8. Detector calibration for operation    Detector

This chapter introduces the calibrations required to get the Septemboard detector into a usable state for data taking at an experiment, in particular to interpret the data taken with it, sec. 8.1. Those calibrations purely related to the Timepix ASIC itself – to work noise free at the lowest possible thresholds – can be found in appendix 19.1. Also the correct working parameters for the FADC are discussed in sec. 8.2. The scintillators need to be set to their correct discriminator thresholds, see sec. 8.3.

8.1. Timepix calibrations

As alluded to above, here we will focus on interpreting data taken with a Timepix ASIC. This means introducing the Time over Threshold (ToT) calibration method used to interpret ToT values as recorded charges, sec. 8.1.1. Further, based on recorded charge the gas gain can be determined. This is discussed in sec. 8.1.2.

Important references for the Timepix in general and for the calibration procedures explained below and in the appendix are (Llopart et al. 2007; Llopart Cudie 2007; Llopart and Poikela 2006; Lupberger 2016).

8.1.1. ToT calibration

The purpose of the ToT (\textbf{T}ime \textbf{o}ver \textbf{T}hreshold) calibration is not to perform a calibration for stable operation of a Timepix based detector, but rather to interpret the data received. It is needed to interpret the ToT values recorded by each pixel as a charge, i.e. a number of recorded electrons.

This is done by injecting charges onto the individual pixels – 'test pulses'. Capacitors are present to inject very precise voltage bursts onto the pixels. In case of the Timepix 1, each pixel uses a capacitance of \(\SI{8}{fF}\) (Llopart and Poikela 2006). Knowing the capacitance and the voltage induced on them, the number of injected electrons can be easily calculated from

\[ Q = C U. \]

By varying the injected charge and recording the resulting ToT values of the pixels, a relation between electrons and ToT values is determined:

\[ f(p) = a p + b - \frac{c}{p - t} \] where \(a, b, c\) and \(t\) are parameters to be determined via a numerical fit and \(p\) is the test pulse height in \(\si{mV}\).

As such, inverting the relation this can be used to compute a charge for a given ToT value:

\[ f(\mathtt{ToT}) = \frac{α}{2 a} \left( \mathtt{ToT} - (b - a t) + \sqrt{ \left( \mathtt{ToT} - (b - a t) \right)^2 + 4 (a b t + a c - a t \mathtt{ToT} ) } \right) \] where \(\mathtt{ToT}\) is the time over threshold value recorded for a pixel, the constants \(a, b, c, t\) the fit parameters determined above and \(α\) the conversion factor relating the number of electrons from a pulse height of \(\SI{1}{mV}\).

An example of a ToT calibration of one chip is shown in fig. 1.

tot_calib_Run2_chip_3.svg
Figure 1: Figure 28: Example of a ToT calibration measurement for the chip H10 W69, the center chip of the Septemboard, as it was done for the CAST data taking period 2.
8.1.1.1. Generate plots for the ToT calibration    extended

The plots for all calibrations (of this sort) are produced using the plotCalibration tool, ./../CastData/ExternCode/TimepixAnalysis/Plotting/plotCalibration/plotCalibration.nim. For a single plot we can produce the plot (used for the thesis above) like this:

# To generate fig:septem:tot_calibration_example
plotCalibration --tot --runPeriod=Run2 --useTeX \
                --file ~/CastData/ExternCode/TimepixAnalysis/resources/ChipCalibrations/Run2/chip3/TOTCalib3.txt \
                --outpath ~/phd/Figs/detector/calibration/

Note that we hand the calibration data file and do not use the InGrid database. We could do either (if you pass a chip number and a run period instead of the ToT calib text file it would read from the database instead).

See appendix section 19.2.1 for all ToT calibrations for all Septemboard chips.

The following is the doc string of the function for the ToT calibration function used in TimepixAnalysis.

  ## calculates the charge in electrons from the TOT value, based on the TOT calibration
  ## from MarlinTPC:
  ## measured and fitted is ToT[clock cycles] in dependency of TestPuseHeight [mV]
  ## fit function is:
  ##   ToT[clock cycles] = a*TestPuseHeight [mV]  + b - ( c / (TestPuseHeight [mV] -t) )
  ## isolating TestPuseHeight gives:
  ##   TestPuseHeight [mV] = 1/(2*a) * (ToT[clock cycles] - (b - a*t) +
  ##                         SQRT( (ToT[clock cycles] - (b - a*t))^2 +
  ##                               4*(a*b*t + a*c - a*t*ToT[clock cycles]) ) )
  ## conversion from charge to electrons:
  ##   electrons = 50 * testpulse[mV]
  ## so we have:
  ## Charge[electrons] = 50 / (2*a) * (ToT[clock cycles] - (b - a*t) +
  ##                     SQRT( (ToT[clock cycles] - (b - a*t))^2 +
  ##                           4*(a*b*t + a*c - a*t*ToT[clock cycles]) ) )

8.1.2. Pólya distribution & gas gain

In sec. 6.3.6 we introduced the Pólya distribution to describe the statistical distribution of the gas amplification stage. In the practical context of the Septemboard detector and the ToT calibration then, this is the histogram of all charge values recorded by the detector (and related of all ToT values). As the ToT calibration function is non-linear though, the histogram of the Pólya distribution has equal bin widths in ToT space, but increasing bin widths in charge space.

Such a histogram can be seen in fig. 2, for a \(\SI{90}{min}\) slice of background data of the center chip of the Septemboard. The reasoning behind looking at a fixed time interval for the Pólya will be explained in section 11.2.2. The pink line represents the fit of the Pólya distribution following eq. \eqref{eq:theory:polya_distribution} to the data. The dashed part of the line was not used for the fit and is only an extension using the final fit parameters. At the lower end of charge values, a cutoff due to the chip's activation threshold is clearly visible. Note also how the bin widths increase from low to high charge values. The fit determines a gas gain of \(\num{3604.3}\), compared to the mean of the data yielding \(\num{3171.0}\). Following (Christoph Krieger 2018) any number for the gas gain used in the remainder of the thesis refers to the mean of the data and not the fit parameter. We use the fit mainly as a verification of the general behavior.

gas_gain_run_267_chip_3_2_90_min_1541650002.svg
Figure 2: Figure 29: An example of a Pólya distribution of chip 3 using the calibration of July 2018 based on \SI{90}{min} of background data. A cutoff at low charges is visible. The pink line represents the fit of the Pólya distribution to the data. In the dashed region the line was extended using the final fit parameters.

A secondary use case for the Pólya distribution is the determination of the activation threshold via the seen cutoff. More on this in appendix 19.1.4.

8.1.2.1. Generate Polya plot for chip 3, run period 3 [0/1]    extended

We will produce a Pólya plot by performing the reconstruction of a single run. In principle during the entire reconstruction all gas gain plots are produced anyway, but that means we'd just copy over a single file. Better to showcase how they are produced in a standalone fashion.

Say we use run 267 and it is in some directory:

cp ~/CastData/data/2018_2/Run_267_181108-02-05_rtBackground.tar.gz /tmp/Run_267_181108-02-05.tar.gz
cd /tmp
raw_data_manipulation -p Run_267_181108-02-05.tar.gz --runType background --out raw_267.h5
reconstruction -i raw_267.h5 --out reco_267.h5
reconstruction -i reco_267.h5 --only_charge
reconstruction -i reco_267.h5 --only_gas_gain --useTeX

where we first copy the raw data to /tmp, parse it, reconstruct it, perform charge calibration and finally fit the gas gain data. That produces plots in ./../../../tmp/out/raw_267_2023-11-30_15-01-39/. For the thesis we use: gas_gain_run_267_chip_3_2_90_min_1541650002.pdf

The gas gain of the fit and from the data are both in the title of the plot.

Fit: 3604.3 Data: 3171.0

8.1.3. Final Septemboard calibrations

The detector was calibrated according to the descriptions of the previous sections and appendix 19 prior to both major data taking campaigns at CAST (see sec. 10.6 for a detailed overview of both campaigns), once in October 2017 and then again in July 2018.

For an overview of all calibration parameters of interest, see the appendix 19.2. Tables 4 and 5 show the THL and THS DAC 1 values used for the Septemboard detector at CAST during the data taking campaign from October 2017 to March 2018 (Run-2) and October 2018 to December 2018 (Run-3), respectively. The other DACs were all set to the same values for all chips in both data taking campaigns with the detector, shown in tab. 6.

Table 4: The THL and THS DAC values for each of the chips of the Septemboard (board H) detector used at CAST for the data taking campaign from October 2017 to March 2018 (Run-2).
DAC E6 W69 K6 W69 H9 W69 H10 W69 G10 W69 D9 W69 L8 W69
THL 435 435 405 450 450 400 470
THS 66 69 66 64 66 65 66
Table 5: The THL and THS DAC values for each of the chips of the Septemboard (board H) detector used at CAST for the data taking campaign from October 2018 to December 2018 (Run-3).
DAC E6 W69 K6 W69 H9 W69 H10 W69 G10 W69 D9 W69 L8 W69
THL 419 386 369 434 439 402 462
THS 68 66 66 65 69 65 64
Table 6: DAC values and settings that are common between data taking periods and all chips of the Septemboard for CAST, from the fsr configuration file.
DAC Value
IKrum 20
Hist 0
GND 80
Coarse 7
CTPR 4294967295
BiasLVDS 128
SenseDAC 1
DACCode 6
RefLVDS 128
Vcas 130
ExtDAC 0
Disc 127
Preamp 255
FBK 128
BuffAnalogA 127
BuffAnalogB 127
8.1.3.1. Generate the FSR tables for all chips and each run period    extended

Let's generate the tables containing the table for the FSR DAC values for all chips for each of the different run periods.

We will use the ChipCalibrations directory that is part of the TimepixAnalysis repository in the resources directory.

Further, we will create a plot of all pixels showing the noise peak in THL values based on the optimized equalization.

import std / [sequtils, strutils, os, tables, algorithm]

const path = "/home/basti/CastData/ExternCode/TimepixAnalysis/resources/ChipCalibrations/"
const periods = ["Run2", "Run3"]
const chipInfo = "chipInfo.txt"
const thrMean = "thresholdMeans$#.txt"
const chips = toSeq(0 .. 6)

import ggplotnim
proc readThresholdMeans(path: string, chip: int): DataFrame =
  echo path / thrMean
  result = readCsv(path / (thrMean % $chip), sep = '\t', colNames = @["x", "y", "min", "max", "bit", "opt"])
    .select("opt")
    .rename(f{"THL" <- "opt"})
    .mutate(f{"chip" <- chip})

# parse the names of the chips from the run info file
var df = newDataFrame()    
for period in periods:
  var header = @["DAC"]
  var tab = initTable[int, seq[(string, int)]]()
  var dfPeriod = newDataFrame()
  for chip in chips:
    proc toTuple(s: seq[seq[string]]): seq[(string, int)] =
      for x in s:
        doAssert x.len == 2
        result.add (x[0], x[1].parseInt)
    let chipPath = path / period / "chip" & $chip
    let data = readFile(chipPath / "fsr" & $chip & ".txt")
      .splitLines
      .filterIt(it.len > 0)
      .mapIt(it.split)
      .toTuple()

    tab[chip] = data

    # read chip name and add to header
    proc readChipName(chip: int): string =
      result = readFile(chipPath / chipInfo)
        .splitLines()[0] 
      result.removePrefix("chipName: ")
    header.add readChipName(chip)

    dfPeriod.add readThresholdMeans(chipPath, chip)
  dfPeriod["Run"] = period
  df.add dfPeriod
  
  proc invertTable(tab: Table[int, seq[(string, int)]]): Table[string, seq[(int, int)]] =
    result = initTable[string, seq[(int, int)]]()
    for chip, data in pairs(tab):
      for (dac, value) in data:
        if dac notin result:
          result[dac] = newSeq[(int, int)]()
        result[dac].add (chip, value)

  proc wrap(s: string): string = "|" & s & "|\n"
  proc toOrgTable(s: seq[seq[string]], header: seq[string]): string =
    let tabLine = wrap toSeq(0 ..< header.len).mapIt("------").join("|")
    result = tabLine
    result.add wrap(header.join("|"))
    result.add tabLine
    for x in s:
      doAssert x.len == header.len
      result.add wrap(x.join("|"))
    result.add tabLine

  proc toOrgTable(tab: Table[string, seq[(int, int)]],
                  header: seq[string]): string =
    var commonDacs = newSeq[seq[string]]()
    var diffDacs = newSeq[seq[string]]()
    for (dac, row) in pairs(tab):
      var fullRow = @[dac]
      let toAdd = row.sortedByIt(it[0]).mapIt($it[1])
      if toAdd.deduplicate.len > 1:
        fullRow.add toAdd
        diffDacs.add fullRow 
      else:
        commonDacs.add @[dac, toAdd.deduplicate[0]]
    result.add toOrgTable(diffDacs, header)
    result.add "\n\n"
    result.add toOrgTable(commonDacs, @["DAC", "Value"])
    # now add common dacs table
  echo "Run: ", period
  echo tab.invertTable.toOrgTable(header)

echo df["THL", float].min  
ggplot(df.filter(f{`THL` > 100}), aes("THL", fill = factor("chip"))) +
  facet_wrap("Run") + 
  geom_histogram(binWidth = 1.0, position = "identity", alpha = 0.7, hdKind = hdOutline) +
  ggtitle("Optimized THL distribution of the noise peak for each chip") +
  ylab(r"\# pixels", margin = 2.0) +
  facetHeaderText(font = font(12.0, alignKind = taCenter)) +
  themeLatex(fWidth = 0.9, width = 600, baseTheme = singlePlot) +
  scale_x_continuous(breaks = 8) + 
  margin(left = 3.0, right = 3.5) + 
  ggsave("/home/basti/phd/Figs/detector/calibration/septemboard_all_thl_optimized.pdf",
         useTeX = true, standalone = true,
         width = 1000, height = 600)

Run: Run2

DAC E6 W69 K6 W69 H9 W69 H10 W69 G10 W69 D9 W69 L8 W69
THL 435 435 405 450 450 400 470
THS 66 69 66 64 66 65 66
DAC Value
IKrum 20
Hist 0
GND 80
Coarse 7
CTPR 4294967295
BiasLVDS 128
SenseDAC 1
DACCode 6
RefLVDS 128
Vcas 130
ExtDAC 0
Disc 127
Preamp 255
FBK 128
BuffAnalogA 127
BuffAnalogB 127

Run: Run3

DAC E6 W69 K6 W69 H9 W69 H10 W69 G10 W69 D9 W69 L8 W69
THL 419 386 369 434 439 402 462
THS 68 66 66 65 69 65 64
DAC Value
IKrum 20
Hist 0
GND 80
Coarse 7
CTPR 4294967295
BiasLVDS 128
SenseDAC 1
DACCode 6
RefLVDS 128
Vcas 130
ExtDAC 0
Disc 127
Preamp 255
FBK 128
BuffAnalogA 127
BuffAnalogB 127

8.1.4. High voltage

The high voltage (HV) settings used for the Septemboard detector are shown in tab. 7. The target is a drift field on the order of \(\SI{500}{V.cm⁻¹}\) and an amplification field of about \(\SI{60}{kV.cm⁻¹}\). The main voltages to choose are the grid voltage (to determine the amplification field) and the cathode voltage (to determine the drift field). The other voltages are computed based on a constant field gradient. Entries ring 1 and ring 29 are the voltages applied to the field shaping ring running around the detector volume to achieve a more homogeneous field. The HV for the Septemboard is controlled via an iseg 2 HV module, while the veto scintillator (requiring positive high voltage) is controlled via a CAEN N470 3.

Table 7: Table of high voltages in use for the InGrid Mk. IV. Note that the veto scintillator is not controlled via the iseg module, but by a CAEN N470.
Description Channel Voltage / V TripCurrent / mA
grid 0 -300 0.050
anode 1 -375 0.050
cathode 2 -1875 0.050
ring 1 3 -415 0.100
ring 29 4 -1830 0.100
veto scinti 5 +1200 2
SiPM 6 -65.6 0.05

8.2. FADC calibration

The FADC requires care about multiple aspects. First, the settings need to be chosen that configure both the operating characteristics, data readout and trigger threshold. Next, the Ortec 474 shaping amplifier has multiple different settings. Finally, in order to interpret the signals received from the FADC, a so-called "pedestal run" should be recorded.

FADC settings
The FADC settings – more details in the configuration file explanation of appendix section 17.2.2 – configure the FADC to run at a frequency of \(\SI{1}{GHz}\) as to cover a running time interval of \(\SI{2.56}{μs}\) in each channel. While it could run at up to \(\SI{2}{GHz}\), \(\SI{1}{ns}\) per time bin is accurate enough given the time scales associated with the amplifier (see below) and a longer interval is more useful. Further, an external trigger is sent out to TOS if the threshold is exceeded. The threshold itself is set to \(\SI{-40}{mV}\) 4. The value was chosen based on trial and error to avoid no triggers based on baseline noise. Given the range of up to \(\SI{-1}{V}\), the relative threshold is pretty low. Finally, the operation mode and data readout is set, the input channel is chosen and the pedestal run parameters are configured (see below).
Amplifier settings
The 474 Ortec shaping amplifier has 3 settings of note. The absolute gain as a multiplier and a signal integration as well as differentiation time. The initial settings were set to an amplification of 6x, an integration time of \(\SI{50}{ns}\) and a differentiation time of \(\SI{50}{ns}\). However, these were changed during the data taking campaign, see more on this in section 11.4.1.
Pedestals
The \(4 · \num{2560}\) registers of the FADC are part of 4 separate cyclic registers. Due to hardware implementation details, the absolute recorded values of each register is arbitrary. In a pedestal run multiple measurements, \(\mathcal{O}(10)\) of a certain length (\(\SI{100}{ms}\) in our case), are performed and the pedestal values averaged. The resulting values represent a mean value for the typical value in each register, hence the name 'pedestal'. To interpret a real measured signal, these pedestals are subtracted from the recorded signal. Each of the 4 FADC channels may have very different pedestal values, but within a single channel they are usually within \(\lesssim\num{50}\) ADC values. Fig. 3 shows the pedestals of all 4 FADC channels as they were recorded before CAST data taking started. The pedestals drift over time, but the associated time scales are long. Alternatively, the pedestals can be computed from real data by computing a truncated mean in each register, which we'll discuss later in sec. 9.5.1.

More details to each of these will be given later where it is of importance.

fadc_pedestal_split_by_channel.svg
Figure 3: Figure 30: The FADC pedestal run used for CAST data initially, split by each of the 4 FADC channels. Each channel's pedestals vary by about \(\mathcal{O}(30)\) ADC values. The first few registers in each channel are not shown, as they are outlying by \(\sim\num{100}\).

8.2.1. Generate plot of pedestals    extended

import std / [strutils, os, sequtils, algorithm]
import ggplotnim

const path = "/home/basti/CastData/ExternCode/TimepixAnalysis/resources/pedestalRuns/"
const file = "pedestalRun000042_1_182143774.txt-fadc"
# read the FADC files using our CSV parser. Everything `#` is header
# aside from the last 3 lines. Skip those using `maxLines`
var df = readCsv(path / file, header = "#", maxLines = 10240)
  .rename(f{"val" <- "nb of channels: 0"})
# generate indices 0, 0, 0, 0, 1, 1, 1, 1, ..., 2559, 2559, 2559, 2559 
df["Register"] = toSeq(0 ..< 2560).repeat(4).concat.sorted
df["Channel"] = @[1, 2, 3, 4].repeat(2560).concat
when false:
  df["Channel"] = @[1, 2, 3, 4].repeat(2560)
  df = df.mutate(f{int -> bool: "even?" ~ `Channel` mod 2 == 0})
  echo df
  echo df.tail(20)
  ggplot(df, aes("Channel", "val", color = "even?")) +
    geom_point(size = 1.0) +
    ggtitle("FADC channel values of pedestal run") +  
    ggsave("/home/basti/phd/Figs/detector/calibration/fadc_pedestal_run.pdf")
  #         useTeX = true, standalone = true)
  ggplot(df.group_by("even?").filter(f{float -> bool: `val` < percentile(col("val"), 95)}),
         aes("Channel", "val", color = "even?")) +
    facet_wrap("even?", scales = "free") +
    facetMargin(0.5) +
    margin(bottom = 1.0, right = 3.0) +
    geom_point(size = 1.0) +
    legendPosition(0.91, 0.0) +
    ggtitle("FADC channel values of pedestal run, split by even and odd channel numbers") +
    ggsave("/home/basti/phd/Figs/detector/calibration/fadc_pedestal_split_even_odd.pdf", width = 1200, height = 600)#
  #         useTeX = true, standalone = true)
else:
  ggplot(df.group_by("Channel").filter(f{float -> bool: `val` < percentile(col("val"), 99)}),
         aes("Register", "val", color = "Channel")) +
    facet_wrap("Channel", scales = "free") +
    geom_point(size = 1.0) +
    facetMargin(0.7) +
    margin(bottom = 1.0, right = 2.0) +
    scale_y_continuous(breaks = 5) + 
    legendPosition(0.87, 0.0) + 
    ylab("Register") + 
    ggtitle("FADC register pedestal values, split by channels") +    
    xlab("Channel", margin = 0.0) +
    themeLatex(fWidth = 1.0, width = 600, baseTheme = singlePlot) + 
    ggsave("/home/basti/phd/Figs/detector/calibration/fadc_pedestal_split_by_channel.pdf",
          useTeX = true, standalone = true)

8.2.2. Calculate pedestals from real FADC data [/]    extended

We will now see what happens if we compute the FADC pedestals from the raw data by computing a truncated mean of all FADC files in a single run and comparing to the real pedestal run we normally use as a reference.

UPDATE: <2022-12-26 Mon 00:48> This was a big success. We will use this in our real data from now on, move this to statusAndProgress and rewrite the section above with that in mind, i.e. explain how to calc pedestals.

  • [ ] REWRITE MAIN TEXT
  • [ ] TEST PEDESTAL CALC FOR BACKGROUND DATA
  • [ ] look at c-blake's ideas using `MovingStat`, a logarithmic histogram etc. to avoid the multiple passes over the data as we do using `sort`! See my journal.org notes about this!
import std / [strutils, os, sequtils, sugar, algorithm]
import ggplotnim

# to read from H5 input
import nimhdf5
import ingrid / tos_helpers
import ingrid / ingrid_types

proc readFadc(f: string): DataFrame =
  # read the FADC files using our CSV parser. Everything `#` is header
  # aside from the last 3 lines. Skip those using `maxLines`
  result = readCsv(f, header = "#", maxLines = 10240)
    .rename(f{"val" <- "nb of channels: 0"})
  #result["Channel"] = toSeq(0 ..< result.len)
  result["Register"] = toSeq(0 ..< 2560).repeat(4).concat.sorted
  result["Channel"] = @[1, 2, 3, 4].repeat(2560).concat

const Size = 5000

proc getFadcDset(h5f: H5File, runNumber: int): H5DataSet =
  let fadcGroup = fadcRawPath(runNumber)
  doAssert fadcGroup in h5f
  let group = h5f[fadcGroup.grp_str]
  result = h5f[(group.name / "raw_fadc").dset_str]
  
proc readChannel(h5f: H5File, dset: H5DataSet, start: int): seq[uint16] =
  let toRead = min(Size, dset.shape[0] - start)
  result = read_hyperslab(dset, uint16,
                          offset = @[start, 0],
                          count = @[toRead, dset.shape[1]])
  
import weave
proc percIdx(q: float, len: int): int = (len.float * q).round.int

proc biasedTruncMean*[T](x: Tensor[T], axis: int, qLow, qHigh: float): Tensor[float] =
  ## Computes the *biased* truncated mean of `x` by removing the quantiles `qLow` on the
  ## bottom end and `qHigh` on the upper end.
  ## ends of the data. `q` should be given as a fraction of events to remove on both ends.
  ## E.g. `qLow = 0.05, qHigh = 0.99` removes anything below the 5-th percentile and above the 99-th.
  ##
  ## Note: uses `weave` internally to multithread along the desired axis!
  doAssert x.rank == 2
  result = newTensorUninit[float](x.shape[axis])
  init(Weave)
  let xBuf = x.toUnsafeView()
  let resBuf = result.toUnsafeView()
  let notAxis = if axis == 0: 1 else: 0
  let numH = x.shape[notAxis] # assuming row column major, 0 is # rows, 1 is # cols
  let numW = x.shape[axis]
  parallelFor i in 0 ..< numW:
    captures: {xBuf, resBuf, numH, numW, axis, qLow, qHigh}
    let xT = xBuf.fromBuffer(numH, numW)
    # get a sorted slice for index `i`
    let subSorted = xT.atAxisIndex(axis, i).squeeze.sorted
    let plow = percIdx(qLow, numH) 
    let phih = percIdx(qHigh, numH)

    var resT = resBuf.fromBuffer(numW)
    ## compute the biased truncated mean by slicing sorted data to lower and upper
    ## percentile index
    var red = 0.0 
    for j in max(0, plow) ..< min(numH, phih): # loop manually as data is `uint16` to convert
      red += subSorted[j].float
    resT[i] = red / (phih - plow).float
  syncRoot(Weave)
  exit(Weave)

defColumn(uint16, uint8)  
proc readFadcH5(f: string, runNumber: int): DataFrame = #seq[DataTable[colType(uint16, uint8)]] =
  let h5f = H5open(f, "r")
  let registers = toSeq(0 ..< 2560).repeat(4).concat.sorted
  let channels = @[1, 2, 3, 4].repeat(2560).concat
  let idxs = arange(3, 2560, 4)
  ## XXX: maybe just read the hyperslab that corresponds to a single channel over
  ## the whole run? That's the whole point of those after all.
  ##  -> That is way too slow unfortunately
  ## XXX: better replace logic by going row wise N elements instead of column wise.
  ## Has the advantage that our memory requirements are constant and not dependent
  ## on the number of elements in the run. If we then average over the resulting N
  ## pedestals, it should be fine.
  let dset = getFadcDset(h5f, runNumber)
  var val = newTensor[float](2560 * 4)
  when true:
    var slices = 0
    for i in arange(0, dset.shape[0], Size):
      # read 
      let data = readChannel(h5f, dset, i)
      let toRead = min(Size, dset.shape[0] - i)
      echo "Reading..."
      let dT = data.toTensor.reshape([toRead, data.len div toRead])
      echo "Read ", i, " to read up to : ", toRead, " now processing..."
      inc slices
      val += biasedTruncMean(dT, axis = 1, qLow = 0.2, qHigh = 0.98)
    for i in 0 ..< 2560 * 4:
      val[i] /= slices.float
  result = toDf({"Channel" : channels, val, "Register" : registers})
  #for fadc in h5f.iterFadcFromH5(runNumber):
  #  let datCh3 = fadc.data[idxs] # .mapIt(it.int)
  #  var df = toDf({"val" : dat, "Register" : registers, "Channel" : channels})
  #  result.add df

## Main function to avoid bug of closure capturing old variable  
proc readFadcData(path: string, runNumber: int): DataFrame =
  var df = newDataFrame()
  if path.endsWith(".h5"):
    doAssert runNumber > 0
    df = readFadcH5(path, runNumber)
  else:
    var dfs = newSeq[DataFrame]()
    var i = 0
    for f in walkFiles(path / "*.txt-fadc"):
      echo "Parsing ", f
      dfs.add readFadc(f)
      inc i
      if i > 5000: break
    let dfP = assignStack(dfs)
    var reg = newSeq[int]()
    var val = newSeq[float]()
    var chs = newSeq[int]()
    for (tup, subDf) in groups(dfP.group_by(["Channel", "Register"])):
      echo "At ", tup
      let p20 = percentile(subDf["val", float], 20)
      let p80 = percentile(subDf["val", float], 80)
      reg.add tup[1][1].toInt
      chs.add tup[0][1].toInt
      let dfF = if p80 - p20 > 0: subDf.filter(dfFn(subDf, f{float: `val` >= p20 and `val` <= p80}))
                else: subDf
      val.add dfF["val", float].mean
    df = toDf({"Channel" : chs, val, "Register" : reg})
  df.writeCsv("/t/pedestal.csv")
  echo df.pretty(-1)
  result = df

proc main(path: string, outfile: string = "/t/empirical_fadc_pedestal_diff.pdf",
          plotVoltage = false,
          runNumber = -1,
          onlyCsv = false
         ) =
  let pData = readFadcData(path, runNumber)
  if onlyCsv: return
  const path = "/home/basti/CastData/ExternCode/TimepixAnalysis/resources/pedestalRuns/"
  const file = "pedestalRun000042_1_182143774.txt-fadc"
  let pReal = readFadc(path / file)
  echo "DATA= ", pData
  echo "REAL= ", pReal
  var df = bind_rows([("Data", pData), ("Real", pReal)], "ID")
    .spread("ID", "val")
    .mutate(f{"Diff" ~ abs(`Data` - `Real`)})
    # alternatively compute the voltage corresponding to the FADC register values,
    # assuming 12 bit working mode (sampling_mode == 0)
    .mutate(f{"DiffVolt" ~ `Diff` / 2048.0})
  var plt: GgPlot
  if plotVoltage:
    plt = ggplot(df, aes("Register", "DiffVolt", color = "Channel")) +
      ylim(0, 100.0 / 2048.0)    
  else:
    plt = ggplot(df, aes("Register", "Diff", color = "Channel")) +
      ylim(0, 100)
  plt +
    geom_point(size = 1.5, alpha = 0.75) +
    ylab("Difference between mean and actual pedestal [ADC]") + 
    ggtitle("Attempt at computing pedestal values based on truncated mean of data") +
    margin(top = 2) +
    xlim(0, 2560) + 
    ggsave(outfile)
  
when isMainModule:
  import cligen
  dispatch main

We can call it on different runs with FADC data:

code/attempt_fadc_pedestals_from_data \
#    -p /mnt/1TB/CAST/2017/DataRuns/Run_77_171102-05-24 \
    -p ~/CastData/data/2017/Run_96_171123-10-42 \
#    --outfile ~/phd/Figs/detector/calibration/pedestal_from_data_compare_real_run77.pdf
    --outfile ~/phd/Figs/detector/calibration/pedestal_from_data_compare_real_run96.pdf    

for an early 2017 run

code/attempt_fadc_pedestals_from_data \
    -p /mnt/1TB/CAST/2018_2/DataRuns/Run_303_181217-16-52 \
    --outfile ~/phd/Figs/detector/calibration/pedestal_from_data_compare_real_run303.pdf

for a late 2018 run.

These yield fig. 4

Figure 4(a): Run 77
Figure 4(b): Run 303
Figure 4: Calculation of the FADC pedestals from data by averaging all channels over all FADC data files of a single run using a truncated mean from the 20-th to 80-th percentile of the distribution. Both data runs show a comparatively small variance. Arguably it may make sense to \textit{always} compute it based on each run instead of relying on a pedestal run though.

Surprisingly, the deviation for the end 2018 run is lower than for the 2017 run, despite the 2017 run being closer in time to the real pedestal run.

Keep in mind that in our data taking we used the 12 bit readout mode. This means the register values divided by \num{2048} correspond to the voltages recorded by the register. As such the absolute values of the deviations are quite a bit smaller than \(\lesssim \SI{48}{mV}\) (which is small given the absolute range of \(±\SI{1}{V}\) of the FADC.

import nimhdf5, ggplotnim
import std / [strutils, os, sequtils]
import ingrid / [tos_helpers, fadc_helpers, ingrid_types, fadc_analysis]

proc stripPrefix(s, p: string): string =
  result = s
  result.removePrefix(p)

proc plotIdx(df: DataFrame, fadcData: Tensor[float], idx: int) =
  let xmin = df["argMinval", int][idx]
  let xminY = df["minvals", float][idx]
  let xminlineX = @[xmin, xmin] # one point for x of min, max
  let fData = fadcData[idx, _].squeeze
  let xminlineY = linspace(fData.min, fData.max, 2)

  let riseStart = df["riseStart", int][idx]
  let fallStop = df["fallStop", int][idx]
  let riseStartX = @[riseStart, riseStart]
  let fallStopX = @[fallStop, fallStop]
  let baseline = df["baseline", float][idx]  
  let baselineY = @[baseline, baseline]
  
  let df = toDf({ "x"         : toSeq(0 ..< 2560),
                  "baseline"  : baseline,
                  "data"      : fData,
                  "xminX"     : xminlineX, 
                  "xminY"     : xminlineY,
                  "riseStart" : riseStartX,
                  "fallStop"  : fallStopX })
                  # Comparison has to be done by hand unfortunately
  let path = "/t/fadc_spectrum_baseline.pdf"
  ggplot(df, aes("x", "data")) +
    geom_line() +
    geom_point(color = color(0.1, 0.1, 0.1, 0.1)) +
    geom_line(data = df.head(2), aes = aes("x", "baseline"),
                     color = "blue") +
    geom_line(data = df.head(2), aes = aes("xminX", "xminY"),
                     color = "red") +
    geom_line(data = df.head(2), aes = aes("riseStart", "xminY"),
                     color = "green") +
    geom_line(data = df.head(2), aes = aes("fallStop", "xminY"),
                     color = "pink") +
    ggsave(path)

proc plotCompare(data, real: Tensor[float]) =
  let path = "/t/fadc_spectrum_compare.pdf"
  for idx in 0 ..< data.shape[0]:
    let df = toDf({ "x" : toSeq(0 ..< 2560),
                    "data" : data[idx, _].squeeze,
                    "real" : real[idx, _].squeeze })
      .gather(["data", "real"], "type", "vals")
    ggplot(df, aes("x", "vals", color = "type")) +
      geom_line() +
      #geom_point(color = color(0.1, 0.1, 0.1, 0.1)) +
      ggsave(path)
    sleep(1000)

proc getFadcData(fadcRun: ProcessedFadcRun, pedestal: seq[uint16]): Tensor[float] =
  let ch0 = getCh0Indices()
  let
    fadc_ch0_indices = getCh0Indices()
    # we demand at least 4 dips, before we can consider an event as noisy
    n_dips = 4
    # the percentile considered for the calculation of the minimum
    min_percentile = 0.95
    numFiles = fadcRun.eventNumber.len
  
  var fData = ReconstructedFadcRun(
    fadc_data: newTensorUninit[float]([numFiles, 2560]),
    eventNumber: fadcRun.eventNumber,
    noisy: newSeq[int](numFiles),
    minVals: newSeq[float](numFiles)
  )

  for i in 0 ..< fadcRun.eventNumber.len:
    let slice = fadcRun.rawFadcData[i, _].squeeze
    let data = slice.fadcFileToFadcData(
      pedestal,
      fadcRun.trigRecs[i], fadcRun.settings.postTrig, fadcRun.settings.bitMode14,
      fadc_ch0_indices).data
    fData.fadc_data[i, _] = data.unsqueeze(axis = 0)
    fData.noisy[i]        = data.isFadcFileNoisy(n_dips)
    fData.minVals[i]      = data.calcMinOfPulse(min_percentile)
  when false:
    let data = fData.fadcData.toSeq2D
    let (baseline, xMin, riseStart, fallStop, riseTime, fallTime) = calcRiseAndFallTime(
      data, 
      false
    )
    let df = toDf({ "argMinval" : xMin.mapIt(it.float),
                    "baseline" : baseline.mapIt(it.float),
                    "riseStart" : riseStart.mapIt(it.float),
                    "fallStop" : fallStop.mapIt(it.float),
                    "riseTime" : riseTime.mapIt(it.float),
                    "fallTime" : fallTime.mapIt(it.float),
                    "minvals" : fData.minvals })
    for idx in 0 ..< df.len:
      plotIdx(df, fData.fadc_data, idx)
      sleep(1000)

  result = fData.fadcData

proc readFadc(f: string): DataFrame =
  # read the FADC files using our CSV parser. Everything `#` is header
  # aside from the last 3 lines. Skip those using `maxLines`
  result = readCsv(f, header = "#", maxLines = 10240)
    .rename(f{"val" <- "nb of channels: 0"})
  #result["Channel"] = toSeq(0 ..< result.len)
  result["Register"] = toSeq(0 ..< 2560).repeat(4).concat.sorted
  result["Channel"] = @[1, 2, 3, 4].repeat(2560).concat

proc main(fname: string, runNumber: int) =
  var h5f = H5open(fname, "r")
  let pedestal = readCsv("/t/pedestal.csv") # created from above
    .arrange(["Register", "Channel"])
  echo pedestal
  const path = "/home/basti/CastData/ExternCode/TimepixAnalysis/resources/pedestalRuns/"
  const file = "pedestalRun000042_1_182143774.txt-fadc"
  let pReal = readFadc(path / file)
  
  let fileInfo = h5f.getFileInfo()
  for run in fileInfo.runs:
    if run == runNumber:
      let fadcRun = h5f.readFadcFromH5(run)
      let fromData = fadcRun.getFadcData(pedestal["val", uint16].toSeq1D)
      let fromReal = fadcRun.getFadcData(pReal["val", uint16].toSeq1D)

      plotCompare(fromData, fromReal)
      
when isMainModule:
  import cligen
  dispatch main
  • [ ] SHOW PLOT OF THE ABOVE (plotCompare) AS EXAMPLE OF BAD PEDESTALS VS GOOD PEDESTALS? -> Would be nice for extended thesis!

8.3. Scintillator calibration

The final piece of the detector requiring calibration, are the two scintillators. As both of them are only used as digital triggers and no analogue signal information is recorded, a suitable discriminator threshold voltage has to be set.

8.3.1. Large scintillator paddle

The large veto scintillator paddle was calibrated at the RD51 laboratory at CERN prior to the data taking campaign in March 2017. Using two smaller, calibrated scintillators to create a coincidence setup of the three scintillators, measurements were taken at different thresholds. Each measurement was \(\SI{10}{\minute}\) long. An amplifier was installed after the PMT to increase the signal.

The Canberra 2007 base and Bicron Corp. 31.49x15.74M2BC408/2-X PMT require a positive high voltage. It was supplied with \(\SI{1200}{V}\). Table 8 shows the recorded measurements. Based on these a threshold of \(\SI{-110}{mV}\) was chosen for the CAST data taking. Fig. 5 also shows the data from table. While the coincidence counts at \(\SI{-110}{mV}\) are lower than the visible plateau starting at \(\SI{-100}{mV}\), this threshold was chosen, because the raw counts were still considered too high compared to expectation based on cosmic muon rate and the size of the scintillator. 5

Table 8: Measurements for the calibration of the large veto scintillator taken at RD51 at CERN with two smaller, calibrated scintillators in a coincidence. Each measurement was \SI{10}{min}. The thresholds set on the discriminator for the veto scintillator were originally measured with a 10x scaling and have been rescaled here to their correct values.
Threshold / mV Counts Szinti Counts Coincidence
-59.8 31221 634
-70.0 30132 674
-80.4 28893 635
-90.3 28076 644
-100.5 27012 684
-110.3 25259 566
-120.0 22483 495
-130.3 19314 437
-140.3 16392 356
-150.5 13677 312
-160.0 11866 267
-170.1 10008 243
veto_scintillator_calibration_coinc_rd51.svg
Figure 5: Figure 31: Calibration measurements for the veto scintillator printed in table 8. The line is an interconnection of all data points. The errors represent Poisson-like \(\sqrt{N}\) uncertainties.
8.3.1.1. Generate the plots of the scintillator calibration data    extended
import ggplotnim, sequtils
let df = toDf({ "Thr" : tbl["Threshold / mV"],
                "Szinti" : tbl["Counts Szinti"],
                "Coinc" : tbl["Counts Coincidence"] })
  .mutate(f{"SzintiErr" ~ sqrt(`Szinti`)},
          f{"CoincErr" ~ sqrt(`Coinc`)})
## XXX: `ebLinesT` is broken?!  
ggplot(df, aes("Thr", "Szinti")) +
  geom_point() + geom_line() +
  geom_errorbar(aes = aes(yMin = f{`Szinti` - `SzintiErr`},
                          yMax = f{`Szinti` + `SzintiErr`}),
                errorBarKind = ebLines,
                color = parseHex("FF00FF")) + 
  xlab(r"Threshold [\si{mV}]") + ylab(r"Counts [\#]") +
  ggtitle(r"Calibration measurements of \SI{10}{min} each") + 
  ggsave("/home/basti/phd/Figs/detector/calibration/veto_scintillator_calibration_rd51.pdf",
         useTeX = true, standalone = true)

ggplot(df, aes("Thr", "Coinc")) +
  geom_point() + geom_line() +
  geom_errorbar(aes = aes(yMin = f{`Coinc` - `CoincErr`},
                          yMax = f{`Coinc` + `CoincErr`}),
                errorBarKind = ebLines) +   
  xlab(r"Threshold [\si{mV}]") + ylab(r"Counts [\#]") +
  ggtitle(r"Calibration measurements of \SI{10}{min} each in 3-way coincidence") +
  themeLatex(fWidth = 0.9, width = 600, baseTheme = singlePlot) + 
  ggsave("/home/basti/phd/Figs/detector/calibration/veto_scintillator_calibration_coinc_rd51.pdf",
         useTeX = true, standalone = true)
8.3.1.2. Notes taken of calibration before CAST data taking    extended

See the appendix 19.3 for a reproduction of the notes taken during the calibration of the veto paddle scintillator.

8.3.1.3. Raw scintillator data    extended
veto_scintillator_calibration_rd51.svg
Figure 6: Figure 32: Calibration measurements for the veto scintillator printed in table 8. In this case the raw data is shown instead of the coincidence. The line is simply an interconnection of all data points. The errors are colored to be seen at all.

In particular looking at the raw counts, in hindsight I would now probably choose a threshold closer to \SI{-90}{mV} or \SI{-95}{mV}. But well.

8.3.1.4. Calculate expected rate    extended

Let's compute the expected rate based on a mean cosmic muon rate at the surface and the area of the scintillator.

  • [X]

    NOTE: <2023-03-14 Tue 11:49> The talk about the veto system of the SDD detector at the IAXO collaboration meeting March 2023 had the following number for the sea level muon rate: 0.017 Hz•cm⁻² -> Ah, this is close to ~1 cm⁻²•min⁻¹!

    import unchained
    let rate = 0.017.Hz•cm⁻²
    echo "Rate in ", rate.toDef(min⁻¹•cm⁻²)
    

The scintillator has a size of 31.49x15.74 inches and we roughly have a mean cosmic muon rate of 1 per cm⁻² min⁻¹. Measurement time was 600 s.

import unchained
let rate = 1.cm⁻²•min⁻¹
let area = 31.49.inch * 15.74.inch
let time = 600.s
echo "Expected rate: ", time * rate * area

So about 32,000 counts in the 10 min time frame. It's a bit frustrating that for some reason during that calibration we assumed a muon rate of 100 Hz m⁻² sr⁻¹, so that we only got an expected number of counts of about 20,000.

If we assume the 100 Hz m⁻² sr⁻¹ number and integrate only over \(θ\) (not \(φ\) as we should also!) using the \(\cos² θ\) dependence we get a more comparable number:

import unchained
let integral = 1.5708.sr # ∫_{-π/2}^{π/2} cos²(θ) dθ = 1.5708
let rate = 100.Hz•m⁻²•sr⁻¹
let angleInt = 2let time = 600.s
let area = 31.49.inch * 15.74.inch
echo rate * time * area * integral

In any case, it seems like our assumption of 20000 as seen in appendix 19.3 is clearly flawed and lead to a possibly too large threshold for the discriminator.

In addition: why the hell did I not take note of the size of the scintillators that Theodoros gave us? That would be a much better cross check for the expected rate. It is certainly possible that our choice of \(\SI{-110}{mV}\) was actually due to an expected coincidence rate matching at that threshold given the sizes of the calibrated scintillators, but I can't verify that anymore.

8.3.2. SiPM

The SiPM was calibrated during the bachelor thesis of Jannes Schmitz in 2016 (Schmitz 2017) based on a coincidence measurement with calibrated scintillators.

Footnotes:

1

The THL DAC is the global threshold DAC of all pixels. The THS DAC is responsible for the range in which each pixel can be adjusted around the global value. See appendix 19.1 for more information.

4

The trigger threshold DAC is a 12-bit DAC. Its values correspond to \(\SI{-1}{V}\) at 000 and \(\SI{1}{V}\) at FFF. Hence \(\num{1966}\) (seen in the configuration file) is roughly \(\SI{-40}{mV}\).

5

While it is unclear to me now given it's been over 5 years, I believe at the time of the calibration we wrongly assumed a muon rate of \(\SI{100}{Hz.m⁻².sr⁻¹}\) instead of about \(\SI{1}{cm⁻².min⁻¹}\). The former number only works out if one integrates it over the \(\cos^2(θ)\) dependence, but only along \(θ\) and not \(φ\)! Either way, the number seems problematic. However, it did misguide us in likely choosing a too low threshold, as using the former number yields an expected number of counts of \(\sim\num{32000}\) compared to only \(\sim\num{20000}\) in our naive approach.

Click on any heading marked 'extended' to open it