How to extract digitizer peaks with the SCS Toolbox¶
Workflow during a beamtime¶
Record first data with signal on digitizer.
Find peak integration parameters using
check_peak_params()
.Update the new parameters in DAMNIT, so that for each new run, automatic processing of each run is performed and saved in
usr/processed_runs
folder. As long as the right bunch pattern is selected for peak extraction, there is no need to care about the number of pulses / period, as they will be adjusted to match the bunch pattern of each run.For analysis, load digitizer data using
load_processed_peaks()
orload_all_processed_peaks()
. This is much faster than loading the raw traces and re-performing peak integration.Checking the integration parameters used for the processed data can be done via
check_processed_peaks_params()
.
Peak-integration parameters¶
[1]:
import toolbox_scs as tb
import matplotlib.pyplot as plt
%matplotlib widget
plt.rcParams['figure.constrained_layout.use'] = True
Cupy is not installed in this environment, no access to the GPU
Extracting peaks from a raw trace is done using the get_digitizer_peaks()
function of the SCS Toolbox, by integration over the area of the peak and subtraction of a baseline. For one peak, the parameters pulseStart
and pulseStop
(sample numbers in the raw trace) define the integration region and baseStart
and baseStop
define the baseline region. In most cases, the pulse pattern is regular and there are npulses
separated by a period
. The peak extraction is repeated for
each peak, with pulseStart
, pulseStop
, baseStart
and baseStop
shifted by the period.
An example of integration parameters:
[2]:
params = {'pulseStart': 100,
'pulseStop': 120,
'baseStart': 80,
'baseStop': 99,
'period': 96,
'npulses': 25}
If the pattern is not regular, a list of starting positions can be provided to pulseStart
, while pulseStop
, baseStart
, baseStop
remain integers and relate to the first peak only. In such case, period
does not have a meaning and npulses
is equal to len(pulseStart)
:
[3]:
params = {'pulseStart': [100, 200, 500, 600, 900, 1000, 2000, 10000, 15500],
'period': 0,
'pulseStop': 110,
'baseStop': 99,
'baseStart': 90,
'npulses': 9}
Let’s assume that we are interested in the APD signal on diode 8 looking at the FEL, which corresponds to Ch9 of Fast ADC 2 (mnemonic FastADC2_9raw
). We can check how the peak-finding algorithm performs by using tb.check_peak_parameters()
and inspecting the found regions of integration. This shows a plot with the first and last pulses identified by the peak-finding algorithm and displays the region of integration and the region for baseline subtraction. The function returns a dictionnary
good_params
that has all parameters necessary to perform the trapezoidal integration over the digitizer trace.
[4]:
proposal, runNB = 2956, 13
good_params = tb.check_peak_params(proposal, runNB, 'FastADC2_9raw', bunchPattern='sase3')
good_params
{'pulseStart': 6763, 'period': 96, 'pulseStop': 6770, 'baseStop': 6762, 'baseStart': 6757, 'npulses': 400}
Bunch pattern sase3: 400 pulses, 96 samples between two pulses
Auto-find peak parameters: 400 pulses, 96 samples between two pulses
[4]:
{'pulseStart': 6763,
'period': 96,
'pulseStop': 6770,
'baseStop': 6762,
'baseStart': 6757,
'npulses': 400}
Extracting peaks¶
The integration parameters are either user-provided or automatically computed by a peak-finding algorithm in get_digitizer_peaks()
and check_peak_params()
. The bunch pattern, when provided, is used to determine the parameters or to check consistency with user-provided parameters, and to align the pulse ID. The minimum required inputs to extract peaks are:
the
bunchPattern
source (‘sase3’ if the device is looking at the FEL or ‘scs_ppl’ if the device is looking at the PP laser), leavingintegParams=None
to let the peak-finding algorithm operate.or
integParams
dict includingpulseStart
,pulseStop
,baseStart
,baseStop
,period
andnpulses
keys.
In most cases, automatic peak finding provides good integration parameters. If it fails, or if we want to define fixed parameters to consistently analyze a series of runs, it is necessary to provide the parameters via integParams
.
If both the bunch pattern and the integration parameters are provided, the period
and npulses
of the user-provided parameters (integParams
) will be overriden (with a warning in case of mismatch), except if pulseStart
is a list (e.g. case of irregular patterns).
Once the parameters are found, we can extract the peaks using get_digitizer_peaks()
, with bunchPattern='sase3'
.
[5]:
peaks = tb.get_digitizer_peaks(proposal, runNB, 'FastADC2_9raw', integParams=good_params, bunchPattern='sase3')
peaks
{'pulseStart': 6763, 'period': 96, 'pulseStop': 6770, 'baseStop': 6762, 'baseStart': 6757, 'npulses': 400}
[5]:
<xarray.Dataset> Size: 25MB Dimensions: (trainId: 7898, sa3_pId: 400) Coordinates: * trainId (trainId) uint64 63kB 1501374970 1501374971 ... 1501382869 * sa3_pId (sa3_pId) int32 2kB 772 776 780 784 ... 2356 2360 2364 2368 Data variables: FastADC2_9peaks (trainId, sa3_pId) float64 25MB -1.04e+04 ... -1.866e+03
We see here that peaks
has a variable FastADC2_9peaks
which has dimensions ['trainId', 'sa3_pId']
, which is exactly what we wanted: the raw traces were automatically transformed into vectors of length sa3_pId
.
Note that we could also have ommitted the parameter integParams
in get_digitizer_peaks()
to force the automatic peak finding algorithm:
[6]:
peaks = tb.get_digitizer_peaks(proposal, runNB, 'FastADC2_9raw', bunchPattern='sase3')
{'pulseStart': 6763, 'period': 96, 'pulseStop': 6770, 'baseStop': 6762, 'baseStart': 6757, 'npulses': 400}
If the peak-finding algorithm fails¶
The best strategy is to inspect the raw trace with get_dig_avg_trace()
or use check_peak_params()
with show_all=True
and determine the regions of integration manually. Once the integration parameter dictionnary is created, one can feed it to the integParams
argument in get_digitizer_peaks()
.
Save / load processed peaks¶
If we have found good integration parameters, it is worth saving the integrated peaks as processed data. This can be done by selecting save=True
in get_digitizer_peaks()
. The location can be chosen by subdir
, by default it goes to the usr/processed_runs
folder of the proposal.
[7]:
peaks = tb.get_digitizer_peaks(proposal, runNB, 'FastADC2_9raw', integParams=good_params, bunchPattern='sase3',
save=True)
{'pulseStart': 6763, 'period': 96, 'pulseStop': 6770, 'baseStop': 6762, 'baseStart': 6757, 'npulses': 400}
saved data into /gpfs/exfel/exp/SCS/202202/p002956/usr/processed_runs/r0013/r0013-digitizers-data.h5.
To load the processed data:
[8]:
tb.load_processed_peaks(proposal, runNB, 'FastADC2_9peaks')
[8]:
<xarray.DataArray 'FastADC2_9peaks' (trainId: 7898, sa3_pId: 400)> Size: 25MB array([[-10396.5, -21040.5, -11618. , ..., -2953. , -14307.5, -9120. ], [-14979.5, -3093.5, -8993. , ..., -5679.5, -4255. , -25531. ], [-22282. , -2814. , -21852.5, ..., -14205.5, -8499.5, -13072. ], ..., [ -607.5, 99. , -660. , ..., -1162.5, -1217. , -808. ], [ -999.5, -681. , -327. , ..., -1800. , -935. , -660. ], [ -1075. , -1200. , -358. , ..., -753. , -864. , -1866.5]]) Coordinates: * trainId (trainId) uint64 63kB 1501374970 1501374971 ... 1501382869 * sa3_pId (sa3_pId) int32 2kB 772 776 780 784 788 ... 2356 2360 2364 2368 Attributes: FastADC2_9peaks_pulseStart: 6763 FastADC2_9peaks_period: 96 FastADC2_9peaks_pulseStop: 6770 FastADC2_9peaks_baseStop: 6762 FastADC2_9peaks_baseStart: 6757 FastADC2_9peaks_npulses: 400
Note that the attributes are the peak integration parameters used for peak extraction.
It is also possible to load the entire dataset containing the peaks and average traces of all the processed sources by ommitting the mnemonic
argument:
[9]:
tb.load_processed_peaks(proposal, runNB)
[9]:
<xarray.Dataset> Size: 26MB Dimensions: (trainId: 7898, sa3_pId: 400, sampleId: 100000) Coordinates: * trainId (trainId) uint64 63kB 1501374970 1501374971 ... 1501382869 * sa3_pId (sa3_pId) int32 2kB 772 776 780 784 ... 2356 2360 2364 2368 Dimensions without coordinates: sampleId Data variables: FastADC2_9peaks (trainId, sa3_pId) float64 25MB -1.04e+04 ... -1.866e+03 FastADC2_9avg (sampleId) float64 800kB -411.8 -412.0 ... -411.3 -411.2 Attributes: FastADC2_9peaks_pulseStart: 6763 FastADC2_9peaks_period: 96 FastADC2_9peaks_pulseStop: 6770 FastADC2_9peaks_baseStop: 6762 FastADC2_9peaks_baseStart: 6757 FastADC2_9peaks_npulses: 400
It is also possible to check the integration parameters that were used for peak extraction:
[10]:
tb.check_processed_peak_params(proposal, runNB, 'FastADC2_9peaks')
[10]:
{'pulseStart': 6763,
'period': 96,
'pulseStop': 6770,
'baseStop': 6762,
'baseStart': 6757,
'npulses': 400}
Example of irregular PPL pattern¶
[11]:
# irregular pattern
proposal, runNB, field = 8716, 28, 'I0_ILHraw'
params = tb.check_peak_params(proposal, runNB, field,
bunchPattern='scs_ppl', show_all=True)
{'pulseStart': array([22998, 23094, 23190, 23286, 23382, 23478, 23574, 23670, 23766,
23862, 23958, 24054, 24150, 24246, 24342, 24438, 24534, 24630,
24726, 24822, 24918, 25014, 25110, 25206, 27606, 27702, 27798,
27894, 27990, 28086, 28182, 28278, 28374, 28470, 28566, 28662,
28758, 28854, 28950, 29046, 29142, 29238, 29334, 29430, 29526,
29622, 29718, 29814, 32214, 32310, 32406, 32502, 32598, 32694,
32790, 32886, 32982, 33078, 33174, 33270, 33366, 33462, 33558,
33654, 33750, 33846, 33942, 34038, 34134, 34230, 34326, 34422,
36822, 36918, 37014, 37110, 37206, 37302, 37398, 37494, 37590,
37686, 37782, 37878, 37974, 38070, 38166, 38262, 38358, 38454,
38550, 38646, 38742, 38838, 38934, 39030, 41430, 41526, 41622,
41718, 41814, 41910, 42006, 42102, 42198, 42294, 42390, 42486,
42582, 42678, 42774, 42870, 42966, 43062, 43158, 43254, 43350,
43446, 43542, 43638, 46038, 46134, 46230, 46326, 46422, 46518,
46614, 46710, 46806, 46902, 46998, 47094, 47190, 47286, 47382,
47478, 47574, 47670, 47766, 47862, 47958, 48054, 48150, 48246,
50646, 50742, 50838, 50934, 51030, 51126, 51222, 51318, 51414,
51510, 51606, 51702, 51798, 51894, 51990, 52086, 52182, 52278,
52374, 52470, 52566, 52662, 52758, 52854, 55254, 55350, 55446,
55542, 55638, 55734, 55830, 55926, 56022, 56118, 56214, 56310,
56406, 56502, 56598, 56694, 56790, 56886, 56982, 57078, 57174,
57270, 57366, 57462]), 'period': 0, 'pulseStop': 23009, 'baseStop': 22995, 'baseStart': 22989, 'npulses': 192}
Bunch pattern scs_ppl: Not a regular pattern. 192 pulses, pulse_ids=[ 0 4 8 12 16 20 24 28 32 36 40 44 48 52
56 60 64 68 72 76 80 84 88 92 192 196 200 204
208 212 216 220 224 228 232 236 240 244 248 252 256 260
264 268 272 276 280 284 384 388 392 396 400 404 408 412
416 420 424 428 432 436 440 444 448 452 456 460 464 468
472 476 576 580 584 588 592 596 600 604 608 612 616 620
624 628 632 636 640 644 648 652 656 660 664 668 768 772
776 780 784 788 792 796 800 804 808 812 816 820 824 828
832 836 840 844 848 852 856 860 960 964 968 972 976 980
984 988 992 996 1000 1004 1008 1012 1016 1020 1024 1028 1032 1036
1040 1044 1048 1052 1152 1156 1160 1164 1168 1172 1176 1180 1184 1188
1192 1196 1200 1204 1208 1212 1216 1220 1224 1228 1232 1236 1240 1244
1344 1348 1352 1356 1360 1364 1368 1372 1376 1380 1384 1388 1392 1396
1400 1404 1408 1412 1416 1420 1424 1428 1432 1436].
Auto-find peak parameters: Not a regular pattern. 192 pulses, pulse_ids=[ 0 4 8 12 16 20 24 28 32 36 40 44 48 52
56 60 64 68 72 76 80 84 88 92 192 196 200 204
208 212 216 220 224 228 232 236 240 244 248 252 256 260
264 268 272 276 280 284 384 388 392 396 400 404 408 412
416 420 424 428 432 436 440 444 448 452 456 460 464 468
472 476 576 580 584 588 592 596 600 604 608 612 616 620
624 628 632 636 640 644 648 652 656 660 664 668 768 772
776 780 784 788 792 796 800 804 808 812 816 820 824 828
832 836 840 844 848 852 856 860 960 964 968 972 976 980
984 988 992 996 1000 1004 1008 1012 1016 1020 1024 1028 1032 1036
1040 1044 1048 1052 1152 1156 1160 1164 1168 1172 1176 1180 1184 1188
1192 1196 1200 1204 1208 1212 1216 1220 1224 1228 1232 1236 1240 1244
1344 1348 1352 1356 1360 1364 1368 1372 1376 1380 1384 1388 1392 1396
1400 1404 1408 1412 1416 1420 1424 1428 1432 1436].
[ ]: