How to extract digitizer peaks with the SCS Toolbox

[1]:
import toolbox_scs as tb
import matplotlib.pyplot as plt
%matplotlib notebook
plt.rcParams['figure.constrained_layout.use'] = True

First, to explore the data contained in a run, we open a DataCollection using open_run() which is equivalent to the extra_data.open_run() function:

[2]:
proposal, runNB = 2956, 13
run = tb.open_run(proposal, runNB)

Each channel of each digitizer stores a string that is used to describe which device is connected to that channel. A dictionnary of all channels can be accessed as follows:

[3]:
tb.digitizer_signal_description(run)
[3]:
{'FastADC_Ch0': 'CHEM DIAG X-ray PD',
 'FastADC_Ch1': 'PPL I0 at in-coupling window',
 'FastADC_Ch2': 'CHEM APD',
 'FastADC_Ch3': 'PPL Reflectometer PD',
 'FastADC_Ch4': '',
 'FastADC_Ch5': 'PPL 800 nm I0 ILH PD',
 'FastADC_Ch6': '',
 'FastADC_Ch7': 'No Description',
 'FastADC_Ch8': 'No Description',
 'FastADC_Ch9': 'CHEM DIAG Laser PD',
 'FastADC2_Ch0': 'No Description',
 'FastADC2_Ch1': 'No Description',
 'FastADC2_Ch2': 'No Description',
 'FastADC2_Ch3': 'No Description',
 'FastADC2_Ch4': 'No Description',
 'FastADC2_Ch5': 'diode 2 - diam = 5mm',
 'FastADC2_Ch6': 'I0_400 nm',
 'FastADC2_Ch7': 'diode 5  - diam = 3mm + Filter (Ti 400 nm / Polyimide 200 nm)',
 'FastADC2_Ch8': 'diode 6 - 10x10 mm + filter longpass > 550 nm',
 'FastADC2_Ch9': 'diode 8 - APD diam = 3 mm + Filter (Ti 400 nm / Polyimide 200 nm)'}

As is customary, we can always list the available mnemonics for the run as follows:

[4]:
tb.mnemonics_for_run(run).keys()
[4]:
dict_keys(['sase3', 'sase2', 'sase1', 'maindump', 'bunchpattern', 'bunchPatternTable', 'npulses_sase3', 'npulses_sase1', 'BAM414', 'BAM1932M', 'BAM1932S', 'nrj', 'nrj_target', 'M2BEND', 'tpi', 'VSLIT', 'ESLIT', 'HSLIT', 'transmission', 'transmission_col2', 'GATT_pressure', 'UND', 'UND2', 'UND3', 'XTD10_photonFlux', 'XTD10_photonFlux_sigma', 'XTD10_XGM', 'XTD10_XGM_sigma', 'XTD10_SA3', 'XTD10_SA3_sigma', 'XTD10_SA1', 'XTD10_SA1_sigma', 'XTD10_slowTrain', 'XTD10_slowTrain_SA1', 'XTD10_slowTrain_SA3', 'SCS_photonFlux', 'SCS_photonFlux_sigma', 'SCS_XGM', 'SCS_XGM_sigma', 'SCS_SA1', 'SCS_SA1_sigma', 'SCS_SA3', 'SCS_SA3_sigma', 'SCS_slowTrain', 'SCS_slowTrain_SA1', 'SCS_slowTrain_SA3', 'HFM_BENDING', 'VFM_BENDING', 'AFS_DelayLine', 'AFS_FocusLens', 'PP800_PhaseShifter', 'PP800_SynchDelayLine', 'PP800_DelayLine', 'PP800_HalfWP', 'PP800_HWP_POWER', 'PP800_FocusLens', 'FFT_FocusLens', 'hRIXS_det', 'hRIXS_delay', 'hRIXS_index', 'hRIXS_norm', 'hRIXS_ABB', 'hRIXS_ABL', 'hRIXS_ABR', 'hRIXS_ABT', 'hRIXS_DRX', 'hRIXS_DTY1', 'hRIXS_DTZ', 'hRIXS_GMX', 'hRIXS_GRX', 'hRIXS_GTLY', 'hRIXS_GTRY', 'hRIXS_GTX', 'hRIXS_GTZ', 'XRD_DRY', 'XRD_SRX', 'XRD_SRY', 'XRD_SRZ', 'XRD_STX', 'XRD_STY', 'XRD_STZ', 'XRD_SXT1Y', 'XRD_SXT2Y', 'XRD_SXTX', 'XRD_SXTZ', 'FastADC0peaks', 'FastADC0raw', 'FastADC1peaks', 'FastADC1raw', 'FastADC2peaks', 'FastADC2raw', 'FastADC3peaks', 'FastADC3raw', 'FastADC4peaks', 'FastADC4raw', 'FastADC5peaks', 'FastADC5raw', 'FastADC6peaks', 'FastADC6raw', 'FastADC7peaks', 'FastADC7raw', 'FastADC8peaks', 'FastADC8raw', 'FastADC9peaks', 'FastADC9raw', 'FastADC2_0peaks', 'FastADC2_0raw', 'FastADC2_1peaks', 'FastADC2_1raw', 'FastADC2_2peaks', 'FastADC2_2raw', 'FastADC2_3peaks', 'FastADC2_3raw', 'FastADC2_4peaks', 'FastADC2_4raw', 'FastADC2_5peaks', 'FastADC2_5raw', 'FastADC2_6peaks', 'FastADC2_6raw', 'FastADC2_7peaks', 'FastADC2_7raw', 'FastADC2_8peaks', 'FastADC2_8raw', 'FastADC2_9peaks', 'FastADC2_9raw'])

Let’s assume that we are interested in the APD signal on diode 8, which corresponds to Ch9 of Fast ADC 2. We will load the raw traces of this channel (mnemonic FastADC2_9raw) and extract the peaks.

This is, in principle, all done automatically in the load() function. The only piece of information that is required is which bunch pattern to use for the extraction ('sase3' if the device is looking at the FEL or 'scs_ppl' if the device is looking at the PP laser), so that all data loaded simultaneously (XGM, BAM, digitizer, …) is pulse-aligned. By default, data from FastADC2 uses the ‘sase3’ bunch pattern (parameter fadc2_bp), so we can in this case omit this parameter in the load function.

A peak-finding algorithm analyses the averaged trace and finds the location of the peaks and performs trapezoidal integration on the interval centered on the peak.

[5]:
proposal = 2956
fields = ['SCS_SA3', 'nrj', 'FastADC2_9raw']
run, ds = tb.load(proposal, 13, fields,)
ds
[5]:
<xarray.Dataset>
Dimensions:            (pulse_slot: 2700, sa3_pId: 400, trainId: 7804)
Coordinates:
  * trainId            (trainId) uint64 1501374970 1501374971 ... 1501382869
  * sa3_pId            (sa3_pId) int64 772 776 780 784 ... 2356 2360 2364 2368
Dimensions without coordinates: pulse_slot
Data variables:
    bunchPatternTable  (trainId, pulse_slot) uint32 2211625 0 ... 16777216
    nrj                (trainId) float64 931.6 931.6 931.5 ... 937.1 937.0 937.1
    FastADC2_9peaks    (trainId, sa3_pId) float64 -1.532e+04 ... -2.655e+03
    SCS_SA3            (trainId, sa3_pId) float64 553.6 1.003e+03 ... 626.1
Attributes:
    runFolder:  /gpfs/exfel/exp/SCS/202202/p002956/raw/r0013

We see here that ds has a variable FastADC2_9peaks which has dimensions ['trainId', 'sa3_pId'], which is exactly what we wanted: the raw traces were automatically transformed into vectors of length sa3_pId.

If, for instance, the recorded signal was an optical laser(OL) pulse and that the OL pattern was different from the sase 3 bunch pattern, we would need to specify fadc2_bp='scs_ppl' in the load function.

We can always inspect how the peak extraction was performed by using the following function:

[6]:
good_params = tb.check_peak_params(run, 'FastADC2_9raw')
good_params
bunch pattern sase3: 400 pulses, 96 samples between two pulses
Auto-find peak params: 400 pulses, 96 samples between two pulses
[6]:
{'pulseStart': 6759,
 'pulseStop': 6776,
 'baseStart': 6743,
 'baseStop': 6755,
 'period': 96,
 'npulses': 400}

This shows a plot with the first and last pulses identified by the peak-finding algorithm and displays the region of integration and the region for baseline subtraction. The function returns a dictionnary good_params that has all parameters necessary to perform the trapezoidal integration over the digitizer trace.

If the peak-finding algorithm fails

[7]:
run_b, ds_b = tb.load(proposal, 14, fields,)
ds_b
The period from the bunch pattern is different than that found by the peak-finding algorithm. Either the algorithm failed or the bunch pattern source (sase3) is not correct.
[7]:
<xarray.Dataset>
Dimensions:            (pulse_slot: 2700, sa3_pId: 400, trainId: 81)
Coordinates:
  * trainId            (trainId) uint64 1501384603 1501384604 ... 1501384683
  * sa3_pId            (sa3_pId) int64 772 776 780 784 ... 2356 2360 2364 2368
Dimensions without coordinates: pulse_slot
Data variables:
    bunchPatternTable  (trainId, pulse_slot) uint32 2144041 0 ... 16777216
    nrj                (trainId) float64 925.4 925.5 925.5 ... 926.1 926.1 926.1
    FastADC2_9peaks    (trainId, sa3_pId) float64 -104.5 -541.0 ... -111.5
    SCS_SA3            (trainId, sa3_pId) float64 96.74 318.3 ... 318.1 127.5
Attributes:
    runFolder:  /gpfs/exfel/exp/SCS/202202/p002956/raw/r0014

Here we get a warning that indicates that the peak-finding algorithm and the bunch pattern used to extract the peaks do not match. If the bunch pattern in the load function was set correctly, this is most likely due to a failure of the peak-finding algorithm. To check, we can plot the average trace and check the integration parameters:

[8]:
plt.figure()
tb.get_dig_avg_trace(run_b, 'FastADC2_9raw').plot()

bad_params = tb.check_peak_params(run_b, 'FastADC2_9raw')
bad_params
bunch pattern sase3: 400 pulses, 96 samples between two pulses
Auto-find peak params: 1820 pulses, 43 samples between two pulses
[8]:
{'pulseStart': 0,
 'pulseStop': 12,
 'baseStart': 0,
 'baseStop': 1,
 'period': 43,
 'npulses': 1820}

We see that the peaks are very weak and this lead to a bad assignment of the pulse parameters by the peak-finding algorithm. From here, we can redefine the integration parameters according to the position on the average trace. We also know that the integration parameters of the previously loaded run are good. We will now use these integration parameters to extract the data from run 14.

We first load the fields, but specify to the load function that we want to keep the raw trace without performing peak integration (extract_fadc2=False).

[9]:
new_params = good_params

# or we can define the new parameters manually:
# new_params = {'pulseStart': 6759,
#             'pulseStop': 6776,
#              'baseStart': 6743,
#              'baseStop': 6755,
#              'period': 96,
#              'npulses': 400}

run_b, ds_b = tb.load(proposal, runNB, fields, extract_fadc2=False)
ds_b
[9]:
<xarray.Dataset>
Dimensions:            (fadc2_samplesId: 100000, pulse_slot: 2700, sa3_pId: 400, trainId: 7804)
Coordinates:
  * trainId            (trainId) uint64 1501374970 1501374971 ... 1501382869
  * sa3_pId            (sa3_pId) int64 772 776 780 784 ... 2356 2360 2364 2368
Dimensions without coordinates: fadc2_samplesId, pulse_slot
Data variables:
    bunchPatternTable  (trainId, pulse_slot) uint32 2211625 0 ... 16777216
    nrj                (trainId) float64 931.6 931.6 931.5 ... 937.1 937.0 937.1
    FastADC2_9raw      (trainId, fadc2_samplesId) int16 -458 -425 ... -463 -470
    SCS_SA3            (trainId, sa3_pId) float64 553.6 1.003e+03 ... 626.1
Attributes:
    runFolder:  /gpfs/exfel/exp/SCS/202202/p002956/raw/r0013

In ds_b, FastADC2_9raw is the raw trace with dimensions ['trainId, 'fadc2_samplesId']. We can now call get_digitizer_peaks():

[10]:
peaks = tb.get_digitizer_peaks(run_b, 'FastADC2_9raw',
                               integParams=good_params, bunchPattern='sase3')
peaks
[10]:
<xarray.Dataset>
Dimensions:          (sa3_pId: 400, trainId: 7898)
Coordinates:
  * sa3_pId          (sa3_pId) int64 772 776 780 784 788 ... 2356 2360 2364 2368
  * trainId          (trainId) uint64 1501374970 1501374971 ... 1501382869
Data variables:
    FastADC2_9peaks  (trainId, sa3_pId) float64 -1.565e+04 ... -2.689e+03

peaks is an xarray Dataset that contains the extracted peaks from the variables contained in mnemonics argument (here mnemonics='FastADC2_9raw').

get_digitizer_peaks() gives the option to merge the resulting data into an existing dataset (merge_with). This can be conveniently used to create the final dataset with extracted peaks:

[11]:
ds_c = tb.get_digitizer_peaks(run_b, 'FastADC2_9raw', merge_with=ds_b,
                               integParams=good_params, bunchPattern='sase3')
ds_c
[11]:
<xarray.Dataset>
Dimensions:            (pulse_slot: 2700, sa3_pId: 400, trainId: 7804)
Coordinates:
  * trainId            (trainId) uint64 1501374970 1501374971 ... 1501382869
  * sa3_pId            (sa3_pId) int64 772 776 780 784 ... 2356 2360 2364 2368
Dimensions without coordinates: pulse_slot
Data variables:
    bunchPatternTable  (trainId, pulse_slot) uint32 2211625 0 ... 16777216
    nrj                (trainId) float64 931.6 931.6 931.5 ... 937.1 937.0 937.1
    SCS_SA3            (trainId, sa3_pId) float64 553.6 1.003e+03 ... 626.1
    FastADC2_9peaks    (trainId, sa3_pId) float64 -1.565e+04 ... -2.689e+03
Attributes:
    runFolder:  /gpfs/exfel/exp/SCS/202202/p002956/raw/r0013

Summary with another example

Check digitizer signal descriptions and load data

[12]:
proposal, runNB = 2953, 1
print(tb.digitizer_signal_description(tb.open_run(proposal, runNB)))

fields = ['FastADC0raw', 'chem_Y']
run, ds = tb.load(proposal, runNB, fields, fadc_bp='sase3')
{'FastADC_Ch0': 'CHEM DIAG X-ray PD', 'FastADC_Ch1': 'PPL I0 at in-coupling window', 'FastADC_Ch2': 'CHEM APD', 'FastADC_Ch3': 'PPL Reflectometer PD', 'FastADC_Ch4': '', 'FastADC_Ch5': 'PPL 800 nm I0 ILH PD', 'FastADC_Ch6': '', 'FastADC_Ch7': 'No Description', 'FastADC_Ch8': 'No Description', 'FastADC_Ch9': 'CHEM DIAG Laser PD'}
The period from the bunch pattern is different than that found by the peak-finding algorithm. Either the algorithm failed or the bunch pattern source (sase3) is not correct.

Check integration parameters

[13]:
params = tb.check_peak_params(run, 'FastADC0raw', bunchPattern='sase3')
plt.figure()
tb.get_dig_avg_trace(run, 'FastADC0raw').plot()
bunch pattern sase3: 400 pulses, 96 samples between two pulses
Auto-find peak params: 521 pulses, 93 samples between two pulses
[13]:
[<matplotlib.lines.Line2D at 0x2b57b8d0d588>]

Define and check new integration parameters

[14]:
params['period'] = 96
params['npulses'] = 400
params['pulseStart'] = 22491
params['pulseStop'] = 22525
params['baseStart'] = 22480
params['baseStop'] = 22485

params = tb.check_peak_params(run, 'FastADC0raw',
                              bunchPattern='sase3', params=params, )
params
bunch pattern sase3: 400 pulses, 96 samples between two pulses
Auto-find peak params: 400 pulses, 96 samples between two pulses
[14]:
{'pulseStart': 22491,
 'pulseStop': 22525,
 'baseStart': 22480,
 'baseStop': 22485,
 'period': 96,
 'npulses': 400}

Reload data with raw trace and extract peaks using the new integration parameters

[15]:
run, ds = tb.load(proposal, runNB, fields, extract_fadc=False)
ds = tb.get_digitizer_peaks(run, 'FastADC0raw', merge_with=ds,
                            integParams=params, bunchPattern='sase3')
ds
[15]:
<xarray.Dataset>
Dimensions:            (pulse_slot: 2700, sa3_pId: 400, trainId: 1817)
Coordinates:
  * trainId            (trainId) uint64 1471028273 1471028274 ... 1471028272
  * sa3_pId            (sa3_pId) int64 1056 1060 1064 1068 ... 2644 2648 2652
Dimensions without coordinates: pulse_slot
Data variables:
    bunchPatternTable  (trainId, pulse_slot) uint32 2146089 0 ... 16777216
    chem_Y             (trainId) float32 224.2802 224.28229 ... 224.2802
    FastADC0peaks      (trainId, sa3_pId) float64 239.0 9.5 ... -69.0 101.5
Attributes:
    runFolder:  /gpfs/exfel/exp/SCS/202202/p002953/raw/r0001

Note that the mnemonics parameter in get_digitizer_peaks can take a list of mnemonics, so this peak extraction can be simultaneoulsy applied to various channels of a same digitizer (mnemonics=['FastADC2_5raw', 'FastADC2_9raw'] for instance).