toolbox_scs.load

Toolbox for SCS.

Various utilities function to quickly process data measured at the SCS instruments.

Copyright (2019) SCS Team.

Module Contents

Functions

load([proposalNB, runNB, fields, data, display, ...])

Load a run and extract the data. Output is an xarray with aligned

run_by_path(path)

Return specified run

find_run_path(proposalNB, runNB[, data])

Return the run path given the specified proposal and run numbers.

open_run(proposalNB, runNB[, subset])

Get extra_data.DataCollection in a given proposal.

get_array([run, mnemonic, stepsize, subset, data, ...])

Loads one data array for the specified mnemonic and rounds its values to

load_run_values(prop_or_run[, runNB, which])

Load the run value for each mnemonic whose source is a CONTORL

concatenateRuns(runs)

Sorts and concatenate a list of runs with identical data variables

check_data_rate(run[, fields])

Calculates the fraction of train ids that contain data in a run.

toolbox_scs.load.load(proposalNB=None, runNB=None, fields=None, data='all', display=False, validate=False, subset=None, rois={}, extract_digitizers=True, extract_xgm=True, extract_bam=True, bunchPattern='sase3', parallelize=True)[source]

Load a run and extract the data. Output is an xarray with aligned trainIds.

Parameters:
  • proposalNB (str, int) – proposal number e.g. ‘p002252’ or 2252

  • runNB (str, int) – run number as integer

  • fields (str, list of str, list of dict) –

    list of mnemonics to load specific data such as “fastccd”, “SCS_XGM”, or dictionnaries defining a custom mnemonic such as {“extra”: {‘source’: ‘SCS_CDIFFT_MAG/SUPPLY/CURRENT’,

    ’key’: ‘actual_current.value’, ‘dim’: None}}

  • data (str or Sequence of str) – ‘raw’, ‘proc’ (processed), or any other location relative to the proposal path with data per run to access. May also be ‘all’ (both ‘raw’ and ‘proc’) or a sequence of strings to load data from several locations, with later locations overwriting sources present in earlier ones. The default is ‘raw’.

  • display (bool) – whether to show the run.info or not

  • validate (bool) – whether to run extra-data-validate or not

  • subset (slice or extra_data.by_index or numpy.s_) – a subset of train that can be loaded with extra_data.by_index[:5] for the first 5 trains. If None, all trains are retrieved.

  • rois (dict) –

    a dictionnary of mnemonics with a list of rois definition and the desired names, for example: {‘fastccd’: {‘ref’: {‘roi’: by_index[730:890, 535:720],

    ’dim’: [‘ref_x’, ‘ref_y’]},

    ’sam’: {‘roi’:by_index[1050:1210, 535:720],

    ’dim’: [‘sam_x’, ‘sam_y’]}}}

  • extract_digitizers (bool) – If True, extracts the peaks from digitizer variables and aligns the pulse Id according to the fadc_bp bunch pattern.

  • extract_xgm (bool) – If True, extracts the values from XGM variables (e.g. ‘SCS_SA3’, ‘XTD10_XGM’) and aligns the pulse Id with the sase1 / sase3 bunch pattern.

  • extract_bam (bool) – If True, extracts the values from BAM variables (e.g. ‘BAM1932M’) and aligns the pulse Id with the sase3 bunch pattern.

  • bunchPattern (str) –

    bunch pattern used to extract the Fast ADC pulses. A string or a dict as in:

    {'FFT_PD2': 'sase3', 'ILH_I0': 'scs_ppl'}
    

    Ignored if extract_digitizers=False.

  • parallelize (bool) – from EXtra-Data: enable or disable opening files in parallel. Particularly useful if creating child processes is not allowed (e.g. in a daemonized multiprocessing.Process).

Returns:

run, ds – extra_data DataCollection of the proposal and run number and an xarray Dataset with aligned trainIds and pulseIds

Return type:

DataCollection, xarray.Dataset

Example

>>> import toolbox_scs as tb
>>> run, data = tb.load(2212, 208, ['SCS_SA3', 'MCP2apd', 'nrj'])
toolbox_scs.load.run_by_path(path)[source]

Return specified run

Wraps the extra_data RunDirectory routine, to ease its use for the scs-toolbox user.

Parameters:

path (str) – path to the run directory

Returns:

run – DataCollection object containing information about the specified run. Data can be loaded using built-in class methods.

Return type:

extra_data.DataCollection

toolbox_scs.load.find_run_path(proposalNB, runNB, data='raw')[source]

Return the run path given the specified proposal and run numbers.

Parameters:
  • proposalNB ((str, int)) – proposal number e.g. ‘p002252’ or 2252

  • runNB ((str, int)) – run number as integer

  • data (str) – ‘raw’, ‘proc’ (processed) or ‘all’ (both ‘raw’ and ‘proc’) to access data from either or both of those folders. If ‘all’ is used, sources present in ‘proc’ overwrite those in ‘raw’. The default is ‘raw’.

Returns:

path – The run path.

Return type:

str

toolbox_scs.load.open_run(proposalNB, runNB, subset=None, **kwargs)[source]

Get extra_data.DataCollection in a given proposal. Wraps the extra_data open_run routine and adds subset selection, out of convenience for the toolbox user. More information can be found in the extra_data documentation.

Parameters:
  • proposalNB ((str, int)) – proposal number e.g. ‘p002252’ or 2252

  • runNB ((str, int)) – run number e.g. 17 or ‘r0017’

  • subset (slice or extra_data.by_index or numpy.s_) – a subset of train that can be loaded with extra_data.by_index[:5] for the first 5 trains. If None, all trains are retrieved.

  • **kwargs

  • --------

  • data (str) – default -> ‘raw’

  • include (str) – default -> ‘*’

Returns:

run – DataCollection object containing information about the specified run. Data can be loaded using built-in class methods.

Return type:

extra_data.DataCollection

toolbox_scs.load.get_array(run=None, mnemonic=None, stepsize=None, subset=None, data='raw', proposalNB=None, runNB=None)[source]

Loads one data array for the specified mnemonic and rounds its values to integer multiples of stepsize for consistent grouping (except for stepsize=None). Returns a 1D array of ones if mnemonic is set to None.

Parameters:
  • run (extra_data.DataCollection) – DataCollection containing the data. Used if proposalNB and runNB are None.

  • mnemonic (str) – Identifier of a single item in the mnemonic collection. None creates a dummy 1D array of ones with length equal to the number of trains.

  • stepsize (float) – nominal stepsize of the array data - values will be rounded to integer multiples of this value.

  • subset (slice or extra_data.by_index or numpy.s_) – a subset of train that can be loaded with extra_data.by_index[:5] for the first 5 trains. If None, all trains are retrieved.

  • data (str or Sequence of str) – ‘raw’, ‘proc’ (processed), or any other location relative to the proposal path with data per run to access. May also be ‘all’ (both ‘raw’ and ‘proc’) or a sequence of strings to load data from several locations, with later locations overwriting sources present in earlier ones. The default is ‘raw’.

  • proposalNB ((str, int)) – proposal number e.g. ‘p002252’ or 2252.

  • runNB ((str, int)) – run number e.g. 17 or ‘r0017’.

Returns:

data – xarray DataArray containing rounded array values using the trainId as coordinate.

Return type:

xarray.DataArray

Raises:

ToolBoxValueError – Exception: Toolbox specific exception, indicating a non-valid mnemonic entry

Example

>>> import toolbox_scs as tb
>>> run = tb.open_run(2212, 235)
>>> mnemonic = 'PP800_PhaseShifter'
>>> data_PhaseShifter = tb.get_array(run, mnemonic, 0.5)
toolbox_scs.load.load_run_values(prop_or_run, runNB=None, which='mnemonics')[source]

Load the run value for each mnemonic whose source is a CONTORL source (see extra-data DataCollection.get_run_value() for details)

Parameters:
  • prop_or_run (extra_data DataCollection or int) – The run (DataCollection) to check for mnemonics. Alternatively, the proposal number (int), for which the runNB is also required.

  • runNB (int) – The run number. Only used if the first argument is the proposal number.

  • which (str) – ‘mnemonics’ or ‘all’. If ‘mnemonics’, only the run values for the ToolBox mnemonics are retrieved. If ‘all’, a compiled dictionnary of all control sources run values is returned.

  • Output

  • ------

  • run_values (a dictionnary containing the mnemonic or all run values.) –

toolbox_scs.load.concatenateRuns(runs)[source]

Sorts and concatenate a list of runs with identical data variables along the trainId dimension.

Input:

runs: (list) the xarray Datasets to concatenate

Output:

a concatenated xarray Dataset

toolbox_scs.load.check_data_rate(run, fields=None)[source]

Calculates the fraction of train ids that contain data in a run.

Parameters:
  • run (extra_data DataCollection) – the DataCollection associated to the data.

  • fields (str, list of str or dict) – mnemonics to check. If None, all mnemonics in the run are checked. A custom mnemonic can be defined with a dictionnary: {‘extra’: {‘source’: ‘SCS_CDIFFT_MAG/SUPPLY/CURRENT’, ‘key’: ‘actual_current.value’}}

  • Output

  • ------ – ret: dictionnary dictionnary with mnemonic as keys and fraction of train ids that contain data as values.