xfel-calibrate configration¶

The European XFEL offline calibration is executed using the command line interface. By running xfel-calibrate DETECTOR CALIBRATION --<configurations> The notebook of the executed detector calibration is submitted on MAXWELL into SLURM nodes.

The offline calibration CLI machinery consists of several configuration pieces that are necessary for the calibration process. These files contain the configuration information for the notebook to process and how to submit it on MAXWELL resources.

- `settings.py`: Consist of the tool's environment definitions.
- `notebooks.py`: The module where every calibration notebook is connected to a detector calibration for the CLI.

Settings¶

The settings.py is a python configuration file, which configures the tool's environment.

    # path into which temporary files from each run are placed
    temp_path = "{}/temp/".format(os.getcwd())

    # Path to use for calling Python. If the environment is correctly set, simply the command
    python_path = "python"

    # Path to store reports in
    report_path = "{}/calibration_reports/".format(os.getcwd())

    # Also try to output the report to an out_folder defined by the notebook
    try_report_to_output = True

    # the command to run this concurrently. It is prepended to the actual call
    launcher_command = "sbatch -p exfel -t 24:00:00 --mem 500G --mail-type END --requeue --output {temp_path}/slurm-%j.out"

Notebooks¶

The notebooks.py module is responsible for configuring the connection between the notebooks and the command line. It achieves this by using a nested dictionary structure, with two levels of nesting. The first level contains a key for the detector being used, and the second level contains keys for the calibration types. The third level of the dictionary contains the names of the notebooks (notebook, pre-notebook, and dep-notebook) along with the relevant concurrency parameters. By organizing the configuration in this way, the notebooks.py module is able to provide a clear and flexible way of connecting the notebooks to the command line.

Example for xfel-calibrate/notebooks.py

notebooks = {
    "AGIPD": {
        "DARK": {
            "notebook":
                "notebooks/AGIPD/Characterize_AGIPD_Gain_Darks_NBC.ipynb",
            "dep_notebooks": [
                "notebooks/generic/overallmodules_Darks_Summary_NBC.ipynb"],
            "concurrency": {"parameter": "modules",
                            "use function": "find_modules",
                            "cluster cores": 8},
        },
        "PC": {
            "notebook": "notebooks/AGIPD/Chracterize_AGIPD_Gain_PC_NBC.ipynb",
            "concurrency": {"parameter": "modules",
                            "default concurrency": 16,
                            "cluster cores": 32},
        },
        "CORRECT": {
            "pre_notebooks": ["notebooks/AGIPD/AGIPD_Retrieve_Constants_Precorrection.ipynb"],
            "notebook": "notebooks/AGIPD/AGIPD_Correct_and_Verify.ipynb",
            "dep_notebooks": [
                "notebooks/AGIPD/AGIPD_Correct_and_Verify_Summary_NBC.ipynb"],
            "concurrency": {"parameter": "sequences",
                            "use function": "balance_sequences",
                            "default concurrency": [-1],
                            "cluster cores": 16},
        },
        ...
    }
}

As previously explained, the DARK and CORRECT nested dictionaries that correspond to different calibration types contain references to the notebooks that will be executed and specify the concurrency settings for the main calibration notebook.

notebook: The main calibration notebook and the notebook that will be affected with the concurrency configs.
pre_notebooks: These notebooks runs before the main notebook as the usually prepare some essential data for it before it runs over multiple SLURM nodes. e.g. retrieving constant file paths before processing.
dep_notebooks: These are the notebooks dependent on the processing of all SLURM nodes running the main notebook. e.g. running summary plots over the processed files.

Tip

It is good practice to name command line enabled notebooks with an _NBC suffix as shown in the above example.

concurrency dictionary:
- parameter: The parameter name that will be used to distribute the processing of the notebook across multiple computing resources. The parameter should be of type list.
- use_function: In case there is a need to use a function within the notebook that will affect the SLURM nodes used. A function can be used here, and it will be expected in the first notebook cell of the main notebook. e.g. balance_sequences
Note

The function only needs to be defined, but not executed within the notebook context itself.
default concurrency: The default concurrency to use if not defined by the user. e.g. default_concurrency = 16 and parameter name modules of type lis(int) leads to running 16 concurrent SLURM jobs with modules values 0 to 15, respectively for each node.
cluster cores: This is for notebooks using ipcluster, only. This is of the number of cores to use.