Skip to content

Environments

This repository is stored in the directory /gpfs/exfel/sw/software/euxfel-environment-management on Maxwell.

An environment is created per European XFEL experiment cycle, this is done so that previous environments are preserved for reproducibility. The files defining the environments are stored ./environments/${CYCLE} (note: ./ refers to this git repository, not the GPFS software directory).

Each environment directory will have a few files:

  • 0-desy-pinned.yml - environment file with a few packages that we should keep in sync with DESY, e.g. if we have a different version of ipympl (interactive plotting backend for matplotlib) to the one in the DESY Conda environment that is running Max-JHub then there may be problems with interactive plotting due to incompatibilities.
  • 1-base.yml - Conda environment file containing packages which are available on a Conda channel
  • 2-custom.yml - optional Conda environment file containing packages built from custom recipes (see Recipes)
  • environment.yml - file generated by merging the three files from above into one
  • environment.lock.yml - the output of conda env export, contains the versions of all packages installed in the environment

Environments exist in an installation of Conda, setting up a new Conda installation is very rarely required and is covered in the Instances section.

Creating a New Specification

The first step to creating a new environment is activating an installation, this can be done with module load exfel mambaforge. Loading this module will initialise the Conda instance into the base environment which provides useful tools for environment management.

Once a Conda instance has been activated, you can create a new directory under either ./environments/${CYCLE} or ./applications/${APPLICATION_NAME}/{APPLICATION_VERSION} depending on whether the environment is intended to be a generic environment users will activate to write and execute their own code, or if the environment exists only to provide a specific application.

If a new cycle environment is being created, copy the 0-desy-pinned.yml, 1-base.yml, and 2-custom.yml files from the previous cycle and use those as a starting point. At this stage some pinned versions can be shifted, e.g. updating python to the latest release.

Generating 0-desy-pinned.yml

To update/generate a new 0-desy-pinned.yml file run:

/gpfs/exfel/sw/software/euxfel-environment-management/scripts/utility.py dump

Otherwise create a standard Conda environment file defining the channels and the dependencies you require:

Info

If the environment is likely to be used as a Jupyter kernel on Max-JHub then you need to take care to have the same versions of a few packages, see Interactive Plotting Issues in Jupyter Notebooks. The 0-desy-lock.yml file is intended to help prevent these issues.

channels:
  - conda-forge
  - nodefaults
dependencies:
  - numpy  # for example

And place any dependencies under the dependencies section. Note that these are Conda dependencies, from Conda channels, not from PyPI, so the package names may differ (see https://anaconda.org/ to search through the official channels).

Once all required dependencies have been added to the dependencies list, carry on with the instructions in Locking and Installing an Environment, as well as Creating a Modulefile.

Locking and Installing a New Environment

First run module load exfel mambaforge, this will activate the conda installation that the environment will be created within.

Now, cd into the directory of the environment you are working on (e.g. cd ./environments/202302), and run the following command:

../../scripts/utility.py merge

This will merge all of the individual environment files into a single environment.yml file which can then be used to create/update an environment.

The command will print off instructions on what to do next:

Next steps:

For a new environment, create it via:

mamba env update -n 202302 -f ./environment.yml

For an existing environment, update it via:

mamba update --no-update-deps -n 202302 -f ./environment.yml

After update/install is complete create a 'lock file' by running export:

mamba env export -n 202302 --no-builds -f environment.lock.yml

Add and commit any new or modified files, then push.

So for installing a new environment run:

mamba env update -n 202302 -f ./environment.yml

mamba env export -n 202302 --no-builds -f environment.lock.yml

git add .

git commit -m "Add environment for cycle 202302"

Modifying Existing Specifications

To add a new package to an existing environment, the package should be added to the 1-base.yml if is is an existing package on a Conda channel, or added to 2-custom.yml if it is a package where the recipe has to be created by us.

Once all required dependencies have been added to the dependency files run the merge script:

../../scripts/utility.py merge

The command will print off instructions on what to do next. For updating an existing environment run:

mamba update --no-update-deps -n 202302 -f ./environment.yml

mamba env export -n 202302 --no-builds -f environment.lock.yml

git add .

git commit -m "Add package X to environment for cycle 202302"

Creating a Modulefile

If you are updating an environment, the modulefile does not require changes. If you've created a new environment, you need to create a module file for it so that users can easily activate it.

There are a few ways to create a module file which enables a conda environment:

1. Adapt Existing Modulefile

If this is a new cycle environment (or any environment based on mambaforge) then you can copy an existing modulefile and adjust the paths. For example, the current exfel-python/202301 modulefile does:

#%Module 1.0
proc ModulesHelp {} {
    puts stdout    "Mamba environment for cycle 202301"
}

module-whatis  "Module loads the mamba environment for cycle 202301"

module load mambaforge

prepend-path    PATH /gpfs/exfel/sw/software/mambaforge/22.11/envs/202301/bin
prepend-path    XML_CATALOG_FILES file {///gpfs/exfel/sw/software/mambaforge/22.11/envs/202301/etc/xml/catalog file} ///etc/xml/catalog
setenv          CONDA_DEFAULT_ENV 202301
setenv          CONDA_PREFIX /gpfs/exfel/sw/software/mambaforge/22.11/envs/202301
setenv          CONDA_PROMPT_MODIFIER {(202301) }
setenv          CONDA_SHLVL 1
setenv          GSETTINGS_SCHEMA_DIR /gpfs/exfel/sw/software/mambaforge/22.11/envs/202301/share/glib-2.0/schemas
setenv          GSETTINGS_SCHEMA_DIR_CONDA_BACKUP {}

Copying this file and adjusting any relevant paths/names is typically enough to get a working modulefile.

Note

Some module files may contain an additional guard case:

if {[info commands set-function] eq {set-function}} {
  ...
}

This checks if a command (in this case set-function) exists, it was required as the default module version on Maxwell was very old and did not support some functions like setting shell functions.

Now that environment modules was updated this is no longer required, but may still be present in some files.

2. Create Via sh-to-mod

Once the mambaforge module is loaded, conda activate commands can be used as normal as the module sets the required shell functions. This means that we can start up a shell, load mambaforge, create a basic activation script which does conda activate ..., and then use sh-to-mod to convert that script into a module file:

$ module load exfel mambaforge

$ export TERM=xterm-256color  # some of the changes made by activation depend on an interactive coloured terminal so make sure TERM is set

$ echo "conda activate 202301" > "activate"

$ module sh-to-mod bash ./activate

#%Module
prepend-path    PATH /gpfs/exfel/sw/software/mambaforge/22.11/envs/202301/bin
prepend-path    XML_CATALOG_FILES file {///gpfs/exfel/sw/software/mambaforge/22.11/envs/202301/etc/xml/catalog file} ///etc/xml/catalog
setenv          CONDA_DEFAULT_ENV 202301
setenv          CONDA_PREFIX /gpfs/exfel/sw/software/mambaforge/22.11/envs/202301
setenv          CONDA_PROMPT_MODIFIER {(202301) }
setenv          CONDA_SHLVL 1
setenv          GSETTINGS_SCHEMA_DIR_CONDA_BACKUP /gpfs/exfel/sw/software/mambaforge/22.11/envs/202301/share/glib-2.0/schemas

This module file is enough to activate a conda environment assuming that mambaforge is already loaded. Given that's not necessarily the case there are a few modifications that should be made to this file:

  1. Add help and description:

    proc ModulesHelp {} {
        puts stdout    "Mamba environment for cycle 202301"
    }
    
    module-whatis  "Module loads the mamba environment for cycle 202301"
    
  2. Add a call to the metrics.sh script which tracks module activations:

    if { [ module-info mode load ] } {
        system    "/gpfs/exfel/sw/software/local/etc/metrics.sh 'mamba/202301'"
    }
    
  3. Optionally, create scripts for other shells (konsh, fish) and use a switch case to support them:

    switch -- [module-info shelltype] {
            sh {
              ...
            }
            fish {
              ...
            }
    

Note that this will semi work even without module load mambaforge. Python will work correctly as will its dependencies however conda/mamba commands will not work.

3. Direct system Calls

Use system calls to directly call conda activate ...; in the user shell. This works as well as normally calling the activate commands, but can cause large amounts of latency due to the activate command accessing a lot of files.

Example module file
#%Module 1.0
proc ModulesHelp {} {
    puts stdout    "Mamba environment for cycle 202301"
}

module-whatis  "Module loads the mamba environment for cycle 202301"

module load mambaforge

# NOTE: the final semicolon is required
puts stdout "mamba activate 202301;"

Setting a Default Modulefile

To set a default modulefile, go to the directory the modulefile is in (for cycle environments that would be /gpfs/exfel/sw/software/xfel_modules/mamba) and create a file called .version containing:

#%Module1.0
set ModulesVersion "{NEW_CYCLE_NUMBER}"

This will set the default for that directory to the new cycle number.

Creating a Kernel for Max-JHub

The following directory contains kernels which are automatically on the kernel list on Max-JHub:

/gpfs/exfel/sw/software/local/share/jupyter/kernels

To add a new kernel, create a new directory in this location with the name of the kernel, and then create a kernel.json file in that directory with the following contents:

{
  "argv": [
    "${PYTHON_PATH}",
    "-m",
    "ipykernel_launcher",
    "-f",
    "{connection_file}"
  ],
  "display_name": "xfel (202301)",
  "language": "python"
}

The display_name is what will be displayed in the Max-JHub kernel list, this should be descriptive and (if for a cycle environment) include the cycle number.

Replace ${PYTHON_PATH} with the path to the python executable for the environment, this can be easily found by loading/activating the environment and running which python.

FAQ

Interactive Plotting Issues in Jupyter Notebooks

If an environment has issues with interactive plotting in Jupyter notebooks it is likely that the packages in it are out of sync with those in the environment running Jupyter/JupyterLab on Max-JHub.

The module used for Max-JHub is mentioned on the DESY documentation page here: https://confluence.desy.de/display/MXW/JupyterHub+on+Maxwell. Currently this is conda/3.8.

You should load that environment and then check the versions of the following packages:

  • ipympl
  • ipywidgets
  • matplotlib

And pin them to be the same as the versions currently used by Max-JHub. Currently (August 2023) this means:

- ipympl=0.9.3
- ipywidgets=7.7.0
- matplotlib=3.5.2