.. _calversion: Tutorial Calculation ==================== Author: Astrid Muennich Version: 0.1 A small example how to adapt a notebook to run with the offline calibration package "pycalibation". The first cell contains all parameters that should be exposed to the command line. To run this notebooks with several different input parameters in parallel by submitting multiple slurm jobs, for example for various random seed we can do the following: xfel-calibrate TUTORIAL TEST --random-seed 1,2,3,4 or xfel-calibrate TUTORIAL TEST --random-seed 1-5 will produce 4 jobs: Parsed input 1,2,3,4 to [1, 2, 3, 4] Submitted job: 1169340 Submitted job: 1169341 Submitted job: 1169342 Submitted job: 1169343 Submitted the following SLURM jobs: 1169340,1169341,1169342,1169343 .. code-block:: python out_folder = "/gpfs/exfel/data/scratch/amunnich/tutorial" # output folder sensor_size = [10, 30] # defining the picture size random_seed = [2345] # random seed for filling of fake data array. Change it to produce different results, range allowed runs = 500 # how may iterations to fill histograms cluster_profile = "tutorial" First include what we need and set up the cluster profile for parallel processing on one node utilising more than one core. Everything that has a written response in a cell will show up in the report, e.g. prints but also return values or errors. .. code-block:: python import matplotlib %matplotlib inline import matplotlib.pyplot as plt import numpy as np # if not using slurm: make sure a cluster is running with # ipcluster start --n=4 --profile=tutorial # give it a while to start from ipyparallel import Client print("Connecting to profile {}".format(cluster_profile)) view = Client(profile=cluster_profile)[:] view.use_dill() Create some random data ----------------------- .. code-block:: python def data_creation(random_seed): np.random.seed = random_seed return np.random.random((sensor_size)) .. code-block:: python # in order to run several random seeds in parallel the parameter has to be a list. To use the current single value in this # notebook we use the first entry in the list random_seed_single = random_seed[0] fake_data = [] for i in range(runs): fake_data.append(data_creation(random_seed_single+10*i)) Create some random images and plot them. Everything we write here in the markup cells will show up as text in the report. .. code-block:: python plt.subplot(211) plt.imshow(fake_data[0], interpolation="nearest") plt.title('Random Image') plt.ylabel('sensor height') plt.subplot(212) plt.imshow(fake_data[5], interpolation="nearest") plt.xlabel('sensor width') plt.ylabel('sensor height') plt.subplots_adjust(bottom=0.1, right=0.8, top=0.9) cax = plt.axes([0.85, 0.1, 0.075, 0.9]) plt.colorbar(cax=cax).ax.set_ylabel("# counts") plt.show() These plots show two randomly filled sensor images. We can use markup cells also as captions for images. Simple Analysis --------------- .. code-block:: python mean = [] std = [] for im in fake_data: mean.append(im.mean()) std.append(im.std()) To parallelise jobs we use the ipyparallel client. This will run on one node an ipcluster with the specified number of cores given in xfel\_calibrate/notebooks.py. .. code-block:: python from functools import partial def parallel_stats(input): return input.mean(), input.std() p = partial(parallel_stats) results = view.map_sync(p, fake_data) p_mean= [ x[0] for x in results ] p_std= [ x[1] for x in results ] We calculate the mean value of all images, as well as the standard deviation. .. code-block:: python plt.subplot(221) plt.hist(mean, 50) plt.xlabel('mean') plt.ylabel('counts') plt.title('Mean value') plt.subplot(222) plt.hist(p_mean, 50) plt.xlabel('mean parallel') plt.ylabel('counts') plt.title('Parallel Mean value') plt.subplot(223) plt.hist(std, 50) plt.xlabel('std') plt.ylabel('counts') plt.title('Std value') plt.subplot(224) plt.hist(p_std, 50) plt.xlabel('std parallel') plt.ylabel('counts') plt.title('Parallel Std value') plt.subplots_adjust(top=0.99, bottom=0.01, left=0.01, right=0.99, hspace=0.7, wspace=0.35) plt.show()