Writing a context¶
The context configures the programmable portion of the pipeline. It is written in Python code, which expresses the transformation steps to be performed, declare parameters or even run a scriped measurement. It is structured into a series of functions, which perform any operations on the data and (may) each return a result. These functions are called views in the sense that they allow a particular “view” into the data.
Views¶
A view represents a single contained operation of the data pipeline, which takes one or more inputs and produces a result.
A view has two fundamental properties, which specify their behaviour and semantics in some cases. The output type defines how the result of a view is to be handled and visualized on the client side. The stage defines at which point in the pipeline the view is executed. The possible values for these properties can be found in the enums metropc.ViewOutput
and metropc.ViewStage
, respectively. The default values are metropc.ViewOutput.VECTOR
and metropc.stage.POOL
.
In order for a function to serve as a view, it must be decorated with the @View
decorator and declare the input source via annotations for each of its non-optional arguments:
@View
def view_name(arg1: 'argument_source'):
# Perform any operation
return awesome_result
View decorator¶
The @View
decorator may be used by itself without any further arguments, but it is typically more useful to narrow down one or more of its properties for each use. This may be done via regular keyword arguments found further below, but it is recommended instead to make use of its virtual properties, that each combine one or more keyword arguments into memorable groups. Several of these virtual properties can in turn be combined by joining them with an underscore as well as augmented further by keyword arguments, e.g.
@View.Vector_MovingAverage(N=25, skip_empty=False, postprocessing=lambda x: x[x > 0])
The virtual properties available right now are listed below. If their use causes a different view implementation to be used, it may introduce additional keyword arguments. Please refer to the respective documentation of each view implementation in the advanced section for more details.
- Compute, Scalar, ScalarFast, ScalarSlow, Vector, VectorLine, VectorDistribution, Matrix, Image, Points, PointsBinned, PointsScatter
Virtual properties for each possible value of
metropc.ViewOutput
defining the view’s output type.
- Pool, Reduce, Step, Action
Virtual properties for each currently supported value of
metropc.ViewRank
defining the view’s execution stage.Keep in mind that events are processed synchronously in the reduce stage and should complete as quickly as possible to not slow down the whole pipeline. Expensive computation can be prepared in a compute view in the pool stage and pass down its results to be combined in the reduce stage.
Views in the action stage will often set
feedback=True
on the view in order to preserve and propagate the result upstream the pipeline.
- Histogram
Use the HistogramView implementation that bins the the view invocation results into histograms. It supports
metropc.ViewOutput.VECTOR
andmetropc.ViewOutput.MATRIX
with a number of additional configuration parameters depending on the output type. Must always run in the reduce stage.
- SortedVector
Use the SortedVectorView implementation that sorts and reduces vector values based on an additional scalar parameters as a sort of hybrid vector/matrix histogram. Must always run in the reduce stage and automatically choses the
metropc.ViewOutput.MATRIX
output type.
- LocalAverage
Use the LocalAverageView implementation that outputs the average of
N=10
view invocations at a time. Must always run in the reduce stage.
- GlobalAverage
Use the GlobalAverageView implementation that outputs the average of all view invocations every
output_every=10
view results. Must always run in the reduce stage.
- MovingAverage
Use the MovingAverageView implementation that outputs the moving average of the last
N=10
view results everyoutput_every=10
results. Must always run in the reduce stage.
- StepAverage
Use the StepAverageView implementation that outputs the average of all view invocations that ocurred during an operator step at its end. Must always run in the reduce stage.
- StepStacked
Use the StepStackedView implementation that outputs the stacked result of the last view invocation in each operator step. An additional axis is prepended to the original view result shape. Must always run in the reduce stage at the end of a step (the step stage).
- Extremum
Use the ExtremumView implementation that outputs the view invocation result only if the passed callable
op
evaluates toTrue
when passed the current value and the last value for which it evaluatedTrue
. It expects the view to return two results, the actual result to be returned ifop
evaluates toTrue
and the value to compare withop
. This may be used to always output the most extreme result, e.g. with the highest intensity until a result with an even higher intensity occurs. Must always run in the reduce stage.
- Maximum
An alias for Extremum using the greater-than operator, i.e. only outputting a result if it is numerically greater than all previous ones, e.g.
@View.Vector_Maximum def brighest_shot(spectrum: 'argument_source'): return spectrum, spectrum.sum()
It is equivalent to
@View.Vector_Extremum(op=operator.gt)
.
- Minimum
An alias for Extremum using the less-than operator, i.e. only outputting a result if it is numerically less than all previous ones.
The combination of all keyword arguments applied by the used virtual properties yields the arguments to create the view object. The resulting mapping of arguments may not be ambiguous, i.e. contain more than one value for the same argument. The full declaration of the underlying @View
decorator found below contains a few additional flags, which may only be used directly.
Data paths¶
All data flowing through the pipeline carries a data path to define its origin. It is used to declare where the data for a view argument should come from or identify the data frames in the ZMQ output stream. These are string values and follow the general schema <type>#<identifier>
, but the path type may often be omitted if unambiguous. The currently supported types are:
- Internal path
internal#
Provides access to various pipeline-internal values:
event_id
Event ID for the data currently being processed. This may be overwritten by a frontend-specific alias, e.g.train_id
at European XFEL.sequence_id
Strictly increasing ID indicating order of processing. May be-1
if not supplied by the frontend, i.e. without explicit sequencing of events.ctx_version
Version of the current context, which is increased each time it is reconfigured.context
Context object this view is currently being executed in.stage_runner
metropc.stage.StageRunner
object the context runs in.identity
Unique identifier for the stage instance in the control network.runtime_id
Unique identifer for the stage runner, e.g. process ID or thread ident.
- Internal path
- View path
view#
References the result of a view with its name being used as the identifier.
It supports pattern matching based on fnmatch to include the results of multiple and potentially unknown views. The matches may be passed as a
list
of view results or adict
mapping view names to their respective results by prepending the path with the respective pass methods[by-index]
or[by-name]
, e.g.view#[by-index]path/to/*
If the pass method is omitted,
[by-index]
is chosen by default. The usage of wildcard characters requires explicit declaration of the path typeview#
, as only valid view names can be detected automatically.
- View path
- Symolic path
sym#
Provides an alias for another (potentially longer) path. Symbols may be registered using the globally available
symbols(*args, **kwargs)
function.
- Symolic path
- Karabo path
karabo#
(European XFEL only)
References a property (slow data) or pipeline (fast data) of a Karabo device running in the same topic and reachable by the processing device. As with the notation used within Karabo, it consists of the device ID, followed by a dot
.
for slow data or a colon:
for fast data and the corresponding descriptor key, e.g.SPB_IRU_AGIPD1M/MOTOR/Z_STEPPER.absolutePosition
SA3_XTD10_XGM/XGM/DOOCS:output
Pipeline data may be indexed to extract and pass around only a portion of its hash. This is heavily recommended in order to always restrict the input data to the contained numpy arrays, as this special case is significantly more efficient than serializing the hash itself. This index is specified via brackets
[
,]
following the pipeline name and using dots.
to navigate the hash hierarchy, e.g.SA3_XTD10_XGM/XGM/DOOCS:output[data.intensityTD]
The pipeline key supports regular expressions as well as the simpler wildcard characters
*
(any characters) and?
(a single character), returning a list of matching results found in the hash, e.g.SA3_XTD10_XGM/XGM/DOOCS:output[data.intensitySa?TD]
Some device pipeline with increased bandwidth requirements are not accessible directly, but may run in an isolated network with their data aggregator device. In that case, the data can be received by switching the corresponding DA to the monitoring state and connect through its output pipeline, declared by this pipeline name following
@
, e.g.SQS_DIGITIZER_UTC1/ADC/1:network[digitizers.channel_1_A.raw.samples]@SQS_DAQ_DATA/DA/2:output
Technically, the identifier before
@
is treated as the source name to filter for, while the actual pipeline to connect to is listed after it. If the latter is omitted, both are assumed to be identical. This may also be used to distinguish between different source names sent over the same pipeline, e.g. the online calibration pipeline.References to the device proxy object itself are not supported.
- Karabo path
If a view argument has a default value but an input annotation, it is considered optional and the view is executed even if it is not present. Optional arguments may also not contain an annotation at all and are thus ignored by the pipeline. This is a useful pattern if views are defined in a loop and the loop variable is required in the function body. In most cases however, such a pattern is better served by using view prototypes or view groups described in the advanced chapter.
for channel in ['1_A', '1_B']:
@View.Vector(name=f'raw/{channel}', channel=channel)
def raw_signal(data: f'DEVICE:output[channel_{channel}]'):
# channel may be used in this function body now.
Note that without this keyword argument, the function’s closure will always reference to the last value of channel
, not the one during its iteration.
Data coordinates and annotations¶
Any data returned by a view can be annotated by wrapping it into an xarray.DataArray
, a labeled version of the regular numpy.ndarray
. For example, this allows to assign arbitrary coordinates to its data content, add axis labels or include graphical annotations to its visualization:
@View.Vector
def some_view(...):
# Compute `y_spectrum` and calibration `x_mq`.
# Wrap data in an xarray. set y axis label and add two
# labeled verticel lines.
return xr.DataArray(
y_spectrum,
dims=['m/q'],
coords={'m/q': x_mq},
attrs={'ylabel': 'rel. int.',
'vlines': {16/4: 'O4+', 16/5: 'O5+'})
In general, the coordinates and their dimension names are directly used as plot axes and their labels. Please refer to the xarray documentation for more details. All builtin custom view implementations explained further below have full support for xarray.DataArray
and will preverse its metadata, while some are actually built on it to define their coordinate axes, for example.
Some visualizations, e.g. a vector plot, may interpret the first dimension of 2D data as a collection of 1D plots and thus include this dimension as a plot legend:
@View.Vector
def combine_channels(channelA: '...', channelB: '...'):
return xr.DataArray(
np.stack(channelA, channelB),
dims=['channel', 'samples'],
coords={'channel': ['A', 'B']})
Additional annotations can be passed in the xarray.DataArray.attrs
mapping:
ylabel
Name of the Y axis for 1D plots.vlines
Vertical annotation lines for 1D plots, either specified as an iterable of X coordinates or a map of X coordinates to labels, e.g.hlines
Same as vlines, but for horizontal annotation lines specified by Y coordinates.
Custom view implementations¶
By default, a view is represented by metropc.core.View
. This baisc type directly forwards the input arguments to the view function and then returns its result. However, a different view implementation extending metropc.core.View
may be used for additional features, e.g. returning the average result of all view invocations.
Histograms and binning¶
The metropc.builtin.HistogramView
type provides automatic binning of view results into vectors or matrices. The binned axis may either be predetermined or allowed to grow to encompass the generated data.
# No boundaries, explicit step size.
@View.Vector_Histogram(bin_step=2.0)
def growing_hist(...):
# Do something.
return array_with_points
# Fixed start, but no end, implicit bin_step=1.0
@View.Vector_Histogram(bin_min=-30)
def onesided_hist(...):
...
# Both ends fixed and step size fixed.
@View.Vector_Histogram(bin_min=30, bin_max=40, bin_step=0.5)
def twosided_hist(...):
...
# Both ends fixed, but number of counts instead of step size.
@View.Vector_Histogram(bin_min=-5, bin_max=5, bin_count=11)
def twosided_hist2(...):
...
For MATRIX
/IMAGE
view outputs, the corresponding bin_*
arguments can be specified for either axis by prepending x_*
and y_*
. Please see the documentation of metropc.builtin.HistogramView.BinnedAxis.__init__()
for further reference of these parameters.
The metropc.builtin.SortedVectorView
is a hybrid between vector and matrix histogram. It sorts vector results based on a corresponding scalar value into bins, reducing the vector value by one of several reduction methods, e.g. averaging.
Averages¶
There are view implementations provided for three different kind of averages:
- Local average via
metropc.builtin.LocalAverageView
computing and then returning the average for a fixed number of view results. - Global average via
metropc.builtin.GlobalAverageView
computing the average until view buffers are cleared explicitly and returning the average periodically. - Moving average via
metropc.builtin.MovingAverageView
computing the average of a rolling buffer of view results, i.e. removing the oldest entry when inserting a new one, and returning the average periodically.
Extrema¶
Returns a result only if it exceeds all previous results via metropc.builtin.ExtremumView
. Two results are expected from its view function, the result to be returned if the extreme condition suceeds and a comparable value.
Step interaction¶
Stacked step results via metropc.builtin.StepStackedView
or average of all results in a step metropc.builtin.StepAverageView
. Mostly superseded by automatic binning/sorting.
View prototypes¶
A view prototype appears identical to a regular view, but does not actually create a view when it is encountered. Instead, it serves as a blueprint for views and may be instantiated multiple times later to an actual view, e.g.:
@ViewPrototype
def proto(...):
...
# Instantiate the prototype into an actual view.
instance = proto(...)
Prototypes may be defined in another Python modules and imported. This is NOT possibly with actual views, as their definintion is a stateful change with side effects on the current context. View prototypes in contrast do not belong to any context until they are instantiated to a view.
Any view property may be overwritten upon instatiation. In addition, the prototype’s name and argument annotations may contain further formatting fields, which can (should) be filled when the prototype is instantiated. However, it is illegal to use any keyword parameter of View.__init__ as such a field name. These additional fields are then also injected into the global scope of the final view’s kernel.
If no name is specified either on the generic prototype or the instantiated view, a default name is chosen based on the kernel’s name and an increasing index.
View groups¶
A view group is an abstract collection of views, view prototypes and parameters, which may be instantiated all together into their actual counterparts:
class Component(ViewGroup):
signal_region: Parameter = slice(0, None)
@View
def raw(...):
...
@View
def averaged(...):
...
@ViewPrototype
def reference(...):
...
device = Component(prefix='device/')
View groups introduce a set of formatting fields common for all its items, which may be used in any of its names or argument annotations similar to a view prototype.
For views, a group may be seen as a set of prototypes instantiated at the same time. For view prototypes, it introduces an additional prototyping layer, i.e. first the group is instantiated and then any prototype from the resulting group instance. For parameters, they are added to the parameter table with a prefix drawn from a view group field passed to the Parameter
annotation, 'prefix'
by default. This causes this field to be obligatory when instantiating the view group.
Its main applications are the modularization of common functionality, e.g. a set of views needed as baseline for a particular component. As with view prototypes, a view group is independent of any context object and may be imported from external modules.
When subclassing a view group, it is required to call its initializer in order to actually concretize its items and pass all formatting fields required to resolve names or annotations.
Parameters¶
The context can define parameters, which may be changed at runtime on the frontend side. On the context side, they are injected as symbols with that name into the global scope. When a value is changed on the frontend side, the corresponding global variable is updated, too.
The supported types are:
int
,float
,bool
,str
slice
numpy.ndarray
for numeric dtypeslist
/tuple
ofstr
Importing other modules¶
Context code can import other Python modules, which in turn may be define views and other metropc objects. As these may be stateful changes, there are a few semantic differences to Python’s regular import system.
- If loaded from a file, the parent directory of the context file is added to the interpreter’s search path for modules.
- If loaded from a file and a loaded module is under the context file’s parent, it is automatically reloaded whenever a new context is created.
- Regular views may hence only be used in modules local to a context file, i.e. located in a file in the same or lower directory than the context file. In any other case, the module may not be reloaded automatically, causing stateful changes to not occur. Such a module may only define view prototypes or groups.
To gain access to the global symbols regularly available in a context file, you can import from the metropc.context
API package, e.g.
from metropc.context import *
Global symbols¶
In a context file, a number of global symbols are always accessible. Note that not all of these symbols are available in imported modules at all times.
-
@
View
(name=None, label=None, output=ViewOutput.VECTOR, stage=ViewStage.POOL, view_impl=View, postprocessing=None, hidden=False, feedback=False, transient=True, skip_empty=True)¶ Declare the decorated function or generator as a view.
This decorator may be used without any arguments, i.e.
@View
, with keyword arguments@View(rank=1)
or using virtual properties@View.Vector
as described above.Parameters: - name (str) – Optional name to use instead of the callable’s name, the callable’s name is used by default.
- label (str) – Optional label to use for display purposes, the view’s name is used by default.
- output (metropc.ViewOutput) – Expected output for this view, which may be a generic type or a hinted type with further information about how it should be visualized.
- stage (metropc.ViewStage) – Point of execution for this view, which may be in the event worker pool (POOL, default), when reducing the pool’s results (REDUCE), at the end of each step (STEP) or upon request (ACTION).
- view_impl (type) – Class to instantiate the new view object from. A custom view implementation may be used to transparently implement custom view behaviour, e g. returning the averaged result across several invocation. There are a number of buitin view implementations in addition to the regular
View
class, which are typically selected via custom properties. - postprocessing (Callable) – Optional callable applied to any data returned from this view.
- hidden (bool) – Whether this view is hidden in the pipeline index,
False
by default. - feedback (bool) – Whether the results of this view are sent upstream to the pool stage, False by default. This may be used to share data generated in a view across the whole pipeline, e.g. a background image. Note that data upstreamed in this way is no longer bound to a specific event! Instead, the most current result is used until updated. The feedback process may take one or more events to reach each node in each stage.
- transient (bool) – Whether the results of this view should be recorded on disk by a client,
False
by default. This is currently not implemented by any client implementation and may become deprecated. - overwrite – Whether the view may overwrite a view existing under the same name,
False
by default. If disabled, defining a view with the same name raises an exception. - skip_empty (bool) – Whether to automatically skip event execution if one or more arguments are considered empty, True by default.
-
@
Operator
¶ Declare the decorated asynchronous generator as an operator.
-
parameters
(*args, **kwargs) Define one or more parameters.
Either a single dictionary
args
may be passed and/or keyword arguments for each parameter.
-
symbols
(*args, **kwargs)¶ Define one or more symbolic paths.
Either a single dictionary
args
may be passed and/or keyword arguments for each symbolic path.
Be careful with defining your own global symbols in a context file. The code will run in many different stages asynchronously, such as the pool’s worker processes and the reduce stage, each with their own namespaces.
Internal context state¶
The metropc.core.Context
object, available via the global symbol ctx
(outside views) and internal#context
path (inside views) in context code, may be interacted with directly for configuration beyond the regular view machinery.
In principle, the view hierarchy can be manipulated on a very low level this way. Beware that the asynchronous nature of the pipeline distributed over isolated stages may break unexpectedly though, if the public interfaces are circumvented.
The most common usecases here are defining required metropc versions or features and creating virtual views, i.e. view paths not backed by an actual view. This can become useful for arbitary counter not known at creation time or to define labels for parents of view paths.
# Require metropc version 1.3 or higher.
ctx.require_version('1.3')
@View(name='path/to/my/view')
def analyse(ctx: 'internal#context', ...):
# Create a virtual name path/to/my/counter with counts 1.
ctx.increase_view_counter('path/to/my/counter', +1)
# Give the immediate parent path of the view above a label and docstring.
ctx.set_view_docs('path/to/my', 'View\'s parent',
'The parent of my view, which is the path above it.')
Operators¶
An operator is a coroutine running on the frontend side, which may mark the beginning and end of steps. View objects can react to these boundaries and perform operations based on them such as computing the average of all events that ocurred within a step. This may be used to group results as a function of external factors such as the value of a particular View. In a lot of cases, this may also be possible by using the @metropc.builtin.SortedVectorView
view implementation.
This is an advanced feature and requires tight integration with the frontend. A repository of pre-defined operators for common tasks such as motor position are available for some frontends. Please contact the data analysis team (da-support@xfel.eu) for support.