Troubleshooting¶
How to use Command Line Interface¶
The Karabo devices working with detectors developed at Paul Scherrer Institut make use of the control software developed there, called SLS Detector Software [1]. This software packages allowes to operate the detectors from Command Line Interface (CLI). For all the detectors installed at EuXFEL, it is always possible to use the CLI in parallel to the Karabo devices or in their place if the CONTROL device has issues. in order to do so, however, xctrl access to the Host running the CONTROL device is needed.
In order to run multiple CONTROL device instances on the same Host, every time a CONTROL device is instantiated, it reserves a different segment of the shared memory, and identifies it with a randomly generated DETECTOR_ID; to use the CLI in parallel to Karabo it is therefore necessary to know the last DETECTOR_ID generated. To do so one should look in the /dev/shm folder in the Host:
ssh as xctrl to the Host and run:
source karabo/activate
go into the /dev/shm folder and run:
ls -lrth
You should see a list of entries like the ones in figure :numref:’label_dev_shm’. We are interested in the last entries of this list and in the multiple digit number before the word ‘_module’: this is the DETECTOR_ID that is needed (‘657808900’ in the example). The second integer number is the MODULE_ID, in case of multi-module detectors.
At this point, the syntax to send a command is
sls_detector_put DETECTOR_ID-MODULE_ID: <command>
For example, if we want to stop the acquisition of the detector in figure label_dev_shm
we will run:
sls_detector_put 657808900-0: stop
and for the slave:
sls_detector_put 657808900-1: stop
Similarly, to retrieve a parameter value:
sls_detector_get DETECTOR_ID-MODULE_ID: <parameter>
Continuing the above example, if we want to retrieve the number of frames set:
sls_detector_get 657808900-0: frames
Note: the CONTROL device clears the shared memory when it is shut down, so the above mentioned procedure is necessary only to work in parallel to the CONTROL (maybe to set commands not exposed in Karabo) or when the CONTROL device is in ERROR or not responsive, but before shutting it down.
How to update firmware and server¶
In order to upgrade the firmware version the CLI needs to be used. The slsDetectorSoftware client version has to be of the same version as the server running on the detector in need of an upgrade.
For server versions above v6.1.1 and for a detector running a version of firmware and software mutually compatible, a description is given on PSI FW_Upgrade page. The description assumes however expert users.
A more detailed description is given here:
Once the client has been initialized, clean the shared memory with:
sls_detector_get free
Set up the network:
sls_detector_put hostname MODULE-HOSTNAME
In order to update both server version and firmware, run:
sls_detector_put update gotthard2DetectorServervxxx xxx.rbf
Otherwise, to just update the firmware, run:
sls_detector_put programfpga xxx.rbf
where gotthard2DetectorServervxxx and xxx.rbf indicate respectively the server file and the .rbf file containing the firmware upgrade instructions, which should be located in the directory where the command is being launched. The procedure will last a few minutes, and progress will be communicated on the command line output.
If the detector is a 25 \(\mu m\) version, the update must be performed for both half-modules.
Older client versions or incompatible SW and FW versions¶
- In this case, a lot of the steps that hidden in the
update
command must be done explicitly. The following instructions - assume that the module has automatic server restart.
Copy the the gotthard2DetectorServervxxx to the module, via:
scp gotthard2DetectorServervxxx root@MODULE-HOSTNAME:~/
log in into the module via:
ssh root@MODULE-HOSTNAME
Subsequently, from there:
Give the server run privileges:
chmod 775 gotthard2DetectorServervxxx
Create a symbolic link:
ln -sf gotthard2DetectorServervxxx gotthard2DetectorServer
Create in the root directory a file named ‘update.txt’; this will allow the server to start in update mode at the following start, ignoring the checksums that verify software and firmware compatibility.
Power cycle the detector.
Now, from the client host, clean the shared memory:
sls_detector_get free
After restart, set up the network:
sls_detector_put hostname MODULE-HOSTNAME
For safety reasons, power off the chips:
sls_detector_put powerchip 0
Update the firmware:
sls_detector_put programfpga xxx.rbf
Remove the update mode:
sls_detector_put updatemode 0
Karabo CONTROL device down or stuck¶
If the Karabo CONTROL device goes down or it is somehow not responsive while the detector is in ACQUIRE state, shutting it down and trying to instantiate it again will result in a failure to connect, because the TCP port where all the instructions from server to RECEIVER are sent to is busy with data transfer. To properly restart the device, it is therefore necessary to stop the acquisition before instantiating the CONTROL device again.
There are three possible methods, which are listed below.
Killing the gotthard2DetectorServer¶
This method is usable by anyone who has access to the control network. It consists of killing the server running on the detector, in this way interrupting any operation currently ongoing. To do this, after shutting down the CONTROL device, follow these steps:
In the Karabo CONTROL device under the key ‘Detector Hostname’ the alias(es) of the readout board(s) are listed: one name for a 50 \(\mu m\) device, two names for a 25 \(\mu m\) model, e.g. ‘hostname1’, ‘hostname2’;
ssh onto each one of these devices with:
ssh root@hostname1
run:
killall gotthard2DetectorServer
This will kill all the gotthard2DetectorServer processes, but since they are configured for automatic respawn, they should restart immediately. Run:
ps ax
to check if the processes are running again; in label_ps_ax
an example of a healty detector module is shown.
After this has been done on all the relevant boards, the CONTROL device can be instantiated again.
Stopping the acquisition from command line¶
This is the cleanest and safest method, but it is the most elaborate and it requires xctrl access to the host where the Karabo CONTROL device was running. You can see its name in the ‘Host’ key in the device. Before shutting down the CONTROL device:
connect to the Host and find the DETECTOR_ID as explained in How to use Command Line Interface;
initialize the local Karabo environment if not done already;
from command line run:
sls_detector_put DETECTOR_ID-0: stop sls_detector_put DETECTOR_ID-0: rx_stop sls_detector_put DETECTOR_ID-0: clearbusy sls_detector_put DETECTOR_ID-0: highvoltage 0
Repeat this for the slave module by replacing ‘-0’ with ‘-1’ in the commands above, if there are two modules, i.e. if the detector is a 25 \(\mu m\) model.
Afterwards, the CONTROL device can be shut down and instantiated again.
Last resort¶
If the two above mentioned methods fail, the last resort option is to power cycle the detector by removing the enable. The firmware currently installed should power down the device in a safe way (provided the LV will be up long enough, i.e. 15 to 30 s), that is removing the HV first and only afterwards powering down the ASICs and the electronics. The safety of this procedure however, has not been thoroughly tested yet, and it is recommended to use it as a last resort and not as regular practice.
Emergency shutdown¶
In order to put the detector into a safe mode, it is useful to power it down. In this case, it is important that the power down procedure is performed correctly: removing the LV before or while also powering down the sensor is dangerous for the ASIC and should never be performed. As stated in `Power`_ section the correct power down procedure is:
- remove the HV; if using the built-in HV module, set High Voltage to zero, then wait for 10 - 15 seconds;
- remove the enable; at this point the firmware will start the power down of the electronics, starting from the ASICs, and it will we completed in few seconds (current will rapidly drop to a value close to zero);
- removing the LV is not necessary, it is however suggested to do so to put the detector in safe condition, when it will not be used for long periods of time (e.g. it is not necessary to remove it while power cycling).
After the detector is correctly powered down, it may be useful to put back the protective cover on it. It is also safe to disconnect the detector from the LV if there is the possibility of an uncontrolled delivery of LV and enable.
Power glitches¶
Power glitches that compromise the stability of the LV supply, can trigger an unsafe power down, by removing LV while the HV is still on. To avoid this, the firmware is equipped with a safe power down procedure, which, in case the enable signal is suddenly removed, powers down HV, ASICs and electronics in this order. For this it is important that the LV (and the LV only, not the enable) is connected to a UPS system that will allow at least 30 s of supply in case of emergency:
- for the custom power supply built internally at EuXFEL this is already the case;
- for the MPODs this is foreseen, but not yet implemented;
- for the power chord this is of course not implemented at all.
In any case, it is important to make sure that when power is back the detector does not get powered in an uncontrolled manner. To do this:
- after the safe power down, shut down the CONTROL and RECEIVER devices;
- check that the HV is set to zero; if power is provided through the built-in power module, make sure, when the Karabo CONTROL device is available again, that its property High Voltage is set equal to zero;
- check that LV and enable signal are not accidentally delivered in an uncontrolled way:
- while using the power chord, switch the button on the OFF position;
- while using Karabo-controlled power supplies, when the Karabo devices are available again, make sure that the LV and enable channels are OFF;
- instantiate the CONTROL and RECEIVER device(s), CONTROL device should go into UNKNOWN state;
- when ready, follow the power up procedure detailed in `Power`_.
Footnotes
[1] | https://slsdetectorgroup.github.io/devdoc/ |