From f7bdc2acff3c13a6d632c28c4569690ab106eed7 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Picca=20Fr=C3=A9d=C3=A9ric-Emmanuel?= Date: Fri, 18 Aug 2017 14:48:52 +0200 Subject: Import Upstream version 0.5.0+dfsg --- doc/source/Tutorials/specfile_to_hdf5.rst | 323 ++++++++++++++++++++++++++++++ 1 file changed, 323 insertions(+) create mode 100644 doc/source/Tutorials/specfile_to_hdf5.rst (limited to 'doc/source/Tutorials/specfile_to_hdf5.rst') diff --git a/doc/source/Tutorials/specfile_to_hdf5.rst b/doc/source/Tutorials/specfile_to_hdf5.rst new file mode 100644 index 0000000..31f8383 --- /dev/null +++ b/doc/source/Tutorials/specfile_to_hdf5.rst @@ -0,0 +1,323 @@ + +SpecFile as HDF5 +================ + +Introduction to SPEC data files +------------------------------- + +SPEC data files are ASCII files. +They contain two general types of block of lines: + + - header lines starting with a ``#`` immediately followed by one or more characters + identifying the information that follows + - data lines + +Header lines +++++++++++++ + +There are two types of headers. The first type is the *file header*. File headers always start +with a ``#F`` line. +The metadata stored in a file header applies to all the content of the data file, until a +new file header is encountered. There can be more than one file header, but a file with +multiple headers can be treated as multiple SPEC files concatenated into a single one. +File headers are sometimes missing. + +A file header contains general information: + + - ``#F`` - file name + - ``#E`` - epoch + - ``#D`` - file time and date + - ``#C`` - First comment (SPEC title, SPEC user) + - ``#O`` - Motor names (separated by at least two blank spaces) + +The second type of header is the *scan header*. A scan header must start with a ``#S`` line +and must be preceded by an empty line. This also applies to files without file headers: in +such a case, the file must start with an empty line. +The metadata stored in scan headers applies to a single block of data lines. + +A scan header contains following information: + + - ``#S`` - Mandatory first line showing the scan number and the + command that was used to record the scan + - ``#D`` - scan time and date + - ``#Q`` - *H, K, L* values + - ``#P`` - Motor positions (corresponding motor names are in file header ``#O``) + - ``#N`` - Number of data columns in the following data block + - ``#L`` - Column labels (``#N`` labels separated by two blank spaces) + +Users can also define their own type of header lines in their macros. + +There can sometimes be a block of scan header lines after a data block, but before the ``#S`` of the next +scan. + +Data lines +++++++++++ + +Data blocks are structured as 2D arrays. Each line contains ``#N`` values, each value +corresponding to the label with the same position in the ``#L`` scan header line. +This implies that each column corresponds to one series of measurements. + +A column typically contains motor positions for a given positioner, a timestamp or the measurement +of a sensor. + +MCA data +++++++++ + +Newer SPEC files can also contain multi-channel analyser data, in between each *normal* data line. +A multichannel analyser records multiple values per single measurement. +This is typically a histogram of number of counts against channels (*MCA spectrum*), to analyze energy distribution +of a process. + +SPEC data files containing MCA data have additional scan header lines: + + - ``#@MCA %16C`` - a spectrum will usually extend for more than one line. + This indicates a number of 16 values per line. + - ``#@CHANN`` - contains 4 values: + + - the number of channels per spectrum + - the first channel number + - the last channel number + - the increment between two channel numbers (usually 1) + - ``#@CALIB`` - 3 polynomial calibration values a, b, c. ( i.e. energy = a + b * channel + c * channel ^ 2) + - ``#@CTIME`` - 3 values: preset time, live time, elapsed time + +The actual MCA data for a single spectrum usually spans over multiple lines. +A spectrum starts on a new line with a ``@A``, and when it span over multiple lines, all +lines except the last one end with a continuation character ``\``. + +Example of SPEC file +++++++++++++++++++++ + +Example of file header:: + + #F ./data/binary_mixtures_mca1.100211 + #E 1295362398 + #D Thu Feb 10 22:43:43 2011 + #C id10b User = opid10 + #O0 delta gamma omega theta mu sigma sigmat xt + #O1 zt zt1 thd chid rhod xd yd zd + #O2 att0 arcf zf PhiD phigH chigH ygH + #O3 zgH phigV chigV xgV ygV zgV gslithg gslitho + #O4 gslitvo gslitvg slit1T slit1B slit1F slit1R slit1hg slit1ho + #O5 slit1vg slit1vo s0T s0B s0R s0F + #O6 s0hg s0ho s0vg s0vo TRT + #O7 pi trough hv1 mpxthl apdwin apdthl apdhv xcrl2 + #O8 thcrl2 zcrl2 picou picod vdrift vmulti vglo vghi + #O9 rien + +Example of scan and data block, without MCA:: + + #S 30 ascan tz3 29.35 29.75 100 0.5 + #D Sat Oct 31 15:43:21 1998 + #T 0.5 (Seconds) + #G0 0 + #G1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 + #G2 0 + #Q + #P0 40.135381 40.262001 65.6 70 35 -1.83 0 -36.1 + #P1 0 0 -1.98 0 0 35.6 86.2 -29.5 + #P2 3.0688882 24.893749 295.98749 28 -27.249938 + #N 22 + #L TZ3 Epoch Seconds If2 If3 If5 If6 If7 If8 I0 It ItdI0 If1dI0 If2dI0 If3dI0 If4dI0 If5dI0 If6dI0 If7dI0 If8dI0 If1 If4 + 29.35 45246 0.000264 478 302 206 201 209 264 177860 646 0.00363207 0.00468346 0.00268751 0.00169796 0.00146745 0.00115821 0.0011301 0.00117508 0.00148431 833 261 + 29.353976 45249 0.000295 549 330 219 208 227 295 178021 684 0.00384224 0.00537577 0.00308391 0.00185371 0.00158408 0.00123019 0.0011684 0.00127513 0.00165711 957 282 + 29.357952 45251 0.000313 604 368 231 215 229 313 178166 686 0.00385034 0.00603931 0.0033901 0.00206549 0.00166698 0.00129654 0.00120674 0.00128532 0.00175679 1076 297 + 29.362028 45253 0.000333 671 390 237 226 236 333 178387 672 0.00376709 0.00683346 0.00376148 0.00218626 0.00176582 0.00132857 0.00126691 0.00132297 0.00186673 1219 315 + 29.366004 45256 0.000343 734 419 248 229 236 343 178082 664 0.00372862 0.00765939 0.0041217 0.00235285 0.00185308 0.00139262 0.00128592 0.00132523 0.00192608 1364 330 + 29.36998 45258 0.00036 847 448 254 229 248 360 178342 668 0.00374561 0.00857342 0.0047493 0.00251203 0.00194009 0.00142423 0.00128405 0.00139059 0.00201859 1529 346 + +Synthetic example of file with 3 scans. The last scan includes data of 3 multichannel analysers, sharing the +same MCA header. + +:: + + #F /tmp/sf.dat + #E 1455180875 + #D Thu Feb 11 09:54:35 2016 + #C imaging User = opid17 + #O0 Pslit HGap MRTSlit UP MRTSlit DOWN + #O1 Sslit1 VOff Sslit1 HOff Sslit1 VGap + #o0 pshg mrtu mrtd + #o2 ss1vo ss1ho ss1vg + + #S 1 ascan ss1vo -4.55687 -0.556875 40 0.2 + #D Thu Feb 11 09:55:20 2016 + #T 0.2 (Seconds) + #P0 180.005 -0.66875 0.87125 + #P1 14.74255 16.197579 12.238283 + #N 3 + #L MRTSlit UP second column 3rd_col + -1.23 5.89 8 + 8.478100E+01 5 1.56 + 3.14 2.73 -3.14 + 1.2 2.3 3.4 + + #S 25 ascan c3th 1.33245 1.52245 40 0.15 + #D Sat 2015/03/14 03:53:50 + #P0 80.005 -1.66875 1.87125 + #P1 4.74255 6.197579 2.238283 + #N 4 + #L column0 column1 col2 col3 + 0.0 0.1 0.2 0.3 + 1.0 1.1 1.2 1.3 + 2.0 2.1 2.2 2.3 + 3.0 3.1 3.2 3.3 + + #S 1 aaaaaa + #D Thu Feb 11 10:00:32 2016 + #@MCA %16C + #@CHANN 20 0 19 1 + #@CALIB 1.2 2.3 3.4 + #@CTIME 123.4 234.5 345.6 + #N 2 + #L uno duo + 1 2 + @A 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15\ + 16 17 18 19 + 3 4 + @A 0 0 2 4 15 10 5 1 0 0 0 0 1 0 0 0\ + 0 0 0 0 + 5 6 + @A 0 0 0 0 5 7 2 0 0 0 0 0 1 0 0 0\ + 0 0 0 1 + +Reading a SpecFile as an HDF5 file +---------------------------------- + +Introduction to the spech5 module ++++++++++++++++++++++++++++++++++ + +The *silx* module :mod:`silx.io.spech5` can be used to expose SPEC files in a hierarchical tree structure +and access them through an API that mimics the *h5py* Python library used to read HDF5 files. + +The structure exposed is as follows:: + + / + 1.1/ + title = "…" + start_time = "…" + instrument/ + specfile/ + file_header = ["…", "…", …] + scan_header = ["…", "…", …] + positioners/ + motor_name = value + … + mca_0/ + data = … + calibration = … + channels = … + preset_time = … + elapsed_time = … + live_time = … + + mca_1/ + … + … + measurement/ + colname0 = … + colname1 = … + … + mca_0/ + data -> /1.1/instrument/mca_0/data + info -> /1.1/instrument/mca_0/ + … + 2.1/ + … + +Scans appear as *Groups* at the root level. The name of a scan group is +made of two numbers, the first one being the *scan number* from the ``#S`` +header line, and the second one being the *scan order*. +If a scan number appears multiple times in a SPEC file, the scan order is incremented. +For examples, the scan *3.2* designates the second occurence of scan number 3 in a given file. + +Data is stored in the ``measurement`` subgroup, one dataset per column. The dataset name +is the column label as it appears on the ``#L`` header line. + +The ``instrument`` subgroup contains following subgroups: + + - ``specfile`` - contains two datasets, ``file_header`` and ``scan_header``, + containing all header lines as a long string. Lines are delimited by the ``\n`` character. + - ``positioners`` - contains one dataset per motor (positioner), containing + either the single motor position from the ``#P`` header line, or a complete 1D array + of positions if the motor names corresponds to a data column (i.e. if the motor name + from the ``#O`` header line is identical to a label on the ``#L`` header line) + - one subgroup per MCA analyser/device containing a 2D ``data`` array with all spectra + recorded by this analyser, as well as datasets for the various MCA metadata + (``#@`` header lines). The first dimension of the ``data`` array corresponds to the number + of points and the second one to the spectrum length. + + +In addition the the data columns, this group contains one subgroup per MCA analyser/device +with links to the data already contained in ``instrument/mca_...`` + +spech5 examples ++++++++++++++++ + +Accessing groups and datasets: + +.. code-block:: python + + from silx.io.spech5 import SpecH5 + + # Open a SpecFile + sfh5 = SpecH5("test.dat") + + # using SpecH5 as a regular group to access scans + scan1group = sfh5["1.1"] # This retrieves scan 1.1 + scan1group = sfh5[0] # This retrieves the first scan irrespectively of its number. + instrument_group = scan1group["instrument"] + + # alternative: full path access + measurement_group = sfh5["/1.1/measurement"] + + # accessing a scan data column by name as a 1D numpy array + data_array = measurement_group["Pslit HGap"] + + # accessing all mca-spectra for one MCA device as a 2D array + mca_0_spectra = measurement_group["mca_0/data"] + + +Files and groups can be treated as iterators, which allows looping through them. + +.. code-block:: python + + # get all column names (labels) in all scans in a file + for scan_group in SpecH5("test.dat"): + dataset_names = [item.name in scan_group["measurement"] if not + item.name.startswith("mca")] + print("Found labels in scan " + scan_group.name + " :") + print(", ".join(dataset_names)) + +Converting SPEC data to HDF5 +++++++++++++++++++++++++++++ + +The *silx* module :mod:`silx.io.spectoh5` can be used to convert a SPEC file into a +HDF5 file with the same structure as the one exposed by the :mod:`spech5` module. + +.. code-block:: python + + from silx.io.spectoh5 import convert + + convert("/home/pierre/myspecfile.dat", "myfile.h5") + + +You can then read the file with any HDF5 reader. + + +In addition to the function :func:`silx.io.spectoh5.convert`, which is simplified +on purpose, you can use the more flexible :func:`silx.io.spectoh5.write_spec_to_h5`. + +This way, you can choose to write scans into a specific HDF5 group in the output directory. +You can also decide whether you want to overwrite an existing file, or append data to it. +You can specify whether existing data with the same name as input data should be overwritten +or ignored. + +This allows you to repeatedly transfer new content of a SPEC file to an existing HDF5 file, in between +two scans. + +The following script is an example of a command line interface to :func:`write_spec_to_h5`. + +.. literalinclude:: ../../../examples/spectoh5.py + :lines: 42- + -- cgit v1.2.3