Machine Learning-based Spectra Classification

Nearly 120 IT professionals, scientists and managers from the Photon and Neutron (PaN) community attended the 2nd European PaN EOSC Symposium organised jointly by PaNOSC and ExPaNDS on 26th October 2021. The second part of the first session focused on a selection of use cases relating to some of the tools and services developed in the EOSC projects, for FAIR data catalogues, data analysis and simulation.

Here is the fifth use case. The presentation starts 1h42m49s in.

Spectroscopy experiment techniques are widely used and produce huge amounts of data especially in facilities with very high repetition rates. At the European XFEL, X-ray pulses can be generated with only 220ns separation in time and a maximum of 27000 pulses per second. In experiments (e.g. SCS, FXE, MID, and HED) at European XFEL, spectral changes can indicate the change of the system under investigation and so the progress of the experiment. Immediate feedback on the actual status (e.g., time-resolved status of the sample) would be essential to quickly judge how to proceed with the experiment. The major spectral changes that we aim to capture are either the change of intensity distribution (e.g., drop or appearance) of peaks at certain locations, or the shift of those on the spectrum.

Machine Learning (ML) opens up new avenues for data-driven analysis in spectroscopy by offering the possibility to quickly recognise such specific changes on-the-fly during data collection, and it usually requires lots of data that are clearly annotated. Hence, it is important that research outputs should align with the FAIR principles. For XFEL experiments, it is suggested to introduce NeXus data format standards in future experiments.

Yue Sun presented an example to show how Neural Network-based ML can be used for accurately classifying the system state if data is properly provided. A solution has been demonstrated, to automatically find the regions (or bins) with high separability where the spectra classes differ significantly. By teaching individual neural networks for each bin and combining them with a weighting technique, a robust classification of any new spectral curve can be quickly obtained.

Scientific topics: spectroscopy

Keywords: machine learning, XFEL, FAIR, large dataset, NeXus, spectroscopy, wp5-ExPaNDS

Resource type: video, slides

Target audience: PaN Community, scientists

Licence: Creative Commons Attribution 4.0

Language: English

DOI: 10.5281/zenodo.5636331

Authors: Yue Sun
Machine Learning-based Spectra Classification Yue Sun presents: Machine Learning-based Spectra Classification at the 2nd PaNOSC and ExPaNDS PaN EOSC Symposium (October 2021). spectroscopy machine learning, XFEL, FAIR, large dataset, NeXus, spectroscopy, wp5-ExPaNDS PaN Community scientists