Computational Reproducibility with the shell, Git and Docker
This workshop provides an overview over tools for computational reproducibility in research. At the end of this workshop participants will be able to navigate via the shell and use it for version control with Git. They will know what types of projects require which forms of computing environments...
Keywords: scientific software, reproducibility, docker container, version control, shell
Resource type: slides
Computational Reproducibility with the shell, Git and Docker
https://osf.io/ar57z/
https://pan-training.eu/materials/computational-reproducibility-with-the-shell-git-and-docker
This workshop provides an overview over tools for computational reproducibility in research. At the end of this workshop participants will be able to navigate via the shell and use it for version control with Git. They will know what types of projects require which forms of computing environments and be able to set up a Docker container to run computations in a reproducible way.
scientific software, reproducibility, docker container, version control, shell
research data scientist
PhD students
researchers
scientists
Containerized Serial Crystallography “CrystFEL” VISA workflow
Experiments generate up to 150 TB per day of data saved at the measurement facility. Such large datasets are impractical for users to take home.
Subsequent analysis needs to be performed remotely making it attractive for deployment as a cloud-like use case. Involving EOSC in the analysis and...
Scientific topics: crystallography
Keywords: CrystFEL, Serial crystallography, pulsed X-ray beam, VISA
Resource type: workflow
Containerized Serial Crystallography “CrystFEL” VISA workflow
https://pan-training.eu/workflows/containerized-serial-crystallography-visa-workflow
https://pan-training.eu/materials/containerized-serial-crystallography-crystfel-visa-workflow
Experiments generate up to 150 TB per day of data saved at the measurement facility. Such large datasets are impractical for users to take home.
Subsequent analysis needs to be performed remotely making it attractive for deployment as a cloud-like use case. Involving EOSC in the analysis and re-use of this data is an appropriate use case.
Serial crystallography is a beam-line technique for collecting information on the structure of a protein without growing large protein crystals. Instead, a large number of small protein crystals are held in a pulsed X-ray beam. In a second step, the series of produced images are used to reconstruct a precise 3-D image of the protein structure. Serial crystallography is the preferred technique for obtaining diffraction data of proteins at room temperature, where radiation damage from the X-ray beam starts rapidly. The standard software for analysing serial crystallography is “CrystFEL”.
The proposed workflow was rendered in a standard fashion, which would allow it to be easily adopted by arbitrary systems or also other containerized applications. The only requirements are an Apptainer installation on the system and a Docker or Singularity/Apptainer image of the application, as well as an adjustment of the configuration file for the wrapper script.
crystallography
CrystFEL, Serial crystallography, pulsed X-ray beam, VISA
research data scientist
scientific
Full-field Tomography at PSI
This workflow has some details on the instrument the data is produced from (TOMCAT beamline) and the infrastructure PSI has concerning their data.
If you are more interested in the science and want to reproduce the data and not bother with the surrounding details/context, please refer to the...
Keywords: synchrotron, imaging, Jupyter notebooks, Python, Pulmonary arterial hypertension
Resource type: workflow
Full-field Tomography at PSI
https://pan-training.eu/workflows/backup-fork-of-full-field-tomography-at-psi-wip#workflow
https://pan-training.eu/materials/full-field-tomography-at-psi
This workflow has some details on the instrument the data is produced from (TOMCAT beamline) and the infrastructure PSI has concerning their data.
If you are more interested in the science and want to reproduce the data and not bother with the surrounding details/context, please refer to the Pulmonary arterial hypertension research workflow.
Full-field Tomography at PSI
Tomography datasets often present large volumes (100 GBs - few TBs) difficult to compress and transfer. The tomographic reconstruction is highly demanding on compute (GPU) and storage resources for the intermediate and/or final result. In addition, the optional image segmentation step may be demanding on computer memory.
The offline analysis (after experiment) could be performed remotely by users at home making it attractive for deployment as a cloud-like use case. Finally, this technique is applied at many facilities and in different scientific domains - therefore a portable result is more useful.
This entire process is illustrated with a typical experiment.
synchrotron, imaging, Jupyter notebooks, Python, Pulmonary arterial hypertension
research data scientist
life scientists
Offline data analysis tutorial
Tutorial on European XFEL offline data analysis as per example of MID instrument data
Scientific topics: small angle x-ray scattering
Keywords: Extra-data, Extra-geom, pyFAI, data analysis
Resource type: jupyter notebook
Offline data analysis tutorial
https://indico.desy.de/event/33127/contributions/117081/
https://pan-training.eu/materials/offline-analysis-tutorial
Tutorial on European XFEL offline data analysis as per example of MID instrument data
small angle x-ray scattering
Extra-data, Extra-geom, pyFAI, data analysis
PaN users
beamline users
research data scientist
A deep dive into the mathematics of Machine Learning
Crash course on Machine Learning
Keywords: machine learning
Resource type: slides
A deep dive into the mathematics of Machine Learning
https://indico.desy.de/event/33127/contributions/117087/attachments/71317/91043/user_meeting_ml_intro_jan_2022.pdf
https://pan-training.eu/materials/a-deep-dive-into-the-mathematics-of-machine-learning
Crash course on Machine Learning
machine learning
research data scientist
masters students
PhD students
Jupyter notebooks on Machine Learning for scientific data analysis
Jupyter Notebooks serving as supplementary material for a tutorial on Machine Learning, originally presented at the 2022 European XFEL user meeting.
Resource type: jupyter notebook
Jupyter notebooks on Machine Learning for scientific data analysis
https://git.xfel.eu/danilo/ml-tutorial
https://pan-training.eu/materials/jupyter-notebooks-on-machine-learning-for-scientific-data-analysis
Jupyter Notebooks serving as supplementary material for a tutorial on Machine Learning, originally presented at the 2022 European XFEL user meeting.
research data scientist
PhD students
masters students
RDMO - Research Data Management Organiser
A tool to support the planning, implementation, and organisation of research data management.
Keywords: FAIR, DMP
Resource type: tool
RDMO - Research Data Management Organiser
https://rdmo.aip.de/
https://pan-training.eu/materials/rdmo-research-data-management-organiser
A tool to support the planning, implementation, and organisation of research data management.
FAIR, DMP
data curator
research data scientist
research data engineer
instrument scientist
PaN users
PaNdata software catalogue
PaNdata software catalogue is a database of software used mainly for data analysis of neutron and photon experiments.
Scientific topics: photon and neutron technique, crystallography, imaging, macromolecular crystallography, nuclear resonant scattering, powder diffraction, ptychography, radiotherapy, small angle x-ray scattering, small angle inelastic scattering, single crystal diffraction, surface crystallography, tomography
Keywords: software, FAIR, catalogue
Resource type: tool
PaNdata software catalogue
https://software.pan-data.eu/
https://pan-training.eu/materials/pandata-software-catalogue
PaNdata software catalogue is a database of software used mainly for data analysis of neutron and photon experiments.
photon and neutron technique
crystallography
imaging
macromolecular crystallography
nuclear resonant scattering
powder diffraction
ptychography
radiotherapy
small angle x-ray scattering
small angle inelastic scattering
single crystal diffraction
surface crystallography
tomography
software, FAIR, catalogue
scientists
neutron community
Photon Community
research data scientist
Delivering data services to EOSC
Wiki page recording the ExPaNDS training workshop on data services for EOSC (06/04/2021)
Keywords: OAI-PMH, metadata, harvesting, SciCat, ICAT, B2FIND, OpenAIRE, research data, wp3-ExPaNDS
Resource type: wiki
Delivering data services to EOSC
https://github.com/ExPaNDS-eu/ExPaNDS/wiki/Delivering-data-services-to-EOSC
https://pan-training.eu/materials/delivering-data-services-to-eosc
Wiki page recording the ExPaNDS training workshop on data services for EOSC (06/04/2021)
OAI-PMH, metadata, harvesting, SciCat, ICAT, B2FIND, OpenAIRE, research data, wp3-ExPaNDS
data curator
research data scientist