Register training material Upload training material to Zenodo
9 materials found

Target audience: research data scientist 


Computational Reproducibility with the shell, Git and Docker

This workshop provides an overview over tools for computational reproducibility in research. At the end of this workshop participants will be able to navigate via the shell and use it for version control with Git. They will know what types of projects require which forms of computing environments...

Keywords: scientific software, reproducibility, docker container, version control, shell

Resource type: slides

Computational Reproducibility with the shell, Git and Docker https://pan-training.eu/materials/computational-reproducibility-with-the-shell-git-and-docker This workshop provides an overview over tools for computational reproducibility in research. At the end of this workshop participants will be able to navigate via the shell and use it for version control with Git. They will know what types of projects require which forms of computing environments and be able to set up a Docker container to run computations in a reproducible way. scientific software, reproducibility, docker container, version control, shell research data scientist PhD students researchers scientists
Containerized Serial Crystallography “CrystFEL” VISA workflow

Experiments generate up to 150 TB per day of data saved at the measurement facility. Such large datasets are impractical for users to take home. Subsequent analysis needs to be performed remotely making it attractive for deployment as a cloud-like use case. Involving EOSC in the analysis and...

Scientific topics: crystallography

Keywords: CrystFEL, Serial crystallography, pulsed X-ray beam, VISA

Resource type: workflow

Containerized Serial Crystallography “CrystFEL” VISA workflow https://pan-training.eu/materials/containerized-serial-crystallography-crystfel-visa-workflow Experiments generate up to 150 TB per day of data saved at the measurement facility. Such large datasets are impractical for users to take home. Subsequent analysis needs to be performed remotely making it attractive for deployment as a cloud-like use case. Involving EOSC in the analysis and re-use of this data is an appropriate use case. Serial crystallography is a beam-line technique for collecting information on the structure of a protein without growing large protein crystals. Instead, a large number of small protein crystals are held in a pulsed X-ray beam. In a second step, the series of produced images are used to reconstruct a precise 3-D image of the protein structure. Serial crystallography is the preferred technique for obtaining diffraction data of proteins at room temperature, where radiation damage from the X-ray beam starts rapidly. The standard software for analysing serial crystallography is “CrystFEL”. The proposed workflow was rendered in a standard fashion, which would allow it to be easily adopted by arbitrary systems or also other containerized applications. The only requirements are an Apptainer installation on the system and a Docker or Singularity/Apptainer image of the application, as well as an adjustment of the configuration file for the wrapper script. crystallography CrystFEL, Serial crystallography, pulsed X-ray beam, VISA research data scientist scientific
Full-field Tomography at PSI

This workflow has some details on the instrument the data is produced from (TOMCAT beamline) and the infrastructure PSI has concerning their data. If you are more interested in the science and want to reproduce the data and not bother with the surrounding details/context, please refer to the...

Keywords: synchrotron, imaging, Jupyter notebooks, Python, Pulmonary arterial hypertension

Resource type: workflow

Full-field Tomography at PSI https://pan-training.eu/materials/full-field-tomography-at-psi This workflow has some details on the instrument the data is produced from (TOMCAT beamline) and the infrastructure PSI has concerning their data. If you are more interested in the science and want to reproduce the data and not bother with the surrounding details/context, please refer to the Pulmonary arterial hypertension research workflow. Full-field Tomography at PSI Tomography datasets often present large volumes (100 GBs - few TBs) difficult to compress and transfer. The tomographic reconstruction is highly demanding on compute (GPU) and storage resources for the intermediate and/or final result. In addition, the optional image segmentation step may be demanding on computer memory. The offline analysis (after experiment) could be performed remotely by users at home making it attractive for deployment as a cloud-like use case. Finally, this technique is applied at many facilities and in different scientific domains - therefore a portable result is more useful. This entire process is illustrated with a typical experiment. synchrotron, imaging, Jupyter notebooks, Python, Pulmonary arterial hypertension research data scientist life scientists
Offline data analysis tutorial

Tutorial on European XFEL offline data analysis as per example of MID instrument data

Scientific topics: small angle x-ray scattering

Keywords: Extra-data, Extra-geom, pyFAI, data analysis

Resource type: jupyter notebook

Offline data analysis tutorial https://pan-training.eu/materials/offline-analysis-tutorial Tutorial on European XFEL offline data analysis as per example of MID instrument data small angle x-ray scattering Extra-data, Extra-geom, pyFAI, data analysis PaN users beamline users research data scientist
A deep dive into the mathematics of Machine Learning

Crash course on Machine Learning

Keywords: machine learning

Resource type: slides

A deep dive into the mathematics of Machine Learning https://pan-training.eu/materials/a-deep-dive-into-the-mathematics-of-machine-learning Crash course on Machine Learning machine learning research data scientist masters students PhD students
Jupyter notebooks on Machine Learning for scientific data analysis

Jupyter Notebooks serving as supplementary material for a tutorial on Machine Learning, originally presented at the 2022 European XFEL user meeting.

Resource type: jupyter notebook

Jupyter notebooks on Machine Learning for scientific data analysis https://pan-training.eu/materials/jupyter-notebooks-on-machine-learning-for-scientific-data-analysis Jupyter Notebooks serving as supplementary material for a tutorial on Machine Learning, originally presented at the 2022 European XFEL user meeting. research data scientist PhD students masters students
RDMO - Research Data Management Organiser

A tool to support the planning, implementation, and organisation of research data management.

Keywords: FAIR, DMP

Resource type: tool

RDMO - Research Data Management Organiser https://pan-training.eu/materials/rdmo-research-data-management-organiser A tool to support the planning, implementation, and organisation of research data management. FAIR, DMP data curator research data scientist research data engineer instrument scientist PaN users
PaNdata software catalogue

PaNdata software catalogue is a database of software used mainly for data analysis of neutron and photon experiments.

Scientific topics: photon and neutron technique, crystallography, imaging, macromolecular crystallography, nuclear resonant scattering, powder diffraction, ptychography, radiotherapy, small angle x-ray scattering, small angle inelastic scattering, single crystal diffraction, surface crystallography, tomography

Keywords: software, FAIR, catalogue

Resource type: tool

PaNdata software catalogue https://pan-training.eu/materials/pandata-software-catalogue PaNdata software catalogue is a database of software used mainly for data analysis of neutron and photon experiments. photon and neutron technique crystallography imaging macromolecular crystallography nuclear resonant scattering powder diffraction ptychography radiotherapy small angle x-ray scattering small angle inelastic scattering single crystal diffraction surface crystallography tomography software, FAIR, catalogue scientists neutron community Photon Community research data scientist
Delivering data services to EOSC

Wiki page recording the ExPaNDS training workshop on data services for EOSC (06/04/2021)

Keywords: OAI-PMH, metadata, harvesting, SciCat, ICAT, B2FIND, OpenAIRE, research data, wp3-ExPaNDS

Resource type: wiki

Delivering data services to EOSC https://pan-training.eu/materials/delivering-data-services-to-eosc Wiki page recording the ExPaNDS training workshop on data services for EOSC (06/04/2021) OAI-PMH, metadata, harvesting, SciCat, ICAT, B2FIND, OpenAIRE, research data, wp3-ExPaNDS data curator research data scientist