Introduction to the usage of open data from the Large Hadron Collider for computer scientists in the context of machine learning
Timo Saala, Matthias Schott
SciPost Phys. Lect. Notes 96 (2025) · published 23 June 2025
- doi: 10.21468/SciPostPhysLectNotes.96
- Submissions/Reports
-
Abstract
Deep learning techniques have evolved rapidly in recent years, significantly impacting various scientific fields, including experimental particle physics. To effectively leverage the latest developments in computer science for particle physics, a strengthened collaboration between computer scientists and physicists is essential. As all machine learning techniques depend on the availability and comprehensibility of extensive data, clear data descriptions and commonly used data formats are prerequisites for successful collaboration. In this study, we converted open data from the Large Hadron Collider, recorded in the ROOT data format commonly used in high-energy physics, to pandas DataFrames, a well-known format in computer science. Additionally, we provide a brief introduction to the data's content and interpretation. This paper aims to serve as a starting point for future interdisciplinary collaborations between computer scientists and physicists, fostering closer ties and facilitating efficient knowledge exchange.
Supplementary Information
External links to supplemental resources; opens in a new tab.
Authors / Affiliation: mappings to Contributors and Organizations
See all Organizations.- 1 Timo Saala,
- 1 Matthias Schott