Fysikdagarna 2023

Name: Fysikdagarna 2023
Start: 2023-06-14T08:50:00+02:00
End: 2023-06-16T17:00:00+02:00
Location: AlbaNova Main Building

14–16 Jun 2023

AlbaNova Main Building

Europe/Stockholm timezone

Machine learning based compression for scientific data

Not scheduled

15m

Oskar Klein Auditorium FR4 (AlbaNova Main Building)

Oskar Klein Auditorium FR4

AlbaNova Main Building

Roslagstullsbacken 21, 114 21 Stockholm

Poster Sektionen för elementarpartikel och astropartikelfysik Sektionen för elementarpartikel och astropartikelfysik

Alexander Ekman (Lund University (SE)) Axel Gallén (Lund University (SE))

One common issue in vastly different fields of research and industry is the ever-increasing need for more data storage. With experiments taking more complex data at higher rates, the data recorded is quickly outgrowing the storage capabilities. This issue is very prominent in LHC experiments such as ATLAS where in five years the resources needed are expected to be many times larger than the storage available (assuming a flat budget model and current technology trends) [1]. Since the data formats used are already highly compressed, storage constraints could require more drastic measures such as lossy compression, where some data accuracy is lost during the compression process.

In our work, following from a number of undergraduate projects [2,3,4,5,6,7], we have developed an interdisciplinary open-source tool for machine learning-based lossy compression. The tool utilizes an autoencoder neural network, which is trained to compress and decompress data based on correlations between the different variables in the dataset. The process is lossy, meaning that the original data values and distributions cannot be reconstructed precisely. However, for certain variables and observables where the precision loss is tolerable, the high compression ratio allows for more data to be stored yielding greater statistical power.

The tool we have developed is called Baler and is available as an open source project [8][9].

[1] - https://cerncourier.com/a/time-to-adapt-for-big-data/
[2] - http://lup.lub.lu.se/student-papers/record/9049610
[3] - http://lup.lub.lu.se/student-papers/record/9012882
[4] - http://lup.lub.lu.se/student-papers/record/9004751
[5] - http://lup.lub.lu.se/student-papers/record/9075881
[6] - https://zenodo.org/record/5482611#.Y3Yysy2l3Jz
[7] - https://zenodo.org/record/4012511#.Y3Yyny2l3Jz
[8] - https://zenodo.org/record/7817467#.ZED-65FBzmE
[9] - https://github.com/baler-collaboration/baler

Alexander Ekman (Lund University (SE)) Axel Gallén (Lund University (SE))

There are no materials yet.

Fysikdagarna 2023

Machine learning based compression for scientific data

Oskar Klein Auditorium FR4

AlbaNova Main Building

Speakers

Description

Primary authors

Presentation materials

Choose timezone

Fysikdagarna 2023

Speakers

Description

Primary authors

Presentation materials