Parton density extractions with the Bayesian analysis toolkit

Ritu Aggarwal, Michiel Botje, Allen Caldwell, Francesca Capel, Oliver Schulz

Hadrons, such as protons and neutrons, are made up of quarks held together by the strong force. At high energy scales, the valence quarks that define these hadrons exist in a sea of virtual quarks and gluons. The parton distribution functions (PDFs) describe this structure and are of fundamental importance to our understanding of quantum chromodynamics (QCD), as well as its application to LHC physics and the development of cosmic ray air showers in the Earth's atmosphere. PDFs can be extracted from accelerator measurements in which hadrons are probed through collisions with electrons. A limitation of existing approaches to analysing this data is the reliance on the chi-square statistic and the coupled assumption of Normal-distributed observations. We are working on a new statistical method for PDF extraction, which overcomes this limitation by forward modelling the problem from an input PDF to the expected number of events in a detector. This model will then be fit using Markov Chain Monte Carlo to enable inference of the PDF parameters. Our project builds on the QCDNUM software for fast QCD evolution and the Bayesian Analysis Toolkit developed at the ODSL to allow inference. We initially focus on the "high-x" regime, where the chi-square method cannot be used due to low event numbers.

Searching for astrophysical neutrino sources with Bayesian hierarchical modelling

Francesca Capel, Christian Haack, Martin Ha Minh, Hans Niederhausen,  Elisa Resconi, Lisa Schumacher

The IceCube neutrino observatory has discovered a diffuse flux of astrophysical neutrinos. However, the nature of the responsible sources is difficult to determine, even with over ten years of data. Identifying individual sources is problematic because, at the highest energies, where the atmospheric neutrino background vanishes, only a handful of neutrinos are recorded each year. Additionally, these neutrinos can reach us from vast cosmological distances. As the production of neutrinos is closely connected to both the acceleration of cosmic rays and the production of gamma-rays, understanding the origins of the IceCube neutrinos falls into a larger multi-messenger picture. Existing point source search techniques are limited to a low number of free parameters to be computationally tractable and focus on single datasets to reduce complexity. We are working on an alternative approach employing a Bayesian hierarchical model. This method can overcome the limitations mentioned above and allow us to include more information about the underlying astrophysics in the form of Bayesian priors. It is possible to carry out more exhaustive searches for specific source models, complementing existing work and potentially uncovering weak sources in particular cases. The extendability of the framework means that it could also be applied to fit data from multiple detectors or messengers in the future.

Development of statistical tests for neutrino flare searches in IceCube

Martina Karl, Elisa Resconi, Lolian Shtembari, Philipp Eller

In 2018, the IceCube collaboration presented the detection of a neutrino flare coming from the direction of a previously detected high energy neutrino alert event. We are extending the search for neutrino flares in IceCube archival data at the positon of other IceCube high energy alert events. Together with the ODSL, we are developping the statisyical techniques for a highly sensitive, fast and accurate detection of significant flares.

The treatement based on a likelihood-ratio test statistic is computationally challanging, and we are speeding up this process using stat-of-the-art techniques, such as expectation maximization. At the same time, we are also investigating completely new test statistics, for eaxmple ones based on the spacings in time between consecutive events.

Fast Spectrum Calculation with Neutral Networks for the KATRIN experiment

Christian Karl, Susanne Mertens, Philipp Eller

The Karlsruhe Tritium Neutrino (KATRIN) experiment is designed to directly measure the neutrino mass with a sensitivity of 0.2 eV (90% CL), by performing a high-precision spectroscopic measurement of the tritium beta decay spectrum close to the endpoint, where the neutrino mass manifests itself as a small shape distortion. One of the main challenges of the experiment is a precise modeling of the spectral shape
and the treatment of all systematic effects. Current methods are based on nummerical integration, which are associated with a computational cost. Together with the ODSL, we are develop a solution to overcome this computational bottleneck
by using multi-dimensional interpolation, obtained by a neural network.

Interactive Visualization of 3D Galactic Dust Maps

Reimar Leike, Torsten Enßlin, Jakob Knollmüller

Galactic dust grains are aggregations of molecules in the interstellar medium (ISM), which form preferentially at denser locations. Dust plays a central role in galactic physics and obscures the view on stars behind it by its optical light absorption and confuses our view on the comic microwave background by its emission of the absorbed energy at longer wavelength.
Knowing the 3D dust distribution in the Milky Way is of paramount importance for a number of scientific questions, ranging from understanding ISM chemical processes like the formation of proto-biological molecules, over star formation, Galactic magnetism, to cosmology.
The goal of the project is to build a web service to query sub-volumes and visualizations of three dimensional Galactic dust maps.

Chemical Networks

Tommaso Grassi, Barbara Ercolano, Jakob Knollmüller

In many astrophysical applications, the cost of evolving in time a chemical network represented by a system of ordinary differential equations (ODEs) grows significantly with its size, and often represent a significant computational bottleneck. In this project, we explore a new class of methods that take advantage of machine learning techniques to reduce complex data sets (autoencoders), the optimization of multi-parameter systems (standard backpropagation), and the robustness of well-established ODE solvers to to explicitly incorporate time-dependence. This new method allows us to find a compressed and simplified version of a large chemical network in a semi-automated fashion that can be solved with a standard ODE solver, while also enabling interpretability of the compressed, latent network. (

Other Projects / Student Projects

The Bayesian Analysis Toolkit

The Bayesian Analysis Toolkit, BAT, is a software package which is designed to help solve statistical problems encountered in Bayesian inference. BAT is based on Bayes' Theorem and is currently realized with the use of Markov Chain Monte Carlo. This gives access to the full posterior probability distribution and enables straightforward parameter estimation, limit setting and uncertainty propagation.  Novel sampling methods, optimization schemes and parallelization are example development areas.

UltraNest Bayesian inference engine

When scientific models are compared to data, two tasks are important: 1) contraining the model parameters and 2) comparing the model to other models. UltraNest is a general-purpose statistical inference package which can robustly fit arbitrary models specified in Python, C, C++, Fortran, Julia or R. With a focus on correctness and speed (in that order), UltraNest is especially useful for multi-modal or non-Gaussian parameter spaces, computational expensive models, in robust pipelines. Parallelisation to computing clusters and resuming in complete runs is supported.


The Dark Matter Data Centre

We will build a platform to host and combine overarching information on experimental studies, astronomical observations and theoretical modeling of Dark Matter to facilitate the combination and cross-correlation of existing and forthcoming data. This data centre will allow tests for Dark Matter (DM) candidates passing all existing benchmarks in cosmology, astro- and particle physics, experiments and in theory. We plan probing for tensions between different data sets and theories pointing towards new, hidden properties of DM. The data centre will make the data available to the international community for further global analysis and model benchmarking, following examples in astro- and high-energy physics.

Universal Imaging Using Information Field Theory

In order to reconstruct a good image of a spatially varying quantity, a field, from incomplete and noisy measurement data, it is necessary to combine the measurements with knowledge about general physical properties of the field, such as its smoothness, correlation structure, or freedom from divergence. Information field theory uses the elegant formalism of field theories to mathematically derive optimal Bayesian imaging algorithms for different measurement situations. These algorithms can be implemented efficiently and generally by means of the "Numerical Information Field Theory" (NIFTy) programming package. Algorithms using NIFTy are already used in radio and gamma-ray astronomy. NIFTy is developing into a universal tool for imaging problems in astronomy, particle physics, medicine, and other fields.

Point source detection in particle astrophysics: A comparison of Bayesian and Frequentist statistical approaches

Valentin Minoz, Francesca Capel

Summary: The search for the sources of astrophysical neutrinos is a central open question in particle astrophysics. Thanks to substantial experimental efforts, we now have large-scale neutrino detectors in the oceans and polar ice. The neutrino sky seems mostly isotropic, but hints of possible source-neutrino associations have started to emerge, leading to much excitement within the astrophysics community. As more data are collected and future experiments planned, the question of how to statistically quantify point source detection robustly becomes increasingly pertinent. The standard approach of null-hypothesis testing leads to reporting the results in terms of a p-value, with detection typically corresponding to surpassing the coveted 5-sigma threshold. While widely used, p-values and significance thresholds are notorious in the statistical community as challenging to interpret and potentially misleading. We explore an alternative Bayesian approach to reporting point source detection and the connections and differences with the Frequentist view.

Systematic Comparison of Monte Carlo Sampling Algorithms

Ganna Moharram, Johannes Buchner, Philipp Eller

Plausible ranges of physical parameters are inferred by fitting models to data sets. Monte Carlo sampling algorithms are commonly used fitting tools. This project compares the reliability and performance of modern inference algorithms under diverse models from astrophysics, particle physics and cosmology.