Marginal Bayesian Statistics Using Masked Autoregressive Flows and Kernel Density Estimators with Examples in Cosmology

AI generated image

Modern cosmology thrives on combining data from multiple experiments to tighten constraints on our understanding of the Universe. However, this process is often hampered by a major computational bottleneck: nuisance parameters. In our paper, 2207.11457, we introduce a novel framework to address this challenge, enabling efficient and reliable combination of cosmological constraints. This work, led by Harry Bevins and involving Will Handley, Pablo Lemos, Peter Sims, Eloy de Lera Acedo, and Anastasia Fialkov, demonstrates a powerful method for marginal Bayesian statistics that streamlines complex analyses.

The Challenge of Nuisance Parameters

Cosmological datasets, such as those from the Dark Energy Survey (1708.01530) and Planck (1807.06209), are not just sensitive to the cosmological parameters we care about (like dark matter density). They are also affected by instrumental systematics, foreground contamination, and other effects that must be modeled. These are known as “nuisance” parameters. When combining datasets, the total number of parameters can easily exceed 20 or 30, while only a small fraction ($\sim$5-6) represent the key physical quantities of interest. A full Bayesian analysis requires sampling this entire high-dimensional space, which is computationally prohibitive.

A Nuisance-Free Approach to Likelihoods

Our work presents a powerful solution: the construction of a “nuisance-free likelihood.” The core insight is that if one has access to the Bayesian evidence ($\mathcal{Z}$) from a nested sampling run, it is possible to reconstruct the likelihood for only the parameters of interest ($\theta$), effectively marginalizing out the nuisance parameters ($\alpha$) analytically. As derived in the paper, the nuisance-free likelihood $\mathcal{L}(\theta)$ can be expressed as:

$\mathcal{L}(\theta) = \frac{\mathcal{P}(\theta)\mathcal{Z}}{\pi(\theta)}$

This equation is the key to our method. It shows that the marginalised likelihood can be recovered from three components:

$\mathcal{P}(\theta)$: The marginal posterior distribution of the parameters of interest.
$\pi(\theta)$: The marginal prior distribution.
$\mathcal{Z}$: The Bayesian evidence, a standard output of nested sampling algorithms.

This approach constitutes a lossless compression in the parameters of interest, meaning we can discard the high-dimensional nuisance parameter samples and still recover the exact same marginal posterior that a full, combined analysis would have produced.

The `margarine` Toolkit

To make this theoretical framework practical, we utilize our public software package, margarine (2205.12841). This tool uses advanced machine learning techniques to create highly accurate density estimators for the marginal prior and posterior distributions from a set of representative samples. Specifically, margarine employs a combination of:

Masked Autoregressive Flows (MAFs): A type of normalizing flow that uses neural networks to learn and replicate complex probability distributions.
Kernel Density Estimators (KDEs): A non-parametric method to estimate the probability density function of a random variable.

By training these models on the outputs of individual nested sampling runs, we can generate fast, reusable, and reliable representations of $\mathcal{P}(\theta)$ and $\pi(\theta)$, which are then used to compute the nuisance-free likelihood.

Demonstration: Combining DES and Planck

We validated our method by applying it to a classic cosmological problem: combining constraints from the DES Year 1 and Planck datasets. A previous analysis (1902.04029) performed this combination using a full nested sampling run over 41 parameters (6 cosmological + 35 nuisance). Our approach using margarine involves:

Running DES and Planck analyses separately to obtain their evidences and posterior samples.
Using margarine to build density estimators for the 6 shared cosmological parameters for each experiment.
Constructing the nuisance-free likelihood for each experiment.
Multiplying these likelihoods and running a new, fast nested sampling analysis in the vastly reduced 6-dimensional parameter space.

The results are remarkable. Our method accurately reproduced the combined posterior and log-evidence found in the full 41-dimensional analysis. More importantly, it dramatically reduced the computational cost. Since nested sampling runtimes scale approximately as the cube of the number of dimensions, reducing the problem from 41 to 6 parameters results in a speed-up of approximately $(41/6)^3 \approx 319$ times, before even accounting for the faster evaluation of the emulated likelihood compared to the original. This work paves the way for a new era of efficient, collaborative cosmology, where public libraries of marginal likelihoods can be rapidly combined to distill new scientific insights from complex datasets.

Harry Bevins Will Handley Anastasia Fialkov

Content generated by gemini-2.5-pro using this prompt.

Image generated by imagen-3.0-generate-002 using this prompt.

The Challenge of Nuisance Parameters

A Nuisance-Free Approach to Likelihoods

The margarine Toolkit

Demonstration: Combining DES and Planck

The `margarine` Toolkit