{% raw %}

Title: Create a Markdown Blog Post Integrating Research Details and a Featured Paper
====================================================================================

This task involves generating a Markdown file (ready for a GitHub-served Jekyll site) that integrates our research details with a featured research paper. The output must follow the exact format and conventions described below.

====================================================================================
Output Format (Markdown):
------------------------------------------------------------------------------------
---
layout: post
title:  "Accelerated nested sampling with $β$-flows for gravitational waves"
date:   2024-11-26
categories: papers
---
![AI generated image](/assets/images/posts/2024-11-26-2411.17663.png)

<!-- BEGINNING OF GENERATED POST -->
<!-- END OF GENERATED POST -->

<img src="/assets/group/images/metha_prathaban.jpg" alt="Metha Prathaban" style="width: auto; height: 20vw;"><img src="/assets/group/images/harry_bevins.jpg" alt="Harry Bevins" style="width: auto; height: 20vw;"><img src="/assets/group/images/will_handley.jpg" alt="Will Handley" style="width: auto; height: 20vw;">

Content generated by [gemini-1.5-pro](https://deepmind.google/technologies/gemini/) using [this prompt](/prompts/content/2024-11-26-2411.17663.txt).

Image generated by [imagen-3.0-generate-002](https://deepmind.google/technologies/gemini/) using [this prompt](/prompts/images/2024-11-26-2411.17663.txt).

------------------------------------------------------------------------------------
====================================================================================

Please adhere strictly to the following instructions:

====================================================================================
Section 1: Content Creation Instructions
====================================================================================

1. **Generate the Page Body:**
   - Write a well-composed, engaging narrative that is suitable for a scholarly audience interested in advanced AI and astrophysics.
   - Ensure the narrative is original and reflective of the tone and style and content in the "Homepage Content" block (provided below), but do not reuse its content.
   - Use bullet points, subheadings, or other formatting to enhance readability.

2. **Highlight Key Research Details:**
   - Emphasize the contributions and impact of the paper, focusing on its methodology, significance, and context within current research.
   - Specifically highlight the lead author ({'name': 'Metha Prathaban'}). When referencing any author, use Markdown links from the Author Information block (choose academic or GitHub links over social media).

3. **Integrate Data from Multiple Sources:**
   - Seamlessly weave information from the following:
     - **Paper Metadata (YAML):** Essential details including the title and authors.
     - **Paper Source (TeX):** Technical content from the paper.
     - **Bibliographic Information (bbl):** Extract bibliographic references.
     - **Author Information (YAML):** Profile details for constructing Markdown links.
   - Merge insights from the Paper Metadata, TeX source, Bibliographic Information, and Author Information blocks into a coherent narrative—do not treat these as separate or isolated pieces.
   - Insert the generated narrative between the HTML comments:
     <!-- BEGINNING OF GENERATED POST --> and <!-- END OF GENERATED POST -->

4. **Generate Bibliographic References:**
   - Review the Bibliographic Information block carefully.
   - For each reference that includes a DOI or arXiv identifier:
     - For DOIs, generate a link formatted as:
       [10.1234/xyz](https://doi.org/10.1234/xyz)
     - For arXiv entries, generate a link formatted as:
       [2103.12345](https://arxiv.org/abs/2103.12345)
    - **Important:** Do not use any LaTeX citation commands (e.g., `\cite{...}`). Every reference must be rendered directly as a Markdown link. For example, instead of `\cite{mycitation}`, output `[mycitation](https://doi.org/mycitation)`
        - **Incorrect:** `\cite{10.1234/xyz}`  
        - **Correct:** `[10.1234/xyz](https://doi.org/10.1234/xyz)`
   - Ensure that at least three (3) of the most relevant references are naturally integrated into the narrative.
   - Ensure that the link to the Featured paper [2411.17663](https://arxiv.org/abs/2411.17663) is included in the first sentence.

5. **Final Formatting Requirements:**
   - The output must be plain Markdown; do not wrap it in Markdown code fences.
   - Preserve the YAML front matter exactly as provided.

====================================================================================
Section 2: Provided Data for Integration
====================================================================================

1. **Homepage Content (Tone and Style Reference):**
```markdown
---
layout: home
---

![AI generated image](/assets/images/index.png)

<!-- START OF WEBSITE SUMMARY -->
The Handley Research Group is dedicated to advancing our understanding of the Universe through the development and application of cutting-edge artificial intelligence and Bayesian statistical inference methods. Our research spans a wide range of cosmological topics, from the very first moments of the Universe to the nature of dark matter and dark energy, with a particular focus on analyzing complex datasets from next-generation surveys.

## Research Focus

Our core research revolves around developing innovative methodologies for analyzing large-scale cosmological datasets. We specialize in Simulation-Based Inference (SBI), a powerful technique that leverages our ability to simulate realistic universes to perform robust parameter inference and model comparison, even when likelihood functions are intractable ([LSBI framework](https://arxiv.org/abs/2501.03921)).  This focus allows us to tackle complex astrophysical and instrumental systematics that are challenging to model analytically ([Foreground map errors](https://arxiv.org/abs/2211.10448)).

A key aspect of our work is the development of next-generation SBI tools ([Gradient-guided Nested Sampling](https://arxiv.org/abs/2312.03911)), particularly those based on neural ratio estimation. These methods offer significant advantages in efficiency and scalability for high-dimensional inference problems ([NRE-based SBI](https://arxiv.org/abs/2207.11457)).  We are also pioneering the application of these methods to the analysis of Cosmic Microwave Background ([CMB](https://arxiv.org/abs/1908.00906)) data, Baryon Acoustic Oscillations ([BAO](https://arxiv.org/abs/1701.08165)) from surveys like DESI and 4MOST, and gravitational wave observations.

Our AI initiatives extend beyond standard density estimation to encompass a broader range of machine learning techniques, such as:

* **Emulator Development:** We develop fast and accurate emulators of complex astrophysical signals ([globalemu](https://arxiv.org/abs/2104.04336)) for efficient parameter exploration and model comparison ([Neural network emulators](https://arxiv.org/abs/2503.13263)).
* **Bayesian Neural Networks:** We explore the full posterior distribution of Bayesian neural networks for improved generalization and interpretability ([BNN marginalisation](https://arxiv.org/abs/2205.11151)).
* **Automated Model Building:**  We are developing novel techniques to automate the process of building and testing theoretical cosmological models using a combination of symbolic computation and machine learning ([Automated model building](https://arxiv.org/abs/2006.03581)).

Additionally, we are active in the development and application of advanced sampling methods like nested sampling ([Nested sampling review](https://arxiv.org/abs/2205.15570)), including dynamic nested sampling ([Dynamic nested sampling](https://arxiv.org/abs/1704.03459)) and its acceleration through techniques like posterior repartitioning ([Accelerated nested sampling](https://arxiv.org/abs/2411.17663)).

## Highlight Achievements

Our group has a strong publication record in high-impact journals and on the arXiv preprint server. Some key highlights include:

* Development of novel AI-driven methods for analyzing the 21-cm signal from the Cosmic Dawn ([21-cm analysis](https://arxiv.org/abs/2201.11531)).
* Contributing to the Planck Collaboration's analysis of CMB data ([Planck 2018](https://arxiv.org/abs/1807.06205)).
* Development of the PolyChord nested sampling software ([PolyChord](https://arxiv.org/abs/1506.00171)), which is now widely used in cosmological analyses.
* Contributions to the GAMBIT global fitting framework ([GAMBIT CosmoBit](https://arxiv.org/abs/2009.03286)).
* Applying SBI to constrain dark matter models ([Dirac Dark Matter EFTs](https://arxiv.org/abs/2106.02056)).

## Future Directions

We are committed to pushing the boundaries of cosmological analysis through our ongoing and future projects, including:

* Applying SBI to test extensions of General Relativity ([Modified Gravity](https://arxiv.org/abs/2006.03581)).
* Developing AI-driven tools for efficient and robust calibration of cosmological experiments ([Calibration for astrophysical experimentation](https://arxiv.org/abs/2307.00099)).
* Exploring the use of transformers and large language models for automating the process of cosmological model building.
* Applying our expertise to the analysis of data from next-generation surveys like Euclid, the Vera Rubin Observatory, and the Square Kilometre Array.  This will allow us to probe the nature of dark energy with increased precision ([Dynamical Dark Energy](https://arxiv.org/abs/2503.08658)), search for parity violation in the large-scale structure ([Parity Violation](https://arxiv.org/abs/2410.16030)), and explore a variety of other fundamental questions.


<!-- END OF WEBSITE SUMMARY -->

Content generated by [gemini-1.5-pro](https://deepmind.google/technologies/gemini/) using [this prompt](/prompts/content/index.txt).

Image generated by [imagen-3.0-generate-002](https://deepmind.google/technologies/gemini/) using [this prompt](/prompts/images/index.txt).

```

2. **Paper Metadata:**
```yaml
!!python/object/new:feedparser.util.FeedParserDict
dictitems:
  id: http://arxiv.org/abs/2411.17663v1
  guidislink: true
  link: http://arxiv.org/abs/2411.17663v1
  updated: '2024-11-26T18:26:20Z'
  updated_parsed: !!python/object/apply:time.struct_time
  - !!python/tuple
    - 2024
    - 11
    - 26
    - 18
    - 26
    - 20
    - 1
    - 331
    - 0
  - tm_zone: null
    tm_gmtoff: null
  published: '2024-11-26T18:26:20Z'
  published_parsed: !!python/object/apply:time.struct_time
  - !!python/tuple
    - 2024
    - 11
    - 26
    - 18
    - 26
    - 20
    - 1
    - 331
    - 0
  - tm_zone: null
    tm_gmtoff: null
  title: "Accelerated nested sampling with $\u03B2$-flows for gravitational waves"
  title_detail: !!python/object/new:feedparser.util.FeedParserDict
    dictitems:
      type: text/plain
      language: null
      base: ''
      value: "Accelerated nested sampling with $\u03B2$-flows for gravitational waves"
  summary: 'There is an ever-growing need in the gravitational wave community for
    fast

    and reliable inference methods, accompanied by an informative error bar. Nested

    sampling satisfies the last two requirements, but its computational cost can

    become prohibitive when using the most accurate waveform models. In this paper,

    we demonstrate the acceleration of nested sampling using a technique called

    posterior repartitioning. This method leverages nested sampling''s unique

    ability to separate prior and likelihood contributions at the algorithmic

    level. Specifically, we define a `repartitioned prior'' informed by the

    posterior from a low-resolution run. To construct this repartitioned prior, we

    use a $\beta$-flow, a novel type of conditional normalizing flow designed to

    better learn deep tail probabilities. $\beta$-flows are trained on the entire

    nested sampling run and conditioned on an inverse temperature $\beta$. Applying

    our methods to simulated and real binary black hole mergers, we demonstrate how

    they can reduce the number of likelihood evaluations required for convergence

    by up to an order of magnitude, enabling faster model comparison and parameter

    estimation. Furthermore, we highlight the robustness of using $\beta$-flows

    over standard normalizing flows to accelerate nested sampling. Notably,

    $\beta$-flows successfully recover the same posteriors and evidences as

    traditional nested sampling, even in cases where standard normalizing flows

    fail.'
  summary_detail: !!python/object/new:feedparser.util.FeedParserDict
    dictitems:
      type: text/plain
      language: null
      base: ''
      value: 'There is an ever-growing need in the gravitational wave community for
        fast

        and reliable inference methods, accompanied by an informative error bar. Nested

        sampling satisfies the last two requirements, but its computational cost can

        become prohibitive when using the most accurate waveform models. In this paper,

        we demonstrate the acceleration of nested sampling using a technique called

        posterior repartitioning. This method leverages nested sampling''s unique

        ability to separate prior and likelihood contributions at the algorithmic

        level. Specifically, we define a `repartitioned prior'' informed by the

        posterior from a low-resolution run. To construct this repartitioned prior,
        we

        use a $\beta$-flow, a novel type of conditional normalizing flow designed
        to

        better learn deep tail probabilities. $\beta$-flows are trained on the entire

        nested sampling run and conditioned on an inverse temperature $\beta$. Applying

        our methods to simulated and real binary black hole mergers, we demonstrate
        how

        they can reduce the number of likelihood evaluations required for convergence

        by up to an order of magnitude, enabling faster model comparison and parameter

        estimation. Furthermore, we highlight the robustness of using $\beta$-flows

        over standard normalizing flows to accelerate nested sampling. Notably,

        $\beta$-flows successfully recover the same posteriors and evidences as

        traditional nested sampling, even in cases where standard normalizing flows

        fail.'
  authors:
  - !!python/object/new:feedparser.util.FeedParserDict
    dictitems:
      name: Metha Prathaban
  - !!python/object/new:feedparser.util.FeedParserDict
    dictitems:
      name: Harry Bevins
  - !!python/object/new:feedparser.util.FeedParserDict
    dictitems:
      name: Will Handley
  author_detail: !!python/object/new:feedparser.util.FeedParserDict
    dictitems:
      name: Will Handley
  author: Will Handley
  arxiv_comment: 12 pages, 13 figures
  links:
  - !!python/object/new:feedparser.util.FeedParserDict
    dictitems:
      href: http://arxiv.org/abs/2411.17663v1
      rel: alternate
      type: text/html
  - !!python/object/new:feedparser.util.FeedParserDict
    dictitems:
      title: pdf
      href: http://arxiv.org/pdf/2411.17663v1
      rel: related
      type: application/pdf
  arxiv_primary_category:
    term: astro-ph.IM
    scheme: http://arxiv.org/schemas/atom
  tags:
  - !!python/object/new:feedparser.util.FeedParserDict
    dictitems:
      term: astro-ph.IM
      scheme: http://arxiv.org/schemas/atom
      label: null
  - !!python/object/new:feedparser.util.FeedParserDict
    dictitems:
      term: astro-ph.HE
      scheme: http://arxiv.org/schemas/atom
      label: null
  - !!python/object/new:feedparser.util.FeedParserDict
    dictitems:
      term: gr-qc
      scheme: http://arxiv.org/schemas/atom
      label: null

```

3. **Paper Source (TeX):**
```tex
% mnras_guide.tex
%
% MNRAS LaTeX user guide
%
% v3.3 released 23 April 2024
%
% v3.2 released 20 July 2023
% 
% v3.1 released 11 June 2020
%
% v3.0 released 22 May 2015
% (version numbers match those of mnras.cls)
%
% Copyright (C) Royal Astronomical Society 2024
% Authors:
% Keith T. Smith (Royal Astronomical Society)

% Change log
%
% v3.3 April 2024
%   Updated \pubyear element to print current year
%
% v3.2 July 2023 
%	Updated guidance on use of amssymb package 
%
% v3.0   September 2013 - May 2015
%    First version: complete rewrite of the user guide
%    Basic structure taken from mnras_template.tex by the same author

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% Basic setup. Most papers should leave these options alone.
\documentclass[fleqn,usenatbib,useAMS]{mnras}

%%%%% AUTHORS - PLACE YOUR OWN PACKAGES HERE %%%%%

% Only include extra packages if you really need them. Avoid using amssymb if newtxmath is enabled, as these packages can cause conflicts. newtxmatch covers the same math symbols while producing a consistent Times New Roman font. Common packages are:
\usepackage{graphicx}	% Including figure files
\usepackage{amsmath}	% Advanced maths commands
\usepackage{multicol}        % Multi-column entries in tables
\usepackage{bm}		% Bold maths symbols, including upright Greek
\usepackage{pdflscape}	% Landscape pages

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

%%%%%% AUTHORS - PLACE YOUR OWN MACROS HERE %%%%%%

% Please keep new commands to a minimum, and use \newcommand not \def to avoid
% overwriting existing commands. Example:
%\newcommand{\pcm}{\,cm$^{-2}$}	% per cm-squared
\newcommand{\kms}{\,km\,s$^{-1}$} % kilometres per second
\newcommand{\bibtex}{\textsc{Bib}\!\TeX} % bibtex. Not quite the correct typesetting, but close enough

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%


% Use vector fonts, so it zooms properly in on-screen viewing software
% Don't change these lines unless you know what you are doing
\usepackage[T1]{fontenc}
\usepackage{ae,aecompl}

% MNRAS is set in Times font. If you don't have this installed (most LaTeX
% installations will be fine) or prefer the old Computer Modern fonts, comment
% out the following line
\usepackage{newtxtext,newtxmath}
% Depending on your LaTeX fonts installation, you might get better results with one of these:
%\usepackage{mathptmx}
%\usepackage{txfonts}

%%%%%%%%%%%%%%%%%%% TITLE PAGE %%%%%%%%%%%%%%%%%%%

% Title of the paper, and the short title which is used in the headers.
% Keep the title short and informative.
\title[MNRAS \LaTeX\ guide for authors]{\textit{Monthly Notices of the Royal Astronomical
  Society}:  \LaTeX\ guide for authors}

% The list of authors, and the short list which is used in the headers.
% If you need two or more lines of authors, add an extra line using \newauthor
\author[K. T. Smith]{Keith T. Smith$^{1}$%
\thanks{Contact e-mail: \href{mailto:mn@ras.ac.uk}{mn@ras.ac.uk}}%
\thanks{Present address: Science magazine, AAAS Science International, \mbox{82-88}~Hills Road, Cambridge CB2~1LQ, UK}%
\\
% List of institutions
$^{1}$Royal Astronomical Society, Burlington House, Piccadilly, London W1J 0BQ, UK}

% These dates will be filled out by the publisher
\date{Last updated 2024 April 23; in original form 2013 September 5}

% Enter the current year, for the copyright statements etc.
\pubyear{{\the\year{}}}

% Don't change these lines
\begin{document}
\label{firstpage}
\pagerange{\pageref{firstpage}--\pageref{lastpage}}
\maketitle

% Abstract of the paper
\begin{abstract}
This is a guide for preparing papers for \textit{Monthly Notices of the Royal Astronomical Society} using the \verb'mnras' \LaTeX\ package.
It provides instructions for using the additional features in the document class.
This is not a general guide on how to use \LaTeX, and nor does it replace the journal's instructions to authors.
See \texttt{mnras\_template.tex} for a simple template.
\end{abstract}

% Select between one and six entries from the list of approved keywords.
% Don't make up new ones.
\begin{keywords}
editorials, notices -- miscellaneous
\end{keywords}

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

%%%%%%%%%%%%%%%%% BODY OF PAPER %%%%%%%%%%%%%%%%%%

% The MNRAS class isn't designed to include a table of contents, but for this document one is useful.
% I therefore have to do some kludging to make it work without masses of blank space.
\begingroup
\let\clearpage\relax
\tableofcontents
\endgroup
\newpage

\section{Introduction}

The journal \textit{Monthly Notices of the Royal Astronomical Society} (MNRAS) encourages authors to prepare their papers using \LaTeX.
The style file \verb'mnras.cls' can be used to approximate the final appearance of the journal, and provides numerous features to simplify the preparation of papers.
This document, \verb'mnras_guide.tex', provides guidance on using that style file and the features it enables.

This is not a general guide on how to use \LaTeX, of which many excellent examples already exist.
We particularly recommend \textit{Wikibooks \LaTeX}\footnote{\url{https://en.wikibooks.org/wiki/LaTeX}}, a collaborative online textbook which is of use to both beginners and experts.
Alternatively there are several other online resources, and most academic libraries also hold suitable beginner's guides.

For guidance on the contents of papers, journal style, and how to submit a paper, see the MNRAS Instructions to Authors\footnote{\label{foot:itas}\url{http://www.oxfordjournals.org/our_journals/mnras/for_authors/}}.
Only technical issues with the \LaTeX\ class are considered here.


\section{Obtaining and installing the MNRAS package}
Some \LaTeX\ distributions come with the MNRAS package by default.
If yours does not, you can either install it using your distribution's package manager, or download it from the Comprehensive \TeX\ Archive Network\footnote{\url{http://www.ctan.org/tex-archive/macros/latex/contrib/mnras}} (CTAN).

The files can either be installed permanently by placing them in the appropriate directory (consult the documentation for your \LaTeX\ distribution), or used temporarily by placing them in the working directory for your paper.

To use the MNRAS package, simply specify \verb'mnras' as the document class at the start of a \verb'.tex' file:

\begin{verbatim}
\documentclass{mnras}
\end{verbatim}
Then compile \LaTeX\ (and if necessary \bibtex) in the usual way.

\section{Preparing and submitting a paper}
We recommend that you start with a copy of the \texttt{mnras\_template.tex} file.
Rename the file, update the information on the title page, and then work on the text of your paper.
Guidelines for content, style etc. are given in the instructions to authors on the journal's website$^{\ref{foot:itas}}$.
Note that this document does not follow all the aspects of MNRAS journal style (e.g. it has a table of contents).

If a paper is accepted, it is professionally typeset and copyedited by the publishers.
It is therefore likely that minor changes to presentation will occur.
For this reason, we ask authors to ignore minor details such as slightly long lines, extra blank spaces, or misplaced figures, because these details will be dealt with during the production process.

Papers must be submitted electronically via the online submission system; paper submissions are not permitted.
For full guidance on how to submit a paper, see the instructions to authors.

\section{Class options}
\label{sec:options}
There are several options which can be added to the document class line like this:

\begin{verbatim}
\documentclass[option1,option2]{mnras}
\end{verbatim}
The available options are:
\begin{itemize}
\item \verb'letters' -- used for papers in the journal's Letters section.
\item \verb'onecolumn' -- single column, instead of the default two columns. This should be used {\it only} if necessary for the display of numerous very long equations.
\item \verb'doublespacing' -- text has double line spacing. Please don't submit papers in this format.
\item \verb'referee' -- \textit{(deprecated)} single column, double spaced, larger text, bigger margins. Please don't submit papers in this format.
\item \verb'galley' -- \textit{(deprecated)} no running headers, no attempt to align the bottom of columns.
\item \verb'landscape' -- \textit{(deprecated)} sets the whole document on landscape paper.
\item \verb"usenatbib" -- \textit{(all papers should use this)} this uses Patrick Daly's \verb"natbib.sty" package for citations.
\item \verb"usegraphicx" -- \textit{(most papers will need this)} includes the \verb'graphicx' package, for inclusion of figures and images.
\item \verb'useAMS' -- adds support for upright Greek characters \verb'\upi', \verb'\umu' and \verb'\upartial' ($\upi$, $\umu$ and $\upartial$). Only these three are included, if you require other symbols you will need to include the \verb'amsmath' package (see section~\ref{sec:packages}).
\item \verb"usedcolumn" -- includes the package \verb"dcolumn", which includes two new types of column alignment for use in tables.
\end{itemize}

Some of these options are deprecated and retained for backwards compatibility only.
Others are used in almost all papers, but again are retained as options to ensure that papers written decades ago will continue to compile without problems.
If you want to include any other packages, see section~\ref{sec:packages}.

\section{Title page}

If you are using \texttt{mnras\_template.tex} the necessary code for generating the title page, headers and footers is already present.
Simply edit the title, author list, institutions, abstract and keywords as described below.

\subsection{Title}
There are two forms of the title: the full version used on the first page, and a short version which is used in the header of other odd-numbered pages (the `running head').
Enter them with \verb'\title[]{}' like this:
\begin{verbatim}
\title[Running head]{Full title of the paper}
\end{verbatim}
The full title can be multiple lines (use \verb'\\' to start a new line) and may be as long as necessary, although we encourage authors to use concise titles. The running head must be $\le~45$ characters on a single line.

See appendix~\ref{sec:advanced} for more complicated examples.

\subsection{Authors and institutions}

Like the title, there are two forms of author list: the full version which appears on the title page, and a short form which appears in the header of the even-numbered pages. Enter them using the \verb'\author[]{}' command.

If the author list is more than one line long, start a new line using \verb'\newauthor'. Use \verb'\\' to start the institution list. Affiliations for each author should be indicated with a superscript number, and correspond to the list of institutions below the author list.

For example, if I were to write a paper with two coauthors at another institution, one of whom also works at a third location:
\begin{verbatim}
\author[K. T. Smith et al.]{
Keith T. Smith,$^{1}$
A. N. Other,$^{2}$
and Third Author$^{2,3}$
\\
$^{1}$Affiliation 1\\
$^{2}$Affiliation 2\\
$^{3}$Affiliation 3}
\end{verbatim}
Affiliations should be in the format `Department, Institution, Street Address, City and Postal Code, Country'.

Email addresses can be inserted with the \verb'\thanks{}' command which adds a title page footnote.
If you want to list more than one email, put them all in the same \verb'\thanks' and use \verb'\footnotemark[]' to refer to the same footnote multiple times.
Present addresses (if different to those where the work was performed) can also be added with a \verb'\thanks' command.

\subsection{Abstract and keywords}

The abstract is entered in an \verb'abstract' environment:
\begin{verbatim}
\begin{abstract}
The abstract of the paper.
\end{abstract}
\end{verbatim}
\noindent Note that there is a word limit on the length of abstracts.
For the current word limit, see the journal instructions to authors$^{\ref{foot:itas}}$.

Immediately following the abstract, a set of keywords is entered in a \verb'keywords' environment:
\begin{verbatim}
\begin{keywords}
keyword 1 -- keyword 2 -- keyword 3
\end{keywords}
\end{verbatim}
\noindent There is a list of permitted keywords, which is agreed between all the major astronomy journals and revised every few years.
Do \emph{not} make up new keywords!
For the current list of allowed keywords, see the journal's instructions to authors$^{\ref{foot:itas}}$.

\section{Sections and lists}

Sections and lists are generally the same as in the standard \LaTeX\ classes.

\subsection{Sections}
\label{sec:sections}
Sections are entered in the usual way, using \verb'\section{}' and its variants. It is possible to nest up to four section levels:
\begin{verbatim}
\section{Main section}
 \subsection{Subsection}
  \subsubsection{Subsubsection}
   \paragraph{Lowest level section}
\end{verbatim}
\noindent The other \LaTeX\ sectioning commands \verb'\part', \verb'\chapter' and \verb'\subparagraph{}' are deprecated and should not be used.

Some sections are not numbered as part of journal style (e.g. the Acknowledgements).
To insert an unnumbered section use the `starred' version of the command: \verb'\section*{}'.

See appendix~\ref{sec:advanced} for more complicated examples.

\subsection{Lists}

Two forms of lists can be used in MNRAS -- numbered and unnumbered.

For a numbered list, use the \verb'enumerate' environment:
\begin{verbatim}
\begin{enumerate}
 \item First item
 \item Second item
 \item etc.
\end{enumerate}
\end{verbatim}
\noindent which produces
\begin{enumerate}
 \item First item
 \item Second item
 \item etc.
\end{enumerate}
Note that the list uses lowercase Roman numerals, rather than the \LaTeX\ default Arabic numerals.

For an unnumbered list, use the \verb'description' environment without the optional argument:
\begin{verbatim}
\begin{description}
 \item First item
 \item Second item
 \item etc.
\end{description}
\end{verbatim}
\noindent which produces
\begin{description}
 \item First item
 \item Second item
 \item etc.
\end{description}

Bulleted lists using the \verb'itemize' environment should not be used in MNRAS; it is retained for backwards compatibility only.

\section{Mathematics and symbols}

The MNRAS class mostly adopts standard \LaTeX\ handling of mathematics, which is briefly summarised here.
See also section~\ref{sec:packages} for packages that support more advanced mathematics.

Mathematics can be inserted into the running text using the syntax \verb'$1+1=2$', which produces $1+1=2$.
Use this only for short expressions or when referring to mathematical quantities; equations should be entered as described below.

\subsection{Equations}
Equations should be entered using the \verb'equation' environment, which automatically numbers them:

\begin{verbatim}
\begin{equation}
 a^2=b^2+c^2
\end{equation}
\end{verbatim}
\noindent which produces
\begin{equation}
 a^2=b^2+c^2
\end{equation}

By default, the equations are numbered sequentially throughout the whole paper. If a paper has a large number of equations, it may be better to number them by section (2.1, 2.2 etc.). To do this, add the command \verb'\numberwithin{equation}{section}' to the preamble.

It is also possible to produce un-numbered equations by using the \LaTeX\ built-in \verb'\['\textellipsis\verb'\]' and \verb'$$'\textellipsis\verb'$$' commands; however MNRAS requires that all equations are numbered, so these commands should be avoided.

\subsection{Special symbols}


\begin{table}
 \caption{Additional commands for special symbols commonly used in astronomy. These can be used anywhere.}
 \label{tab:anysymbols}
 \begin{tabular*}{\columnwidth}{@{}l@{\hspace*{50pt}}l@{\hspace*{50pt}}l@{}}
  \hline
  Command & Output & Meaning\\
  \hline
  \verb'\sun' & \sun & Sun, solar\\[2pt] % additional height spacing for enhanced symbol legibility
  \verb'\earth' & \earth & Earth, terrestrial\\[2pt]
  \verb'\micron' & \micron & microns\\[2pt]
  \verb'\degr' & \degr & degrees\\[2pt]
  \verb'\arcmin' & \arcmin & arcminutes\\[2pt]
  \verb'\arcsec' & \arcsec & arcseconds\\[2pt]
  \verb'\fdg' & \fdg & fraction of a degree\\[2pt]
  \verb'\farcm' & \farcm & fraction of an arcminute\\[2pt]
  \verb'\farcs' & \farcs & fraction of an arcsecond\\[2pt]
  \verb'\fd' & \fd & fraction of a day\\[2pt]
  \verb'\fh' & \fh & fraction of an hour\\[2pt]
  \verb'\fm' & \fm & fraction of a minute\\[2pt]
  \verb'\fs' & \fs & fraction of a second\\[2pt]
  \verb'\fp' & \fp & fraction of a period\\[2pt]
  \verb'\diameter' & \diameter & diameter\\[2pt]
  \verb'\sq' & \sq & square, Q.E.D.\\[2pt]
  \hline
 \end{tabular*}
\end{table}

\begin{table}
 \caption{Additional commands for mathematical symbols. These can only be used in maths mode.}
 \label{tab:mathssymbols}
 \begin{tabular*}{\columnwidth}{l@{\hspace*{40pt}}l@{\hspace*{40pt}}l}
  \hline
  Command & Output & Meaning\\
  \hline
  \verb'\upi' & $\upi$ & upright pi\\[2pt] % additional height spacing for enhanced symbol legibility
  \verb'\umu' & $\umu$ & upright mu\\[2pt]
  \verb'\upartial' & $\upartial$ & upright partial derivative\\[2pt]
  \verb'\lid' & $\lid$ & less than or equal to\\[2pt]
  \verb'\gid' & $\gid$ & greater than or equal to\\[2pt]
  \verb'\la' & $\la$ & less than of order\\[2pt]
  \verb'\ga' & $\ga$ & greater than of order\\[2pt]
  \verb'\loa' & $\loa$ & less than approximately\\[2pt]
  \verb'\goa' & $\goa$ & greater than approximately\\[2pt]
  \verb'\cor' & $\cor$ & corresponds to\\[2pt]
  \verb'\sol' & $\sol$ & similar to or less than\\[2pt]
  \verb'\sog' & $\sog$ & similar to or greater than\\[2pt]
  \verb'\lse' & $\lse$ & less than or homotopic to \\[2pt]
  \verb'\gse' & $\gse$ & greater than or homotopic to\\[2pt]
  \verb'\getsto' & $\getsto$ & from over to\\[2pt]
  \verb'\grole' & $\grole$ & greater over less\\[2pt]
  \verb'\leogr' & $\leogr$ & less over greater\\
  \hline
 \end{tabular*}
\end{table}

Some additional symbols of common use in astronomy have been added in the MNRAS class. These are shown in tables~\ref{tab:anysymbols}--\ref{tab:mathssymbols}. The command names are -- as far as possible -- the same as those used in other major astronomy journals.

Many other mathematical symbols are also available, either built into \LaTeX\ or via additional packages. If you want to insert a specific symbol but don't know the \LaTeX\ command, we recommend using the Detexify website\footnote{\url{http://detexify.kirelabs.org}}.

Sometimes font or coding limitations mean a symbol may not get smaller when used in sub- or superscripts, and will therefore be displayed at the wrong size. There is no need to worry about this as it will be corrected by the typesetter during production.

To produce bold symbols in mathematics, use \verb'\bmath' for simple variables, and the \verb'bm' package for more complex symbols (see section~\ref{sec:packages}). Vectors are set in bold italic, using \verb'\mathbfit{}'.

For matrices, use \verb'\mathbfss{}' to produce a bold sans-serif font e.g. \mathbfss{H}; this works even outside maths mode, but not all symbols are available (e.g. Greek). For $\nabla$ (del, used in gradients, divergence etc.) use \verb'$\nabla$'.

\subsection{Ions}

A new \verb'\ion{}{}' command has been added to the class file, for the correct typesetting of ionisation states.
For example, to typeset singly ionised calcium use \verb'\ion{Ca}{ii}', which produces \ion{Ca}{ii}.

\section{Figures and tables}
\label{sec:fig_table}
Figures and tables (collectively called `floats') are mostly the same as built into \LaTeX.

\subsection{Basic examples}
\begin{figure}
 \includegraphics[width=\columnwidth]{example}
 \caption{An example figure.}
 \label{fig:example}
\end{figure}
Figures are inserted in the usual way using a \verb'figure' environment and \verb'\includegraphics'. The example Figure~\ref{fig:example} was generated using the code:
\begin{verbatim}
\begin{figure}
 \includegraphics[width=\columnwidth]{example}
 \caption{An example figure.}
 \label{fig:example}
\end{figure}
\end{verbatim}

\begin{table}
 \caption{An example table.}
 \label{tab:example}
 \begin{tabular}{lcc}
  \hline
  Star & Mass & Luminosity\\
   & $M_{\sun}$ & $L_{\sun}$\\
  \hline
  Sun & 1.00 & 1.00\\
  $\alpha$~Cen~A & 1.10 & 1.52\\
  $\epsilon$~Eri & 0.82 & 0.34\\
  \hline
 \end{tabular}
\end{table}
The example Table~\ref{tab:example} was generated using the code:
\begin{verbatim}
\begin{table}
 \caption{An example table.}
 \label{tab:example}
 \begin{tabular}{lcc}
  \hline
  Star & Mass & Luminosity\\
   & $M_{\sun}$ & $L_{\sun}$\\
  \hline
  Sun & 1.00 & 1.00\\
  $\alpha$~Cen~A & 1.10 & 1.52\\
  $\epsilon$~Eri & 0.82 & 0.34\\
  \hline
 \end{tabular}
\end{table}
\end{verbatim}

\subsection{Captions and placement}
Captions go \emph{above} tables but \emph{below} figures, as in the examples above.

The \LaTeX\ float placement commands \verb'[htbp]' are intentionally disabled.
Layout of figures and tables will be adjusted by the publisher during the production process, so authors should not concern themselves with placement to avoid disappointment and wasted effort.
Simply place the \LaTeX\ code close to where the figure or table is first mentioned in the text and leave exact placement to the publishers.

By default a figure or table will occupy one column of the page.
To produce a wider version which covers both columns, use the \verb'figure*' or \verb'table*'  environment.

If a figure or table is too long to fit on a single page it can be split it into several parts.
Create an additional figure or table which uses \verb'\contcaption{}' instead of \verb'\caption{}'.
This will automatically correct the numbering and add `\emph{continued}' at the start of the caption.
\begin{table}
 \contcaption{A table continued from the previous one.}
 \label{tab:continued}
 \begin{tabular}{lcc}
  \hline
  Star & Mass & Luminosity\\
   & $M_{\sun}$ & $L_{\sun}$\\
  \hline
  $\tau$~Cet & 0.78 & 0.52\\
  $\delta$~Pav & 0.99 & 1.22\\
  $\sigma$~Dra & 0.87 & 0.43\\
  \hline
 \end{tabular}
\end{table}
Table~\ref{tab:continued} was generated using the code:

\begin{verbatim}
\begin{table}
 \contcaption{A table continued from the previous one.}
 \label{tab:continued}
 \begin{tabular}{lcc}
  \hline
  Star & Mass & Luminosity\\
   & $M_{\sun}$ & $L_{\sun}$\\
  \hline
  $\tau$~Cet & 0.78 & 0.52\\
  $\delta$~Pav & 0.99 & 1.22\\
  $\sigma$~Dra & 0.87 & 0.43\\
  \hline
 \end{tabular}
\end{table}
\end{verbatim}

To produce a landscape figure or table, use the \verb'pdflscape' package and the \verb'landscape' environment.
The landscape Table~\ref{tab:landscape} was produced using the code:
\begin{verbatim}
\begin{landscape}
 \begin{table}
  \caption{An example landscape table.}
  \label{tab:landscape}
  \begin{tabular}{cccccccccc}
    \hline
    Header & Header & ...\\
    Unit & Unit & ...\\
    \hline
    Data & Data & ...\\
    Data & Data & ...\\
    ...\\
    \hline
  \end{tabular}
 \end{table}
\end{landscape}
\end{verbatim}
Unfortunately this method will force a page break before the table appears.
More complicated solutions are possible, but authors shouldn't worry about this.

\begin{landscape}
 \begin{table}
  \caption{An example landscape table.}
  \label{tab:landscape}
  \begin{tabular}{cccccccccc}
    \hline
    Header & Header & Header & Header & Header & Header & Header & Header & Header & Header\\
    Unit & Unit & Unit & Unit & Unit & Unit & Unit & Unit & Unit & Unit \\
    \hline
    Data & Data & Data & Data & Data & Data & Data & Data & Data & Data\\
    Data & Data & Data & Data & Data & Data & Data & Data & Data & Data\\
    Data & Data & Data & Data & Data & Data & Data & Data & Data & Data\\
    Data & Data & Data & Data & Data & Data & Data & Data & Data & Data\\
    Data & Data & Data & Data & Data & Data & Data & Data & Data & Data\\
    Data & Data & Data & Data & Data & Data & Data & Data & Data & Data\\
    Data & Data & Data & Data & Data & Data & Data & Data & Data & Data\\
    Data & Data & Data & Data & Data & Data & Data & Data & Data & Data\\
    \hline
  \end{tabular}
 \end{table}
\end{landscape}

\section{References and citations}

\subsection{Cross-referencing}

The usual \LaTeX\ commands \verb'\label{}' and \verb'\ref{}' can be used for cross-referencing within the same paper.
We recommend that you use these whenever relevant, rather than writing out the section or figure numbers explicitly.
This ensures that cross-references are updated whenever the numbering changes (e.g. during revision) and provides clickable links (if available in your compiler).

It is best to give each section, figure and table a logical label.
For example, Table~\ref{tab:mathssymbols} has the label \verb'tab:mathssymbols', whilst section~\ref{sec:packages} has the label \verb'sec:packages'.
Add the label \emph{after} the section or caption command, as in the examples in sections~\ref{sec:sections} and \ref{sec:fig_table}.
Enter the cross-reference with a non-breaking space between the type of object and the number, like this: \verb'see Figure~\ref{fig:example}'.

The \verb'\autoref{}' command can be used to automatically fill out the type of object, saving on typing.
It also causes the link to cover the whole phrase rather than just the number, but for that reason is only suitable for single cross-references rather than ranges.
For example, \verb'\autoref{tab:journal_abbr}' produces \autoref{tab:journal_abbr}.

\subsection{Citations}
\label{sec:cite}

MNRAS uses the Harvard -- author (year) -- citation style, e.g. \citet{author2013}.
This is implemented in \LaTeX\ via the \verb'natbib' package, which in turn is included via the \verb'usenatbib' package option (see section~\ref{sec:options}), which should be used in all papers.

Each entry in the reference list has a `key' (see section~\ref{sec:ref_list}) which is used to generate citations.
There are two basic \verb'natbib' commands:
\begin{description}
 \item \verb'\citet{key}' produces an in-text citation: \citet{author2013}
 \item \verb'\citep{key}' produces a bracketed (parenthetical) citation: \citep{author2013}
\end{description}
Citations will include clickable links to the relevant entry in the reference list, if supported by your \LaTeX\ compiler.

\defcitealias{smith2014}{Paper~I}
\begin{table*}
 \caption{Common citation commands, provided by the \texttt{natbib} package.}
 \label{tab:natbib}
 \begin{tabular}{lll}
  \hline
  Command & Ouput & Note\\
  \hline
  \verb'\citet{key}' & \citet{smith2014} & \\
  \verb'\citep{key}' & \citep{smith2014} & \\
  \verb'\citep{key,key2}' & \citep{smith2014,jones2015} & Multiple papers\\
  \verb'\citet[table 4]{key}' & \citet[table 4]{smith2014} & \\
  \verb'\citep[see][figure 7]{key}' & \citep[see][figure 7]{smith2014} & \\
  \verb'\citealt{key}' & \citealt{smith2014} & For use with manual brackets\\
  \verb'\citeauthor{key}' & \citeauthor{smith2014} & If already cited in close proximity\\
  \verb'\defcitealias{key}{Paper~I}' &  & Define an alias (doesn't work in floats)\\
  \verb'\citetalias{key}' & \citetalias{smith2014} & \\
  \verb'\citepalias{key}' & \citepalias{smith2014} & \\
  \hline
 \end{tabular}
\end{table*}

There are a number of other \verb'natbib' commands which can be used for more complicated citations.
The most commonly used ones are listed in Table~\ref{tab:natbib}.
For full guidance on their use, consult the \verb'natbib' documentation\footnote{\url{http://www.ctan.org/pkg/natbib}}.

If a reference has several authors, \verb'natbib' will automatically use `et al.' if there are more than two authors. However, if a paper has exactly three authors, MNRAS style is to list all three on the first citation and use `et al.' thereafter. If you are using \bibtex\ (see section~\ref{sec:ref_list}) then this is handled automatically. If not, the \verb'\citet*{}' and \verb'\citep*{}' commands can be used at the first citation to include all of the authors.

\subsection{The list of references}
\label{sec:ref_list}

It is possible to enter references manually using the usual \LaTeX\ commands, but we strongly encourage authors to use \bibtex\ instead.
\bibtex\ ensures that the reference list is updated automatically as references are added or removed from the paper, puts them in the correct format, saves on typing, and the same reference file can be used for many different papers -- saving time hunting down reference details.
An MNRAS \bibtex\ style file, \verb'mnras.bst', is distributed as part of this package.
The rest of this section will assume you are using \bibtex.

References are entered into a separate \verb'.bib' file in standard \bibtex\ formatting.
This can be done manually, or there are several software packages which make editing the \verb'.bib' file much easier.
We particularly recommend \textsc{JabRef}\footnote{\url{http://jabref.sourceforge.net/}}, which works on all major operating systems.
\bibtex\ entries can be obtained from the NASA Astrophysics Data System\footnote{\label{foot:ads}\url{http://adsabs.harvard.edu}} (ADS) by clicking on `Bibtex entry for this abstract' on any entry.
Simply copy this into your \verb'.bib' file or into the `BibTeX source' tab in \textsc{JabRef}.

Each entry in the \verb'.bib' file must specify a unique `key' to identify the paper, the format of which is up to the author.
Simply cite it in the usual way, as described in section~\ref{sec:cite}, using the specified key.
Compile the paper as usual, but add an extra step to run the \texttt{bibtex} command.
Consult the documentation for your compiler or latex distribution.

Correct formatting of the reference list will be handled by \bibtex\ in almost all cases, provided that the correct information was entered into the \verb'.bib' file.
Note that ADS entries are not always correct, particularly for older papers and conference proceedings, so may need to be edited.
If in doubt, or if you are producing the reference list manually, see the MNRAS instructions to authors$^{\ref{foot:itas}}$ for the current guidelines on how to format the list of references.

\section{Appendices and online material}

To start an appendix, simply place the \verb'\appendix' command before the next \verb'\section{}'.
This will automatically adjust the section headings, figures, tables, and equations to reflect the fact that they are part of an appendix.
It is only necessary to enter the \verb'\appendix' command once -- everything after that command is in an appendix.
Remember that appendices should be placed \textit{after} the list of references.

Unlike other astronomy class files, there are no special commands for online material.
If your paper has any online material, it should be placed in a separate file.
See our instructions to authors$^{\ref{foot:itas}}$ for guidance.

\section{Packages and custom commands}
\label{sec:packages}
\subsection{Additional packages}

Sometimes authors need to include additional \LaTeX\ packages, which provide extra features.
For example, the \verb'bm' package provides extra bold maths symbols, whilst the \verb'pdflscape' package adds support for landscape pages.
Packages can be included by adding the \verb'\usepackage{}' command to the preamble of the document (not the main body).

Please \emph{only include packages which are actually used in the paper}, and include a comment to explain what each one does.
This will assist the typesetters.
If you are using \texttt{mnras\_template.tex}, it includes a specific section for this purpose, near the start of the file with the header 'authors - place your own packages here'.

For example, to include \verb'pdflscape', use:
\begin{verbatim}
\usepackage{pdflscape}	% Landscape pages
\end{verbatim}
Consult the documentation for that package for instructions on how to use the additional features.


\subsection{Custom commands}

Authors should avoid duplicating or redefining commands which are already available in \LaTeX\ or \verb'mnras.cls'.
However it may sometimes be necessary to introduce a custom command e.g. as a shortcut while writing the paper.

Please \emph{only include commands which are actually used in the paper}, and include a comment to explain what each one does.
This will assist the typesetters.
Use \verb'\newcommand', \emph{not} \verb'\def', as this will avoid accidentally overwriting existing commands.
Place custom commands in the preamble of the document (not the main body).
If you are using \texttt{mnras\_template.tex}, it includes a specific section for this purpose, near the start of the file with the header 'authors - place your own commands here'.

As an example, a shortcut for the unit \kms can be defined like this:
\begin{verbatim}
\newcommand{\kms}{\,km\,s$^{-1}$}	% kilometres per second
\end{verbatim}
Velocities can then be written as e.g. \verb'2.3\kms' which produces 2.3\kms.
Similar shortcuts can be used for frequently quoted object designations.


\section*{Acknowledgements}
% Entry for the table of contents, for this guide only
\addcontentsline{toc}{section}{Acknowledgements}

This guide replaces an earlier one originally prepared by Cambridge University Press (CUP) in 1994, and last updated in 2002 by Blackwell Publishing.
Some code segments are reproduced from, and some examples are based upon, that guide.
The authors were: A.~Woollatt, M.~Reed, R.~Mulvey, K.~Matthews, D.~Starling, Y.~Yu, A.~Richardson (all CUP), and Penny~Smith, N.~Thompson and Gregor~Hutton (all Blackwell), whose work is gratefully acknowledged.

The accompanying \bibtex\ style file was written by John Sleath, Tim Jenness and Norman Gray, without whom \bibtex\ support would not have been possible.

Some special symbols in tables~\ref{tab:anysymbols}--\ref{tab:mathssymbols} were taken from the Springer Verlag \textit{Astronomy \& Astrophysics} \LaTeX\ class, with their permission.

KTS thanks Nelson Beebe (University of Utah) for helpful advice regarding CTAN.

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

\section*{Data Availability}

 
The inclusion of a Data Availability Statement is a requirement for articles published in MNRAS. Data Availability Statements provide a standardised format for readers to understand the availability of data underlying the research results described in the article. The statement may refer to original data generated in the course of the study or to third-party data analysed in the article. The statement should describe and provide means of access, where possible, by linking to the data or providing the required accession numbers for the relevant databases or DOIs.


%%%%%%%%%%%%%%%%%%%% REFERENCES %%%%%%%%%%%%%%%%%%

% The best way to enter references is to use BibTeX:

%\bibliographystyle{mnras}
%\bibliography{example} % if your bibtex file is called example.bib


% Alternatively you could enter them by hand, like this:
\begin{thebibliography}{99}
\bibitem[\protect\citeauthoryear{Author}{2013}]{author2013}
Author A.~N., 2013, Journal of Improbable Astronomy, 1, 1
\bibitem[\protect\citeauthoryear{Jones}{2015}]{jones2015}
Jones C.~D., 2015, Journal of Interesting Stuff, 17, 198
\bibitem[\protect\citeauthoryear{Smith}{2014}]{smith2014}
Smith A.~B., 2014, The Example Journal, 12, 345 (Paper I)
\end{thebibliography}


%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

%%%%%%%%%%%%%%%%% APPENDICES %%%%%%%%%%%%%%%%%%%%%

\appendix
\section{Journal abbreviations}
\label{sec:abbreviations}
Abbreviations for cited journals can be accessed using the commands listed in table~\ref{tab:journal_abbr}.
Although some of these may appear to be outdated or rarely cited, they have been selected to be compatible with the \bibtex\ output by the NASA Astrophysics Data System$^{\ref{foot:ads}}$, commands used by other astronomy journals, and with additional entries for journals with non-standard abbreviations in MNRAS.
For journals which are not on this list, see our instructions to authors$^{\ref{foot:itas}}$ for guidance on how to abbreviate titles.

\begin{table*}
\caption{Commands for abbreviated journal names, see appendix~\ref{sec:abbreviations}.}
\label{tab:journal_abbr}
\begin{tabular}{@{}l@{\:}l@{\:}l@{}} % manual @ spacing to prevent this being too wide for a page
\hline
Command & Output & Journal name\\
\hline
\verb'\aap' or \verb'\astap' & \aap & Astronomy and Astrophysics$^a$\\
\verb'\aapr' & \aapr & The Astronomy and Astrophysics Review\\
\verb'\aaps' & \aaps  & Astronomy and Astrophysics Supplement Series\\
\verb'\actaa' & \actaa & Acta Astronomica\\
\verb'\afz' & \afz & Astrofizika\\
\verb'\aj' & \aj & The Astronomical Journal\\
\verb'\ao' or \verb'\applopt' & \ao & Applied Optics\\
\verb'\aplett' & \aplett & Astrophysics Letters\\
\verb'\apj' & \apj & The Astrophysical Journal\\
\verb'\apjl' or \verb'\apjlett' & \apjl & The Astrophysical Journal Letters$^a$\\
\verb'\apjs' or \verb'\apjsupp' & \apjs & The Astrophysical Journal Supplement Series\\
\verb'\apss' & \apss & Astrophysics and Space Science\\
\verb'\araa' & \araa & Annual Review of Astronomy and Astrophysics\\
\verb'\arep' & \arep & Astronomy Reports$^b$\\
\verb'\aspc' & \aspc & Astronomical Society of the Pacific Conference Series\\
\verb'\azh' & \azh & Astronomicheskii Zhurnal$^c$\\
\verb'\baas' & \baas & Bulletin of the American Astronomical Society\\
\verb'\bac' & \bac & Bulletin of the Astronomical Institutes of Czechoslovakia\\
\verb'\bain' & \bain & Bull. Astron. Inst. Netherlands\\
\verb'\caa' & \caa & Chinese Astronomy and Astrophysics\\
\verb'\cjaa' & \cjaa & Chinese Journal of Astronomy and Astrophysics\\
\verb'\fcp' & \fcp & Fundamentals of Cosmic Physics\\
\verb'\gca' & \gca & Geochimica Cosmochimica Acta\\
\verb'\grl' & \grl & Geophysics Research Letters\\
\verb'\iaucirc' & \iaucirc & International Astronomical Union Circulars\\
\verb'\icarus' & \icarus & Icarus\\
\verb'\japa' & \japa & Journal of Astrophysics and Astronomy\\
\verb'\jcap' & \jcap & Journal of Cosmology and Astroparticle Physics\\
\verb'\jcp' & \jcp & Journal of Chemical Physics\\
\verb'\jgr' & \jgr & Journal of Geophysics Research\\
\verb'\jqsrt' & \jqsrt & Journal of Quantitiative Spectroscopy and Radiative Transfer\\
\verb'\jrasc' & \jrasc & Journal of the Royal Astronomical Society of Canada\\
\verb'\memras' & \memras & Memoirs of the Royal Astronomical Society\\
\verb'\memsai' & \memsai & Memoire della Societa Astronomica Italiana\\
\verb'\mnassa' & \mnassa & Monthly Notes of the Astronomical Society of Southern Africa\\
\verb'\mnras' & \mnras & Monthly Notices of the Royal Astronomical Society$^a$\\
\verb'\na' & \na & New Astronomy\\
\verb'\nar' & \nar & New Astronomy Review\\
\verb'\nat' & \nat & Nature\\
\verb'\nphysa' & \nphysa & Nuclear Physics A\\
\verb'\pra' & \pra & Physical Review A: Atomic, molecular, and optical physics\\
\verb'\prb' & \prb & Physical Review B: Condensed matter and materials physics\\
\verb'\prc' & \prc & Physical Review C: Nuclear physics\\
\verb'\prd' & \prd & Physical Review D: Particles, fields, gravitation, and cosmology\\
\verb'\pre' & \pre & Physical Review E: Statistical, nonlinear, and soft matter physics\\
\verb'\prl' & \prl & Physical Review Letters\\
\verb'\pasa' & \pasa & Publications of the Astronomical Society of Australia\\
\verb'\pasp' & \pasp & Publications of the Astronomical Society of the Pacific\\
\verb'\pasj' & \pasj & Publications of the Astronomical Society of Japan\\
\verb'\physrep' & \physrep & Physics Reports\\
\verb'\physscr' & \physscr & Physica Scripta\\
\verb'\planss' & \planss & Planetary and Space Science\\
\verb'\procspie' & \procspie & Proceedings of the Society of Photo-Optical Instrumentation Engineers\\
\verb'\rmxaa' & \rmxaa & Revista Mexicana de Astronomia y Astrofisica\\
\verb'\qjras' & \qjras & Quarterly Journal of the Royal Astronomical Society\\
\verb'\sci' & \sci & Science\\
\verb'\skytel' & \skytel & Sky and Telescope\\
\verb'\solphys' & \solphys & Solar Physics\\
\verb'\sovast' & \sovast & Soviet Astronomy$^b$\\
\verb'\ssr' & \ssr & Space Science Reviews\\
\verb'\zap' & \zap & Zeitschrift fuer Astrophysik\\
\hline
\multicolumn{3}{l}{$^a$ Letters are designated by an L at the start of the page number, not in the journal name}\\
\multicolumn{3}{l}{\footnotesize$^b$ In 1992 the English translation of this journal changed its name from Soviet Astronomy to Astronomy Reports}\\
\multicolumn{3}{l}{\footnotesize$^c$ Including the English translation Astronomy Letters}\\
\end{tabular}
\end{table*}

\clearpage % to avoid the long table breaking up the formatting examples
\section{Advanced formatting examples}
\label{sec:advanced}

Sometimes formatting doesn't behave exactly as expected when used in titles or section headings, and must be modified to obtain the correct appearance.
Generally the publishers can fix these problems during the typesetting process after a paper is accepted, but authors may wish to adjust these themselves to minimise the possibility of errors and/or for the benefit of the refereeing process.
Below are some examples of output, followed by the \LaTeX\ code which produces them.

Most mathematics and text formatting works as expected, but some commands might not be the correct size, bold or italic.
If so they can be finessed by hand, as in the bold mathematics here:
\boxit{\huge\bf \textit{Herschel} observations of galaxies at $\bm{\delta > 60\degr}$}
\begin{verbatim}
\title{\textit{Herschel} observations of galaxies at
 $\bm{\delta > 60\degr}$}
\end{verbatim}

Most fonts do not provide bold and italic versions of small capitals, so the \verb'\ion{}{}' command doesn't produce the expected output in headings.
The effect has to be `faked' using font size commands, remembering that the running head is a different style:
\boxit{\huge\bf Abundances in H\,{\Large \textbf{II}} regions}
\begin{verbatim}
\title
[Abundances in H\,{\normalsize \textit{II}} regions]
{Abundances in H\,{\Large \textbf{II}} regions}
\end{verbatim}

Complex mathematics can cause problems with links, so might require adding a less formatted short version of the heading:
\boxit{\bf 2\quad FINDING Mg\,{\sevensize II} ABSORBERS AT $\bm{z > 2}$}
\begin{verbatim}
\section
[Finding Mg II absorbers at z > 2]
{Finding M\lowercase{g}\,{\sevensize II} absorbers
 at $\lowercase{\bm{z > 2}}$}
\end{verbatim}


Using square brackets in headings can cause additional linking problems, which are solved by wrapping them in \{\textellipsis\}:
\boxit{\bf 2.1\quad [C\,{\sevensize II}] 158$\bmath{\umu}$m emission}
\begin{verbatim}
\subsection
[{[C II] 158$\umu$m emission}]
{[C\,{\sevensize II}] 158$\bmath{\umu}$m
 emission}
\end{verbatim}

Use \verb'\text{}' (not \verb'\rm') for non-variables in mathematics, which preserves the formatting of the surrounding text.
For the same reasons, use \verb'\textit{}' for italics (not \verb'\it').
\boxit{\bf 3.1\quad Measuring $\bm{T}_\text{eff}$ from \textit{Gaia} photometry}
\begin{verbatim}
\subsection{Measuring $\bm{T}_\text{eff}$ from
 \textit{Gaia} photometry}
\end{verbatim}

\section{Additional commands for editors only}
The following commands are available for the use of editors and production staff only.
They should not be used (or modified in the template) by authors.

\begin{description}
 \item \verb'\maketitle' inserts the title, authors and institution list in the correct formatting.
 \item \verb'\nokeywords' tidies up the spacing if there are no keywords, but authors should always enter at least one.
 \item \verb'\volume{}' sets the volume number (default is 000)
 \item \verb'\pagerange{}' sets the page range. The standard template generates this automatically, starting from 1.
 \item \verb'\bsp' adds the `This paper has been typeset\textellipsis' comment at the end of the paper.
The command name refers to Blackwell Science Publishing, who were the publishers at the time when MNRAS began accepting \LaTeX\ submissions in 1993.
 \item \verb'\mniiiauth{}' used by the \bibtex\ style to handle MNRAS style for citing papers with three authors. It should not be used manually.
 \item \verb'\eprint{}' used by the \bibtex\ style for citing arXiv eprints.
 \item \verb'\doi{}' used by the \bibtex\ style for citing Digital Object Identifiers.
\end{description}

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%


% Don't change these lines
\bsp	% typesetting comment
\label{lastpage}
\end{document}

% End of mnras_guide.tex
% mnras_template.tex 
%
% LaTeX template for creating an MNRAS paper
%
% v3.3 released April 2024
% (version numbers match those of mnras.cls)
%
% Copyright (C) Royal Astronomical Society 2015
% Authors:
% Keith T. Smith (Royal Astronomical Society)

% Change log
%
% v3.3 April 2024
%   Updated \pubyear to print the current year automatically
% v3.2 July 2023
%	Updated guidance on use of amssymb package
% v3.0 May 2015
%    Renamed to match the new package name
%    Version number matches mnras.cls
%    A few minor tweaks to wording
% v1.0 September 2013
%    Beta testing only - never publicly released
%    First version: a simple (ish) template for creating an MNRAS paper

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% Basic setup. Most papers should leave these options alone.
\documentclass[fleqn,usenatbib]{mnras}
% MNRAS is set in Times font. If you don't have this installed (most LaTeX
% installations will be fine) or prefer the old Computer Modern fonts, comment
% out the following line
% Depending on your LaTeX fonts installation, you might get better results with one of these:
%\usepackage{mathptmx}
%\usepackage{txfonts}

% Use vector fonts, so it zooms properly in on-screen viewing software
% Don't change these lines unless you know what you are doing
\usepackage[T1]{fontenc}
\usepackage{amsfonts}
\usepackage{algorithm}
\usepackage{algpseudocode}

% Allow "Thomas van Noord" and "Simon de Laguarde" and alike to be sorted by "N" and "L" etc. in the bibliography.
% Write the name in the bibliography as "\VAN{Noord}{Van}{van} Noord, Thomas"
\DeclareRobustCommand{\VAN}[3]{#2}
\let\VANthebibliography\thebibliography
\def\thebibliography{\DeclareRobustCommand{\VAN}[3]{##3}\VANthebibliography}


%%%%% AUTHORS - PLACE YOUR OWN PACKAGES HERE %%%%%

% Only include extra packages if you really need them. Avoid using amssymb if newtxmath is enabled, as these packages can cause conflicts. newtxmatch covers the same math symbols while producing a consistent Times New Roman font. Common packages are:
\usepackage{graphicx}	% Including figure files
\usepackage{amsmath}	% Advanced maths commands#
\usepackage{cleveref}

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

%%%%% AUTHORS - PLACE YOUR OWN COMMANDS HERE %%%%%

% Please keep new commands to a minimum, and use \newcommand not \def to avoid
% overwriting existing commands. Example:
%\newcommand{\pcm}{\,cm$^{-2}$}	% per cm-squared

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

%%%%%%%%%%%%%%%%%%% TITLE PAGE %%%%%%%%%%%%%%%%%%%

% Title of the paper, and the short title which is used in the headers.
% Keep the title short and informative.
\title[Accelerated NS with $\beta$-flows]{Accelerated nested sampling with $\beta$-flows for gravitational waves}

% The list of authors, and the short list which is used in the headers.
% If you need two or more lines of authors, add an extra line using \newauthor
\author[]{
Metha Prathaban,$^{1,2,3}$\thanks{E-mail: myp23@cam.ac.uk}
Harry Bevins,$^{1,2}$
Will Handley,$^{1,2,4}$
\\
% List of institutions
$^{1}$Kavli Institute for Cosmology, Madingley Road, Cambridge CB3 0HA, UK\\
$^{2}$Astrophysics Group, Cavendish Laboratory, J.J. Thomson Avenue, Cambridge CB3 0HE, UK\\
$^{3}$Pembroke College, Trumpington Street, Cambridge CB2 1RF, UK \\
$^{4}$Gonville \& Caius College, Trinity Street, Cambridge CB2 1TA, UK
}

% These dates will be filled out by the publisher
\date{Accepted XXX. Received YYY; in original form ZZZ}

% Prints the current year, for the copyright statements etc. To achieve a fixed year, replace the expression with a number. 
\pubyear{\the\year{}}

% Don't change these lines
\begin{document}
\label{firstpage}
\pagerange{\pageref{firstpage}--\pageref{lastpage}}
\maketitle

% Abstract of the paper
\begin{abstract}
There is an ever-growing need in the gravitational wave community for fast and reliable inference methods, accompanied by an informative error bar. Nested sampling satisfies the last two requirements, but its computational cost can become prohibitive when using the most accurate waveform models. In this paper, we demonstrate the acceleration of nested sampling using a technique called posterior repartitioning. This method leverages nested sampling's unique ability to separate prior and likelihood contributions at the algorithmic level. Specifically, we define a `repartitioned prior' informed by the posterior from a low-resolution run. To construct this repartitioned prior, we use a $\beta$-flow, a novel type of conditional normalizing flow designed to better learn deep tail probabilities. $\beta$-flows are trained on the entire nested sampling run and conditioned on an inverse temperature $\beta$. Applying our methods to simulated and real binary black hole mergers, we demonstrate how they can reduce the number of likelihood evaluations required for convergence by up to an order of magnitude, enabling faster model comparison and parameter estimation. Furthermore, we highlight the robustness of using $\beta$-flows over standard normalizing flows to accelerate nested sampling. Notably, $\beta$-flows successfully recover the same posteriors and evidences as traditional nested sampling, even in cases where standard normalizing flows fail.

% We exploit the unique ability of nested sampling to distinguish between prior and likelihood at the algorithm level, by defining a `helper prior' using information about the posterior from a low resolution run. We define this helper prior using a $\beta$-flow, a novel type of conditional normalizing flow. Designed to better learn deep tail probabilities, they are trained on the entire nested sampling run and conditioned on an inverse temperature $\beta$. Applying our methods to simulated and real binary black hole mergers, we demonstrate how they can reduce the number of likelihood evaluations required for convergence by up to an order of magnitude, enabling faster model comparison and parameter estimation. We also highlight the robustness of using $\beta$-flows over standard normalizing flows to accelerate nested sampling. Notably, $\beta$-flows successfully recover the same posteriors and evidences as traditional nested sampling, even in cases where standard normalizing flows fail.

\end{abstract}

% Select between one and six entries from the list of approved keywords.
% Don't make up new ones.
\begin{keywords}
keyword1 -- keyword2 -- keyword3
\end{keywords}

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

%%%%%%%%%%%%%%%%% BODY OF PAPER %%%%%%%%%%%%%%%%%%

\section{Introduction}

Nested sampling (NS)~\citep{Skilling2006NS} is a Bayesian inference tool widely used across the physical sciences, including in the analysis of gravitational wave data~\citep{Ashton2022NSReview, Thrane_2019, Veitch_LAL, bilby_paper}. Unlike many Bayesian inference algorithms that focus solely on approximating the posterior distribution from a given likelihood and prior, nested sampling first evaluates the Bayesian evidence. This evidence, obtained by evaluating an integral over the parameter space, is essential for model comparison and tension quantification. Samples from the normalized posterior can then be drawn as a byproduct of this calculation.

While the ability to compute evidences is a key advantage, nested sampling can be slower than alternative posterior samplers, such as Metropolis-Hastings~\citep{metropolis1953, hastings1970}. This challenge is particularly pronounced in gravitational wave data analysis, where the use of high-fidelity waveform models or models incorporating additional physics can make likelihood evaluations prohibitively expensive. Consequently, reducing the wall-time for inference has been the focus of significant research efforts~\citep{likelihood_reweighting, DINGO, ROQ}.

% ; through the estimation of this integral over the parameter space, samples on the normalized posterior can then be drawn.

% Though calculating the evidence is vital for model comparison and tension quantification, nested sampling algorithms can be slower than alternative posterior samplers, such as Metropolis-Hastings~\citep{metropolis1953, hastings1970}. This is especially important for the analysis of gravitational wave data, where waveform models which are more accurate or incorporate additional physics can be prohibitively expensive to evaluate, leading to significant research efforts to bring down the wall-time of inference~\citep{likelihood_reweighting, DINGO, ROQ}.

Several methods have been proposed to accelerate the core NS algorithm~\citep{Supernest, DynamicNS}, with one promising solution being posterior repartitioning (PR)~\citep{PR1}. Originally introduced to solve the problem of unrepresentative priors, this approach takes advantages of NS's unique ability in distinguishing between the prior and the likelihood, by sampling from the prior, $\pi$, subject to the hard likelihood constraint, $\mathcal{L}$. Other techniques, such as Hamiltonian Monte Carlo~\citep{HMC1, HMC2} and Metropolis-Hastings, are only sensitive to the product of the two. PR works by redistributing parts of the likelihood into the prior that NS sees, thereby reducing the number of iterations of the algorithm required for convergence~\citep{Supernest}. The main difficulty lies in defining the optimal prior for this purpose.

% There have been several historical attempts at accelerating the core nested sampling algorithm~\citep{Supernest, DynamicNS}, and one promising solution is posterior repartitioning (PR)~\citep{PR1}. Originally proposed to solve the problem of unrepresentative priors, this approach exploits the uniqueness of nested sampling in distinguishing between the prior and the likelihood, by sampling from the prior, $\pi$, subject to the hard likelihood constraint, $\mathcal{L}$. Other techniques, such as Hamiltonian Monte Carlo~\citep{HMC1, HMC2} and Metropolis-Hastings, are only sensitive to the product of the two. This property of nested sampling can be used to our advantage by moving portions of the likelihood into the prior that nested sampling sees, such that fewer iterations of the algorithm are required~\citep{Supernest}. The difficulty is in choosing the optimal prior.

Normalizing flows (NFs) offer a promising approach to addressing this. These versatile generative modelling tools have been widely adopted in the scientific community for tasks ranging from performing efficient joint analyses~\citep{Bevins2022margarine1, Bevins2023margarine2} to evaluating Bayesian statistics like the Kullback-Leibler divergence in a marginal framework~\citep{Bevins2023margarine2, Pochinda2023Constraints, Gessey-Jones2024Constraints}, as region samplers in the nested sampling algorithm~\citep{Williams2021Nessai}, as proposals for importance sampling and MCMC methods~\citep{Papamakarios2015ImportanceSampling, Paige2016SMC, Matthews2022SMC} and as a foundation for Simulation Based Inference~\citep{Fan2012NLE, Papamakarios2016NPE}, among others. 

Importantly, they can also be used to define non-trivial priors~\citep{Alsing2022anyprior, Bevins2023margarine2}, making them ideal candidates for use as repartitioned priors in PR to speed up NS. Central to the success of this application of normalizing flows, and indeed of all the above applications, is the accuracy of the flow in representing the distribution it aims to learn. In this paper, we will demonstrate empirically that the accuracy of commonly used normalizing flow architectures is often poor in the tails of the distribution. We introduce $\beta-$flows, which are trained on the whole nested sampling run and conditioned on an inverse temperature $\beta$, analogous to the inverse temperature in statistical mechanics. Since NS has deep tails, $\beta$-flows are able to better learn the tails of target distributions. We show that replacing standard normalizing flows with $\beta$-flows can lead to improvements in the runtime and robustness of PR-accelerated NS.

In the following section, we lay out the necessary background. We then introduce $\beta$-flows and describe the methodology used in our analyses in Section~\ref{methods}, and present and discuss our results in Section~\ref{results}. Finally, conclusions are presented in Section~\ref{conclusions}. 

\section{Background}

Section~\ref{sec:NS_intro} provides a brief overview of the key concepts of nested sampling and establishes notation. For a more detailed review, readers are directed to~\cite{Skilling2006NS} and~\cite{Ashton2022NSReview} for general information on NS, and to~\cite{Polychord1} for specifics about \textsc{PolyChord}, the NS implementation used in this work. Sections~\ref{sec:runtime} and~\ref{sec:PR} provide background on the runtime of NS and outline posterior repartitioning, introducing key aspects that extend beyond the standard nested sampling framework. 

\subsection{Nested sampling and Bayesian inference}\label{sec:NS_intro}

The nested sampling algorithm, first proposed by~\cite{Skilling2006NS}, is a technique whose primary goal is to calculate the evidence term in Bayes' theorem. Given some model $\mathcal{M}$ and observed data $D$, Bayes' theorem enables us to relate the posterior probability of a set of parameters $\theta$ to the likelihood, $\mathcal{L}$, of $D$ given $\theta$ and the prior probability, $\pi$, of $\theta$ given $\mathcal{M}$

\begin{equation}
    P(\theta | D, \mathcal{M}) = \frac{P(D | \theta, \mathcal{M}) P(\theta | \mathcal{M})}{P(D | \mathcal{M})} = \frac{\mathcal{L}(D | \theta)\pi(\theta)}{\mathcal{Z}}.
\end{equation}
In general, the evidence, $\mathcal{Z}$, is a many dimensional integral over the parameter space:
% The explicit dependence on the model will henceforth be dropped for notational conciseness. In gravitational wave analyses, where the observed data are modelled as the sum of a compact binary coalescence signal and a noise component, the likelihood is typically expressed in a manner similar to~\cite{Thrane_2019} or~\cite{TemperedImportanceSampling} as:

% \begin{equation}
%     \mathcal{L}(D | \theta) = \frac{1}{|\sqrt{2\pi\Sigma_n}|} \textrm{exp}\left(-\frac{1}{2} (D - \mu(\theta))^T \Sigma_n^{-1} (D - \mu(\theta)) \right),
% \end{equation}
% where $\mu(\theta)$ is a template for the gravitational wave strain given $\theta$. $\Sigma_n$ is the detector noise, which is assumed to be stationary and characterized as a zero-mean Gaussian with a known covariance estimated from the power spectral density~\citep{Veitch_LAL}. 

% The evidence, $\mathcal{Z} = P(D)$, represents the probability of the data within the assumed model; it is not merely a normalization constant for the posterior, but plays an instrumental role in model comparison and tension quantification. In general, the evidence is a many dimensional integral over the parameter space:

\begin{equation}
    \mathcal{Z} = \int \mathcal{L}(\theta) \pi(\theta) d\theta.
\end{equation}
The innovation of NS is in transforming this into a one dimensional problem, by defining the integral in terms of the fractional prior volume enclosed by a given iso-likelihood contour at $\mathcal{L}^\ast$ in the parameter space:

\begin{equation}
    X(\mathcal{L}^\ast) = \int_{\mathcal{L}>\mathcal{L}^\ast} \pi(\theta)d\theta.
\end{equation}
In this way, the integral may be written as:

\begin{equation}\label{evidenceX}
    \mathcal{Z} = \int \mathcal{L}(X) dX.
\end{equation}
The NS algorithm begins by populating the prior with a set `live points'. At each iteration $i$, the live point with the lowest likelihood is deleted, and a new live point is sampled from the prior with the constraint that its likelihood, $\mathcal{L}$, must be higher than that of the deleted point, $\mathcal{L}^\ast$. The algorithm terminates once some set stopping criterion is satisfied, at which point the evidence may be estimated as a weighted sum over the deleted, or `dead', points; the weights correspond to the fractional prior volumes of the `shells' enclosed between successive dead points, $w_i = \Delta X_i = X_{i-1} - X_i$.
% The NS algorithm begins by populating the prior with a set `live points'. At each iteration $i$, the live point with the lowest likelihood is deleted, and a new live point is sampled from the prior with the constraint that its likelihood, $\mathcal{L}$, must be higher than that of the deleted point, $\mathcal{L}^\ast$. At each step, the prior is compressed by a factor $t$ such that $X_{i+1} = t X_i$, where $t$ is the largest of N random numbers from \texttt{Uniform[0,1]} and $X_i$ is the fractional prior volume occupied by the live points at iteration $i$. The algorithm terminates once some set stopping criterion is satisfied, at which point the evidence may be estimated as a weighted sum over the deleted, or `dead', points; the weights correspond to the fractional prior volumes of the `shells' enclosed between successive dead points, $w_i = \Delta X_i = X_{i-1} - X_i$.

\begin{equation}\label{evidence_weightedsum}
    \mathcal{Z} = \sum_{\textrm{dead points}} \mathcal{L}_i w_i.
\end{equation}
The posterior weights of the dead points are given by
\begin{equation}\label{dead_posteriorweight}
    p_i = \frac{w_i \mathcal{L}_i}{\mathcal{Z}}.
\end{equation}

\begin{figure}\vspace{-50pt}
\centering
\def\svgwidth{1.2\columnwidth}
\input{nested_sampling.pdf_tex}
\vspace{-260pt}
\caption{Schematic of a nested sampling run. Each dead point defines an iso-likelihood contour in the parameter space (left), which then encloses a certain fractional prior volume (right). As the points compress towards the peak of the likelihood, they enclose smaller and smaller fractional volumes.}
\end{figure}

\subsection{Runtime and acceleration of NS }\label{sec:runtime}

The nested sampling algorithm typically terminates when the estimated evidence remaining in the live points is below some set fraction of the accumulated evidence so far. The total convergence time may be expressed as~\citep{Supernest}:

\begin{equation}\label{runtime}
    T \propto  T_\mathcal{L} \times f_\textrm{sampler} \times D_{\textrm{KL}} \times n_\textrm{live},
\end{equation}
where $n_\textrm{live}$ is the number of live points, $T_\mathcal{L}$ is the time taken for a single likelihood evaluation, $f_\textrm{sampler}$ encapsulates the average number of calls to the likelihood function to choose a new live point, dependent on the sampler implementation, and $D_\mathrm{KL}$ is the Kullback-Liebler divergence, representing the amount of compression from prior to posterior. This is defined as:

\begin{equation}
    \mathcal{D}_{\textrm{KL}} = \int \mathcal{P}(\theta) \textrm{ln} \frac{\mathcal{P}(\theta)}{\pi(\theta)} d\theta.
\end{equation}
Historically in gravitational wave analyses, much of the efforts in bringing down the wall-time for inference has focused on the $T_\mathcal{L}$ term, which involves developing faster waveform models through various approximations~\citep{TL_phenomD, IMRPhenomXPHM, TL_fastmodel, TL_ROQreview, TL_multibandinterpolation, TL_relativebinning}. Meanwhile, the nested sampling community has emphasized developing samplers which reduce the $f_{\textrm{sampler}}$ term~\citep{Polychord1, Polychord2, NS_multinest, NS_multinest2, NS_cosmonest, NS_cosmonest2, NS_dynesty, NS_dypolychord, NS_ultranest, Williams2021Nessai, NS1, NS2, NS3, NS4, NS5, NS6, NS7, NS8, NS9, NS10, NS11, NS12}. The aim of this paper is to accelerate NS by taking advantage of the runtime's dependence on the KL divergence term. 

The KL divergence is particularly important because it appears again in the uncertainty of the accumulated evidence. We may express the uncertainty in $\textrm{log}\mathcal{Z}$ as

\begin{equation}\label{uncertainty}
    \sigma_{\mathrm{log}\mathcal{Z}} \propto \sqrt{D_\textrm{KL} / n_\textrm{live}}.
\end{equation}
For a fixed uncertainty $\sigma$, $n_\mathrm{live}$ is directly proportional to $D_\mathrm{KL}$: a lower KL divergence allows for fewer live points, further reducing the time to convergence without sacrificing precision. In this sense, the precision-normalized runtime of NS has a quadratic dependence on the KL divergence. Thus, an effective way to accelerate NS is to reduce the amount of compression from prior to posterior. 

In practice, one way to achieve this is to first perform a low resolution pass of NS to identify roughly the region of the parameter space where the posterior lies. Then, a narrower box prior can be set in this region for high resolution pass. The tighter prior used in the second pass reduces the KL divergence between the prior and posterior. However, since the prior has changed, the evidence from the second pass will not be the desired evidence. For simple box priors, this can be corrected after the run by multiplying the second pass's evidence by the ratio of the prior volumes to recover the original evidence. For more details and an application of this method, see, for example,~\cite{anstey}. 

This method can be further improved by training a normalizing flow (NF) on the rough posterior from the low resolution pass and using this as the new prior for the high resolution pass, instead of a simple box. NFs are generative models which transform a base distribution onto a more complex one by learning a series of invertible mappings between the two. For further details on normalizing flows, readers are referred to~\cite{NF_review} for an introduction and review of the current methods, and to~\cite{Bevins2022margarine1, Bevins2023margarine2} for details on \textsc{margarine}, the \textsc{python} package used to train the normalizing flows in this work. 

However, when using the output of trained flows as the new proposal, it is no longer trivial to correct the evidence exactly. Other techniques must be employed to address this issue.

\subsection{Posterior repartitioning}\label{sec:PR}

Many sampling algorithms, such as Metropolis Hastings~\citep{metropolis1953, hastings1970} and Hamiltonian Monte Carlo~\citep{HMC1, HMC2}, are sensitive only to the product of the likelihood and prior\footnote{This is known as the `unnormalized posterior' and is in fact the joint distribution. It is this joint distribution that is used, for example, in the Metropolis acceptance ratio.}. Nested sampling on the other hand, in ``sampling from the prior, $\pi$, subject to the hard likelihood constraint, $\mathcal{L}$'', uniquely distinguishes between the two~\citep{Supernest}. Given that the evidence and posterior only depend on $\mathcal{L} \times \pi$, it follows that we are free to repartition the prior and likelihood that nested sampling sees in any way, as long as their product remains the same:

\begin{align}
    & \tilde{\mathcal{L}}(\theta) \tilde{\pi}(\theta) = \mathcal{L}(\theta) \pi(\theta)  \\
     &\implies \tilde{\mathcal{Z}} = \int \tilde{\mathcal{L}}(\theta)\tilde{\pi}(\theta) d\theta = \int \mathcal{L}(\theta) \pi (\theta) d\theta = \mathcal{Z}; \\
     &\implies\tilde{\mathcal{P}}(\theta) = \frac{\tilde{\mathcal{L}}(\theta)\tilde{\pi}(\theta)}{\tilde{\mathcal{Z}}} = \frac{\mathcal{L}(\theta) \pi (\theta)}{ \mathcal{Z}} = \mathcal{P}(\theta).
\end{align}


This concept of `posterior repartitioning' (PR) was originally introduced by~\cite{PR1, PR2} as a way to tackle problems where the prior may be unrepresentative. They pioneered a specific implementation of this called `power posterior repartitioning' (PPR), where the original prior is raised to a power $\beta$, where $\beta$ is treated as a hyperparameter which is sampled over during the run. This new adaptive prior can then widen itself at runtime if the original prior was indeed unrepresentative. Although conceived for the purposes of robustness, the same fundamental ideas can be applied to speed up NS. As explained in Section~\ref{sec:runtime}, the inference time depends on the amount of compression between prior and posterior. Hence, moving portions of the likelihood into the nested sampling prior such that it is closer to the posterior means a smaller KL divergence and a faster run. Crucially, the product of the likelihood and prior remaining the same means we can get the correct evidences out in the first instance, bypassing the need to correct them by a prior volume factor as in~\cite{anstey}. These techniques have been applied in~\cite{Supernest} to accelerate NS, although not with $\beta$-flows. 

\section{Methods}\label{methods}

Putting the above pieces together, we can accelerate NS by running a low resolution pass first, training a NF on this and then using the NF as the prior for a second, higher resolution run. We also alter the likelihood for this second run, in accordance with PR, so that
\begin{align}\label{PR_prior}
\pi^\ast & = \mathrm{NF}(\theta) \\ 
\mathcal{L}^\ast & = \frac{\mathcal{L}(\theta)\pi(\theta)}{\mathrm{NF}(\theta)}\label{PR_likelihood},
\end{align}
where $\mathrm{NF}(\theta)$ is the probability of $\theta$ predicted by the NF and $\mathcal{L}$ and $\pi$ represent the original likelihood and prior respectively. 

We have found empirically that in many cases this method provides significant speedups compared with normal NS, with results that are in excellent agreement with the latter. Occasionally, however, the NF will learn a distribution which is narrower than the target `true' posterior. In these instances, sampling from the NF can become very inefficient and, in extreme cases, may provide biased results. This is because the peaks of the repartitioned likelihood can lie `deep' in the tails of the repartitioned prior. Even in more typical cases, the amount of acceleration provided by this method depends heavily on how well the flow has learned the posterior distribution provided by the low resolution pass of NS. For the number of dimensions that are involved in most gravitational wave problems, NFs can perform poorly at this density estimation task, especially in the tails of the distribution (see Figure~\ref{fig:NF_vs_lsbi}). This can severely limit the acceleration produced by this method for many realistic GW use cases. 

\begin{figure}
    \centering
    \includegraphics{figures/NF_LSBI_comparison_v2.pdf}
    \caption{We evaluate the performance of normalizing flows on a mixture model, comprised of five Gaussians combined with unequal weights, as the number of dimensions increases. We generate samples from the mixture model in the full 14 dimensions using the package \textsc{lsbi}~\citep{lsbi_paper, lsbi_github} and drop the required number of columns to get samples in lower dimensions. We then train a normalizing flow using \textsc{margarine} on each set of samples, and compare the true log probability with the log probability predicted by the NF (blue). The black dashed line shows where the points would sit if the two perfectly matched. We also fit a five component Gaussian mixture model to each set of samples using \textsc{lsbi} and plot the log probability predictions of this too (orange). Since this model is in theory capable of fitting the distribution exactly, it could be taken to represent an upper bound on how well the task of density estimation can be performed in practice on this example. In lower dimensions, the NF performs well, albeit with slightly more scatter compared to the \textsc{lsbi} result. By $n=10$, however, the NF exhibits a significant decline in performance compared to the \textsc{lsbi} fit, with the most severe deterioration in the tails of the distribution. By $n=14$, both fits perform poorly. The arrows represent that there are points which lie outside the plot area. The full code to reproduce this plot, including details of how the mixture model was generated, can be found at~\protect\cite{zenodo}.}
    \label{fig:NF_vs_lsbi}
\end{figure}

In this paper, we attempt to address these issues by replacing classic normalizing flows with what we christen $\beta$-flows. 

\subsection{$\beta$-flows and the connection with statistical mechanics}\label{statmech}

There is an analogy to be made between the nested sampling algorithm and statistical mechanics~\citep{statmech_analogy}. In particular, the Bayesian evidence may be related to the partition function, if we consider the parameters $\theta$ to describe the microstate of a system with potential energy equal to the negative log-likelihood. The density of states may be expressed as:

\begin{equation}
    g(E) = \int \delta[E - E(\theta)] \pi(\theta) d\theta,
\end{equation}
where the prior is interpreted as the distribution of all possible states. An isolikelihood contour at $\mathcal{L}^\ast$ then corresponds to an energy limit $\epsilon = - \textrm{log}(\mathcal{L}^\ast)$. We can then see that the fractional prior volume, $X$, is simply the cumulative density of states, as a function of energy, rather than likelihood:

\begin{equation}
    X(\epsilon) = \int_{E(\theta) < \epsilon} \pi(\theta)d\theta = \int_{-\infty}^\epsilon g(E)dE. 
\end{equation}
The partition function at inverse canonical temperature $\beta$ may be rewritten as:

\begin{align}
    Z(\beta) & = \int e^{-\beta E}g(E) dE \notag = \int e^{-\beta \times - \textrm{log}\mathcal{L}(\theta)} \pi(\theta) d\theta \notag \\ 
    & = \int \mathcal{L}(\theta)^\beta \pi(\theta)d\theta = \int \mathcal{L}(X)^\beta dX  \label{eq:partition}
\end{align}
This inverse temperature ranges from $\beta = 0$, corresponding to an integral over the prior, to $\beta = 1$, recovering the Bayesian evidence integral from equation~\ref{evidenceX}. Though nested sampling is not thermal, it can simulate any temperature~\citep{Skilling2006NS}, meaning the partition function may be evaluated at any $\beta$ after the run (Figure~\ref{fig:temperature}). 

\begin{figure}
    \centering
    \includegraphics{figures/figure3_beta_v2.pdf}
    \caption{Nested sampling can emulate any temperature. The posterior has an inverse temperature of $\beta=1$ and the prior has an inverse temperature of $\beta=0$. In-between temperature represent intermediate distributions. This is illustrated first on a more straightforward case where the posterior is a Gaussian and the prior is uniform (top panel). As $\beta$ decreases from $1$ to $0$, the distribution widens. The bottom panel shows the two-dimensional $1\sigma$ contours recovered from a simulated binary black hole merger for the luminosity distance, $d_L$, and the zenith angle between the total angular momentum and the line of sight, $\theta_\mathrm{JN}$. The posterior samples are re-weighted according to equation~\ref{beta_weights} to generate the distributions at various temperatures. Between $\beta=0.1$ and $\beta=0.2$, the distribution begins to split into two modes; in the statistical mechanics analogy, this is akin to a phase transition at the critical temperature. }
    \label{fig:temperature}
\end{figure}

Generating samples at any inverse temperature involves modifying the posterior weights of the dead points from equation~\ref{dead_posteriorweight} to
\begin{equation}\label{beta_weights}
    p_i(\beta) = \frac{w_i \mathcal{L}_i^\beta}{\mathcal{Z}(\beta)}.
\end{equation}
$\mathcal{Z}(\beta)$ is evaluated from equation~\ref{eq:partition}. This functionality is provided by the package \textsc{anesthetic}~\citep{anesthetic}.
Typically, normalizing flows (NF) are trained only on the posterior samples, drawn from the $\beta=1$ distribution. As such, any information about the posterior and underlying likelihood functions encapsulated in the $\beta < 1$ intermediate distributions are discarded. The idea of $\beta$-flows is to incorporate this additional tail information to better learn the posterior.

\subsection{Training $\beta$-flows}

The goal is to learn a target distribution $\mathcal{P}(\theta)$ conditioned on the inverse temperature $\beta$ for samples from a NS run. We use conditional normalizing flows to transform samples from the multivariate base distribution $z \sim \mathcal{N}(0,1)$ onto $\mathcal{P}(\theta | \beta)$, where $\theta$ are drawn from the low resolution nested sampling run, with weights given by equation~\ref{beta_weights}. For any bijective transformation $f_\phi$, we can calculate the probability of a set of samples given $\beta$ by

\begin{equation}
    P_\phi(\theta|\beta) = \mathcal{N}(f_\phi(\theta,\beta)|\mu=0, \sigma=1) \left|\frac{df_\phi(\theta,\beta)}{d\theta}\right|.
\end{equation}
$\phi$ are the parameters of the neural network. We parameterize $f_\phi$ as a conditional masked auto-regressive (MAF) flow and train on a weighted reverse KL divergence~\citep{Bevins2023margarine2, Alsing2022anyprior}:

\begin{equation}
    \mathbb{L} = - \frac{1}{\sum p_i} \sum p_i(\beta) \textrm{log}P_\phi (\theta|\beta).
\end{equation}
We give the network samples weighted by various sets of $p(\beta)$, where $\beta$ ranges from $0$ to $1$. The training data therefore consists of $\{\theta, p(\beta), \beta\}$, in contrast to normal NFs, where we train with $\{\theta, p(\beta=1)\}$.

As $\beta$ increases from $0$ to $1$, the KL divergence between the weighted dead points and the prior increases non-linearly. The maximum KL divergence occurs at $\beta=1$, but the most rapid change happens at low $\beta$. 
\begin{equation}\label{DKL}
    \mathcal{D}_{\textrm{KL}} = \frac{1}{\sum_i p_i(\beta)} \sum_i p_i(\beta) \textrm{log} \frac{P(\theta | \beta)}{\pi(\theta)}.
\end{equation}
As such, instead of building the training data from $\beta$ values drawn uniformly from $[0,1]$, we define a $\beta$ schedule such that the change in KL divergence between subsequent sets of weighted dead points is constant. We choose a fixed number of $\beta$ values we want to train on first, and then calculate the exact $\beta$s between $0$ and $1$ that give equally spaced KL divergences.  

% we define a $\beta$ schedule such that the change in KL divergence between subsequent sets of weighted dead points is constant. We choose a fixed number of $\beta$ values between $0$ and $1$ and calculate which values give equal KL divergence 

Once a $\beta$-flow has been trained on the samples from the low resolution first pass of NS, we then use this as a proposal for the high resolution pass. The flow can emulate not only the $\beta=1$ posterior, but also the intermediate distributions at any $0 \le \beta \le 1$. We treat $\beta$ as a hyperparameter, similar to the approach in~\cite{PR2} (though $\beta$ has a different meaning here), and sample over it during the high resolution run. Therefore, if the $\beta=1$ distribution is too narrow compared to the `true' posterior, the proposal can widen itself adaptively at runtime. The repartitioned prior and likelihood functions become
\begin{align}\label{PR_prior_beta}
\pi^\ast & = P(\theta | \beta) \\ 
\mathcal{L}^\ast & = \frac{\mathcal{L}(\theta)\pi(\theta)}{P(\theta | \beta)}\label{PR_likelihood_beta},
\end{align}
where this time the repartitioned prior and likelihood depend on $\beta$ (though the final evidences and posteriors will not). 


\section{Results and discussion}\label{results}

In the following section, we present the results of applying the methods described above applied to both a simulated black hole binary (BBH) signal and a real event from the third Gravitational-Wave Transient Catalogue (GWTC-3). For each analysis, we first perform a low resolution pass of NS using \textsc{bilby}~\citep{bilby_paper}, with a slightly modified version (see Appendix~\ref{sec:appendix}) of the built-in \textsc{PolyChord} sampler~\citep{Polychord1, Polychord2}. Next, we train both a standard normalizing flow using \textsc{margarine}~\citep{Bevins2022margarine1, Bevins2023margarine2} and a $\beta$-flow, with code adapted from \textsc{margarine}, on the weighted posterior samples. Each of these trained flows respectively are then used as the repartitioned prior in a second pass of NS, where the likelihood is also repartitioned according to equation~\ref{PR_likelihood}. In this second pass, we use the same number of live points as in the first pass to facilitate a direct comparison between methods. However, in typical applications, a higher resolution pass would be used at this stage. All runs employ the \texttt{IMRPhenomXPHM} waveform model~\citep{IMRPhenomXPHM} and, unless otherwise specified, the standard BBH priors implemented in \textsc{bilby}. Plots are generated using \textsc{anesthetic}~\citep{anesthetic}.

\subsection{Injections}

We first demonstrate the method on a simulated BBH merger injected into Gaussian noise. We assume a two-detector configuration and inject the signal with the \texttt{IMRPhenomXPHM} waveform model. The binary has chirp mass $\mathcal{M}=28$\(M_\odot\) and mass ratio $q=0.8$. The spins are non-aligned, with an effective spin parameter $\chi_{\textrm{eff}}=0.27$ and it is located at a luminosity distance $d_L = 2000$ Mpc. The network matched-filter signal-to-noise ratio (SNR) is $\rho_{mf} = 14.8$ and we show the posterior distributions obtained for this signal from a standard nested sampling run in Figures~\ref{fig:intrinsic_simulated} and~\ref{fig:eextrinsic_simulated}.  

For the first step of our method, we perform a low resolution NS run with $n_{\textrm{live}}=200$; this is a much lower number of live points than what is typically used in standard 15-parameter gravitational wave analyses, but is still high resolution enough to capture the main features and modes of the posterior. We then use the weighted samples from this to train both a NF and a $\beta$-flow. The relative performances are shown in Figure~\ref{fig:NF_betaflow_simulated}, where the predicted probabilities from the flows are compared to the posterior probabilities given by NS. Both flows exhibit a fairly large scatter about the target probabilities, typical for a 15-dimensional problem, but the $\beta$-flow performs noticeably better than the NF, particularly in the tails of the distribution. 

Each flow is then used as the updated prior for a PR NS run, also with $n_{\textrm{live}}=200$, and the evidences and posteriors obtained from this run are compared to those from standard NS analyses with the same number of live points. Figure~\ref{fig:logZ_simulated} shows the log evidence distributions obtained from each PR run and from the original low resolution pass of NS. The results are in excellent agreement, with the error bars on $\log\mathcal{Z}$ being tighter for both the PR runs compared to normal NS, despite using the same number of live points, as predicted by equation~\ref{uncertainty}. We also compare the posteriors obtained from each method, which are plotted in Figures~\ref{fig:intrinsic_simulated} and~\ref{fig:eextrinsic_simulated} and again show good agreement between the methods. 

Table~\ref{tab:simulated_speedup} outlines the relative acceleration provided by each flow compared to normal NS. For a fixed uncertainty in $\mathrm{log}\mathcal{Z}$, given that $n_\textrm{live} \propto D_\mathrm{KL}$, we may rewrite equation~\ref{runtime} as

\begin{equation}
    T \propto T_\mathcal{L} \times f_\textrm{sampler} \times \mathcal{D}_{\textrm{KL}}^2.
\end{equation}
Then, the precision-normalized acceleration of the PR run may be approximated as
\begin{equation}
    \frac{T^{\textrm{normal NS}}}{T^{\textrm{PR NS}}} = \left( \frac{\mathcal{D}_{\mathrm{KL}}^{\textrm{normal NS}}}{\mathcal{D}_\mathrm{KL}^{\textrm{PR NS}}} \right) ^2
\end{equation}
Using PR in conjunction with a trained $\beta$-flow led to almost an order of magnitude improvement in the runtime (see Figure~\ref{fig:truncation}). In this instance, the NF performs similarly well to the $\beta$-flow, indicating that the NF has learned a wide enough distribution to avoid sampling inefficiencies in the PR run. 

It is important to note at this stage that the quoted speedup factors are calculated purely based on the number of iterations that would be required for a precision-normalized PR run. It does not take into account the changes to $T_\mathcal{L}$, the time for a single likelihood evaluation, from including the flows in the likelihood. The $\beta$-flow took longer to evaluate than the NF we used. This also means that for analyses using a waveform model like \texttt{IMRPhenomXPHM}, $T_\mathcal{L}$ increases by such a factor that we do not recommend using $\beta$-flows in their current form in these cases. This point is addressed further in the conclusions, including a discussion of future work to speed up the evaluation of our $\beta$-flows, but for now, we intend for the methods presented in this paper to be used in analyses where the evaluation of the gravitational wave likelihood is of comparable cost to the evaluation of the $\beta$-flow. We also note that, strictly speaking, the speedup factors should include the time it takes to perform the original low resolution NS run, but in the typical case where the second pass of NS uses a much larger number of live points, this cost will not contribute significantly to the overall runtime.

\begin{figure}
    \centering
    \includegraphics{figures/intrinsic_parameters_simulated.pdf}
    \caption{The posteriors obtained on some intrinsic parameters (chirp mass $\mathcal{M}$, mass ratio $q$ and effective spin parameter $\chi_\mathrm{eff}$) from standard NS are compared to those obtained using PR with normalizing flows or $\beta$-flows. The results are consistent, showing both the PR methods have managed to recover the same answers as normal NS.}
    \label{fig:intrinsic_simulated}
\end{figure}

\begin{figure}
    \centering
    \includegraphics{figures/extrinsic_parameters_simulated.pdf}
    \caption{Similarly to~\ref{fig:intrinsic_simulated}, the posteriors on the extrinsic parameters, the luminosity distance and inclination, from the two methods are compared. Again, the results are comparable, with the PR NS methods able to achieve this with far fewer likelihood evaluations. The $\beta$-flow method gives less posterior weight in the second mode and more posterior weight in the first mode than the normal NS run, but this could occur from two separate normal NS runs too, due to the stochasticity of NS~\citep{Adam, Polychord1}. This stochasticity is quantified by the $\text{log}\mathcal{Z}$ error bars that \textsc{PolyChord} outputs for individual clusters.}
    \label{fig:eextrinsic_simulated}
\end{figure}

\begin{figure}
    \centering
    \includegraphics{figures/NF_vs_betaflow_simulated_v2.pdf}
    \caption{We compare how well both the typical normalizing flow (NF) and the $\beta$-flow (evaluated at $\beta=1$) have learned the rough posterior from the low resolution pass of NS. If the flows have learned the posterior perfectly, the points should lie on the black dashed line. The arrow indicates that there are points which lie below the axes. The $\beta$-flow predictions display much less scatter about this line, showing that the extra tail information from the NS temperature has indeed enabled the flow to learn the posterior better. Although the scatter on the NF seems large, this is an empirically typical performance on a 15-dimensional problem.}
    \label{fig:NF_betaflow_simulated}
\end{figure}

\begin{figure}
    \centering
    \includegraphics{figures/logZ_simulated.pdf}
    \caption{The $\log\mathcal{Z}$ estimates obtained from normal NS, posterior repartitioned NS with a normalizing flow, and posterior-repartitioned NS with a $\beta$-flow are compared. All of the runs are performed with $n_{\textrm{live}}=200$ for easier comparison. The estimates are all consistent with each other, but both the PR runs have smaller error bars, as expected. }
    \label{fig:logZ_simulated}
\end{figure}

\begin{figure}
    \centering
    \includegraphics{figures/logZ_convergence_simulated.pdf}
    \caption{During a normal NS run, the evidence is accumulated as the live points compress towards the peak of the likelihood. The total evidence estimate for normal NS (black) becomes stable late into the run, only after the live points occupy a very small fraction of the prior volume. Because the updated prior for the posterior repartitioned runs is roughly the posterior from the low resolution pass of NS, most of the evidence has already been accumulated very early on in the run. We keep running until the total evidence estimates for the accelerated runs (blue and red) have stabilized. This happens much earlier than for normal NS, and the live points typically still occupy a significant fraction of the prior volume.}
    \label{fig:truncation}
\end{figure}

\begin{table}
    \centering
    \begin{tabular}{|c|c|c|c|c|c|}
    \hline
     type  & $n_\textrm{live}$ & $N_\textrm{iter}$ & $\sigma_{ln(Z)}$ & speedup \\
    \hline
     normal NS    & 200 & 8186 & 0.352 & - \\
     PR NS w/ NF & 200 & 4663 & 0.177 & $\times 7$\\
     PR NS w/ $\beta$-flow& 200 & 3252 &  0.179 & $\times 9$ \\
     \hline
    \end{tabular}
    \caption{For the simulated event, results of the runs comparing normal NS to posterior-repartitioned NS (PR NS) are shown. $N_iter$ is the total number of iterations, $i$, of the algorithm that were performed, and is proportional to the number of likelihood evaluations. Both the run using a typical normalizing flow and using a $\beta$-flow finish significantly sooner than normal NS. The final column shows the \textbf{precision-normalized} speedup, calculated by using equation~\ref{uncertainty} to work out how many live points we would need to run with in order to match the $\log\mathcal{Z}$ uncertainty of the normal NS run, and then scaling $N_{\textrm{iter}}$ proportionally.}
    \label{tab:simulated_speedup}
\end{table}


\subsection{Real Data}

We demonstrate the above methods on the real event, GW191222\_033537 (henceforth GW191222) from GWTC-3. As before, we perform a low resolution pass of NS on which we train both flows. This time, however, we use $350$ live points. The posterior for this event is more complex and has more multi-modality than the simulated example above, so we give the flows more samples to train on in order to give them a better chance of learning these features accurately. 

As shown in Figure~\ref{fig:GW191222_NF_v_betaflow}, once again the $\beta$-flow is able to learn the rough posterior from the NS run more accurately, and is better at predicting deep tail probabilities than the NF. However, both flows exhibit a wider spread than before at the highest log probability values, and there is a `tail' of under-predictions for certain samples from the peak of the posterior. This is indicative of the fact that the full multi-modality of the NS posterior has not been captured by either flow, though the NF does perform significantly worse. This is key to understanding the final results.

\begin{figure}
    \centering
    \includegraphics{figures/NF_vs_betaflow_GW191222_v2.pdf}
    \caption{The $\beta$-flow once again performs better at predicting the log probability given by NS. This time, both flows have a larger spread at higher log probabilities and a `tail' of points below the black dashed line. Again, the arrow indicates that there are points which lie below the axes. The NF heavily under-predicts the posterior probability of certain samples, which is indicative of the fact that it has failed to capture the multi-modality of the rough posterior. }
    \label{fig:GW191222_NF_v_betaflow}
\end{figure}

To properly verify whether we have recovered the correct posteriors for this real event, we compare our posteriors from the accelerated methods to those from a higher resolution ($n_\mathrm{live} = 2000$) standard NS run. Since the NF does not learn the multi-modality of the posterior well enough, it sets the proposal for the PR run such that certain modes are only included in the prior with very low probabilities. This leads to a biasing of the final posteriors, shown in Figures~\ref{fig:GW191222_intrinsic} and~\ref{fig:GW191222_extrinsic}. The $\beta$-flow also doesn't fully learn the multi-modality of the posterior, but since it acts as an adaptive prior at runtime, able to draw samples from the distribution at any inverse temperature, it does not completely cut off important regions of the parameter space in the same way the NF does. Looking at the posteriors in Figure~\ref{fig:GW191222_extrinsic}, we can indeed see that the $\beta=1$ distribution was too narrow and excluded regions of the parameter space with non-negligible posterior weight. Otherwise, we would expect to see a roughly uniform posterior on $\beta$, but instead we see that $\beta=1$ has a low posterior probability. 

The evidence calculated by PR NS using the NF also reflects this bias (Figure~\ref{fig:GW191222_logZ}). The results are incompatible with those from normal NS, and is another sign that regions of the parameter space with significant posterior weight were missed due to the updated prior being too narrow. Once again, because the $\beta$-flow can emulate any temperature, it is robust to these issues and is able to give equally reliable results as for the simulated example, despite a poorer performance at the posterior density estimation.

Since the $\beta$-flow did not learn the posterior at $\beta=1$ as well as for the simulated case, the speedup given by using this flow as the updated prior was not as large (Figure~\ref{fig:GW191222_truncation}). The exact acceleration provided by PR NS is very sensitive to the accuracy of the density estimation. However, the precision-normalized runtime was still twice as fast as for normal NS and, importantly, we demonstrate the robustness of this method in giving reliable evidences and posteriors, even when the density estimation is relatively poor quality. The worst case scenario of using PR NS with $\beta$-flows is that we get correct evidences and posteriors which take the same amount of time as normal NS (since for a very poor $\beta$-flow we would sample preferentially from the $\beta=0$ distribution, which is the original NS prior). The same cannot be said for PR NS with NFs, however, and the results in this section give an example where this method breaks down completely. For this reason, we recommend using $\beta$-flows in place of NFs when implementing posterior repartitioning. 

\begin{figure}
    \centering
    \includegraphics{figures/intrinsic_parameters_GW191222.pdf}
    \caption{Unlike for the previous simulated signal, using the trained NF as the proposal for PR NS has led to biased results. The NF learned the posterior from the low resolution run poorly, and without the ability to widen itself at runtime, this has produced incorrect posteriors and evidence. The $\beta$-flow is robust to this issue as the proposal is over all values of $\beta$. This means that even if the learned flow is too narrow or has not learned the multi-modality sufficiently well, it can still adapt the proposal at runtime and, in the worst case scenario, samples will simply be drawn from the original prior ($\beta$=0). The posterior on $\beta$ can tell us how poor a proposal the $\beta=1$ learned distribution was.}
    \label{fig:GW191222_intrinsic}
\end{figure}

\begin{figure}
    \centering
    \includegraphics{figures/extrinsic_parameters_GW191222.pdf}
    \caption{The multi-modality in the extrinsic parameters has caused a biasing effect for PR NS with the NF, since the NF did not learn all modes properly. The posterior on $\beta$, the inverse temperature, for the $\beta$-flow run is also included. If the flow learned the rough posterior well, we would expect to see a uniform posterior on $\beta$. The low posterior probability at $\beta=1$ indicates that the $\beta$-flow had to widen itself at runtime due to the $\beta=1$ distribution being unsuitable as a prior.}
    \label{fig:GW191222_extrinsic}
\end{figure}

\begin{figure}
    \centering
    \includegraphics{figures/logZ_GW191222.pdf}
    \caption{The NF learned the rough posterior from the low resolution run poorly, insufficiently capturing its multi-modality. This has led to a biasing of the final evidences and posteriors, since the proposal from the NF cannot widen itself like the $\beta$-flow can. The $\beta$-flow not only learned the distribution from the first pass of NS better, but also enabled an adaptive proposal at runtime, ensuring robustness against such biases. }
    \label{fig:GW191222_logZ}
\end{figure}

\begin{table}
    \centering
    \begin{tabular}{|c|c|c|c|c|c|}
    \hline
     type  & $n_\textrm{live}$ & $N_\textrm{iter}$ & $\sigma_{ln(Z)}$ & speedup\\
    \hline
     normal NS    & 350 & 10445 & 0.211 & - \\
     PR NS w/ $\beta$-flow& 350 & 7995 &  0.171 & $\times 2$ \\
     \hline
    \end{tabular}
    \caption{Normal NS is compared to the PR NS method for real event GW191222. PR NS with the $\beta$-flow is twice as fast as normal NS for a precision-normalized run. This is a smaller speedup than for the simulated example, and this is driven by the fact that the $\beta$-flow was not able to learn the rough posterior from pass 1 as accurately. PR NS with the NF is not shown here; although it was also quicker than normal NS, it gave incorrect posteriors and evidences due to the biased proposal.}
    \label{tab:GW191222_speedup}
\end{table}

\begin{figure}
    \centering
    \includegraphics{figures/logZ_convergence_GW191222.pdf}
    \caption{PR NS with the $\beta$-flow terminates before normal NS with the same number of live points. The precision-normalized speedup is less than for the simulated example, but is still a factor of two faster. We don't show the equivalent line for the NF because it failed to correctly recover the evidence and posterior.}
    \label{fig:GW191222_truncation}
\end{figure}

\section{Conclusions}\label{conclusions}

In this paper, we outline how posterior repartitioning using normalizing flows can accelerate nested sampling. While we demonstrate these methods with \textsc{PolyChord}, this is a general acceleration technique applicable to a variety of nested sampling algorithms, and does not inherently rely on machine learning to be effective. Bringing together previous work~\citep{PR1, PR2, Supernest, Bevins2022margarine1, Bevins2023margarine2, Alsing2022anyprior}, we demonstrate this method on realistic gravitational wave examples. However, there are a few drawbacks of using traditional normalizing flow architectures in posterior repartitioned nested sampling. Firstly, the amount of acceleration provided by PR NS is highly dependent on the success of the flow in learning the posterior distribution provided by the low resolution nested sampling run. In particular, the more successful the flow is at learning the deep tail probabilities, the sooner we can terminate the high resolution PR run. However, we empirically show that the accuracy of commonly used NF architectures is often poor in the tails of the target distribution, especially as the dimensionality increases. Furthermore, if the distribution learned by the flow is too narrow compared to the true posterior, this can lead to sampling inefficiencies, making the problem harder, and in the worst case scenario can give biased results. We show a real GW case where this occurs.

In order to mitigate these issues, we introduce $\beta$-flows, which are conditional normalizing flows trained on nested samples and conditioned on inverse temperature, $\beta$.
$\beta$-flows are shown to be better at predicting deep tail probabilities than traditional normalizing flows, as they have access to intermediate distributions between prior and posterior during training, as opposed to just the posterior samples. Additionally, $\beta$-flows can emulate not just the target posterior distribution itself, which corresponds to $\beta=1$, but also any of these intermediate distributions. At runtime, we sample over different values of $\beta$, meaning that if the $\beta=1$ distribution learned by the flow is indeed too narrow, the repartitioned prior can adaptively widen itself at runtime to mitigate sampling inefficiencies and biases. For the same case on which normal normalizing flows fail, we show that replacing normalizing flows with $\beta$-flows fixes the problem and is thus robust against potential pitfalls. 

One current disadvantage of $\beta$-flows is that, due to the flow having to store and call more biases and weights, they take significantly longer to evaluate than more typical normalizing flows. For evaluating the probability of a single sample, they take about $100$ms, $100$ times slower than the NF trained using \textsc{margarine}. This limitation could be ameliorated in a few ways. Firstly, the $\beta$-flow could be implemented in \textsc{jax}, which could significantly reduce this cost. Moreover, NFs and $\beta$-flows are designed to evaluate batches of samples at once, and so this cost does scale linearly with the number of samples. For a set of $10,000$ samples, the $\beta$-flows only take twice as long to evaluate them as for a single sample, and only take $4$ times as long as a flow using \textsc{margarine}. Therefore, if we could implement PR within a nested sampling algorithm which can properly make use of this property of normalizing flows, the cost to evaluate the $\beta$-flow would become negligible. Both of these are promising avenues for future work on this topic, and would make the methods presented in this paper suitable for a wider range of likelihoods. In their current form, they can still be worthwhile implementing in cases where the likelihood itself is of comparable computational cost to the flows. 

Currently, the method requires nested samples from the exact likelihood we want to use in our final analysis in order to train the flows. Future work could involve adapting the methodology to enable the $\beta$-flow to learn an approximate distribution, perhaps from a cheaper waveform model, and then use this as a proposal for the high resolution run. This has synergies with likelihood reweighting~\citep{likelihood_reweighting} and tempered importance sampling~\citep{tempered_likelihood_reweighting}. $\beta$-flows also have a connection with continuous normalizing flows (CNFs) and diffusion models, where there is a natural user tunable parameter akin to $\beta$~\citep{CNF}. Future work could explore this link, and could explore using CNFs in conjunction with posterior repartitioning too. 


\section*{Acknowledgements}

MP was supported by the Harding Distinguished Postgraduate Scholars Programme (HDPSP). WH was supported by a Royal Society University Research Fellowship. HTJB acknowledges support from the Kavli Institute for Cosmology, Cambridge, the Kavli Foundation and of St Edmunds College, Cambridge.

This work was performed using the Cambridge Service for Data Driven Discovery (CSD3), part of which is operated by the University of Cambridge Research Computing on behalf of the STFC DiRAC HPC Facility (www.dirac.ac.uk). The DiRAC component of CSD3 was funded by BEIS capital funding via STFC capital grants ST/P002307/1 and ST/R002452/1 and STFC operations grant ST/R00689X/1. DiRAC is part of the National e-Infrastructure.

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section*{Data Availability}

All the data used in this analysis, including the relevant nested sampling dataframes, can be obtained from~\cite{zenodo}. We include a notebook with all the code to reproduce the plots in this paper. We also include an example~\textsc{python} file to show how to implement posterior repartitioning in~\textsc{bilby}, with instructions on how to modify the~\textsc{bilby} source code. The code we used for training the $\beta$-flows in this paper is publicly available and can be found at~\cite{betaflows_github}. The modified version of \textsc{PolyChord} used to perform these analyses will also be publicly released. 


%%%%%%%%%%%%%%%%%%%% REFERENCES %%%%%%%%%%%%%%%%%%

% The best way to enter references is to use BibTeX:

\bibliographystyle{mnras}
\bibliography{example} % if your bibtex file is called example.bib

\appendix

\section{Termination conditions for NS}\label{sec:appendix}
Posterior repartitioned NS has slightly different properties to normal NS. This means that the usual termination condition that is used for the latter is too cautious for the former. Nested sampling compresses live points exponentially towards the peak of the likelihood function. As they close in on the peak, the likelihood values begin to saturate ($\mathcal{L}_i \rightarrow \mathcal{L}_{\textrm{peak}}$) and the fractional volumes become very small ($X_i \rightarrow 0$)~\citep{Keeton_2011}. As such, beyond a certain point there are diminishing returns for performing further iterations of the algorithm.

\begin{figure}
    \centering
    \includegraphics{figures/live_vs_accum.pdf}
    \caption{As described in~\protect\cite{Keeton_2011}, the total evidence estimate varies throughout a typical NS run. In typical NS (top panel), the accumulated evidence before we reach the bulk of the posterior is very low, due to small likelihood values. When the live points enter the posterior bulk, this accumulated evidence steadily increases, until the likelihoods saturate and the fractional prior volume changes become negligible. The estimate of the evidence remaining in the live points is much more unstable, and is initially dominated by a single live point with the highest weight, $w_i \mathcal{L}_i$. It spikes and falls rapidly as a new live point is found which temporarily dominates the live evidence, and hence the total evidence estimate also changes. This total evidence estimate usually only becomes stable once the fractional evidence remaining in the live points is small, making this a robust proxy for the stopping criterion in normal NS. When doing posterior repartitioning, however, the total evidence estimate may stabilize before the live evidence fraction has fallen by the required amount (bottom panel). In these cases, the algorithm may continue for many more iterations without any additional benefit. Here, the usual termination condition is too cautious and should be framed directly in terms of the total evidence estimate instead.}
    \label{fig:Keeton}
\end{figure}

At each iteration $k$, the estimated total evidence is the sum of the accumulated evidence and the estimated evidence remaining in the live points.

\begin{equation}
    \mathcal{Z}_\mathrm{tot} = \mathcal{Z}_\mathrm{dead} + \mathcal{Z}_\mathrm{live} \approx \sum_{i=1}^{k} \mathcal{L}_i(X_{i-1} - X_i) + \bar{\mathcal{L}}_\textrm{live} X_k.
\end{equation}
$\bar{\mathcal{L}}_\textrm{live}$ represents the average likelihood of the live points at iteration $k$, and $X_k$ is the remaining fractional volume.

Figure~\ref{fig:Keeton} shows the evolution of each of these terms as a function of the iteration number. Initially, since the deleted points have not yet reached the bulk of the posterior, the total accumulated evidence is very small due to low likelihoods. Once the bulk of the posterior is reached, the accumulated evidence builds up rapidly as the likelihood increases, until the likelihood flattens out near the peak and the fractional volume changes become negligible. At this point, the accumulated evidence saturates. 

The estimated live evidence is very unstable to begin with. It is usually dominated by a single live point which lies in the posterior, and rises sharply when a new live point is found which temporarily becomes the main contributor, falling again as the fractional prior volume decreases. Once the live points are completely contained within the bulk of the posterior, the estimated live evidence begins to fall smoothly, unless previously missed modes are found. The total evidence is also unstable at the beginning, dominated by the live evidence, but starts to become stable once we enter the posterior bulk. Ideally, we would terminate our run once this estimated total evidence has become completely stable and does not change significantly as we perform further iterations of the algorithm. 

In most cases, a proxy for this is to stop when the estimated live evidence is some very small fraction of the total accumulated evidence, and this is the default termination condition in many popular NS implementations~\citep{Ashton2022NSReview}. In the specific case of posterior repartitioning, however, this is perhaps too cautious a stopping criterion. In the extreme case where our trained flow has perfectly learned the posterior distribution, we could terminate our high resolution PR run almost immediately, since although performing further iterations of the algorithm would increase the accumulated evidence and decrease the live evidence, it would make no difference to the total evidence estimate. Even in the case where the flow has imperfectly learned the posterior, much of the discrepancy is likely to be in the tails of the distribution. As such, the total evidence estimate would still likely stabilize well before the live evidence fraction falls below the usual threshold (see e.g. Figure~\ref{fig:truncation}). As a result, in the above analyses we modified \textsc{PolyChord} to set the termination condition for the run in terms of the estimated total evidence directly, instead of the live evidence fraction. For normal NS, this resulted a very similar end point to the default condition for all the examples we ran. 
% Alternatively you could enter them by hand, like this:
% This method is tedious and prone to error if you have lots of references
%\begin{thebibliography}{99}
%\bibitem[\protect\citeauthoryear{Author}{2012}]{Author2012}
%Author A.~N., 2013, Journal of Improbable Astronomy, 1, 1
%\bibitem[\protect\citeauthoryear{Others}{2013}]{Others2013}
%Others S., 2012, Journal of Interesting Stuff, 17, 198
%\end{thebibliography}

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

%%%%%%%%%%%%%%%%% APPENDICES %%%%%%%%%%%%%%%%%%%%%


%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%


% Don't change these lines
\bsp	% typesetting comment
\label{lastpage}
\end{document}

% End of mnras_template.tex

```

4. **Bibliographic Information:**
```bbl
\begin{thebibliography}{}
\makeatletter
\relax
\def\mn@urlcharsother{\let\do\@makeother \do\$\do\&\do\#\do\^\do\_\do\%\do\~}
\def\mn@doi{\begingroup\mn@urlcharsother \@ifnextchar [ {\mn@doi@} {\mn@doi@[]}}
\def\mn@doi@[#1]#2{\def\@tempa{#1}\ifx\@tempa\@empty \href {http://dx.doi.org/#2} {doi:#2}\else \href {http://dx.doi.org/#2} {#1}\fi \endgroup}
\def\mn@eprint#1#2{\mn@eprint@#1:#2::\@nil}
\def\mn@eprint@arXiv#1{\href {http://arxiv.org/abs/#1} {{\tt arXiv:#1}}}
\def\mn@eprint@dblp#1{\href {http://dblp.uni-trier.de/rec/bibtex/#1.xml} {dblp:#1}}
\def\mn@eprint@#1:#2:#3:#4\@nil{\def\@tempa {#1}\def\@tempb {#2}\def\@tempc {#3}\ifx \@tempc \@empty \let \@tempc \@tempb \let \@tempb \@tempa \fi \ifx \@tempb \@empty \def\@tempb {arXiv}\fi \@ifundefined {mn@eprint@\@tempb}{\@tempb:\@tempc}{\expandafter \expandafter \csname mn@eprint@\@tempb\endcsname \expandafter{\@tempc}}}

\bibitem[\protect\citeauthoryear{Albert}{Albert}{2020}]{NS12}
Albert J.~G.,  2020, JAXNS: a high-performance nested sampling package based on JAX (\mn@eprint {arXiv} {2012.15286}), \url {https://arxiv.org/abs/2012.15286}

\bibitem[\protect\citeauthoryear{{Alsing} \& {Handley}}{{Alsing} \& {Handley}}{2021}]{Alsing2022anyprior}
{Alsing} J.,  {Handley} W.,  2021, \mn@doi [\mnras] {10.1093/mnrasl/slab057}, \href {https://ui.adsabs.harvard.edu/abs/2021MNRAS.505L..95A} {505, L95}

\bibitem[\protect\citeauthoryear{Anstey, de Lera~Acedo  \& Handley}{Anstey et~al.}{2021}]{anstey}
Anstey D.,  de Lera~Acedo E.,   Handley W.,  2021, \mn@doi [Monthly Notices of the Royal Astronomical Society] {10.1093/mnras/stab1765}, 506, 2041

\bibitem[\protect\citeauthoryear{Ashton et~al.}{Ashton et~al.}{2019}]{bilby_paper}
Ashton G.,  et~al., 2019, \mn@doi [Astrophys. J. Suppl.] {10.3847/1538-4365/ab06fc}, 241, 27

\bibitem[\protect\citeauthoryear{{Ashton} et~al.,}{{Ashton} et~al.}{2022}]{Ashton2022NSReview}
{Ashton} G.,  et~al., 2022, \mn@doi [Nature Reviews Methods Primers] {10.1038/s43586-022-00121-x}, \href {https://ui.adsabs.harvard.edu/abs/2022NRvMP...2...39A} {2, 39}

\bibitem[\protect\citeauthoryear{Baldock, Bernstein, Salerno, Pártay  \& Csányi}{Baldock et~al.}{2017}]{NS2}
Baldock R. J.~N.,  Bernstein N.,  Salerno K.~M.,  Pártay L.~B.,   Csányi G.,  2017, \mn@doi [Physical Review E] {10.1103/physreve.96.043311}, 96

\bibitem[\protect\citeauthoryear{Barbary}{Barbary}{}]{NS6}
Barbary K., , nestle: Pure Python, MIT-licensed implementation of nested sampling algorithms for evaluating Bayesian evidence., \url {https://github.com/kbarbary/nestle.git}

\bibitem[\protect\citeauthoryear{{Bevins}, {Handley}, {Lemos}, {Sims}, {de Lera Acedo}  \& {Fialkov}}{{Bevins} et~al.}{2022}]{Bevins2022margarine1}
{Bevins} H.,  {Handley} W.,  {Lemos} P.,  {Sims} P.,  {de Lera Acedo} E.,   {Fialkov} A.,  2022, \mn@doi [arXiv e-prints] {10.48550/arXiv.2207.11457}, \href {https://ui.adsabs.harvard.edu/abs/2022arXiv220711457B} {p. arXiv:2207.11457}

\bibitem[\protect\citeauthoryear{{Bevins}, {Handley}, {Lemos}, {Sims}, {de Lera Acedo}, {Fialkov}  \& {Alsing}}{{Bevins} et~al.}{2023}]{Bevins2023margarine2}
{Bevins} H. T.~J.,  {Handley} W.~J.,  {Lemos} P.,  {Sims} P.~H.,  {de Lera Acedo} E.,  {Fialkov} A.,   {Alsing} J.,  2023, \mn@doi [MNRAS] {10.1093/mnras/stad2997}, \href {https://ui.adsabs.harvard.edu/abs/2023MNRAS.526.4613B} {526, 4613}

\bibitem[\protect\citeauthoryear{Bevins et~al.}{Bevins et~al.}{2024}]{betaflows_github}
Bevins H.,  et~al., 2024, beta-flows, \url {https://github.com/htjb/beta-flows.git}

\bibitem[\protect\citeauthoryear{Brewer, Pártay  \& Csányi}{Brewer et~al.}{2010}]{NS3}
Brewer B.~J.,  Pártay L.~B.,   Csányi G.,  2010, Diffusive Nested Sampling (\mn@eprint {arXiv} {0912.2380}), \url {https://arxiv.org/abs/0912.2380}

\bibitem[\protect\citeauthoryear{Buchner}{Buchner}{2021}]{NS_ultranest}
Buchner J.,  2021, UltraNest -- a robust, general purpose Bayesian inference engine (\mn@eprint {arXiv} {2101.09604}), \url {https://arxiv.org/abs/2101.09604}

\bibitem[\protect\citeauthoryear{Chen, Hobson, Das  \& Gelderblom}{Chen et~al.}{2018}]{PR1}
Chen X.,  Hobson M.,  Das S.,   Gelderblom P.,  2018, Improving the efficiency and robustness of nested sampling using posterior repartitioning (\mn@eprint {arXiv} {1803.06387}), \url {https://arxiv.org/abs/1803.06387}

\bibitem[\protect\citeauthoryear{Chen, Feroz  \& Hobson}{Chen et~al.}{2022}]{PR2}
Chen X.,  Feroz F.,   Hobson M.,  2022, Bayesian posterior repartitioning for nested sampling (\mn@eprint {arXiv} {1908.04655}), \url {https://arxiv.org/abs/1908.04655}

\bibitem[\protect\citeauthoryear{Corsaro \& Ridder}{Corsaro \& Ridder}{2015}]{NS5}
Corsaro E.,  Ridder J.~D.,  2015, \mn@doi [EPJ Web of Conferences] {10.1051/epjconf/201510106019}, 101, 06019

\bibitem[\protect\citeauthoryear{Dax, Green, Gair, Macke, Buonanno  \& Sch\"olkopf}{Dax et~al.}{2021}]{DINGO}
Dax M.,  Green S.~R.,  Gair J.,  Macke J.~H.,  Buonanno A.,   Sch\"olkopf B.,  2021, \mn@doi [Phys. Rev. Lett.] {10.1103/PhysRevLett.127.241103}, 127, 241103

\bibitem[\protect\citeauthoryear{Duane, Kennedy, Pendleton  \& Roweth}{Duane et~al.}{1987}]{HMC1}
Duane S.,  Kennedy A.~D.,  Pendleton B.~J.,   Roweth D.,  1987, Physics letters B, 195, 216

\bibitem[\protect\citeauthoryear{{Fan}, {Nott}  \& {Sisson}}{{Fan} et~al.}{2012}]{Fan2012NLE}
{Fan} Y.,  {Nott} D.~J.,   {Sisson} S.~A.,  2012, \mn@doi [arXiv e-prints] {10.48550/arXiv.1212.1479}, \href {https://ui.adsabs.harvard.edu/abs/2012arXiv1212.1479F} {p. arXiv:1212.1479}

\bibitem[\protect\citeauthoryear{Feroz \& Hobson}{Feroz \& Hobson}{2008}]{NS_multinest}
Feroz F.,  Hobson M.~P.,  2008, \mn@doi [Monthly Notices of the Royal Astronomical Society] {10.1111/j.1365-2966.2007.12353.x}, 384, 449–463

\bibitem[\protect\citeauthoryear{Feroz, Hobson  \& Bridges}{Feroz et~al.}{2009}]{NS_multinest2}
Feroz F.,  Hobson M.~P.,   Bridges M.,  2009, \mn@doi [Monthly Notices of the Royal Astronomical Society] {10.1111/j.1365-2966.2009.14548.x}, 398, 1601–1614

\bibitem[\protect\citeauthoryear{Field et~al.}{Field et~al.}{2023}]{ROQ}
Field S.~E.,  et~al., 2023, \mn@doi [Physical Review D] {10.1103/PhysRevD.108.123025}, 108, 123025

\bibitem[\protect\citeauthoryear{{Gessey-Jones}, {Pochinda}, {Bevins}, {Fialkov}, {Handley}, {de Lera Acedo}, {Singh}  \& {Barkana}}{{Gessey-Jones} et~al.}{2024}]{Gessey-Jones2024Constraints}
{Gessey-Jones} T.,  {Pochinda} S.,  {Bevins} H.~T.~J.,  {Fialkov} A.,  {Handley} W.~J.,  {de Lera Acedo} E.,  {Singh} S.,   {Barkana} R.,  2024, \mn@doi [MNRAS] {10.1093/mnras/stae512}, \href {https://ui.adsabs.harvard.edu/abs/2024MNRAS.529..519G} {529, 519}

\bibitem[\protect\citeauthoryear{Habeck}{Habeck}{2015}]{statmech_analogy}
Habeck M.,  2015. pp 121--129, \mn@doi{10.1063/1.4905971}

\bibitem[\protect\citeauthoryear{Handley}{Handley}{2019}]{anesthetic}
Handley W.,  2019, \mn@doi [Journal of Open Source Software] {10.21105/joss.01414}, 4, 1414

\bibitem[\protect\citeauthoryear{Handley, Hobson  \& Lasenby}{Handley et~al.}{2015a}]{Polychord2}
Handley W.~J.,  Hobson M.~P.,   Lasenby A.~N.,  2015a, \mn@doi [Monthly Notices of the Royal Astronomical Society: Letters] {10.1093/mnrasl/slv047}, 450, L61–L65

\bibitem[\protect\citeauthoryear{Handley, Hobson  \& Lasenby}{Handley et~al.}{2015b}]{Polychord1}
Handley W.~J.,  Hobson M.~P.,   Lasenby A.~N.,  2015b, \mn@doi [Monthly Notices of the Royal Astronomical Society] {10.1093/mnras/stv1911}, 453, 4385–4399

\bibitem[\protect\citeauthoryear{Handley et~al.}{Handley et~al.}{2023b}]{lsbi_github}
Handley W.,  et~al., 2023b, lsbi, \url {https://github.com/handley-lab/lsbi.git}

\bibitem[\protect\citeauthoryear{Handley et~al.}{Handley et~al.}{2023a}]{lsbi_paper}
Handley W.,  et~al., 2023a, In preparation

\bibitem[\protect\citeauthoryear{Hastings}{Hastings}{1970}]{hastings1970}
Hastings W.~K.,  1970, \mn@doi [Biometrika] {10.1093/biomet/57.1.97}, 57, 97

\bibitem[\protect\citeauthoryear{Higson}{Higson}{2018}]{NS_dypolychord}
Higson E.,  2018, \mn@doi [Journal of Open Source Software] {10.21105/joss.00965}, 3, 965

\bibitem[\protect\citeauthoryear{Higson, Handley, Hobson  \& Lasenby}{Higson et~al.}{2018}]{DynamicNS}
Higson E.,  Handley W.,  Hobson M.,   Lasenby A.,  2018, \mn@doi [Statistics and Computing] {10.1007/s11222-018-9844-0}, 29, 891–913

\bibitem[\protect\citeauthoryear{Keeton}{Keeton}{2011}]{Keeton_2011}
Keeton C.~R.,  2011, \mn@doi [Monthly Notices of the Royal Astronomical Society] {10.1111/j.1365-2966.2011.18474.x}, 414, 1418–1426

\bibitem[\protect\citeauthoryear{Kester \& Mueller}{Kester \& Mueller}{2021}]{NS11}
Kester D.,  Mueller M.,  2021, BayesicFitting, a PYTHON Toolbox for Bayesian Fitting and Evidence Calculation (\mn@eprint {arXiv} {2109.11976}), \url {https://arxiv.org/abs/2109.11976}

\bibitem[\protect\citeauthoryear{Khan, Husa, Hannam, Ohme, Pürrer, Forteza  \& Bohé}{Khan et~al.}{2016}]{TL_phenomD}
Khan S.,  Husa S.,  Hannam M.,  Ohme F.,  Pürrer M.,  Forteza X.~J.,   Bohé A.,  2016, \mn@doi [Physical Review D] {10.1103/physrevd.93.044007}, 93

\bibitem[\protect\citeauthoryear{Kobyzev, Prince  \& Brubaker}{Kobyzev et~al.}{2021}]{NF_review}
Kobyzev I.,  Prince S.~J.,   Brubaker M.~A.,  2021, \mn@doi [IEEE Transactions on Pattern Analysis and Machine Intelligence] {10.1109/tpami.2020.2992934}, 43, 3964–3979

\bibitem[\protect\citeauthoryear{Krishna, Vijaykumar, Ganguly, Talbot, Biscoveanu, George, Williams  \& Zimmerman}{Krishna et~al.}{2023}]{TL_relativebinning}
Krishna K.,  Vijaykumar A.,  Ganguly A.,  Talbot C.,  Biscoveanu S.,  George R.~N.,  Williams N.,   Zimmerman A.,  2023, Accelerated parameter estimation in Bilby with relative binning (\mn@eprint {arXiv} {2312.06009}), \url {https://arxiv.org/abs/2312.06009}

\bibitem[\protect\citeauthoryear{{Matthews}, {Arbel}, {Rezende}  \& {Doucet}}{{Matthews} et~al.}{2022}]{Matthews2022SMC}
{Matthews} A. G.~D.~G.,  {Arbel} M.,  {Rezende} D.~J.,   {Doucet} A.,  2022, \mn@doi [arXiv e-prints] {10.48550/arXiv.2201.13117}, \href {https://ui.adsabs.harvard.edu/abs/2022arXiv220113117M} {p. arXiv:2201.13117}

\bibitem[\protect\citeauthoryear{Metropolis, Rosenbluth, Rosenbluth, Teller  \& Teller}{Metropolis et~al.}{1953}]{metropolis1953}
Metropolis N.,  Rosenbluth A.~W.,  Rosenbluth M.~N.,  Teller A.~H.,   Teller E.,  1953, \mn@doi [Journal of Chemical Physics] {10.1063/1.1699114}, 21, 1087

\bibitem[\protect\citeauthoryear{Morrás, Nuño~Siles  \& García-Bellido}{Morrás et~al.}{2023}]{TL_ROQreview}
Morrás G.,  Nuño~Siles J.~F.,   García-Bellido J.,  2023, \mn@doi [Physical Review D] {10.1103/physrevd.108.123025}, 108

\bibitem[\protect\citeauthoryear{Moss}{Moss}{2020}]{NS10}
Moss A.,  2020, \mn@doi [Monthly Notices of the Royal Astronomical Society] {10.1093/mnras/staa1469}, 496, 328–338

\bibitem[\protect\citeauthoryear{Mukherjee, Parkinson  \& Liddle}{Mukherjee et~al.}{2006}]{NS_cosmonest}
Mukherjee P.,  Parkinson D.,   Liddle A.~R.,  2006, \mn@doi [The Astrophysical Journal] {10.1086/501068}, 638, L51–L54

\bibitem[\protect\citeauthoryear{Neal}{Neal}{2011}]{HMC2}
Neal R.~M.,  2011, Handbook of Markov Chain Monte Carlo, 2, 2

\bibitem[\protect\citeauthoryear{Ormondroyd et~al.}{Ormondroyd et~al.}{2024}]{Adam}
Ormondroyd A.,  et~al., 2024, In preparation

\bibitem[\protect\citeauthoryear{{Paige} \& {Wood}}{{Paige} \& {Wood}}{2016}]{Paige2016SMC}
{Paige} B.,  {Wood} F.,  2016, \mn@doi [arXiv e-prints] {10.48550/arXiv.1602.06701}, \href {https://ui.adsabs.harvard.edu/abs/2016arXiv160506376P} {p. arXiv:1602.06701}

\bibitem[\protect\citeauthoryear{Papamakarios \& Murray}{Papamakarios \& Murray}{2015}]{Papamakarios2015ImportanceSampling}
Papamakarios G.,  Murray I.,  2015, Distilling intractable generative models

\bibitem[\protect\citeauthoryear{{Papamakarios} \& {Murray}}{{Papamakarios} \& {Murray}}{2016}]{Papamakarios2016NPE}
{Papamakarios} G.,  {Murray} I.,  2016, \mn@doi [arXiv e-prints] {10.48550/arXiv.1605.06376}, \href {https://ui.adsabs.harvard.edu/abs/2016arXiv160506376P} {p. arXiv:1605.06376}

\bibitem[\protect\citeauthoryear{Parkinson, Mukherjee  \& Liddle}{Parkinson et~al.}{2006}]{NS_cosmonest2}
Parkinson D.,  Mukherjee P.,   Liddle A.~R.,  2006, \mn@doi [Physical Review D] {10.1103/physrevd.73.123523}, 73

\bibitem[\protect\citeauthoryear{Payne, Talbot  \& Thrane}{Payne et~al.}{2019}]{likelihood_reweighting}
Payne E.,  Talbot C.,   Thrane E.,  2019, \mn@doi [Physical Review D] {10.1103/physrevd.100.123017}, 100

\bibitem[\protect\citeauthoryear{Petrosyan \& Handley}{Petrosyan \& Handley}{2022}]{Supernest}
Petrosyan A.,  Handley W.,  2022, SuperNest: accelerated nested sampling applied to astrophysics and cosmology, \mn@doi{10.48550/arXiv.2212.01760}

\bibitem[\protect\citeauthoryear{{Pochinda} et~al.,}{{Pochinda} et~al.}{2023}]{Pochinda2023Constraints}
{Pochinda} S.,  et~al., 2023, \mn@doi [arXiv e-prints] {10.48550/arXiv.2312.08095}, \href {https://ui.adsabs.harvard.edu/abs/2023arXiv231208095P} {p. arXiv:2312.08095}

\bibitem[\protect\citeauthoryear{Prathaban, Bevins  \& Handley}{Prathaban et~al.}{2024}]{zenodo}
Prathaban M.,  Bevins H.,   Handley W.,  2024, Accelerated nested sampling with $\beta$-flows for gravitational waves, \mn@doi{10.5281/zenodo.14198699}, \url {https://doi.org/10.5281/zenodo.14198699}

\bibitem[\protect\citeauthoryear{Pratten et~al.,}{Pratten et~al.}{2021}]{IMRPhenomXPHM}
Pratten G.,  et~al., 2021, \mn@doi [Physical Review D] {10.1103/physrevd.103.104056}, 103

\bibitem[\protect\citeauthoryear{Saleh, Zimmerman, Chen  \& Ghattas}{Saleh et~al.}{2024}]{tempered_likelihood_reweighting}
Saleh B.,  Zimmerman A.,  Chen P.,   Ghattas O.,  2024, Tempered Multifidelity Importance Sampling for Gravitational Wave Parameter Estimation (\mn@eprint {arXiv} {2405.19407}), \url {https://arxiv.org/abs/2405.19407}

\bibitem[\protect\citeauthoryear{Skilling}{Skilling}{2006}]{Skilling2006NS}
Skilling J.,  2006, \mn@doi [Bayesian Analysis] {10.1214/06-BA127}, 1, 833

\bibitem[\protect\citeauthoryear{Smith, Field, Blackburn, Haster, Pürrer, Raymond  \& Schmidt}{Smith et~al.}{2016}]{TL_fastmodel}
Smith R.,  Field S.~E.,  Blackburn K.,  Haster C.-J.,  Pürrer M.,  Raymond V.,   Schmidt P.,  2016, \mn@doi [Physical Review D] {10.1103/physrevd.94.044031}, 94

\bibitem[\protect\citeauthoryear{Speagle}{Speagle}{2020}]{NS_dynesty}
Speagle J.~S.,  2020, \mn@doi [Monthly Notices of the Royal Astronomical Society] {10.1093/mnras/staa278}, 493, 3132–3158

\bibitem[\protect\citeauthoryear{Thrane \& Talbot}{Thrane \& Talbot}{2019}]{Thrane_2019}
Thrane E.,  Talbot C.,  2019, \mn@doi [Publications of the Astronomical Society of Australia] {10.1017/pasa.2019.2}, 36

\bibitem[\protect\citeauthoryear{Tong, Fatras, Malkin, Huguet, Zhang, Rector-Brooks, Wolf  \& Bengio}{Tong et~al.}{2024}]{CNF}
Tong A.,  Fatras K.,  Malkin N.,  Huguet G.,  Zhang Y.,  Rector-Brooks J.,  Wolf G.,   Bengio Y.,  2024, Improving and generalizing flow-based generative models with minibatch optimal transport (\mn@eprint {arXiv} {2302.00482}), \url {https://arxiv.org/abs/2302.00482}

\bibitem[\protect\citeauthoryear{Trassinelli}{Trassinelli}{2017}]{NS1}
Trassinelli M.,  2017, \mn@doi [] {10.1016/j.nimb.2017.05.030}, 408, 301–312

\bibitem[\protect\citeauthoryear{Trassinelli}{Trassinelli}{2019}]{NS7}
Trassinelli M.,  2019, \mn@doi [Proceedings] {10.3390/proceedings2019033014}, 33

\bibitem[\protect\citeauthoryear{Trassinelli \& Ciccodicola}{Trassinelli \& Ciccodicola}{2020}]{NS8}
Trassinelli M.,  Ciccodicola P.,  2020, \mn@doi [Entropy] {10.3390/e22020185}, 22

\bibitem[\protect\citeauthoryear{Veitch et~al.,}{Veitch et~al.}{2015a}]{Veitch_LAL}
Veitch J.,  et~al., 2015a, \mn@doi [Physical Review D] {10.1103/physrevd.91.042003}, 91

\bibitem[\protect\citeauthoryear{Veitch et~al.,}{Veitch et~al.}{2015b}]{NS4}
Veitch J.,  et~al., 2015b, \mn@doi [Physical Review D] {10.1103/physrevd.91.042003}, 91

\bibitem[\protect\citeauthoryear{Veitch et~al.,}{Veitch et~al.}{2024}]{NS9}
Veitch J.,  et~al., 2024, johnveitch/cpnest: v0.11.7, \mn@doi{10.5281/zenodo.12801702}, \url {https://doi.org/10.5281/zenodo.12801702}

\bibitem[\protect\citeauthoryear{Vinciguerra, Veitch  \& Mandel}{Vinciguerra et~al.}{2017}]{TL_multibandinterpolation}
Vinciguerra S.,  Veitch J.,   Mandel I.,  2017, \mn@doi [Classical and Quantum Gravity] {10.1088/1361-6382/aa6d44}, 34, 115006

\bibitem[\protect\citeauthoryear{{Williams}, {Veitch}  \& {Messenger}}{{Williams} et~al.}{2021}]{Williams2021Nessai}
{Williams} M.~J.,  {Veitch} J.,   {Messenger} C.,  2021, \mn@doi [Phys. Rev. D] {10.1103/PhysRevD.103.103006}, \href {https://ui.adsabs.harvard.edu/abs/2021PhRvD.103j3006W} {103, 103006}

\makeatother
\end{thebibliography}

```

5. **Author Information:**
- Lead Author: {'name': 'Metha Prathaban'}
- Full Authors List:
```yaml
Metha Prathaban:
  phd:
    start: 2022-10-01
    supervisors:
    - Will Handley
    thesis: null
  partiii:
    start: 2020-10-01
    end: 2021-06-01
    supervisors:
    - Will Handley
    thesis: Evidence for a Palindromic Universe
  original_image: images/originals/metha_prathaban.png
  image: /assets/group/images/metha_prathaban.jpg
  links:
    Harding Scholar: https://www.hardingscholars.fund.cam.ac.uk/metha-prathaban-2022-cohort
    GitHub: https://github.com/mrosep
Harry Bevins:
  coi:
    start: 2023-10-01
    thesis: null
  phd:
    start: 2019-10-01
    end: 2023-03-31
    supervisors:
    - Will Handley
    - Eloy de Lera Acedo
    - Anastasia Fialkov
    thesis: A Machine Learning-enhanced Toolbox for Bayesian 21-cm Data Analysis and
      Constraints on the Astrophysics of the Early Universe
  original_image: images/originals/harry_bevins.jpeg
  image: /assets/group/images/harry_bevins.jpg
  links:
    Webpage: https://htjb.github.io/
    GitHub: https://github.com/htjb
    ADS: https://ui.adsabs.harvard.edu/search/q=author%3A%22Bevins%2C%20H.%20T.%20J.%22&sort=date%20desc%2C%20bibcode%20desc&p_=0
    Publons: https://publons.com/researcher/5239833/harry-bevins/
  destination:
    2023-04-01: Postdoc in Cambridge (Eloy)
    2023-10-01: Cambridge Kavli Fellowship
Will Handley:
  pi:
    start: 2020-10-01
    thesis: null
  postdoc:
    start: 2016-10-01
    end: 2020-10-01
    thesis: null
  phd:
    start: 2012-10-01
    end: 2016-09-30
    supervisors:
    - Anthony Lasenby
    - Mike Hobson
    thesis: 'Kinetic initial conditions for inflation: theory, observation and methods'
  original_image: images/originals/will_handley.jpeg
  image: /assets/group/images/will_handley.jpg
  links:
    Webpage: https://willhandley.co.uk

```
This YAML file provides a concise snapshot of an academic research group. It lists members by name along with their academic roles—ranging from Part III and summer projects to MPhil, PhD, and postdoctoral positions—with corresponding dates, thesis topics, and supervisor details. Supplementary metadata includes image paths and links to personal or departmental webpages. A dedicated "coi" section profiles senior researchers, highlighting the group’s collaborative mentoring network and career trajectories in cosmology, astrophysics, and Bayesian data analysis.


====================================================================================
Final Output Instructions
====================================================================================

- Combine all data sources to create a seamless, engaging narrative.
- Follow the exact Markdown output format provided at the top.
- Do not include any extra explanation, commentary, or wrapping beyond the specified Markdown.
- Validate that every bibliographic reference with a DOI or arXiv identifier is converted into a Markdown link as per the examples.
- Validate that every Markdown author link corresponds to a link in the author information block.
- Before finalizing, confirm that no LaTeX citation commands or other undesired formatting remain.
- Before finalizing, confirm that the link to the paper itself [2411.17663](https://arxiv.org/abs/2411.17663) is featured in the first sentence.

Generate only the final Markdown output that meets all these requirements.

{% endraw %}