# Second star to the right and straight on ’til morning—Astrophysics white papers

What will be the next big thing in astronomy? One of the hard things about research is that you often don’t know what you will discover before you embark on an investigation. An idea might work out, or it might not, or along the way you might discover something unexpected which is far more interesting. As you might imagine, this can make laying definite plans difficult…

However, it is important to have plans for research. While you might not be sure of the outcome, it is necessary to weigh the risks and rewards associated with the probable results before you invest your time and taxpayers’ money!

To help with planning and prioritising, researchers in astrophysics often pull together white papers [bonus note]. These are sketches of ideas for future research, arguing why you think they might be interesting. These can then be discussed within the community to help shape the direction of the field. If other scientists find the paper convincing, you can build support which helps push for funding. If there are gaps in the logic, others can point these out to ave you heading the wrong way. This type of consensus building is especially important for large experiments or missions—you don’t want to spend a billion dollars on something unless you’re really sure it is a good idea and lots of people agree.

I have been involved with a few white papers recently. Here are some key ideas for where research should go.

### Ground-based gravitational-wave detectors: The next generation

We’ve done some awesome things with Advanced LIGO and Advanced Virgo. In just a couple of years we have revolutionized our understanding of binary black holes. That’s not bad. However, our current gravitational-wave observatories are limited in what they can detect. What amazing things could we achieve with a new generation of detectors?

It can take decades to develop new instruments, therefore it’s important to start thinking about them early. Obviously, what we would most like is an observatory which can detect everything, but that’s not feasible. In this white paper, we pick the questions we most want answered, and see what the requirements for a new detector would be. A design which satisfies these specifications would therefore be a solid choice for future investment.

Binary black holes are the perfect source for ground-based detectors. What do we most want to know about them?

1. How many mergers are there, and how does the merger rate change over the history of the Universe? We want to know how binary black holes are made. The merger rate encodes lots of information about how to make binaries, and comparing how this evolves compared with the rate at which the Universe forms stars, will give us a deeper understanding of how black holes are made.
2. What are the properties (masses and spins) of black holes? The merger rate tells us some things about how black holes form, but other properties like the masses, spins and orbital eccentricity complete the picture. We want to make precise measurements for individual systems, and also understand the population.
3. Where do supermassive black holes come from? We know that stars can collapse to produce stellar-mass black holes. We also know that the centres of galaxies contain massive black holes. Where do these massive black holes come from? Do they grow from our smaller black holes, or do they form in a different way? Looking for intermediate-mass black holes in the gap in-between will tells us whether there is a missing link in the evolution of black holes.

The detection horizon (the distance to which sources can be detected) for Advanced LIGO (aLIGO), its upgrade A+, and the proposed Cosmic Explorer (CE) and Einstein Telescope (ET). The horizon is plotted for binaries with equal-mass, nonspinning components. Adapted from Hall & Evans (2019).

What can we do to answer these questions?

1. Increase sensitivity! Advanced LIGO and Advanced Virgo can detect a $30 M_\odot + 30 M_\odot$ binary out to a redshift of about $z \approx 1$. The planned detector upgrade A+ will see them out to redshift $z \approx 2$. That’s pretty impressive, it means we’re covering 10 billion years of history. However, the peak in the Universe’s star formation happens at around $z \approx 2$, so we’d really like to see beyond this in order to measure how the merger rate evolves. Ideally we would see all the way back to cosmic dawn at $z \approx 20$ when the Universe was only 200 million years old and the first stars light up.
2. Increase our frequency range! Our current detectors are limited in the range of frequencies they can detect. Pushing to lower frequencies helps us to detect heavier systems. If we want to detect intermediate-mass black holes of $100 M_\odot$ we need this low frequency sensitivity. At the moment, Advanced LIGO could get down to about $10~\mathrm{Hz}$. The plot below shows the signal from a $100 M_\odot + 100 M_\odot$ binary at $z = 10$. The signal is completely undetectable at $10~\mathrm{Hz}$.

The gravitational wave signal from the final stages of inspiral, merger and ringdown of a two 100 solar mass black holes at a redshift of 10. The signal chirps up in frequency. The colour coding shows parts of the signal above different frequencies. Part of Figure 2 of the Binary Black Holes White Paper.

3. Increase sensitivity and frequency range! Increasing sensitivity means that we will have higher signal-to-noise ratio detections. For these loudest sources, we will be able to make more precise measurements of the source properties. We will also have more detections overall, as we can survey a larger volume of the Universe. Increasing the frequency range means we can observe a longer stretch of the signal (for the systems we currently see). This means it is easier to measure spin precession and orbital eccentricity. We also get to measure a wider range of masses. Putting the improved sensitivity and frequency range together means that we’ll get better measurements of individual systems and a more complete picture of the population.

How much do we need to improve our observatories to achieve our goals? To quantify this, lets consider the boost in sensitivity relative to A+, which I’ll call $\beta_\mathrm{A+}$. If the questions can be answered with $\beta_\mathrm{A+} = 1$, then we don’t need anything beyond the currently planned A+. If we need a slightly larger $\beta_\mathrm{A+}$, we should start investigating extra ways to improve the A+ design. If we need much larger $\beta_\mathrm{A+}$, we need to think about new facilities.

The plot below shows the boost necessary to detect a binary (with equal-mass nonspinning components) out to a given redshift. With a boost of $\beta_\mathrm{A+} = 10$ (blue line) we can survey black holes around $10 M_\odot$$30 M_\odot$ across cosmic time.

The boost factor (relative to A+) $\beta_\mathrm{A+}$ needed to detect a binary with a total mass $M$ out to redshift $z$. The binaries are assumed to have equal-mass, nonspinning components. The colour scale saturates at $\log_{10} \beta_\mathrm{A+} = 4.5$. The blue curve highlights the reach at a boost factor of $\beta_\mathrm{A+} = 10$. The solid and dashed white lines indicate the maximum reach of Cosmic Explorer and the Einstein Telescope, respectively. Part of Figure 1 of the Binary Black Holes White Paper.

The plot above shows that to see intermediate-mass black holes, we do need to completely overhaul the low-frequency sensitivity. What do we need to detect a $100 M_\odot + 100 M_\odot$ binary at $z = 10$? If we parameterize the noise spectrum (power spectral density) of our detector as $S_n(f) = S_{10}(f/10~\mathrm{Hz})^\alpha$ with a lower cut-off frequency of $f_\mathrm{min}$, we can investigate the various possibilities. The plot below shows the possible combinations of parameters which meet of requirements.

Requirements on the low-frequency noise power spectrum necessary to detect an optimally oriented intermediate-mass binary black hole system with two 100 solar mass components at a redshift of 10. Part of Figure 2 of the Binary Black Holes White Paper.

To build up information about the population of black holes, we need lots of detections. Uncertainties scale inversely with the square root of the number of detections, so you would expect few percent uncertainty after 1000 detections. If we want to see how the population evolves, we need these many per redshift bin! The plot below shows the number of detections per year of observing time for different boost factors. The rate starts to saturate once we detect all the binaries in the redshift range. This is as good as you’ll ever going to get.

Expected rate of binary black hole detections $R_\mathrm{det}$ per redshift bin as a function of A+ boost factor $\beta_\mathrm{A+}$ for three redshift bins. The merging binaries are assumed to be uniformly distributed with a constant merger rate roughly consistent with current observations: the solid line is about the current median, while the dashed and dotted lines are roughly the 90% bounds. Figure 3 of the Binary Black Holes White Paper.

Looking at the plots above, it is clear that A+ is not going to satisfy our requirements. We need something with a boost factor of $\beta_\mathrm{A+} = 10$: a next-generation observatory. Both the Cosmic Explorer and Einstein Telescope designs do satisfy our goals.

Title: Deeper, wider, sharper: Next-generation ground-based gravitational-wave observations of binary black holes
arXiv:
1903.09220 [astro-ph.HE]
Theme music: Daft Punk

### Extreme mass ratio inspirals are awesome

We have seen gravitational waves from a stellar-mass black hole merging with another stellar-mass black hole, can we observe a stellar-mass black hole merging with a massive black hole? Yes, these are a perfect source for a space-based gravitational wave observatory. We call these systems extreme mass-ratio inspirals (or EMRIs, pronounced em-rees, for short) [bonus note].

Having such an extreme mass ratio, with one black hole much bigger than the other, gives EMRIs interesting properties. The number of orbits over the course of an inspiral scales with the mass ratio: the more extreme the mass ratio, the more orbits there are. Each of these gives us something to measure in the gravitational wave signal.

A short section of an orbit around a spinning black hole. While inspirals last for years, this would represent only a few hours around a black hole of mass $M = 10^6 M_\odot$. The position is measured in terms of the gravitational radius $r_\mathrm{g} = GM/c^2$. The innermost stable orbit for this black hole would be about $r_\mathrm{g} = 2.3$. Part of Figure 1 of the EMRI White Paper.

As EMRIs are so intricate, we can make exquisit measurements of the source properties. These will enable us to:

• Measure the masses of both black holes to better than 10% precision
• Reconstruct the mass distribution of massive black holes out to redshift $z \gtrsim 4$
• Measure massive black hole spins to a precision of better than 0.001, giving us an insight into how they formed
• Perform precision tests of the no-hair theorem describing black holes in general relativity, and test alternative theories of gravity in the strong-field regime
• Cross-correlate locations with galaxy catalogues to measure the expansion of the Universe

Event rates for EMRIs are currently uncertain: there could be just one per year or thousands. From the rate we can figure out the details of what is going in in the nuclei of galaxies, and what types of objects you find there.

With EMRIs you can unravel mysteries in astrophysics, fundamental physics and cosmology.

Have we sold you that EMRIs are awesome? Well then, what do we need to do to observe them? There is only one currently planned mission which can enable us to study EMRIs: LISA. To maximise the science from EMRIs, we have to support LISA.

As an aspiring scientist, Lisa Simpson is a strong supporter of the LISA mission. Credit: Fox

Title: The unique potential of extreme mass-ratio inspirals for gravitational-wave astronomy
arXiv:
1903.03686 [astro-ph.HE]
Theme music: Muse

### Bonus notes

#### White paper vs journal article

Since white papers are proposals for future research, they aren’t as rigorous as usual academic papers. They are really attempts to figure out a good question to ask, rather than being answers. White papers are not usually peer reviewed before publication—the point is that you want everybody to comment on them, rather than just one or two anonymous referees.

Whilst white papers aren’t quite the same class as journal articles, they do still contain some interesting ideas, so I thought they still merit a blog post.

#### Recycling

I have blogged about EMRIs before, so I won’t go into too much detail here. It was one of my former blog posts which inspired the LISA Science Team to get in touch to ask me to write the white paper.

# Accuracy of inference on the physics of binary evolution from gravitational-wave observations

Gravitational-wave astronomy lets us observing binary black holes. These systems, being made up of two black holes, are pretty difficult to study by any other means. It has long been argued that with this new information we can unravel the mysteries of stellar evolution. Just as a palaeontologist can discover how long-dead animals lived from their bones, we can discover how massive stars lived by studying their black hole remnants. In this paper, we quantify how much we can really learn from this black hole palaeontology—after 1000 detections, we should pin down some of the most uncertain parameters in binary evolution to a few percent precision.

### Life as a binary

There are many proposed ways of making a binary black hole. The current leading contender is isolated binary evolution: start with a binary star system (most stars are in binaries or higher multiples, our lonesome Sun is a little unusual), and let the stars evolve together. Only a fraction will end with black holes close enough to merge within the age of the Universe, but these would be the sources of the signals we see with LIGO and Virgo. We consider this isolated binary scenario in this work [bonus note].

Now, you might think that with stars being so fundamentally important to astronomy, and with binary stars being so common, we’d have the evolution of binaries figured out by now. It turns out it’s actually pretty messy, so there’s lots of work to do. We consider constraining four parameters which describe the bits of binary physics which we are currently most uncertain of:

• Black hole natal kicks—the push black holes receive when they are born in supernova explosions. We now the neutron stars get kicks, but we’re less certain for black holes [bonus note].
• Common envelope efficiency—one of the most intricate bits of physics about binaries is how mass is transferred between stars. As they start exhausting their nuclear fuel they puff up, so material from the outer envelope of one star may be stripped onto the other. In the most extreme cases, a common envelope may form, where so much mass is piled onto the companion, that both stars live in a single fluffy envelope. Orbiting inside the envelope helps drag the two stars closer together, bringing them closer to merging. The efficiency determines how quickly the envelope becomes unbound, ending this phase.
• Mass loss rates during the Wolf–Rayet (not to be confused with Wolf 359) and luminous blue variable phases–stars lose mass through out their lives, but we’re not sure how much. For stars like our Sun, mass loss is low, there is enough to gives us the aurora, but it doesn’t affect the Sun much. For bigger and hotter stars, mass loss can be significant. We consider two evolutionary phases of massive stars where mass loss is high, and currently poorly known. Mass could be lost in clumps, rather than a smooth stream, making it difficult to measure or simulate.

We use parameters describing potential variations in these properties are ingredients to the COMPAS population synthesis code. This rapidly (albeit approximately) evolves a population of stellar binaries to calculate which will produce merging binary black holes.

The question is now which parameters affect our gravitational-wave measurements, and how accurately we can measure those which do?

Binary black hole merger rate at three different redshifts $z$ as calculated by COMPAS. We show the rate in 30 different chirp mass bins for our default population parameters. The caption gives the total rate for all masses. Figure 2 of Barrett et al. (2018)

### Gravitational-wave observations

For our deductions, we use two pieces of information we will get from LIGO and Virgo observations: the total number of detections, and the distributions of chirp masses. The chirp mass is a combination of the two black hole masses that is often well measured—it is the most important quantity for controlling the inspiral, so it is well measured for low mass binaries which have a long inspiral, but is less well measured for higher mass systems. In reality we’ll have much more information, so these results should be the minimum we can actually do.

We consider the population after 1000 detections. That sounds like a lot, but we should have collected this many detections after just 2 or 3 years observing at design sensitivity. Our default COMPAS model predicts 484 detections per year of observing time! Honestly, I’m a little scared about having this many signals…

For a set of population parameters (black hole natal kick, common envelope efficiency, luminous blue variable mass loss and Wolf–Rayet mass loss), COMPAS predicts the number of detections and the fraction of detections as a function of chirp mass. Using these, we can work out the probability of getting the observed number of detections and fraction of detections within different chirp mass ranges. This is the likelihood function: if a given model is correct we are more likely to get results similar to its predictions than further away, although we expect their to be some scatter.

If you like equations, the from of our likelihood is explained in this bonus note. If you don’t like equations, there’s one lurking in the paragraph below. Just remember, that it can’t see you if you don’t move. It’s OK to skip the equation.

To determine how sensitive we are to each of the population parameters, we see how the likelihood changes as we vary these. The more the likelihood changes, the easier it should be to measure that parameter. We wrap this up in terms of the Fisher information matrix. This is defined as

$\displaystyle F_{ij} = -\left\langle\frac{\partial^2\ln \mathcal{L}(\mathcal{D}|\left\{\lambda\right\})}{\partial \lambda_i \partial\lambda_j}\right\rangle$,

where $\mathcal{L}(\mathcal{D}|\left\{\lambda\right\})$ is the likelihood for data $\mathcal{D}$ (the number of observations and their chirp mass distribution in our case), $\left\{\lambda\right\}$ are our parameters (natal kick, etc.), and the angular brackets indicate the average over the population parameters. In statistics terminology, this is the variance of the score, which I think sounds cool. The Fisher information matrix nicely quantifies how much information we can lean about the parameters, including the correlations between them (so we can explore degeneracies). The inverse of the Fisher information matrix gives a lower bound on the covariance matrix (the multidemensional generalisation of the variance in a normal distribution) for the parameters $\left\{\lambda\right\}$. In the limit of a large number of detections, we can use the Fisher information matrix to estimate the accuracy to which we measure the parameters [bonus note].

We simulated several populations of binary black hole signals, and then calculate measurement uncertainties for our four population uncertainties to see what we could learn from these measurements.

### Results

Using just the rate information, we find that we can constrain a combination of the common envelope efficiency and the Wolf–Rayet mass loss rate. Increasing the common envelope efficiency ends the common envelope phase earlier, leaving the binary further apart. Wider binaries take longer to merge, so this reduces the merger rate. Similarly, increasing the Wolf–Rayet mass loss rate leads to wider binaries and smaller black holes, which take longer to merge through gravitational-wave emission. Since the two parameters have similar effects, they are anticorrelated. We can increase one and still get the same number of detections if we decrease the other. There’s a hint of a similar correlation between the common envelope efficiency and the luminous blue variable mass loss rate too, but it’s not quite significant enough for us to be certain it’s there.

Fisher information matrix estimates for fractional measurement precision of the four population parameters: the black hole natal kick $\sigma_\mathrm{kick}$, the common envelope efficiency $\alpha_\mathrm{CE}$, the Wolf–Rayet mass loss rate $f_\mathrm{WR}$, and the luminous blue variable mass loss rate $f_\mathrm{LBV}$. There is an anticorrealtion between $f_\mathrm{WR}$ and $\alpha_\mathrm{CE}$, and hints at a similar anticorrelation between $f_|mathrm{LBV}$ and $\alpha_\mathrm{CE}$. We show 1500 different realisations of the binary population to give an idea of scatter. Figure 6 of Barrett et al. (2018)

Adding in the chirp mass distribution gives us more information, and improves our measurement accuracies. The fraction uncertainties are about 2% for the two mass loss rates and the common envelope efficiency, and about 5% for the black hole natal kick. We’re less sensitive to the natal kick because the most massive black holes don’t receive a kick, and so are unaffected by the kick distribution [bonus note]. In any case, these measurements are exciting! With this type of precision, we’ll really be able to learn something about the details of binary evolution.

Measurement precision for the four population parameters after 1000 detections. We quantify the precision with the standard deviation estimated from the Fisher inforamtion matrix. We show results from 1500 realisations of the population to give an idea of scatter. Figure 5 of Barrett et al. (2018)

The accuracy of our measurements will improve (on average) with the square root of the number of gravitational-wave detections. So we can expect 1% measurements after about 4000 observations. However, we might be able to get even more improvement by combining constraints from other types of observation. Combining different types of observation can help break degeneracies. I’m looking forward to building a concordance model of binary evolution, and figuring out exactly how massive stars live their lives.

arXiv: 1711.06287 [astro-ph.HE]
Journal: Monthly Notices of the Royal Astronomical Society; 477(4):4685–4695; 2018
Favourite dinosaur: Professor Science

### Bonus notes

#### Channel selection

In practise, we will need to worry about how binary black holes are formed, via isolated evolution or otherwise, before inferring the parameters describing binary evolution. This makes the problem more complicated. Some parameters, like mass loss rates or black hole natal kicks, might be common across multiple channels, while others are not. There are a number of ways we might be able to tell different formation mechanisms apart, such as by using spin measurements.

#### Kick distribution

We model the supernova kicks $v_\mathrm{kick}$ as following a Maxwell–Boltzmann distribution,

$\displaystyle p(v_\mathrm{kick}) = \sqrt{\frac{2}{\pi}} \frac{v_\mathrm{kick}^2}{\sigma_\mathrm{kick}^3} \exp\left(\frac{-v_\mathrm{kick}^2}{2\sigma_\mathrm{kick}^2}\right)$,

where $\sigma_\mathrm{kick}$ is the unknown population parameter. The natal kick received by the black hole $v^*_\mathrm{kick}$ is not the same as this, however, as we assume some of the material ejected by the supernova falls back, reducing the over kick. The final natal kick is

$v^*_\mathrm{kick} = (1-f_\mathrm{fb})v_\mathrm{kick}$,

where $f_\mathrm{fb}$ is the fraction that falls back, taken from Fryer et al. (2012). The fraction is greater for larger black holes, so the biggest black holes get no kicks. This means that the largest black holes are unaffected by the value of $\sigma_\mathrm{kick}$.

#### The likelihood

In this analysis, we have two pieces of information: the number of detections, and the chirp masses of the detections. The first is easy to summarise with a single number. The second is more complicated, and we consider the fraction of events within different chirp mass bins.

Our COMPAS model predicts the merger rate $\mu$ and the probability of falling in each chirp mass bin $p_k$ (we factor measurement uncertainty into this). Our observations are the the total number of detections $N_\mathrm{obs}$ and the number in each chirp mass bin $c_k$ ($N_\mathrm{obs} = \sum_k c_k$). The likelihood is the probability of these observations given the model predictions. We can split the likelihood into two pieces, one for the rate, and one for the chirp mass distribution,

$\mathcal{L} = \mathcal{L}_\mathrm{rate} \times \mathcal{L}_\mathrm{mass}$.

For the rate likelihood, we need the probability of observing $N_\mathrm{obs}$ given the predicted rate $\mu$. This is given by a Poisson distribution,

$\displaystyle \mathcal{L}_\mathrm{rate} = \exp(-\mu t_\mathrm{obs}) \frac{(\mu t_\mathrm{obs})^{N_\mathrm{obs}}}{N_\mathrm{obs}!}$,

where $t_\mathrm{obs}$ is the total observing time. For the chirp mass likelihood, we the probability of getting a number of detections in each bin, given the predicted fractions. This is given by a multinomial distribution,

$\displaystyle \mathcal{L}_\mathrm{mass} = \frac{N_\mathrm{obs}!}{\prod_k c_k!} \prod_k p_k^{c_k}$.

These look a little messy, but they simplify when you take the logarithm, as we need to do for the Fisher information matrix.

When we substitute in our likelihood into the expression for the Fisher information matrix, we get

$\displaystyle F_{ij} = \mu t_\mathrm{obs} \left[ \frac{1}{\mu^2} \frac{\partial \mu}{\partial \lambda_i} \frac{\partial \mu}{\partial \lambda_j} + \sum_k\frac{1}{p_k} \frac{\partial p_k}{\partial \lambda_i} \frac{\partial p_k}{\partial \lambda_j} \right]$.

Conveniently, although we only need to evaluate first-order derivatives, even though the Fisher information matrix is defined in terms of second derivatives. The expected number of events is $\langle N_\mathrm{obs} \rangle = \mu t_\mathrm{obs}$. Therefore, we can see that the measurement uncertainty defined by the inverse of the Fisher information matrix, scales on average as $N_\mathrm{obs}^{-1/2}$.

For anyone worrying about using the likelihood rather than the posterior for these estimates, the high number of detections [bonus note] should mean that the information we’ve gained from the data overwhelms our prior, meaning that the shape of the posterior is dictated by the shape of the likelihood.

#### Interpretation of the Fisher information matrix

As an alternative way of looking at the Fisher information matrix, we can consider the shape of the likelihood close to its peak. Around the maximum likelihood point, the first-order derivatives of the likelihood with respect to the population parameters is zero (otherwise it wouldn’t be the maximum). The maximum likelihood values of $N_\mathrm{obs} = \mu t_\mathrm{obs}$ and $c_k = N_\mathrm{obs} p_k$ are the same as their expectation values. The second-order derivatives are given by the expression we have worked out for the Fisher information matrix. Therefore, in the region around the maximum likelihood point, the Fisher information matrix encodes all the relevant information about the shape of the likelihood.

So long as we are working close to the maximum likelihood point, we can approximate the distribution as a multidimensional normal distribution with its covariance matrix determined by the inverse of the Fisher information matrix. Our results for the measurement uncertainties are made subject to this approximation (which we did check was OK).

Approximating the likelihood this way should be safe in the limit of large $N_\mathrm{obs}$. As we get more detections, statistical uncertainties should reduce, with the peak of the distribution homing in on the maximum likelihood value, and its width narrowing. If you take the limit of $N_\mathrm{obs} \rightarrow \infty$, you’ll see that the distribution basically becomes a delta function at the maximum likelihood values. To check that our $N_\mathrm{obs} = 1000$ was large enough, we verified that higher-order derivatives were still small.

Michele Vallisneri has a good paper looking at using the Fisher information matrix for gravitational wave parameter estimation (rather than our problem of binary population synthesis). There is a good discussion of its range of validity. The high signal-to-noise ratio limit for gravitational wave signals corresponds to our high number of detections limit.