Accuracy of inference on the physics of binary evolution from gravitational-wave observations

Gravitational-wave astronomy lets us observing binary black holes. These systems, being made up of two black holes, are pretty difficult to study by any other means. It has long been argued that with this new information we can unravel the mysteries of stellar evolution. Just as a palaeontologist can discover how long-dead animals lived from their bones, we can discover how massive stars lived by studying their black hole remnants. In this paper, we quantify how much we can really learn from this black hole palaeontology—after 1000 detections, we should pin down some of the most uncertain parameters in binary evolution to a few percent precision.

Life as a binary

There are many proposed ways of making a binary black hole. The current leading contender is isolated binary evolution: start with a binary star system (most stars are in binaries or higher multiples, our lonesome Sun is a little unusual), and let the stars evolve together. Only a fraction will end with black holes close enough to merge within the age of the Universe, but these would be the sources of the signals we see with LIGO and Virgo. We consider this isolated binary scenario in this work [bonus note].

Now, you might think that with stars being so fundamentally important to astronomy, and with binary stars being so common, we’d have the evolution of binaries figured out by now. It turns out it’s actually pretty messy, so there’s lots of work to do. We consider constraining four parameters which describe the bits of binary physics which we are currently most uncertain of:

  • Black hole natal kicks—the push black holes receive when they are born in supernova explosions. We now the neutron stars get kicks, but we’re less certain for black holes [bonus note].
  • Common envelope efficiency—one of the most intricate bits of physics about binaries is how mass is transferred between stars. As they start exhausting their nuclear fuel they puff up, so material from the outer envelope of one star may be stripped onto the other. In the most extreme cases, a common envelope may form, where so much mass is piled onto the companion, that both stars live in a single fluffy envelope. Orbiting inside the envelope helps drag the two stars closer together, bringing them closer to merging. The efficiency determines how quickly the envelope becomes unbound, ending this phase.
  • Mass loss rates during the Wolf–Rayet (not to be confused with Wolf 359) and luminous blue variable phases–stars lose mass through out their lives, but we’re not sure how much. For stars like our Sun, mass loss is low, there is enough to gives us the aurora, but it doesn’t affect the Sun much. For bigger and hotter stars, mass loss can be significant. We consider two evolutionary phases of massive stars where mass loss is high, and currently poorly known. Mass could be lost in clumps, rather than a smooth stream, making it difficult to measure or simulate.

We use parameters describing potential variations in these properties are ingredients to the COMPAS population synthesis code. This rapidly (albeit approximately) evolves a population of stellar binaries to calculate which will produce merging binary black holes.

The question is now which parameters affect our gravitational-wave measurements, and how accurately we can measure those which do?

Merger rate with redshift and chirp mass

Binary black hole merger rate at three different redshifts z as calculated by COMPAS. We show the rate in 30 different chirp mass bins for our default population parameters. The caption gives the total rate for all masses. Figure 2 of Barrett et al. (2018)

Gravitational-wave observations

For our deductions, we use two pieces of information we will get from LIGO and Virgo observations: the total number of detections, and the distributions of chirp masses. The chirp mass is a combination of the two black hole masses that is often well measured—it is the most important quantity for controlling the inspiral, so it is well measured for low mass binaries which have a long inspiral, but is less well measured for higher mass systems. In reality we’ll have much more information, so these results should be the minimum we can actually do.

We consider the population after 1000 detections. That sounds like a lot, but we should have collected this many detections after just 2 or 3 years observing at design sensitivity. Our default COMPAS model predicts 484 detections per year of observing time! Honestly, I’m a little scared about having this many signals…

For a set of population parameters (black hole natal kick, common envelope efficiency, luminous blue variable mass loss and Wolf–Rayet mass loss), COMPAS predicts the number of detections and the fraction of detections as a function of chirp mass. Using these, we can work out the probability of getting the observed number of detections and fraction of detections within different chirp mass ranges. This is the likelihood function: if a given model is correct we are more likely to get results similar to its predictions than further away, although we expect their to be some scatter.

If you like equations, the from of our likelihood is explained in this bonus note. If you don’t like equations, there’s one lurking in the paragraph below. Just remember, that it can’t see you if you don’t move. It’s OK to skip the equation.

To determine how sensitive we are to each of the population parameters, we see how the likelihood changes as we vary these. The more the likelihood changes, the easier it should be to measure that parameter. We wrap this up in terms of the Fisher information matrix. This is defined as

\displaystyle F_{ij} = -\left\langle\frac{\partial^2\ln \mathcal{L}(\mathcal{D}|\left\{\lambda\right\})}{\partial \lambda_i \partial\lambda_j}\right\rangle,

where \mathcal{L}(\mathcal{D}|\left\{\lambda\right\}) is the likelihood for data \mathcal{D} (the number of observations and their chirp mass distribution in our case), \left\{\lambda\right\} are our parameters (natal kick, etc.), and the angular brackets indicate the average over the population parameters. In statistics terminology, this is the variance of the score, which I think sounds cool. The Fisher information matrix nicely quantifies how much information we can lean about the parameters, including the correlations between them (so we can explore degeneracies). The inverse of the Fisher information matrix gives a lower bound on the covariance matrix (the multidemensional generalisation of the variance in a normal distribution) for the parameters \left\{\lambda\right\}. In the limit of a large number of detections, we can use the Fisher information matrix to estimate the accuracy to which we measure the parameters [bonus note].

We simulated several populations of binary black hole signals, and then calculate measurement uncertainties for our four population uncertainties to see what we could learn from these measurements.


Using just the rate information, we find that we can constrain a combination of the common envelope efficiency and the Wolf–Rayet mass loss rate. Increasing the common envelope efficiency ends the common envelope phase earlier, leaving the binary further apart. Wider binaries take longer to merge, so this reduces the merger rate. Similarly, increasing the Wolf–Rayet mass loss rate leads to wider binaries and smaller black holes, which take longer to merge through gravitational-wave emission. Since the two parameters have similar effects, they are anticorrelated. We can increase one and still get the same number of detections if we decrease the other. There’s a hint of a similar correlation between the common envelope efficiency and the luminous blue variable mass loss rate too, but it’s not quite significant enough for us to be certain it’s there.

Correaltions between population parameters

Fisher information matrix estimates for fractional measurement precision of the four population parameters: the black hole natal kick \sigma_\mathrm{kick}, the common envelope efficiency \alpha_\mathrm{CE}, the Wolf–Rayet mass loss rate f_\mathrm{WR}, and the luminous blue variable mass loss rate f_\mathrm{LBV}. There is an anticorrealtion between f_\mathrm{WR} and \alpha_\mathrm{CE}, and hints at a similar anticorrelation between f_|mathrm{LBV} and \alpha_\mathrm{CE}. We show 1500 different realisations of the binary population to give an idea of scatter. Figure 6 of Barrett et al. (2018)

Adding in the chirp mass distribution gives us more information, and improves our measurement accuracies. The fraction uncertainties are about 2% for the two mass loss rates and the common envelope efficiency, and about 5% for the black hole natal kick. We’re less sensitive to the natal kick because the most massive black holes don’t receive a kick, and so are unaffected by the kick distribution [bonus note]. In any case, these measurements are exciting! With this type of precision, we’ll really be able to learn something about the details of binary evolution.

Standard deviation of measurements of population parameters

Measurement precision for the four population parameters after 1000 detections. We quantify the precision with the standard deviation estimated from the Fisher inforamtion matrix. We show results from 1500 realisations of the population to give an idea of scatter. Figure 5 of Barrett et al. (2018)

The accuracy of our measurements will improve (on average) with the square root of the number of gravitational-wave detections. So we can expect 1% measurements after about 4000 observations. However, we might be able to get even more improvement by combining constraints from other types of observation. Combining different types of observation can help break degeneracies. I’m looking forward to building a concordance model of binary evolution, and figuring out exactly how massive stars live their lives.

arXiv: 1711.06287 [astro-ph.HE]
Journal: Monthly Notices of the Royal Astronomical Society; 477(4):4685–4695; 2018
Favourite dinosaur: Professor Science

Bonus notes

Channel selection

In practise, we will need to worry about how binary black holes are formed, via isolated evolution or otherwise, before inferring the parameters describing binary evolution. This makes the problem more complicated. Some parameters, like mass loss rates or black hole natal kicks, might be common across multiple channels, while others are not. There are a number of ways we might be able to tell different formation mechanisms apart, such as by using spin measurements.

Kick distribution

We model the supernova kicks v_\mathrm{kick} as following a Maxwell–Boltzmann distribution,

\displaystyle p(v_\mathrm{kick}) = \sqrt{\frac{2}{\pi}}  \frac{v_\mathrm{kick}^2}{\sigma_\mathrm{kick}^3} \exp\left(\frac{-v_\mathrm{kick}^2}{2\sigma_\mathrm{kick}^2}\right),

where \sigma_\mathrm{kick} is the unknown population parameter. The natal kick received by the black hole v^*_\mathrm{kick} is not the same as this, however, as we assume some of the material ejected by the supernova falls back, reducing the over kick. The final natal kick is

v^*_\mathrm{kick} = (1-f_\mathrm{fb})v_\mathrm{kick},

where f_\mathrm{fb} is the fraction that falls back, taken from Fryer et al. (2012). The fraction is greater for larger black holes, so the biggest black holes get no kicks. This means that the largest black holes are unaffected by the value of \sigma_\mathrm{kick}.

The likelihood

In this analysis, we have two pieces of information: the number of detections, and the chirp masses of the detections. The first is easy to summarise with a single number. The second is more complicated, and we consider the fraction of events within different chirp mass bins.

Our COMPAS model predicts the merger rate \mu and the probability of falling in each chirp mass bin p_k (we factor measurement uncertainty into this). Our observations are the the total number of detections N_\mathrm{obs} and the number in each chirp mass bin c_k (N_\mathrm{obs} = \sum_k c_k). The likelihood is the probability of these observations given the model predictions. We can split the likelihood into two pieces, one for the rate, and one for the chirp mass distribution,

\mathcal{L} = \mathcal{L}_\mathrm{rate} \times \mathcal{L}_\mathrm{mass}.

For the rate likelihood, we need the probability of observing N_\mathrm{obs} given the predicted rate \mu. This is given by a Poisson distribution,

\displaystyle \mathcal{L}_\mathrm{rate} = \exp(-\mu t_\mathrm{obs}) \frac{(\mu t_\mathrm{obs})^{N_\mathrm{obs}}}{N_\mathrm{obs}!},

where t_\mathrm{obs} is the total observing time. For the chirp mass likelihood, we the probability of getting a number of detections in each bin, given the predicted fractions. This is given by a multinomial distribution,

\displaystyle \mathcal{L}_\mathrm{mass} = \frac{N_\mathrm{obs}!}{\prod_k c_k!} \prod_k p_k^{c_k}.

These look a little messy, but they simplify when you take the logarithm, as we need to do for the Fisher information matrix.

When we substitute in our likelihood into the expression for the Fisher information matrix, we get

\displaystyle F_{ij} = \mu t_\mathrm{obs} \left[ \frac{1}{\mu^2} \frac{\partial \mu}{\partial \lambda_i} \frac{\partial \mu}{\partial \lambda_j}  + \sum_k\frac{1}{p_k} \frac{\partial p_k}{\partial \lambda_i} \frac{\partial p_k}{\partial \lambda_j} \right].

Conveniently, although we only need to evaluate first-order derivatives, even though the Fisher information matrix is defined in terms of second derivatives. The expected number of events is \langle N_\mathrm{obs} \rangle = \mu t_\mathrm{obs}. Therefore, we can see that the measurement uncertainty defined by the inverse of the Fisher information matrix, scales on average as N_\mathrm{obs}^{-1/2}.

For anyone worrying about using the likelihood rather than the posterior for these estimates, the high number of detections [bonus note] should mean that the information we’ve gained from the data overwhelms our prior, meaning that the shape of the posterior is dictated by the shape of the likelihood.

Interpretation of the Fisher information matrix

As an alternative way of looking at the Fisher information matrix, we can consider the shape of the likelihood close to its peak. Around the maximum likelihood point, the first-order derivatives of the likelihood with respect to the population parameters is zero (otherwise it wouldn’t be the maximum). The maximum likelihood values of $latex N_\mathrm{obs} = \mu t_\mathrm{obs}$ and c_k = N_\mathrm{obs} p_k are the same as their expectation values. The second-order derivatives are given by the expression we have worked out for the Fisher information matrix. Therefore, in the region around the maximum likelihood point, the Fisher information matrix encodes all the relevant information about the shape of the likelihood.

So long as we are working close to the maximum likelihood point, we can approximate the distribution as a multidimensional normal distribution with its covariance matrix determined by the inverse of the Fisher information matrix. Our results for the measurement uncertainties are made subject to this approximation (which we did check was OK).

Approximating the likelihood this way should be safe in the limit of large N_\mathrm{obs}. As we get more detections, statistical uncertainties should reduce, with the peak of the distribution homing in on the maximum likelihood value, and its width narrowing. If you take the limit of N_\mathrm{obs} \rightarrow \infty, you’ll see that the distribution basically becomes a delta function at the maximum likelihood values. To check that our N_\mathrm{obs} = 1000 was large enough, we verified that higher-order derivatives were still small.

Michele Vallisneri has a good paper looking at using the Fisher information matrix for gravitational wave parameter estimation (rather than our problem of binary population synthesis). There is a good discussion of its range of validity. The high signal-to-noise ratio limit for gravitational wave signals corresponds to our high number of detections limit.



The missing link for black holes

There has been some recent excitement about the claimed identification of a 400-solar-mass black hole. A team of scientists have recently published a letter in the journal Nature where they show how X-ray measurements of a source in the nearby galaxy M82 can be interpreted as originating from a black hole with mass of around 400 times the mass of the Sun—from now on I’ll use M_\odot as shorthand for the mass of the Sun (one solar mass). This particular X-ray source is peculiarly bright and has long been suspected to potentially be a black hole with a mass around 100 M_\odot to 1000 M_\odot. If the result is confirmed, then it is the first definite detection of an intermediate-mass black hole, or IMBH for short, but why is this exciting?

Mass of black holes

In principle, a black hole can have any mass. To form a black hole you just need to squeeze mass down into a small enough space. For the something the mass of the Earth, you need to squeeze down to a radius of about 9 mm and for something about the mass of the Sun, you need to squeeze to a radius of about 3 km. Black holes are pretty small! Most of the time, things don’t collapse to form black holes because they materials they are made of are more than strong enough to counter-balance their own gravity.


These innocent-looking marshmallows could collapse down to form black holes if they were squeezed down to a size of about 10−29 m. The only thing stopping this is the incredible strength of marshmallow when compared to gravity.

Stellar-mass black holes

Only very massive things, where gravitational forces are immense, collapse down to black holes. This happens when the most massive stars reach the end of their lifetimes. Stars are kept puffy because they are hot. They are made of plasma where all their constituent particles are happily whizzing around and bouncing into each other. This can continue to happen while the star is undergoing nuclear fusion which provides the energy to keep things hot. At some point this fuel runs out, and then the core of the star collapses. What happens next depends on the mass of the core. The least massive stars (like our own Sun) will collapse down to become white dwarfs. In white dwarfs, the force of gravity is balanced by electrons. Electrons are rather anti-social and dislike sharing the same space with each other (a concept known as the Pauli exclusion principle, which is a consequence of their exchange symmetry), hence they put up a bit of a fight when squeezed together. The electrons can balance the gravitational force for masses up to about 1.4 M_\odot, known as the Chandrasekhar mass. After that they get squeezed together with protons and we are left with a neutron star. Neutron stars are much like giant atomic nuclei. The force of gravity is now balanced by the neutrons who, like electrons, don’t like to share space, but are less easy to bully than the electrons. The maximum mass of a neutron star is not exactly known, but we think it’s somewhere between 2 M_\odot and 3 M_\odot. After this, nothing can resist gravity and you end up with a black hole of a few times the mass of the Sun.

Collapsing stars produce the imaginatively named stellar-mass black holes, as they are about the same mass as stars. Stars lose a lot of mass during their lifetime, so the mass of a newly born black hole is less than the original mass of the star that formed it. The maximum mass of stellar-mass black holes is determined by the maximum size of stars. We have good evidence for stellar-mass black holes, for example from looking at X-ray binaries, where we see a hot disc of material swirling around the black hole.

Massive black holes

We also have evidence for another class of black holes: massive black holes, MBHs to their friends, or, if trying to sound extra cool, supermassive black holes. These may be 10^5 M_\odot to 10^9 M_\odot. The strongest evidence comes from our own galaxy, where we can see stars in the centre of the galaxy orbiting something so small and heavy it can only be a black hole.

We think that there is an MBH at the centre of pretty much every galaxy, like there’s a hazelnut at the centre of a Ferrero Rocher (in this analogy, I guess the Nutella could be delicious dark matter). From the masses we’ve measured, the properties of these black holes is correlated with the properties of their surrounding galaxies: bigger galaxies have bigger MBHs. The most famous of these correlations is the M–sigma relation, between the mass of the black hole (M) and the velocity dispersion, the range of orbital speeds, of stars surrounding it (the Greek letter sigma, \sigma). These correlations tell us that the evolution of the galaxy and it’s central black hole are linked somehow, this could be just because of their shared history or through some extra feedback too.

MBHs can grow by accreting matter (swallowing up clouds of gas or stars that stray too close) or by merging with other MBHs (we know galaxies merge). The rather embarrassing problem, however, is that we don’t know what the MBHs have grown from. There are really huge MBHs already present in the early Universe (they power quasars), so MBHs must be able to grow quickly. Did they grow from regular stellar-mass black holes or some form of super black hole that formed from a giant star that doesn’t exist today? Did lots of stellar-mass black holes collide to form a seed or did material just accrete quickly? Did the initial black holes come from somewhere else other than stars, perhaps they are leftovers from the Big Bang? We don’t have the data to tell where MBHs came from yet (gravitational waves could be useful for this).

Intermediate-mass black holes

However MBHs grew, it is generally agreed that we should be able to find some intermediate-mass black holes: black holes which haven’t grown enough to become IMBHs. These might be found in dwarf galaxies, or maybe in globular clusters (giant collections of stars that formed together), perhaps even in the centre of galaxies orbiting an MBH. Finding some IMBHs will hopefully tell us about how MBHs formed (and so, possibly about how galaxies formed too).

IMBHs have proved elusive. They are difficult to spot compared to their bigger brothers and sisters. Not finding any might mean we’d need to rethink our ideas of how MBHs formed, and try to find a way for them to either be born about a million times the mass of the Sun, or be guaranteed to grow that big. The finding of the first IMBH tells us that things are more like common sense would dictate: black holes can come in the expected range of masses (phew!). We now need to identify some more to learn about their properties as a population.

In conclusion, black holes can come in a range of masses. We know about the smaller stellar-mass ones and the bigger massive black holes. We suspect that the bigger ones grow from smaller ones, and we now have some evidence for the existence of the hypothesised intermediate-mass black holes. Whatever their size though, black holes are awesome, and they shouldn’t worry about their weight.