# Accuracy of inference on the physics of binary evolution from gravitational-wave observations

Gravitational-wave astronomy lets us observing binary black holes. These systems, being made up of two black holes, are pretty difficult to study by any other means. It has long been argued that with this new information we can unravel the mysteries of stellar evolution. Just as a palaeontologist can discover how long-dead animals lived from their bones, we can discover how massive stars lived by studying their black hole remnants. In this paper, we quantify how much we can really learn from this black hole palaeontology—after 1000 detections, we should pin down some of the most uncertain parameters in binary evolution to a few percent precision.

### Life as a binary

There are many proposed ways of making a binary black hole. The current leading contender is isolated binary evolution: start with a binary star system (most stars are in binaries or higher multiples, our lonesome Sun is a little unusual), and let the stars evolve together. Only a fraction will end with black holes close enough to merge within the age of the Universe, but these would be the sources of the signals we see with LIGO and Virgo. We consider this isolated binary scenario in this work [bonus note].

Now, you might think that with stars being so fundamentally important to astronomy, and with binary stars being so common, we’d have the evolution of binaries figured out by now. It turns out it’s actually pretty messy, so there’s lots of work to do. We consider constraining four parameters which describe the bits of binary physics which we are currently most uncertain of:

• Black hole natal kicks—the push black holes receive when they are born in supernova explosions. We now the neutron stars get kicks, but we’re less certain for black holes [bonus note].
• Common envelope efficiency—one of the most intricate bits of physics about binaries is how mass is transferred between stars. As they start exhausting their nuclear fuel they puff up, so material from the outer envelope of one star may be stripped onto the other. In the most extreme cases, a common envelope may form, where so much mass is piled onto the companion, that both stars live in a single fluffy envelope. Orbiting inside the envelope helps drag the two stars closer together, bringing them closer to merging. The efficiency determines how quickly the envelope becomes unbound, ending this phase.
• Mass loss rates during the Wolf–Rayet (not to be confused with Wolf 359) and luminous blue variable phases–stars lose mass through out their lives, but we’re not sure how much. For stars like our Sun, mass loss is low, there is enough to gives us the aurora, but it doesn’t affect the Sun much. For bigger and hotter stars, mass loss can be significant. We consider two evolutionary phases of massive stars where mass loss is high, and currently poorly known. Mass could be lost in clumps, rather than a smooth stream, making it difficult to measure or simulate.

We use parameters describing potential variations in these properties are ingredients to the COMPAS population synthesis code. This rapidly (albeit approximately) evolves a population of stellar binaries to calculate which will produce merging binary black holes.

The question is now which parameters affect our gravitational-wave measurements, and how accurately we can measure those which do?

Binary black hole merger rate at three different redshifts $z$ as calculated by COMPAS. We show the rate in 30 different chirp mass bins for our default population parameters. The caption gives the total rate for all masses. Figure 2 of Barrett et al. (2018)

### Gravitational-wave observations

For our deductions, we use two pieces of information we will get from LIGO and Virgo observations: the total number of detections, and the distributions of chirp masses. The chirp mass is a combination of the two black hole masses that is often well measured—it is the most important quantity for controlling the inspiral, so it is well measured for low mass binaries which have a long inspiral, but is less well measured for higher mass systems. In reality we’ll have much more information, so these results should be the minimum we can actually do.

We consider the population after 1000 detections. That sounds like a lot, but we should have collected this many detections after just 2 or 3 years observing at design sensitivity. Our default COMPAS model predicts 484 detections per year of observing time! Honestly, I’m a little scared about having this many signals…

For a set of population parameters (black hole natal kick, common envelope efficiency, luminous blue variable mass loss and Wolf–Rayet mass loss), COMPAS predicts the number of detections and the fraction of detections as a function of chirp mass. Using these, we can work out the probability of getting the observed number of detections and fraction of detections within different chirp mass ranges. This is the likelihood function: if a given model is correct we are more likely to get results similar to its predictions than further away, although we expect their to be some scatter.

If you like equations, the from of our likelihood is explained in this bonus note. If you don’t like equations, there’s one lurking in the paragraph below. Just remember, that it can’t see you if you don’t move. It’s OK to skip the equation.

To determine how sensitive we are to each of the population parameters, we see how the likelihood changes as we vary these. The more the likelihood changes, the easier it should be to measure that parameter. We wrap this up in terms of the Fisher information matrix. This is defined as

$\displaystyle F_{ij} = -\left\langle\frac{\partial^2\ln \mathcal{L}(\mathcal{D}|\left\{\lambda\right\})}{\partial \lambda_i \partial\lambda_j}\right\rangle$,

where $\mathcal{L}(\mathcal{D}|\left\{\lambda\right\})$ is the likelihood for data $\mathcal{D}$ (the number of observations and their chirp mass distribution in our case), $\left\{\lambda\right\}$ are our parameters (natal kick, etc.), and the angular brackets indicate the average over the population parameters. In statistics terminology, this is the variance of the score, which I think sounds cool. The Fisher information matrix nicely quantifies how much information we can lean about the parameters, including the correlations between them (so we can explore degeneracies). The inverse of the Fisher information matrix gives a lower bound on the covariance matrix (the multidemensional generalisation of the variance in a normal distribution) for the parameters $\left\{\lambda\right\}$. In the limit of a large number of detections, we can use the Fisher information matrix to estimate the accuracy to which we measure the parameters [bonus note].

We simulated several populations of binary black hole signals, and then calculate measurement uncertainties for our four population uncertainties to see what we could learn from these measurements.

### Results

Using just the rate information, we find that we can constrain a combination of the common envelope efficiency and the Wolf–Rayet mass loss rate. Increasing the common envelope efficiency ends the common envelope phase earlier, leaving the binary further apart. Wider binaries take longer to merge, so this reduces the merger rate. Similarly, increasing the Wolf–Rayet mass loss rate leads to wider binaries and smaller black holes, which take longer to merge through gravitational-wave emission. Since the two parameters have similar effects, they are anticorrelated. We can increase one and still get the same number of detections if we decrease the other. There’s a hint of a similar correlation between the common envelope efficiency and the luminous blue variable mass loss rate too, but it’s not quite significant enough for us to be certain it’s there.

Fisher information matrix estimates for fractional measurement precision of the four population parameters: the black hole natal kick $\sigma_\mathrm{kick}$, the common envelope efficiency $\alpha_\mathrm{CE}$, the Wolf–Rayet mass loss rate $f_\mathrm{WR}$, and the luminous blue variable mass loss rate $f_\mathrm{LBV}$. There is an anticorrealtion between $f_\mathrm{WR}$ and $\alpha_\mathrm{CE}$, and hints at a similar anticorrelation between $f_|mathrm{LBV}$ and $\alpha_\mathrm{CE}$. We show 1500 different realisations of the binary population to give an idea of scatter. Figure 6 of Barrett et al. (2018)

Adding in the chirp mass distribution gives us more information, and improves our measurement accuracies. The fraction uncertainties are about 2% for the two mass loss rates and the common envelope efficiency, and about 5% for the black hole natal kick. We’re less sensitive to the natal kick because the most massive black holes don’t receive a kick, and so are unaffected by the kick distribution [bonus note]. In any case, these measurements are exciting! With this type of precision, we’ll really be able to learn something about the details of binary evolution.

Measurement precision for the four population parameters after 1000 detections. We quantify the precision with the standard deviation estimated from the Fisher inforamtion matrix. We show results from 1500 realisations of the population to give an idea of scatter. Figure 5 of Barrett et al. (2018)

The accuracy of our measurements will improve (on average) with the square root of the number of gravitational-wave detections. So we can expect 1% measurements after about 4000 observations. However, we might be able to get even more improvement by combining constraints from other types of observation. Combining different types of observation can help break degeneracies. I’m looking forward to building a concordance model of binary evolution, and figuring out exactly how massive stars live their lives.

arXiv: 1711.06287 [astro-ph.HE]
Journal: Monthly Notices of the Royal Astronomical Society; 477(4):4685–4695; 2018
Favourite dinosaur: Professor Science

### Bonus notes

#### Channel selection

In practise, we will need to worry about how binary black holes are formed, via isolated evolution or otherwise, before inferring the parameters describing binary evolution. This makes the problem more complicated. Some parameters, like mass loss rates or black hole natal kicks, might be common across multiple channels, while others are not. There are a number of ways we might be able to tell different formation mechanisms apart, such as by using spin measurements.

#### Kick distribution

We model the supernova kicks $v_\mathrm{kick}$ as following a Maxwell–Boltzmann distribution,

$\displaystyle p(v_\mathrm{kick}) = \sqrt{\frac{2}{\pi}} \frac{v_\mathrm{kick}^2}{\sigma_\mathrm{kick}^3} \exp\left(\frac{-v_\mathrm{kick}^2}{2\sigma_\mathrm{kick}^2}\right)$,

where $\sigma_\mathrm{kick}$ is the unknown population parameter. The natal kick received by the black hole $v^*_\mathrm{kick}$ is not the same as this, however, as we assume some of the material ejected by the supernova falls back, reducing the over kick. The final natal kick is

$v^*_\mathrm{kick} = (1-f_\mathrm{fb})v_\mathrm{kick}$,

where $f_\mathrm{fb}$ is the fraction that falls back, taken from Fryer et al. (2012). The fraction is greater for larger black holes, so the biggest black holes get no kicks. This means that the largest black holes are unaffected by the value of $\sigma_\mathrm{kick}$.

#### The likelihood

In this analysis, we have two pieces of information: the number of detections, and the chirp masses of the detections. The first is easy to summarise with a single number. The second is more complicated, and we consider the fraction of events within different chirp mass bins.

Our COMPAS model predicts the merger rate $\mu$ and the probability of falling in each chirp mass bin $p_k$ (we factor measurement uncertainty into this). Our observations are the the total number of detections $N_\mathrm{obs}$ and the number in each chirp mass bin $c_k$ ($N_\mathrm{obs} = \sum_k c_k$). The likelihood is the probability of these observations given the model predictions. We can split the likelihood into two pieces, one for the rate, and one for the chirp mass distribution,

$\mathcal{L} = \mathcal{L}_\mathrm{rate} \times \mathcal{L}_\mathrm{mass}$.

For the rate likelihood, we need the probability of observing $N_\mathrm{obs}$ given the predicted rate $\mu$. This is given by a Poisson distribution,

$\displaystyle \mathcal{L}_\mathrm{rate} = \exp(-\mu t_\mathrm{obs}) \frac{(\mu t_\mathrm{obs})^{N_\mathrm{obs}}}{N_\mathrm{obs}!}$,

where $t_\mathrm{obs}$ is the total observing time. For the chirp mass likelihood, we the probability of getting a number of detections in each bin, given the predicted fractions. This is given by a multinomial distribution,

$\displaystyle \mathcal{L}_\mathrm{mass} = \frac{N_\mathrm{obs}!}{\prod_k c_k!} \prod_k p_k^{c_k}$.

These look a little messy, but they simplify when you take the logarithm, as we need to do for the Fisher information matrix.

When we substitute in our likelihood into the expression for the Fisher information matrix, we get

$\displaystyle F_{ij} = \mu t_\mathrm{obs} \left[ \frac{1}{\mu^2} \frac{\partial \mu}{\partial \lambda_i} \frac{\partial \mu}{\partial \lambda_j} + \sum_k\frac{1}{p_k} \frac{\partial p_k}{\partial \lambda_i} \frac{\partial p_k}{\partial \lambda_j} \right]$.

Conveniently, although we only need to evaluate first-order derivatives, even though the Fisher information matrix is defined in terms of second derivatives. The expected number of events is $\langle N_\mathrm{obs} \rangle = \mu t_\mathrm{obs}$. Therefore, we can see that the measurement uncertainty defined by the inverse of the Fisher information matrix, scales on average as $N_\mathrm{obs}^{-1/2}$.

For anyone worrying about using the likelihood rather than the posterior for these estimates, the high number of detections [bonus note] should mean that the information we’ve gained from the data overwhelms our prior, meaning that the shape of the posterior is dictated by the shape of the likelihood.

#### Interpretation of the Fisher information matrix

As an alternative way of looking at the Fisher information matrix, we can consider the shape of the likelihood close to its peak. Around the maximum likelihood point, the first-order derivatives of the likelihood with respect to the population parameters is zero (otherwise it wouldn’t be the maximum). The maximum likelihood values of $latex N_\mathrm{obs} = \mu t_\mathrm{obs}$ and $c_k = N_\mathrm{obs} p_k$ are the same as their expectation values. The second-order derivatives are given by the expression we have worked out for the Fisher information matrix. Therefore, in the region around the maximum likelihood point, the Fisher information matrix encodes all the relevant information about the shape of the likelihood.

So long as we are working close to the maximum likelihood point, we can approximate the distribution as a multidimensional normal distribution with its covariance matrix determined by the inverse of the Fisher information matrix. Our results for the measurement uncertainties are made subject to this approximation (which we did check was OK).

Approximating the likelihood this way should be safe in the limit of large $N_\mathrm{obs}$. As we get more detections, statistical uncertainties should reduce, with the peak of the distribution homing in on the maximum likelihood value, and its width narrowing. If you take the limit of $N_\mathrm{obs} \rightarrow \infty$, you’ll see that the distribution basically becomes a delta function at the maximum likelihood values. To check that our $N_\mathrm{obs} = 1000$ was large enough, we verified that higher-order derivatives were still small.

Michele Vallisneri has a good paper looking at using the Fisher information matrix for gravitational wave parameter estimation (rather than our problem of binary population synthesis). There is a good discussion of its range of validity. The high signal-to-noise ratio limit for gravitational wave signals corresponds to our high number of detections limit.

# Science with the space-based interferometer LISA. V. Extreme mass-ratio inspirals

The space-based observatory LISA will detect gravitational waves from massive black holes (giant black holes residing in the centres of galaxies). One particularly interesting signal will come from the inspiral of a regular stellar-mass black hole into a massive black hole. These are called extreme mass-ratio inspirals (or EMRIs, pronounced emries, to their friends) [bonus note]. We have never observed such a system. This means that there’s a lot we have to learn about them. In this work, we systematically investigated the prospects for observing EMRIs. We found that even though there’s a wide range in predictions for what EMRIs we will detect, they should be a safe bet for the LISA mission.

Artistic impression of the spacetime for an extreme-mass-ratio inspiral, with a smaller stellar-mass black hole orbiting a massive black hole. This image is mandatory when talking about extreme-mass-ratio inspirals. Credit: NASA

### LISA & EMRIs

My previous post discussed some of the interesting features of EMRIs. Because of the extreme difference in masses of the two black holes, it takes a long time for them to complete their inspiral. We can measure tens of thousands of orbits, which allows us to make wonderfully precise measurements of the source properties (if we can accurately pick out the signal from the data). Here, we’ll examine exactly what we could learn with LISA from EMRIs [bonus note].

First we build a model to investigate how many EMRIs there could be.  There is a lot of astrophysics which we are currently uncertain about, which leads to a large spread in estimates for the number of EMRIs. Second, we look at how precisely we could measure properties from the EMRI signals. The astrophysical uncertainties are less important here—we could get a revolutionary insight into the lives of massive black holes.

### The number of EMRIs

To build a model of how many EMRIs there are, we need a few different inputs:

1. The population of massive black holes
2. The distribution of stellar clusters around massive black holes
3. The range of orbits of EMRIs

We examine each of these in turn, building a more detailed model than has previously been constructed for EMRIs.

We currently know little about the population of massive black holes. This means we’ll discover lots when we start measuring signals (yay), but it’s rather inconvenient now, when we’re trying to predict how many EMRIs there are (boo). We take two different models for the mass distribution of massive black holes. One is based upon a semi-analytic model of massive black hole formation, the other is at the pessimistic end allowed by current observations. The semi-analytic model predicts massive black hole spins around 0.98, but we also consider spins being uniformly distributed between 0 and 1, and spins of 0. This gives us a picture of the bigger black hole, now we need the smaller.

Observations show that the masses of massive black holes are correlated with their surrounding cluster of stars—bigger black holes have bigger clusters. We consider four different versions of this trend: Gültekin et al. (2009); Kormendy & Ho (2013); Graham & Scott (2013), and Shankar et al. (2016). The stars and black holes about a massive black hole should form a cusp, with the density of objects increasing towards the massive black hole. This is great for EMRI formation. However, the cusp is disrupted if two galaxies (and their massive black holes) merge. This tends to happen—it’s how we get bigger galaxies (and black holes). It then takes some time for the cusp to reform, during which time, we don’t expect as many EMRIs. Therefore, we factor in the amount of time for which there is a cusp for massive black holes of different masses and spins.

That’s a nice galaxy you have there. It would be a shame if it were to collide with something… Hubble image of The Mice. Credit: ACS Science & Engineering Team.

Given a cusp about a massive black hole, we then need to know how often an EMRI forms. Simulations give us a starting point. However, these only consider a snap-shot, and we need to consider how things evolve with time. As stellar-mass black holes inspiral, the massive black hole will grow in mass and the surrounding cluster will become depleted. Both these effects are amplified because for each inspiral, there’ll be many more stars or stellar-mass black holes which will just plunge directly into the massive black hole. We therefore need to limit the number of EMRIs so that we don’t have an unrealistically high rate. We do this by adding in a couple of feedback factors, one to cap the rate so that we don’t deplete the cusp quicker than new objects will be added to it, and one to limit the maximum amount of mass the massive black hole can grow from inspirals and plunges. This gives us an idea for the total number of inspirals.

Finally, we calculate the orbits that EMRIs will be on.  We again base this upon simulations, and factor in how the spin of the massive black hole effects the distribution of orbital inclinations.

Putting all the pieces together, we can calculate the population of EMRIs. We now need to work out how many LISA would be able to detect. This means we need models for the gravitational-wave signal. Since we are simulating a large number, we use a computationally inexpensive analytic model. We know that this isn’t too accurate, but we consider two different options for setting the end of the inspiral (where the smaller black hole finally plunges) which should bound the true range of results.

Number of EMRIs for different size massive black holes in different astrophysical models. M1 is our best estimate, the others explore variations on this. M11 and M12 are designed to be cover the extremes, being the most pessimistic and optimistic combinations. The solid and dashed lines are for two different signal models (AKK and AKS), which are designed to give an indication of potential variation. They agree where the massive black hole is not spinning (M10 and M11). The range of masses is similar for all models, as it is set by the sensitivity of LISA. We can detect higher mass systems assuming the AKK signal model as it includes extra inspiral close to highly spinning black holes: for the heaviest black holes, this is the only part of the signal at high enough frequency to be detectable. Figure 8 of Babak et al. (2017).

Allowing for all the different uncertainties, we find that there should be somewhere between 1 and 4200 EMRIs detected per year. (The model we used when studying transient resonances predicted about 250 per year, albeit with a slightly different detector configuration, which is fairly typical of all the models we consider here). This range is encouraging. The lower end means that EMRIs are a pretty safe bet, we’d be unlucky not to get at least one over the course of a multi-year mission (LISA should have at least four years observing). The upper end means there could be lots—we might actually need to worry about them forming a background source of noise if we can’t individually distinguish them!

### EMRI measurements

Having shown that EMRIs are a good LISA source, we now need to consider what we could learn by measuring them?

We estimate the precision we will be able to measure parameters using the Fisher information matrix. The Fisher matrix measures how sensitive our observations are to changes in the parameters (the more sensitive we are, the better we should be able to measure that parameter). It should be a lower bound on actual measurement precision, and well approximate the uncertainty in the high signal-to-noise (loud signal) limit. The combination of our use of the Fisher matrix and our approximate signal models means our results will not be perfect estimates of real performance, but they should give an indication of the typical size of measurement uncertainties.

Given that we measure a huge number of cycles from the EMRI signal, we can make really precise measurements of the the mass and spin of the massive black hole, as these parameters control the orbital frequencies. Below are plots for the typical measurement precision from our Fisher matrix analysis. The orbital eccentricity is measured to similar accuracy, as it influences the range of orbital frequencies too. We also get pretty good measurements of the the mass of the smaller black hole, as this sets how quickly the inspiral proceeds (how quickly the orbital frequencies change). EMRIs will allow us to do precision astronomy!

Distribution of (one standard deviation) fractional uncertainties for measurements of the  massive black hole (redshifted) mass $M_z$. Results are shown for the different astrophysical models, and for the different signal models.  The astrophysical model has little impact on the uncertainties. M4 shows a slight difference as it assumes heavier stellar-mass black holes. The results with the two signal models agree when the massive black hole is not spinning (M10 and M11). Otherwise, measurements are more precise with the AKK signal model, as this includes extra signal from the end of the inspiral. Part of Figure 11 of Babak et al. (2017).

Distribution of (one standard deviation) uncertainties for measurements of the massive black hole spin $a$. The results mirror those for the masses above. Part of Figure 11 of Babak et al. (2017).

Now, before you get too excited that we’re going to learn everything about massive black holes, there is one confession I should make. In the plot above I show the measurement accuracy for the redshifted mass of the massive black hole. The cosmological expansion of the Universe causes gravitational waves to become stretched to lower frequencies in the same way light is (this makes visible light more red, hence the name). The measured frequency is $f_z = (1 + z)f$ where $f$ is the frequency emitted, and $z$ is the redshift ($z= 0$ for a nearby source, and is larger for further away sources). Lower frequency gravitational waves correspond to higher mass systems, so it is often convenient to work with the redshifted mass, the mass corresponding to the signal you measure if you ignore redshifting. The redshifted mass of the massive black hole is $M_z = (1+z)M$ where $M$ is the true mass. To work out the true mass, we need the redshift, which means we need to measure the distance to the source.

Distribution of (one standard deviation) fractional uncertainties for measurements of the luminosity distance $D_\mathrm{L}$. The signal model is not as important here, as the uncertainty only depends on how loud the signal is. Part of Figure 12 of Babak et al. (2017).

The plot above shows the fractional uncertainty on the distance. We don’t measure this too well, as it is determined from the amplitude of the signal, rather than its frequency components. The situation is much as for LIGO. The larger uncertainties on the distance will dominate the overall uncertainty on the black hole masses. We won’t be getting all these to fractions of a percent. However, that doesn’t mean we can’t still figure out what the distribution of masses looks like!

One of the really exciting things we can do with EMRIs is check that the signal matches our expectations for a black hole in general relativity. Since we get such an excellent map of the spacetime of the massive black hole, it is easy to check for deviations. In general relativity, everything about the black hole is fixed by its mass and spin (often referred to as the no-hair theorem). Using the measured EMRI signal, we can check if this is the case. One convenient way of doing this is to describe the spacetime of the massive object in terms of a multipole expansion. The first (most important) terms gives the mass, and the next term the spin. The third term (the quadrupole) is set by the first two, so if we can measure it, we can check if it is consistent with the expected relation. We estimated how precisely we could measure a deviation in the quadrupole. Fortunately, for this consistency test, all factors from redshifting cancel out, so we can get really detailed results, as shown below. Using EMRIs, we’ll be able to check for really small differences from general relativity!

Distribution of (one standard deviation) of uncertainties for deviations in the quadrupole moment of the massive object spacetime $\mathcal{Q}$. Results are similar to the mass and spin measurements. Figure 13 of Babak et al. (2017).

In summary: EMRIS are awesome. We’re not sure how many we’ll detect with LISA, but we’re confident there will be some, perhaps a couple of hundred per year. From the signals we’ll get new insights into the masses and spins of black holes. This should tell us something about how they, and their surrounding galaxies, evolved. We’ll also be able to do some stringent tests of whether the massive objects are black holes as described by general relativity. It’s all pretty exciting, for when LISA launches, which is currently planned about 2034…

One of the most valuable traits a student or soldier can have: patience. Credit: Sony/Marvel

arXiv: 1703.09722 [gr-qc]
Journal: Physical Review D; 477(4):4685–4695; 2018
Conference proceedings: 1704.00009 [astro-ph.GA] (from when work was still in-progress)
Estimated number of Marvel films before LISA launch: 48 (starting with Ant-Man and the Wasp)

### Bonus notes

#### Hyphenation

Is it “extreme-mass-ratio inspiral”, “extreme mass-ratio inspiral” or “extreme mass ratio inspiral”? All are used in the literature. This is one of the advantage of using “EMRI”. The important thing is that we’re talking about inspirals that have a mass ratio which is extreme. For this paper, we used “extreme mass-ratio inspiral”, but when I first started my PhD, I was first introduced to “extreme-mass-ratio inspirals”, so they are always stuck that way in my mind.

I think hyphenation is a bit of an art, and there’s no definitive answer here, just like there isn’t for superhero names, where you can have Iron Man, Spider-Man or Iceman.

#### Science with LISA

This paper is part of a series looking at what LISA could tells us about different gravitational wave sources. So far, this series covers

1. Massive black hole binaries
2. Cosmological phase transitions
3. Standard sirens (for measuring the expansion of the Universe)
4. Inflation
5. Extreme-mass-ratio inspirals

You’ll notice there’s a change in the name of the mission from eLISA to LISA part-way through, as things have evolved. (Or devolved?) I think the main take-away so far is that the cosmology group is the most enthusiastic.

# Importance of transient resonances in extreme-mass-ratio inspirals

Extreme-mass-ratio inspirals (EMRIs for short) are a promising source for the planned space-borne gravitational-wave observatory LISA. To detect and analyse them we need accurate models for the signals, which are exquisitely intricate. In this paper, we investigated a feature, transient resonances, which have not previously included in our models. They are difficult to incorporate, but can have a big impact on the signal. Fortunately, we find that we can still detect the majority of EMRIs, even without including resonances. Phew!

### EMRIs and orbits

EMRIs are a beautiful gravitational wave source. They occur when a stellar-mass black hole slowly inspirals into a massive black hole (as found in the centre of galaxies). The massive black hole can be tens of thousands or millions of times more massive than the stellar-mass black hole (hence extreme mass ratio). This means that the inspiral is slow—we can potentially measure tens of thousands of orbits. This is both the blessing and the curse of EMRIs. The huge numbers of cycles means that we can closely follow the inspiral, and build a detailed map of the massive black hole’s spacetime. EMRIs will give us precision measurements of the properties of massive black holes. However, to do this, we need to be able to find the EMRI signals in the data, we need models which can match the signals over all these cycles. Analysing EMRIs is a huge challenge.

EMRI orbits are complicated. At any moment, the orbit can be described by three orbital frequencies: one for radial (in/out) motion $\Omega_r$, one for polar (north/south if we think of the spin of the massive black hole like the rotation of the Earth) motion $\Omega_\theta$ and one for axial (around in the east/west direction) motion. As gravitational waves are emitted, and the orbit shrinks, these frequencies evolve. The animation above, made by Steve Drasco, illustrates the evolution of an EMRI. Every so often, so can see the pattern freeze—the orbits stays in a constant shape (although this still rotates). This is a transient resonance. Two of the orbital frequencies become commensurate (so we might have 3 north/south cycles and 2 in/out cycles over the same period [bonus note])—this is the resonance. However, because the frequencies are still evolving, we don’t stay locked like this is forever—which is why the resonance is transient. To calculate an EMRI, you need to know how the orbital frequencies evolve.

The evolution of an EMRI is slow—the time taken to inspiral is much longer than the time taken to complete one orbit. Therefore, we can usually split the problem of calculating the trajectory of an EMRI into two parts. On short timescales, we can consider orbits as having fixed frequencies. On long timescale, we can calculate the evolution by averaging over many orbits. You might see the problem with this—around resonances, this averaging breaks down. Whereas normally averaging over many orbits means averaging over a complicated trajectory that hits pretty much all possible points in the orbital range, on resonance, you just average over the same bit again and again. On resonance, terms which usually average to zero can become important. Éanna Flanagan and Tanja Hinderer first pointed out that around resonances the usual scheme (referred to as the adiabatic approximation) doesn’t work.

A non-resonant EMRI orbit in three dimensions (left) and two dimensions (right), ignoring the rotation in the axial direction. A non-resonant orbit will eventually fill the $r$$\theta$ plane. Credit: Rob Cole

For comparison, a resonant EMRI orbit. A 2:3 resonance traces the same parts of the $r$$\theta$ plane over and over. Credit: Rob Cole

Around a resonance, the evolution will be enhanced or decreased a little relative to the standard adiabatic evolution. We get a kick. This is only small, but because we observe EMRIs for so many orbits, a small difference can grow to become a significant difference later on. Does this mean that we won’t be able to detect EMRIs with our standard models? This was a concern, so back at the end of PhD I began to investigate [bonus note]. The first step is to understand the size of the kick.

A jump in the orbital energy across a 2:3 resonance. The plot shows the difference between the approximate adiabatic evolution and the instantaneous evolution including the resonance. The thickness of the blue line is from oscillations on the orbital timescale which is too short to resolve here. The dotted red line shows the fitted size of the jump. Time is measured in terms of the resonance time $\tau_\mathrm{res}$ which is defined below. Figure 4 of Berry et al. (2016).

### Resonance kicks

If there were no gravitational waves, the orbit would not evolve, it would be fixed. The orbit could then be described by a set of constants of motion. The most commonly used when describing orbits about black holes are the energy, angular momentum and Carter constant. For the purposes of this blog, we’ll not worry too much about what these constants are, we’ll just consider some constant $I$.

The resonance kick is a change in this constant $\Delta I$. What should this depend on? There are three ingredients. First, the rate of change of this constant $F$ on the resonant orbit. Second, the time spent on resonance $\tau_\mathrm{res}$. The bigger these are, the bigger the size of the jump. Therefore,

$|\Delta I| \propto F \tau_\mathrm{res}$.

However, the jump could be positive or negative. This depends upon the relative phase of the radial and polar motion [bonus note]—for example, do they both reach their maximum point at the same time, or does one lag behind the other? We’ll call this relative phase $q$. By varying $q$ we explore we can get our resonant trajectory to go through any possible point in space. Therefore, averaging over $q$ should get us back to the adiabatic approximation: the average value of $\Delta I$ must be zero. To complete our picture for the jump, we need a periodic function of the phase,

$\Delta I = F \tau_\mathrm{res} f(q)$,

with $\langle f(q) \rangle_q = 0$. Now, we know the pieces, we can try to figure out what the pieces are.

The rate of change $F$ is proportional the mass ratio $\eta \ll 1$: the smaller the stellar-mass black hole is relative to the massive one, the smaller $F$ is. The exact details depend upon gravitational self-force calculations, which we’ll skip over, as they’re pretty hard, but they are the same for all orbits (resonant or not).

We can think of the resonance timescale either as the time for the orbital frequencies to drift apart or the time for the orbit to start filling the space again (so that it’s safe to average). The two pictures yield the same answer—there’s a fuller explanation in Section III A of the paper. To define the resonance timescale, it is useful to define the frequency $\Omega = n_r \Omega_r - n_\theta \Omega_\theta$, which is zero exactly on resonance. If this is evolving at rate $\dot{\Omega}$, then the resonance timescale is

$\displaystyle \tau_\mathrm{res} = \left[\frac{2\pi}{\dot{\Omega}}\right]^{1/2}$.

This bridges the two timescales that usually define EMRIs: the short orbital timescale $T$ and the long evolution timescale $\tau_\mathrm{ev}$:

$T \sim \eta^{1/2} \tau_\mathrm{res} \sim \eta \tau_\mathrm{ev}$.

To find the form of for $f(q)$, we need to do some quite involved maths (given in Appendix B of the paper) [bonus note]. This works by treating the evolution far from resonance as depending upon two independent times (effectively defining $T$ and $\tau_\mathrm{ev}$), and then matching the evolution close to resonance using an expansion in terms of a different time (something like $\tau_\mathrm{res}$). The solution shows that the jump depends sensitively upon the phase $q$ at resonance, which makes them extremely difficult to calculate.

We numerically evaluated the size of kicks for different orbits and resonances. We found a number of trends. First, higher-order resonances (those with larger $n_r$ and $n_\theta$) have smaller jumps than lower-order ones. This makes sense, as higher-order resonances come closer to covering all the points in the space, and so are more like averaging over the entire space. Second, jumps are larger for higher eccentricity orbits. This also makes sense, as you can’t have resonances for circular (zero eccentricity orbits) as there’s no radial frequency, so the size of the jumps must tend to zero. We’ll see that these two points are important when it comes to observational consequences of transient resonances.

### Astrophysical EMRIs

Now we’ve figured out the impact of passing through a transient resonance, let’s look at what this means for detecting EMRIs. The jump can mean that the evolution post-resonance can soon become out of phase with that pre-resonance. We can’t match both parts with the same adiabatic template. This could significantly hamper our prospects for detection, as we’re limited to the bits of signal we can pick up between resonances.

We created an astrophysical population of simulated EMRIs. We used numerical simulations to estimate a plausible population of massive black holes and distribution of stellar-mass black holes insprialling into them. We then used adiabatic models to see how many LISA (or eLISA as it was called at the time) could potentially detect. We found there were ~510 EMRIs detectable (with a signal-to-noise ratio of 15 or above) for a two-year mission.

We then calculated how much the signal-to-noise ratio would be reduced by passing through transient resonances. The plot below shows the distribution of signal-to-noise ratio for the original population, ignoring resonances, and then after factoring in the reduction. There are now ~490 detectable EMRIs, a loss of 4%. We can still detect the majority of EMRIs!

Distribution of signal-to-noise ratios for EMRIs. In blue (solid outline), we have the results ignoring transient resonances. In orange (dashed outline), we have the distribution including the reduction due to resonance jumps. Events falling below 15 are deemed to be undetectable. Figure 10 of Berry et al. (2016).

We were worried about the impact of transient resonances, we know that jumps can cause them to become undetectable, so why aren’t we seeing a bit effect in our population? The answer lies is in the trends we saw earlier. Jumps are large for low order resonances with high eccentricities. These were the ones first highlighted, as they are obviously the most important. However, low-order resonances are only encountered really close to the massive black hole. This means late in the inspiral, after we have already accumulated lots of signal-to-noise ratio. Losing a little bit of signal right at the end doesn’t hurt detectability too much. On top of this, gravitational wave emission efficiently damps down eccentricity. Orbits typically have low eccentricities by the time they hit low-order resonances, meaning that the jumps are actually quite small. Although small jumps lead to some mismatch, we can still use our signal templates without jumps. Therefore, resonances don’t hamper us (too much) in finding EMRIs!

This may seem like a happy ending, but it is not the end of the story. While we can detect EMRIs, we still need to be able to accurately infer their source properties. Features not included in our signal templates (like jumps), could bias our results. For example, it might be that we can better match a jump by using a template for a different black hole mass or spin. However, if we include jumps, these extra features could give us extra precision in our measurements. The question of what jumps could mean for parameter estimation remains to be answered.

arXiv: 1608.08951 [gr-qc]
Journal: Physical Review D; 94(12):124042(24); 2016
Conference proceedings: 1702.05481 [gr-qc] (only 2 pages—ideal for emergency journal club presentations)
Favourite jumpers: Woolly, Mario, Kangaroos

### Bonus notes

When discussing resonances, and their impact on orbital evolution, we’ll only care about $\Omega_r$$\Omega_\theta$ resonances. Resonances with $\Omega_\phi$ are not important because the spacetime is axisymmetric. The equations are exactly identical for all values of the the axial angle $\phi$, so it doesn’t matter where you are (or if you keep cycling over the same spot) for the evolution of the EMRI.

This, however, doesn’t mean that $\Omega_\phi$ resonances aren’t interesting. They can lead to small kicks to the binary, because you are preferentially emitting gravitational waves in one direction. For EMRIs this are negligibly small, but for more equal mass systems, they could have some interesting consequences as pointed out by Maarten van de Meent.

#### Extra time

I’m grateful to the Cambridge Philosophical Society for giving me some extra funding to work on resonances. If you’re a Cambridge PhD student, make sure to become a member so you can take advantage of the opportunities they offer.

#### Calculating jumps

The theory of how to evolve through a transient resonance was developed by Kevorkian and coauthors. I spent a long time studying these calculations before working up the courage to attempt them myself. There are a few technical details which need to be adapted for the case of EMRIs. I finally figured everything out while in Warsaw Airport, coming back from a conference. It was the most I had ever felt like a real physicist.

Transient resonances remind me of Spirographs. Thanks Frinkiac

# Hierarchical analysis of gravitational-wave measurements of binary black hole spin–orbit misalignments

Gravitational waves allow us to infer the properties of binary black holes (two black holes in orbit about each other), but can we use this information to figure out how the black holes and the binary form? In this paper, we show that measurements of the black holes’ spins can help us this out, but probably not until we have at least 100 detections.

### Black hole spins

Black holes are described by their masses (how much they bend spacetime) and their spins (how much they drag spacetime to rotate about them). The orientation of the spins relative to the orbit of the binary could tell us something about the history of the binary [bonus note].

We considered four different populations of spin–orbit alignments to see if we could tell them apart with gravitational-wave observations:

1. Aligned—matching the idealised example of isolated binary evolution. This stands in for the case where misalignments are small, which might be the case if material blown off during a supernova ends up falling back and being swallowed by the black hole.
2. Isotropic—matching the expectations for dynamically formed binaries.
3. Equal misalignments at birth—this would be the case if the spins and orbit were aligned before the second supernova, which then tilted the plane of the orbit. (As the binary inspirals, the spins wobble around, so the two misalignment angles won’t always be the same).
4. Both spins misaligned by supernova kicks, assuming that the stars were aligned with the orbit before exploding. This gives a more general scatter of unequal misalignments, but typically the primary (bigger and first forming) black hole is more misaligned.

These give a selection of possible spin alignments. For each, we assumed that the spin magnitude was the same and had a value of 0.7. This seemed like a sensible idea when we started this study [bonus note], but is now towards the upper end of what we expect for binary black holes.

### Hierarchical analysis

To measurement the properties of the population we need to perform a hierarchical analysis: there are two layers of inference, one for the individual binaries, and one of the population.

From a gravitational wave signal, we infer the properties of the source using Bayes’ theorem. Given the data $d_\alpha$, we want to know the probability that the parameters $\mathbf{\Theta}_\alpha$ have different values, which is written as $p(\mathbf{\Theta}_\alpha|d_\alpha)$. This is calculated using

$\displaystyle p(\mathbf{\Theta}_\alpha|d_\alpha) = \frac{p(d_\alpha | \mathbf{\Theta}_\alpha) p(\mathbf{\Theta}_\alpha)}{p(d_\alpha)}$,

where $p(d_\alpha | \mathbf{\Theta}_\alpha)$ is the likelihood, which we can calculate from our knowledge of the noise in our gravitational wave detectors, $p(\mathbf{\Theta}_\alpha)$ is the prior on the parameters (what we would have guessed before we had the data), and the normalisation constant $p(d_\alpha)$ is called the evidence. We’ll use the evidence again in the next layer of inference.

Our prior on the parameters should actually depend upon what we believe about the astrophysical population. It is different if we believed that Model 1 were true (when we’d only consider aligned spins) than for Model 2. Therefore, we should really write

$\displaystyle p(\mathbf{\Theta}_\alpha|d_\alpha, \lambda) = \frac{p(d_\alpha | \mathbf{\Theta}_\alpha,\lambda) p(\mathbf{\Theta}_\alpha,\lambda)}{p(d_\alpha|\lambda)}$,

where  $\lambda$ denotes which model we are considering.

This is an important point to remember: if you our using our LIGO results to test your theory of binary formation, you need to remember to correct for our choice of prior. We try to pick non-informative priors—priors that don’t make strong assumptions about the physics of the source—but this doesn’t mean that they match what would be expected from your model.

We are interested in the probability distribution for the different models: how many binaries come from each. Given a set of different observations $\{d_\alpha\}$, we can work this out using another application of Bayes’ theorem (yay)

$\displaystyle p(\mathbf{\lambda}|\{d_\alpha\}) = \frac{p(\{d_\alpha\} | \mathbf{\lambda}) p(\mathbf{\lambda})}{p(\{d_\alpha\})}$,

where $p(\{d_\alpha\} | \mathbf{\lambda})$ is just all the evidences for the individual events (given that model) multiplied together, $p(\mathbf{\lambda})$ is our prior for the different models, and $p(\{d_\alpha\})$ is another normalisation constant.

Now knowing how to go from a set of observations to the probability distribution on the different channels, let’s give it a go!

#### Results

To test our approach made a set of mock gravitational wave measurements. We generated signals from binaries for each of our four models, and analysed these as we would for real signals (using LALInference). This is rather computationally expensive, and we wanted a large set of events to analyse, so using these results as a guide, we created a larger catalogue of approximate distributions for the inferred source parameters $p(\mathbf{\Theta}_\alpha|d_\alpha)$. We then fed these through our hierarchical analysis. The GIF below shows how measurements of the fraction of binaries from each population tightens up as we get more detections: the true fraction is marked in blue.

Probability distribution for the fraction of binaries from each of our four spin misalignment populations for different numbers of observations. The blue dot marks the true fraction: and equal fraction from all four channels.

The plot shows that we do zoom in towards the true fraction of events from each model as the number of events increases, but there are significant degeneracies between the different models. Notably, it is difficult to tell apart Models 1 and 3, as both have strong support for both spins being nearly aligned. Similarly, there is a degeneracy between Models 2 and 4 as both allow for the two spins to have very different misalignments (and for the primary spin, which is the better measured one, to be quite significantly misaligned).

This means that we should be able to distinguish aligned from misaligned populations (we estimated that as few as 5 events would be needed to distinguish the case that all events came from either Model 1  or Model 2 if those were the only two allowed possibilities). However, it will be more difficult to distinguish different scenarios which only lead to small misalignments from each other, or disentangle whether there is significant misalignment due to big supernova kicks or because binaries are formed dynamically.

The uncertainty of the fraction of events from each model scales roughly with the square root of the number of observations, so it may be slow progress making these measurements. I’m not sure whether we’ll know the answer to how binary black hole form, or who will sit on the Iron Throne first.

arXiv: 1703.06873 [astro-ph.HE]
Journal: Monthly Notices of the Royal Astronomical Society471(3):2801–2811; 2017
Birmingham science summary: Hierarchical analysis of gravitational-wave measurements of binary black hole spin–orbit misalignment (by Simon)
If you like this you might like: Farr et al. (2017)Talbot & Thrane (2017), Vitale et al. (2017), Trifirò et al. (2016), Minogue (2000)

### Bonus notes

#### Spin misalignments and formation histories

If you have two stars forming in a binary together, you’d expect them to be spinning in roughly the same direction, rotating the same way as they go round in their orbit (like our Solar System). This is because they all formed from the same cloud of swirling gas and dust. Furthermore, if two stars are to form a black hole binary that we can detect gravitational waves from, they need to be close together. This means that there can be tidal forces which gently tug the stars to align their rotation with the orbit. As they get older, stars puff up, meaning that if you have a close-by neighbour, you can share outer layers. This transfer of material will tend to align rotate too. Adding this all together, if you have an isolated binary of stars, you might expect that when they collapse down to become black holes, their spins are aligned with each other and the orbit.

Unfortunately, real astrophysics is rarely so clean. Even if the stars were initially rotating the same way as each other, they doesn’t mean that their black hole remnants will do the same. This depends upon how the star collapses. Massive stars explode as supernova, blasting off their outer layers while their cores collapse down to form black holes. Escaping material could carry away angular momentum, meaning that the black hole is spinning in a different direction to its parent star, or material could be blasted off asymmetrically, giving the new black hole a kick. This would change the plane of the binary’s orbit, misaligning the spins.

Alternatively, the binary could be formed dynamically. Instead of two stars living their lives together, we could have two stars (or black holes) come close enough together to form a binary. This is likely to happen in regions where there’s a high density of stars, such as a globular cluster. In this case, since the binary has been randomly assembled, there’s no reason for the spins to be aligned with each other or the orbit. For dynamically assembled binaries, all spin–orbit misalignments are equally probable.

This project was led by Simon Stevenson. It was one of the first things we started working on at the beginning of his PhD. He has now graduated, and is off to start a new exciting life as a postdoc in Australia. We got a little distracted by other projects, most notably analysing the first detections of gravitational waves. Simon spent a lot of time developing the COMPAS population code, a code to simulate the evolution of binaries. Looking back, it’s impressive how far he’s come. This paper used a simple approximation to to estimate the masses of our black holes: we called it the Post-it note model, as we wrote it down on a single Post-it. Now Simon’s writing papers including the complexities of common-envelope evolution in order to explain LIGO’s actual observations.

# Going the distance: Mapping host galaxies of LIGO and Virgo sources in three dimensions using local cosmography and targeted follow-up

GW150914 claimed the title of many firsts—it was the first direct observation of gravitational waves, the first observation of a binary black hole system, the first observation of two black holes merging, the first time time we’ve tested general relativity in such extreme conditions… However, there are still many firsts for gravitational-wave astronomy yet to come (hopefully, some to be accompanied by cake). One of the most sought after, is the first is signal to have a clear electromagnetic counterpart—a glow in some part of the spectrum of light (from radio to gamma-rays) that we can observe with telescopes.

Identifying a counterpart is challenging, as it is difficult to accurately localise a gravitational-wave source. electromagnetic observers must cover a large area of sky before any counterparts fade. Then, if something is found, it can be hard to determine if that is from the same source as the gravitational waves, or some thing else…

To help the search, it helps to have as much information as possible about the source. Especially useful is the distance to the source. This can help you plan where to look. For nearby sources, you can cross-reference with galaxy catalogues, and perhaps pick out the biggest galaxies as the most likely locations for the source [bonus note]. Distance can also help plan your observations: you might want to start with regions of the sky where the source would be closer and so easiest to spot, or you may want to prioritise points where it is further and so you’d need to observe longer to detect it (I’m not sure there’s a best strategy, it depends on the telescope and the amount of observing time available). In this paper we describe a method to provide easy-to-use distance information, which could be supplied to observers to help their search for a counterpart.

### Going the distance

This work is the first spin-off from the First 2 Years trilogy of papers, which looked a sky localization and parameter estimation for binary neutron stars in the first two observing runs of the advance-detector era. Binary neutron star coalescences are prime candidates for electromagnetic counterparts as we think there should be a bigger an explosion as they merge. I was heavily involved in the last two papers of the trilogy, but this study was led by Leo Singer: I think I mostly annoyed Leo by being a stickler when it came to writing up the results.

Three-dimensional localization showing the 20%, 50%, and 90% credible levels for a typical two-detector early Advanced LIGO event. The Earth is shown at the centre, marked by $\oplus$. The true location is marked by the cross. Leo poetically described this as looking like the seeds of the jacaranda tree, and less poetically as potato chips. Figure 1 of Singer et al. (2016).

The idea is to provide a convenient means of sharing a 3D localization for a gravitational wave source. The full probability distribution is rather complicated, but it can be made more manageable if you break it up into pixels on the sky. Since astronomers need to decide where to point their telescopes, breaking up the 3D information along different lines of sight, should be useful for them.

Each pixel covers a small region of the sky, and along each line of sight, the probability distribution for distance $D$ can be approximated using an ansatz

$\displaystyle p(D|\mathrm{data}) \propto D^2\exp\left[-\frac{(D - \mu)^2}{2\sigma}\right]$,

where $\mu$ and $\sigma$ are calculated for each pixel individually.  The form of this ansatz can be understood as the posterior probability distribution is proportional to the product of the prior and the likelihood. Our prior is that sources are uniformly distributed in volume, which means $\propto D^2$, and the likelihood can often be well approximated as a Gaussian distribution, which gives the other piece [bonus note].

The ansatz doesn’t always fit perfectly, but it performs well on average. Considering the catalogue of binary neutron star signals used in the earlier papers, we find that roughly 50% of the time sources are found within the 50% credible volume, 90% are found in the 90% volume, etc.

The 3D localization is easy to calculate, and Leo has worked out a cunning way to evaluate the ansatz with BAYESTAR, our rapid sky localization code, meaning that we can produce it on minute time-scales. This means that observers should have something to work with straight-away, even if we’ll need to wait a while for the full, final results. We hope that this will improve prospects for finding counterparts—some potential examples are sketched out in the penultimate section of the paper.

If you are interested in trying out the 3D information, there is a data release and the supplement contains a handy Python tutorial. We are hoping that the Collaboration will use the format for alerts for LIGO and Virgo’s upcoming observing run (O2).

arXiv: 1603.07333 [astro-ph.HE]; 1605.04242 [astro-ph.IM]
Journal: Astrophysical Journal Letters; 829(1):L15(7); 2016; Astrophysical Journal Supplement Series; 226(1):10(8); 2016
Data release: Going the distance
Favourite crisp flavour: Salt & vinegar
Favourite jacaranda: Jacaranda mimosifolia

### Bonus notes

#### Catalogue shopping

The Event’s source has a luminosity distance of around 250–570 Mpc. This is sufficiently distant that galaxy catalogues are incomplete and not much use when it comes to searching. GW151226 and LVT151012 have similar problems, being at around the same distance or even further.

#### The gravitational-wave likelihood

For the professionals interested in understanding more about the shape of the likelihood, I’d recommend Cutler & Flanagan (1994). This is a fantastic paper which contains many clever things [bonus bonus note]. This work is really the foundation of gravitational-wave parameter estimation. From it, you can see how the likelihood can be approximated as a Gaussian. The uncertainty can then be evaluated using Fisher matrices. Many studies have been done using Fisher matrices, but it important to check that this is a valid approximation, as nicely explained in Vallisneri (2008). I ran into a case when it didn’t during my PhD.

#### Mergin’

As a reminder that smart people make mistakes, Cutler & Flanagan have a typo in the title of arXiv posting of their paper. This is probably the most important thing to take away from this paper.

# Parameter estimation on gravitational waves from neutron-star binaries with spinning components

blIn gravitation-wave astronomy, some parameters are easier to measure than others. We are sensitive to properties which change the form of the wave, but sometimes the effect of changing one parameter can be compensated by changing another. We call this a degeneracy. In signals for coalescing binaries (two black holes or neutron stars inspiralling together), there is a degeneracy between between the masses and spins. In this recently published paper, we look at what this means for observing binary neutron star systems.

### History

This paper has been something of an albatross, and I’m extremely pleased that we finally got it published. I started working on it when I began my post-doc at Birmingham in 2013. Back then I was sharing an office with Ben Farr, and together with others in the Parameter Estimation Group, we were thinking about the prospect of observing binary neutron star signals (which we naively thought were the most likely) in LIGO’s first observing run.

One reason that this work took so long is that binary neutron star signals can be computationally expensive to analyse [bonus note]. The signal slowly chirps up in frequency, and can take up to a minute to sweep through the range of frequencies LIGO is sensitive to. That gives us a lot of gravitational wave to analyse. (For comparison, GW150914 lasted 0.2 seconds). We need to calculate waveforms to match to the observed signals, and these can be especially complicated when accounting for the effects of spin.

A second reason is shortly after submitting the paper in August 2015, we got a little distracted

This paper was the third of a trilogy look at measuring the properties of binary neutron stars. I’ve written about the previous instalment before. We knew that getting the final results for binary neutron stars, including all the important effects like spin, would take a long time, so we planned to follow up any detections in stages. A probable sky location can be computed quickly, then we can have a first try at estimating other parameters like masses using waveforms that don’t include spin, then we go for the full results with spin. The quicker results would be useful for astronomers trying to find any explosions that coincided with the merger of the two neutron stars. The first two papers looked at results from the quicker analyses (especially at sky localization); in this one we check what effect neglecting spin has on measurements.

### What we did

We analysed a population of 250 binary neutron star signals (these are the same as the ones used in the first paper of the trilogy). We used what was our best guess for the sensitivity of the two LIGO detectors in the first observing run (which was about right).

The simulated neutron stars all have small spins of less than 0.05 (where 0 is no spin, and 1 would be the maximum spin of a black hole). We expect neutron stars in these binaries to have spins of about this range. The maximum observed spin (for a neutron star not in a binary neutron star system) is around 0.4, and we think neutron stars should break apart for spins of 0.7. However, since we want to keep an open mind regarding neutron stars, when measuring spins we considered spins all the way up to 1.

### What we found

Our results clearly showed the effect of the mass–spin degeneracy. The degeneracy increases the uncertainty for both the spins and the masses.

Even though the true spins are low, we find that across the 250 events, the median 90% upper limit on the spin of the more massive (primary) neutron star is 0.70, and the 90% limit on the less massive (secondary) neutron star is 0.86. We learn practically nothing about the spin of the secondary, but a little more about the spin of the primary, which is more important for the inspiral. Measuring spins is hard.

The effect of the mass–spin degeneracy for mass measurements is shown in the plot below. Here we show a random selection of events. The banana-shaped curves are the 90% probability intervals. They are narrow because we can measure a particular combination of masses, the chirp mass, really well. The mass–spin degeneracy determines how long the banana is. If we restrict the range of spins, we explore less of the banana (and potentially introduce an offset in our results).

Rough outlines for 90% credible regions for component masses for a random assortments of signals. The circles show the true values. The coloured lines indicate the extent of the distribution with different limits on the spins. The grey area is excluded from our convention on masses $m_1 \geq m_2$. Figure 5 from Farr et al. (2016).

Although you can’t see it in the plot above, including spin does also increase the uncertainty in the chirp mass too. The plots below show the standard deviation (a measure width of the posterior probability distribution), divided by the mean for several mass parameters. This gives a measure of the fractional uncertainty in our measurements. We show the chirp mass $\mathcal{M}_\mathrm{c}$, the mass ratio $q = m_2/m_1$ and the total mass $M = m_1 + m_2$, where $m_1$ and $m_2$ are the masses of the primary and secondary neutron stars respectively. The uncertainties are small for louder signals (higher signal-to-noise ratio). If we neglect the spin, the true chirp mass can lie outside the posterior distribution, the average is about 5 standard deviations from the mean, but if we include spin, the offset is just 0.7 from the mean (there’s still some offset as we’re allowing for spins all the way up to 1).

Fractional statistical uncertainties in chirp mass (top), mass ratio (middle) and total mass (bottom) estimates as a function of network signal-to-noise ratio for both the fully spinning analysis and the quicker non-spinning analysis. The lines indicate approximate power-law trends to guide the eye. Figure 2 of Farr et al. (2016).

We need to allow for spins when measuring binary neutron star masses in order to explore for the possible range of masses.

Sky localization and distance, however, are not affected by the spins here. This might not be the case for sources which are more rapidly spinning, but assuming that binary neutron stars do have low spin, we are safe using the easier-to-calculate results. This is good news for astronomers who need to know promptly where to look for explosions.

arXiv: 1508.05336 [astro-ph.HE]
Journal: Astrophysical Journal825(2):116(10); 2016
Authorea [bonus note]: Parameter estimation on gravitational waves from neutron-star binaries with spinning components
Conference proceedings:
Early Advanced LIGO binary neutron-star sky localization and parameter estimation
Favourite albatross:
Wilbur

### Bonus notes

#### How long?

The plot below shows how long it took to analyse each of the binary neutron star signals.

Distribution of run times for binary neutron star signals. Low-latency sky localization is done with BAYESTAR; medium-latency non-spinning parameter estimation is done with LALInference and TaylorF2 waveforms, and high-latency fully spinning parameter estimation is done with LALInference and SpinTaylorT4 waveforms. The LALInference results are for 2000 posterior samples. Figure 9 from Farr et al. (2016).

BAYESTAR provides a rapid sky localization, taking less than ten seconds. This is handy for astronomers who want to catch a flash caused by the merger before it fades.

Estimates for the other parameters are computed with LALInference. How long this takes to run depends on which waveform you are using and how many samples from the posterior probability distribution you want (the more you have, the better you can map out the shape of the distribution). Here we show times for 2000 samples, which is enough to get a rough idea (we collected ten times more for GW150914 and friends). Collecting twice as many samples takes (roughly) twice as long. Prompt results can be obtained with a waveform that doesn’t include spin (TaylorF2), these take about a day at most.

For this work, we considered results using a waveform which included the full effects of spin (SpinTaylorT4). These take about twenty times longer than the non-spinning analyses. The maximum time was 172 days. I have a strong suspicion that the computing time cost more than my salary.

Waiting for LALInference runs to finish gives you some time to practise hobbies. This is a globe knitted by Hannah. The two LIGO sites marked in red, and a typical gravitational-wave sky localization stitched on.

In order to get these results, we had to add check-pointing to our code, so we could stop it and restart it; we encountered a new type of error in the software which manages jobs running on our clusters, and Hannah Middleton and I got several angry emails from cluster admins (who are wonderful people) for having too many jobs running.

In comparison, analysing GW150914, LVT151012 and GW151226 was a breeze. Grudgingly, I have to admit that getting everything sorted out for this study made us reasonably well prepared for the real thing. Although, I’m not looking forward to that first binary neutron star signal…

#### Authorea

Authorea is an online collaborative writing service. It allows people to work together on documents, editing text, adding comments, and chatting with each other. By the time we came to write up the paper, Ben was no longer in Birmingham, and many of our coauthors are scattered across the globe. Ben thought Authorea might be useful for putting together the paper.

Writing was easy, and the ability to add comments on the text was handy for getting feedback from coauthors. The chat was going for quickly sorting out issues like plots. Overall, I was quite pleased, up to the point we wanted to get the final document. Extracted a nicely formatted PDF was awkward. For this I switched to using the Github back-end. On reflection, a simple git repo, plus a couple of Skype calls might have been a smoother way of writing, at least for a standard journal article.

Authorea promises to be an open way of producing documents, and allows for others to comment on papers. I don’t know if anyone’s looked at our Authorea article. For astrophysics, most people use the arXiv, which is free to everyone, and I’m not sure if there’s enough appetite for interaction (beyond the occasional email to authors) to motivate people to look elsewhere. At least, not yet.

In conclusion, I think Authorea is a nice idea, and I would try out similar collaborative online writing tools again, but I don’t think I can give it a strong recommendation for your next paper unless you have a particular idea in mind of how to make the most of it.

# Testing general relativity using golden black-hole binaries

Binary black hole mergers are the ultimate laboratory for testing gravity. The gravitational fields are strong, and things are moving at close to the speed of light. these extreme conditions are exactly where we expect our theories could breakdown, which is why we were so exciting by detecting gravitational waves from black hole coalescences. To accompany the first detection of gravitational waves, we performed several tests of Einstein’s theory of general relativity (it passed). This paper outlines the details of one of the tests, one that can be extended to include future detections to put Einstein’s theory to the toughest scrutiny.

One of the difficulties of testing general relativity is what do you compare it to? There are many alternative theories of gravity, but only a few of these have been studied thoroughly enough to give an concrete idea of what a binary black hole merger should look like. Even if general relativity comes out on top when compared to one alternative model, it doesn’t mean that another (perhaps one we’ve not thought of yet) can be ruled out. We need ways of looking for something odd, something which hints that general relativity is wrong, but doesn’t rely on any particular alternative theory of gravity.

The test suggested here is a consistency test. We split the gravitational-wave signal into two pieces, a low frequency part and a high frequency part, and then try to measure the properties of the source from the two parts. If general relativity is correct, we should get answers that agree; if it’s not, and there’s some deviation in the exact shape of the signal at different frequencies, we can get different answers. One way of thinking about this test is imagining that we have two experiments, one where we measure lower frequency gravitational waves and one where we measure higher frequencies, and we are checking to see if their results agree.

To split the waveform, we use a frequency around that of the last stable circular orbit: about the point that the black holes stop orbiting about each other and plunge together and merge [bonus note]. For GW150914, we used 132 Hz, which is about the same as the C an octave below middle C (a little before time zero in the simulation below). This cut roughly splits the waveform into the low frequency inspiral (where the two black hole are orbiting each other), and the higher frequency merger (where the two black holes become one) and ringdown (where the final black hole settles down).

We are fairly confident that we understand what goes on during the inspiral. This is similar physics to where we’ve been testing gravity before, for example by studying the orbits of the planets in the Solar System. The merger and ringdown are more uncertain, as we’ve never before probed these strong and rapidly changing gravitational fields. It therefore seems like a good idea to check the two independently [bonus note].

We use our parameter estimation codes on the two pieces to infer the properties of the source, and we compare the values for the mass $M_f$ and spin $\chi_f$ of the final black hole. We could use other sets of parameters, but this pair compactly sum up the properties of the final black hole and are easy to explain. We look at the difference between the estimated values for the mass and spin, $\Delta M_f$ and $\Delta \chi_f$, if general relativity is a good match to the observations, then we expect everything to match up, and $\Delta M_f$ and $\Delta \chi_f$ to be consistent with zero. They won’t be exactly zero because we have noise in the detector, but hopefully zero will be within the uncertainty region [bonus note]. An illustration of the test is shown below, including one of the tests we did to show that it does spot when general relativity is not correct.

Results from the consistency test. The top panels show the outlines of the 50% and 90% credible levels for the low frequency (inspiral) part of the waveform, the high frequency (merger–ringdown) part, and the entire (inspiral–merger–ringdown, IMR) waveform. The bottom panel shows the fractional difference between the high and low frequency results. If general relativity is correct, we expect the distribution to be consistent with $(0,0)$, indicated by the cross (+). The left panels show a general relativity simulation, and the right panel shows a waveform from a modified theory of gravity. Figure 1 of Ghosh et al. (2016).

A convenient feature of using $\Delta M_f$ and $\Delta \chi_f$ to test agreement with relativity, is that you can combine results from multiple observations. By averaging over lots of signals, you can reduce the uncertainty from noise. This allows you to pin down whether or not things really are consistent, and spot smaller deviations (we could get precision of a few percent after about 100 suitable detections). I look forward to seeing how this test performs in the future!

arXiv: 1602.02453 [gr-qc]
Journal: Physical Review D; 94(2):021101(6); 2016
Favourite golden thing: Golden syrup sponge pudding

### Bonus notes

#### Review

I became involved in this work as a reviewer. The LIGO Scientific Collaboration is a bit of a stickler when it comes to checking its science. We had to check that the test was coded up correctly, that the results made sense, and that calculations done and written up for GW150914 were all correct. Since most of the team are based in India [bonus note], this involved some early morning telecons, but it all went smoothly.

One of our checks was that the test wasn’t sensitive to exact frequency used to split the signal. If you change the frequency cut, the results from the two sections do change. If you lower the frequency, then there’s less of the low frequency signal and the measurement uncertainties from this piece get bigger. Conversely, there’ll be more signal in the high frequency part and so we’ll make a more precise measurement of the parameters from this piece. However, the overall results where you combine the two pieces stay about the same. You get best results when there’s a roughly equal balance between the two pieces, but you don’t have to worry about getting the cut exactly on the innermost stable orbit.

#### Golden binaries

In order for the test to work, we need the two pieces of the waveform to both be loud enough to allow us to measure parameters using them. This type of signals are referred to as golden. Earlier work on tests of general relativity using golden binaries has been done by Hughes & Menou (2015), and Nakano, Tanaka & Nakamura (2015). GW150914 was a golden binary, but GW151226 and LVT151012 were not, which is why we didn’t repeat this test for them.

#### GW150914 results

For The Event, we ran this test, and the results are consistent with general relativity being correct. The plots below show the estimates for the final mass and spin (here denoted $a_f$ rather than $\chi_f$), and the fractional difference between the two measurements. The points $(0,0)$ is at the 28% credible level. This means that if general relativity is correct, we’d expect a deviation at this level to occur around-about 72% of the time due to noise fluctuations. It wouldn’t take a particular rare realisation of noise to cause the assume true value of $(0,0)$ to be found at this probability level, so we’re not too suspicious that something is amiss with general relativity.

Results from the consistency test for The Event. The top panels final mass and spin measurements from the low frequency (inspiral) part of the waveform, the high frequency (post-inspiral) part, and the entire (IMR) waveform. The bottom panel shows the fractional difference between the high and low frequency results. If general relativity is correct, we expect the distribution to be consistent with $(0,0)$, indicated by the cross. Figure 3 of the Testing General Relativity Paper.

### The authors

Abhirup Ghosh and Archisman Ghosh were two of the leads of this study. They are both A. Ghosh at the same institution, which caused some confusion when compiling the LIGO Scientific Collaboration author list. I think at one point one of them (they can argue which) was removed as someone thought there was a mistaken duplication. To avoid confusion, they now have their full names used. This is a rare distinction on the Discovery Paper (I’ve spotted just two others). The academic tradition of using first initials plus second name is poorly adapted to names which don’t fit the typical western template, so we should be more flexible.

# Inference on gravitational waves from coalescences of stellar-mass compact objects and intermediate-mass black holes

I love collecting things, there’s something extremely satisfying about completing a set. I suspect that this is one of the alluring features of Pokémon—you’ve gotta catch ’em all. The same is true of black hole hunting. Currently, we know of stellar-mass black holes which are a few times the mass of our Sun, up to a few tens of the mass of our Sun (the black holes of GW150914 are the biggest yet to be observed), and we know of supermassive black holes, which are ten thousand to ten billion times the mass our Sun. However, we are missing intermediate-mass black holes which lie in the middle. We have Charmander and Charizard, but where is Charmeleon? The elusive ones are always the most satisfying to capture.

Adorable black hole (available for adoption). I’m sure this could be a Pokémon. It would be a Dark type. Not that I’ve given it that much thought…

Intermediate-mass black holes have evaded us so far. We’re not even sure that they exist, although that would raise questions about how you end up with the supermassive ones (you can’t just feed the stellar-mass ones lots of rare candy). Astronomers have suggested that you could spot intermediate-mass black holes in globular clusters by the impact of their gravity on the motion of other stars. However, this effect would be small, and near impossible to conclusively spot. Another way (which I’ve discussed before), would to be to look at ultra luminous X-ray sources, which could be from a disc of material spiralling into the black hole.  However, it’s difficult to be certain that we understand the source properly and that we’re not misclassifying it. There could be one sure-fire way of identifying intermediate-mass black holes: gravitational waves.

The frequency of gravitational waves depend upon the mass of the binary. More massive systems produce lower frequencies. LIGO is sensitive to the right range of frequencies for stellar-mass black holes. GW150914 chirped up to the pitch of a guitar’s open B string (just below middle C). Supermassive black holes produce gravitational waves at too low frequency for LIGO (a space-based detector would be perfect for these). We might just be able to detect signals from intermediate-mass black holes with LIGO.

In a recent paper, a group of us from Birmingham looked at what we could learn from gravitational waves from the coalescence of an intermediate-mass black hole and a stellar-mass black hole [bonus note].  We considered how well you would be able to measure the masses of the black holes. After all, to confirm that you’ve found an intermediate-mass black hole, you need to be sure of its mass.

The signals are extremely short: we only can detect the last bit of the two black holes merging together and settling down as a final black hole. Therefore, you might think there’s not much information in the signal, and we won’t be able to measure the properties of the source. We found that this isn’t the case!

We considered a set of simulated signals, and analysed these with our parameter-estimation code [bonus note]. Below are a couple of plots showing the accuracy to which we can infer a couple of different mass parameters for binaries of different masses. We show the accuracy of measuring the chirp mass $\mathcal{M}$ (a much beloved combination of the two component masses which we are usually able to pin down precisely) and the total mass $M_\mathrm{total}$.

Measured chirp mass for systems of different total masses. The shaded regions show the 90% credible interval and the dashed lines show the true values. The mass ratio $q$ is the mass of the stellar-mass black hole divided by the mass of the intermediate-mass black hole. Figure 1 of Haster et al. (2016).

Measured total mass for systems of different total masses. The shaded regions show the 90% credible interval and the dashed lines show the true values. Figure 2 of Haster et al. (2016).

For the lower mass systems, we can measure the chirp mass quite well. This is because we get a little information from the part of the gravitational wave from when the two components are inspiralling together. However, we see less and less of this as the mass increases, and we become more and more uncertain of the chirp mass.

The total mass isn’t as accurately measured as the chirp mass at low masses, but we see that the accuracy doesn’t degrade at higher masses. This is because we get some constraints on its value from the post-inspiral part of the waveform.

We found that the transition from having better fractional accuracy on the chirp mass to having better fractional accuracy on the total mass happened when the total mass was around 200–250 solar masses. This was assuming final design sensitivity for Advanced LIGO. We currently don’t have as good sensitivity at low frequencies, so the transition will happen at lower masses: GW150914 is actually in this transition regime (the chirp mass is measured a little better).

Given our uncertainty on the masses, when can we conclude that there is an intermediate-mass black hole? If we classify black holes with masses more than 100 solar masses as intermediate mass, then we’ll be able to say to claim a discovery with 95% probability if the source has a black hole of at least 130 solar masses. The plot below shows our inferred probability of there being an intermediate-mass black hole as we increase the black hole’s mass (there’s little chance of falsely identifying a lower mass black hole).

Probability that the larger black hole is over 100 solar masses (our cut-off mass for intermediate-mass black holes $M_\mathrm{IMBH}$). Figure 7 of Haster et al. (2016).

Gravitational-wave observations could lead to a concrete detection of intermediate mass black holes if they exist and merge with another black hole. However, LIGO’s low frequency sensitivity is important for detecting these signals. If detector commissioning goes to plan and we are lucky enough to detect such a signal, we’ll finally be able to complete our set of black holes.

arXiv: 1511.01431 [astro-ph.HE]
Journal: Monthly Notices of the Royal Astronomical Society457(4):4499–4506; 2016
Birmingham science summary: Inference on gravitational waves from coalescences of stellar-mass compact objects and intermediate-mass black holes (by Carl)
Other collectables: Breakthrough, Gruber, Shaw, Kavli

### Bonus notes

#### Jargon

The coalescence of an intermediate-mass black hole and a stellar-mass object (black hole or neutron star) has typically been known as an intermediate mass-ratio inspiral (an IMRI). This is similar to the name for the coalescence of a a supermassive black hole and a stellar-mass object: an extreme mass-ratio inspiral (an EMRI). However, my colleague Ilya has pointed out that with LIGO we don’t really see much of the intermediate-mass black hole and the stellar-mass black hole inspiralling together, instead we see the merger and ringdown of the final black hole. Therefore, he prefers the name intermediate mass-ratio coalescence (or IMRAC). It’s a better description of the signal we measure, but the acronym isn’t as good.

#### Parameter-estimation runs

The main parameter-estimation analysis for this paper was done by Zhilu, a summer student. This is notable for two reasons. First, it shows that useful research can come out of a summer project. Second, our parameter-estimation code installed and ran so smoothly that even an undergrad with no previous experience could get some useful results. This made us optimistic that everything would work perfectly in the upcoming observing run (O1). Unfortunately, a few improvements were made to the code before then, and we were back to the usual level of fun in time for The Event.

# Parameter estimation for binary neutron-star coalescences with realistic noise during the Advanced LIGO era

The first observing run (O1) of Advanced LIGO is nearly here, and with it the prospect of the first direct detection of gravitational waves. That’s all wonderful and exciting (far more exciting than a custard cream or even a chocolate digestive), but there’s a lot to be done to get everything ready. Aside from remembering to vacuum the interferometer tubes and polish the mirrors, we need to see how the data analysis will work out. After all, having put so much effort into the detector, it would be shame if we couldn’t do any science with it!

### Parameter estimation

Since joining the University of Birmingham team, I’ve been busy working on trying to figure out how well we can measure things using gravitational waves. I’ve been looking at binary neutron star systems. We expect binary neutron star mergers to be the main source of signals for Advanced LIGO. We’d like to estimate how massive the neutron stars are, how fast they’re spinning, how far away they are, and where in the sky they are. Just published is my first paper on how well we should be able to measure things. This took a lot of hard work from a lot of people, so I’m pleased it’s all done. I think I’ve earnt a celebratory biscuit. Or two.

When we see something that looks like it could be a gravitational wave, we run code to analyse the data and try to work out the properties of the signal. Working out some properties is a bit trickier than others. Sadly, we don’t have an infinite number of computers, so it means it can take a while to get results. Much longer than the time to eat a packet of Jaffa Cakes…

The fastest algorithm we have for binary neutron stars is BAYESTAR. This takes the same time as maybe eating one chocolate finger. Perhaps two, if you’re not worried about the possibility of choking. BAYESTAR is fast as it only estimates where the source is coming from. It doesn’t try to calculate a gravitational-wave signal and match it to the detector measurements, instead it just looks at numbers produced by the detection pipeline—the code that monitors the detectors and automatically flags whenever something interesting appears. As far as I can tell, you give BAYESTAR this information and a fresh cup of really hot tea, and it uses Bayes’ theorem to work out how likely it is that the signal came from each patch of the sky.

To work out further details, we need to know what a gravitational-wave signal looks like and then match this to the data. This is done using a different algorithm, which I’ll refer to as LALInference. (As names go, this isn’t as cool as SKYNET). This explores parameter space (hopping between different masses, distances, orientations, etc.), calculating waveforms and then working out how well they match the data, or rather how likely it is that we’d get just the right noise in the detector to make the waveform fit what we observed. We then use another liberal helping of Bayes’ theorem to work out how probable those particular parameter values are.

It’s rather difficult to work out the waveforms, but some our easier than others. One of the things that makes things trickier is adding in the spins of the neutron stars. If you made a batch of biscuits at the same time you started a LALInference run, they’d still be good by the time a non-spinning run finished. With a spinning run, the biscuits might not be quite so appetising—I generally prefer more chocolate than penicillin on my biscuits. We’re working on speeding things up (if only to prevent increased antibiotic resistance).

In this paper, we were interested in what you could work out quickly, while there’s still chance to catch any explosion that might accompany the merging of the neutron stars. We think that short gamma-ray bursts and kilonovae might be caused when neutron stars merge and collapse down to a black hole. (I find it mildly worrying that we don’t know what causes these massive explosions). To follow-up on a gravitational-wave detection, you need to be able to tell telescopes where to point to see something and manage this while there’s still something that’s worth seeing. This means that using spinning waveforms in LALInference is right out, we just use BAYESTAR and the non-spinning LALInference analysis.

### What we did

To figure out what we could learn from binary neutron stars, we generated a large catalogue of fakes signals, and then ran the detection and parameter-estimation codes on this to see how they worked. This has been done before in The First Two Years of Electromagnetic Follow-Up with Advanced LIGO and Virgo which has a rather delicious astrobites write-up. Our paper is the sequel to this (and features most of the same cast). One of the differences is that The First Two Years assumed that the detectors were perfectly behaved and had lovely Gaussian noise. In this paper, we added in some glitches. We took some real data™ from initial LIGO’s sixth science run and stretched this so that it matches the sensitivity Advanced LIGO is expected to have in O1. This process is called recolouring [bonus note]. We now have fake signals hidden inside noise with realistic imperfections, and can treat it exactly as we would real data. We ran it through the detection pipeline, and anything which was flagged as probably being a signal (we used a false alarm rate of once per century), was analysed with the parameter-estimation codes. We looked at how well we could measure the sky location and distance of the source, and the masses of the neutron stars. It’s all good practice for O1, when we’ll be running this analysis on any detections.

### What we found

1. The flavour of noise (recoloured or Gaussian) makes no difference to how well we can measure things on average.
2. Sky-localization in O1 isn’t great, typically hundreds of square degrees (the median 90% credible region is 632 deg2), for comparison, the Moon is about a fifth of a square degree. This’ll make things interesting for the people with telescopes.

Probability that of a gravitational-wave signal coming from different points on the sky. The darker the red, the higher the probability. The star indicates the true location. This is one of the worst localized events from our study for O1. You can find more maps in the data release (including 3D versions), this is Figure 6 of Berry et al. (2015).

3. BAYESTAR does just as well as LALInference, despite being about 2000 times faster.

Sky localization (the size of the patch of the sky that we’re 90% sure contains the source location) varies with the signal-to-noise ratio (how loud the signal is). The approximate best fit is $\log_{10}(\mathrm{CR}_{0.9}/\mathrm{deg^2}) \approx -2 \log_{10}(\varrho) +5.06$, where $\mathrm{CR}_{0.9}$ is the 90% sky area and $\varrho$ is the signal-to-noise ratio. The results for BAYESTAR and LALInference agree, as do the results with Gaussian and recoloured noise. This is Figure 9 of Berry et al. (2015).

4. We can’t measure the distance too well: the median 90% credible interval divided by the true distance (which gives something like twice the fractional error) is 0.85.
5. Because we don’t include the spins of the neutron stars, we introduce some error into our mass measurements. The chirp mass, a combination of the individual masses that we’re most sensitive to [bonus note], is still reliably measured (the median offset is 0.0026 of the mass of the Sun, which is tiny), but we’ll have to wait for the full spinning analysis for individual masses.

Fraction of events with difference between the mean estimated and true chirp mass smaller than a given value. There is an error because we are not including the effects of spin, but this is small. Again, the type of noise makes little difference. This is Figure 15 of Berry et al. (2015).

There’s still some work to be done before O1, as we need to finish up the analysis with waveforms that include spin. In the mean time, our results are all available online for anyone to play with.

arXiv: 1411.6934 [astro-ph.HE]
Journal: Astrophysical Journal; 904(2):114(24); 2015
Data release: The First Two Years of Electromagnetic Follow-Up with Advanced LIGO and Virgo
Favourite colour: Blue. No, yellow…

### Notes

The colour of noise: Noise is called white if it doesn’t have any frequency dependence. We made ours by taking some noise with initial LIGO’s frequency dependence (coloured noise), removing the frequency dependence (making it white), and then adding in the frequency dependence of Advanced LIGO (recolouring it).

The chirp mass: Gravitational waves from a binary system depend upon the masses of the components, we’ll call these $m_1$ and $m_2$. The chirp mass is a combination these that we can measure really well, as it determines the most significant parts of the shape of the gravitational wave. It’s given by

$\displaystyle \mathcal{M} = \frac{m_1^{3/5} m_2^{3/5}}{(m_1 + m_2)^{1/5}}$.

We get lots of good information on the chirp mass, unfortunately, this isn’t too useful for turning back into the individual masses. For that we next extra information, for example the mass ratio $m_2/m_1$. We can get this from less dominant parts of the waveform, but it’s not typically measured as precisely as the chirp mass, so we’re often left with big uncertainties.

# Gravitational-wave sensitivity curves

Differing weights and differing measures—
the LORD detests them both. — Proverbs 20:10

As a New Year’s resolution, I thought I would try to write a post on each paper I have published. (I might try to go back and talk about my old papers too, but that might be a little too optimistic.)  Handily, I have a paper that was published in Classical & Quantum Gravity on Thursday, so let’s get on with it, and hopefully 2015 will deliver those hoverboards soon.

This paper was written in collaboration with my old officemates, Chris Moore and Rob Cole, and originates from my time in Cambridge. We were having a weekly group meeting (surreptitiously eating cake—you’re not meant to eat in the new meeting rooms) and discussing what to do for the upcoming open afternoon. Posters are good as you can use them to decorate your office afterwards, so we decided on making one on gravitational-wave astronomy. Gravitational waves come in a range of frequencies, just like light (electromagnetic radiation). You can observe different systems with different frequencies, but you need different instruments to do so. For light, the range is from high frequency gamma rays (observed with satellites like Fermi) to low frequency radio waves (observed with telescopes like those at Jodrell Bank or Arecibo), with visible light (observed with Hubble or your own eyes) in the middle. Gravitational waves also have a spectrum, ground-based detectors like LIGO measure the higher frequencies, pulsar timing arrays measure the lower frequencies, and space-borne detectors like eLISA measure stuff in the middle. We wanted a picture that showed the range of each instrument and the sources they could detect, but we couldn’t find a good up-to-date one. Chris is not one to be put off by a challenge (especially if it’s a source of procrastination), so he decided to have a go at making one himself. How hard could it be? We never made that poster, but we did end up with a paper.

When talking about gravitational-wave detectors, you normally use a sensitivity curve. This shows how sensitive it is at a given frequency: you plot a graph with the sensitivity curve on, and then plot the spectrum of the source you’re interested in on the same graph. If your source is above the sensitivity curve, you can detect it (yay), but if it lies below it, then you can’t pick it out from the noise (boo). Making a plot with lots of sensitivity curves on sounds simple: you look up the details for lots of detectors, draw them together and add a few sources. However, there are lots of different conventions for how you actually measure sensitivity, and they’re frequently muddled up! We were rather confused by the whole thing, but eventually (after the open afternoon had flown by), we figured things out and made our picture. So we wouldn’t forget, we wrote up the different conventions, why you might want to use each, and how to convert between them; these notes became the paper. We also thought it would be handy to have a website where you could make your own plot, picking which detectors and sources you wanted to include. Rob also likes a challenge (especially if it’s a source of procrastination), so he set about making such a thing. I think it turned out rather well!

That’s the story of the paper. It explains different conventions for characterising gravitational-wave detectors and sources, and gives some examples. If you’d actually like to know some of the details, I’ll give a little explanation now, if not, just have a look at the pretty plots below (or, if looking for your own source of procrastination, have a go at Space Time Quest, a game where you try to build the most sensitive detector).

There are three common conventions in use for sensitivity-curve plots: the characteristic strain, the amplitude spectral density and the energy density.

You might wonder why we don’t just directly use the amplitude of the wave? Gravitational waves are a stretching and squashing of spacetime, so you can characterise how much they stretch and squeeze things and use that to describe the size of your waves. The sensitivity of your detector is then how much various sources of noise cause a similar wibbling. The amplitude of the wave is really, really small, so it’s difficult to detect, but if you were to consider observations over a time interval instead of just one moment, it’s easier to spot a signal: hints that there might be a signal add up until you’re certain that it’s there. The characteristic strain is a way of modifying the amplitude to take into account how we add up the signal. It’s especially handy, as if you make a log–log plot (such that the space between 1 and 10 is the same as between 10 and 100, etc.), then the area between the characteristic strain of your source and the detector sensitivity curve gives you a measure of the signal-to-noise ratio, a measure of how loud (how detectable) a signal is.

Gravitational-wave sensitivity-curve plot using characteristic strain. The area between the detector’s curve and the top of the box for a source indicates how loud that signal would be.

The characteristic strain is handy for quickly working out how loud a signal is, but it’s not directly related to anything we measure. The noise in a detector is usually described by its power spectral density or PSD. This tells you how much wibbling there is on average. Actually, it tells you the average amount of wibbling squared. The square root of the PSD is the amplitude spectral density or ASD. This gives a handy indication of the sensitivity of your detector, which is actually related to what you measure.

Gravitational-wave sensitivity-curve plot using the square root of the power spectral density (the amplitude spectral density).

The PSD is tied to the detector, but isn’t too relevant to the actual waves. An interesting property of the waves is how much energy they carry. We talk about this in terms of the energy density, the energy per unit volume. Cosmologists love this, and to make things easy for themselves, they like to divide energy densities by the amount that would make the Universe flat. (If you’ve ever wondered what astrophysicists mean when they say the Universe is about 70% dark energy and about 25% dark matter, they’re using these quantities). To make things even simpler, they like to multiply this quantity by something related to the Hubble constant (which measures the expansion rate of the Universe), as this means things don’t change if you tweak the numbers describing how the Universe evolves. What you’re left with is a quantity $\Omega h_{100}^2$ that is really convenient if you’re a cosmologist, but a pain for anyone else. It does have the advantage of making the pulsar timing arrays look more sensitive though.

Gravitational-wave sensitivity-curve plot using the energy density that cosmologists love. The proper name of the plotted quantity is the critical energy density per logarithmic frequency interval multiplied by the reduced Hubble constant squared. I prefer Bob.

We hope that the paper will be useful for people (like us), who can never remember what the conventions are (and why). There’s nothing new (in terms of results) in this paper, but I think it’s the first time all this material has been collected together in one place. If you ever need to make a poster about gravitational waves, I know where you can find a good picture.

arXiv: 1408.0740 [gr-qc]
Journal: Classical & Qunatum Gravity32(1):015014(25); 2015
Website: Gravitational Wave Sensitivity Curve Plotter
Procrastination score: TBC