Dirichlet Process Gaussian-mixture model: An application to localizing coalescing binary neutron stars with gravitational-wave observations

Where do gravitational waves like GW170817 come from? Using our network of detectors, we cannot pinpoint a source, but we can make a good estimate—the amplitude of the signal tells us about the distance; the time delay between the signal arriving at different detectors, and relative amplitudes of the signal in different detectors tells us about the sky position (see the excellent video by Leo Singer below).

In this paper we look at full three-dimensional localization of gravitational-wave sources; we important a (rather cunning) technique from computer vision to construct a probability distribution for the source’s location, and then explore how well we could localise a set of simulated binary neutron stars. Knowing the source location enables lots of cool science. First, it aids direct follow-up observations with non-gravitational-wave observatories, searching for electromagnetic or neutrino counterparts. It’s especially helpful if you can cross-reference with galaxy catalogues, to find the most probable source locations (this technique was used to find the kilonova associated with GW170817). Even without finding a counterpart, knowing the most probable host galaxy helps us figure out how the source formed (have lots of stars been born recently, or are all the stars old?), and allows us measure the expansion of the Universe. Having a reliable technique to reconstruct source locations is useful!

This was a fun paper to write [bonus note]. I’m sure it will be valuable, both for showing how to perform this type of reconstruction of a multi-dimensional probability density, and for its implications for source localization and follow-up of gravitational-wave signals. I go into details of both below, first discussing our statistical model (this is a bit technical), then looking at our results for a set of binary neutron stars (which have implications for hunting for counterparts) .

Dirichlet process Gaussian mixture model

When we analyse gravitational-wave data to infer the source properties (location, masses, etc.), we map out parameter space with a set of samples: a list of points in the parameter space, with there being more around more probable locations and fewer in less probable locations. These samples encode everything about the probability distribution for the different parameters, we just need to extract it…

For our application, we want a nice smooth probability density. How do we convert a bunch of discrete samples to a smooth distribution? The simplest thing is to bin the samples. However, picking the right bin size is difficult, and becomes much harder in higher dimensions. Another popular option is to use kernel density estimation. This is better at ensuring smooth results, but you now have to worry about the size of your kernels.

Our approach is in essence to use a kernel density estimate, but to learn the size and position of the kernels (as well as the number) from the data as an extra layer of inference. The “Gaussian mixture model” part of the name refers to the kernels—we use several different Gaussians. The “Dirichlet process” part refers to how we assign their properties (their means and standard deviations). What I really like about this technique, as opposed to the usual rule-of-thumb approaches used for kernel density estimation,  is that it is well justified from a theoretical point of view.

I hadn’t come across a Dirchlet process before. Section 2 of the paper is a walkthrough of how I built up an understanding of this mathematical object, and it contains lots of helpful references if you’d like to dig deeper.

In our application, you can think of the Dirichlet process as being a probability distribution for probability distributions. We want a probability distribution describing the source location. Given our samples, we infer what this looks like. We could put all the probability into one big Gaussian, or we could put it into lots of little Gaussians. The Gaussians could be wide or narrow or a mix. The Dirichlet distribution allows us to assign probabilities to each configuration of Gaussians; for example, if our samples are all in the northern hemisphere, we probably want Gaussians centred around there, rather than in the southern hemisphere.

With the resulting probability distribution for the source location, we can quickly evaluate it at a single point. This means we can rapidly produce a list of most probable source galaxies—extremely handy if you need to know where to point a telescope before a kilonova fades away (or someone else finds it).

Gravitational-wave localization

To verify our technique works, and develop an intuition for three-dimensional localizations, we used studied a set of simulated binary neutron star signals created for the First 2 Years trilogy of papers. This data set is well studied now, it illustrates performance it what we anticipated to be the first two observing runs of the advanced detectors, which turned out to be not too far from the truth. We have previously looked at three-dimensional localizations for these signals using a super rapid approximation.

The plots below show how well we could localise the sources of our binary neutron star sources. Specifically, the plots show the size of the volume which has a 90% probability of containing the source verses the signal-to-noise ratio (the loudness) of the signal. Typically, volumes are 10^410^5~\mathrm{Mpc}^3, which is about 10^{68}10^{69} Olympic swimming pools. Such a volume would contain something like 1001000 galaxies.

Volume verses signal-to-noise ratio

Localization volume as a function of signal-to-noise ratio. The top panel shows results for two-detector observations: the LIGO-Hanford and LIGO-Livingston (HL) network similar to in the first observing run, and the LIGO and Virgo (HLV) network similar to the second observing run. The bottom panel shows all observations for the HLV network including those with all three detectors which are colour coded by the fraction of the total signal-to-noise ratio from Virgo. In both panels, there are fiducial lines scaling inversely with the sixth power of the signal-to-noise ratio. Adapted from Fig. 4 of Del Pozzo et al. (2018).

Looking at the results in detail, we can learn a number of things

  1. The localization volume is roughly inversely proportional to the sixth power of the signal-to-noise ratio [bonus note]. Loud signals are localized much better than quieter ones!
  2. The localization dramatically improves when we have three-detector observations. The extra detector improves the sky localization, which reduces the localization volume.
  3. To get the benefit of the extra detector, the source needs to be close enough that all the detectors could get a decent amount of the signal-to-noise ratio. In our case, Virgo is the least sensitive, and we see the the best localizations are when it has a fair share of the signal-to-noise ratio.
  4. Considering the cases where we only have two detectors, localization volumes get bigger at a given signal-to-noise ration as the detectors get more sensitive. This is because we can detect sources at greater distances.

Putting all these bits together, I think in the future, when we have lots of detections, it would make most sense to prioritise following up the loudest signals. These are the best localised, and will also be the brightest since they are the closest, meaning there’s the greatest potential for actually finding a counterpart. As the sensitivity of the detectors improves, its only going to get more difficult to find a counterpart to a typical gravitational-wave signal, as sources will be further away and less well localized. However, having more sensitive detectors also means that we are more likely to have a really loud signal, which should be really well localized.

Banana vs cucumber

Left: Localization (yellow) with a network of two low-sensitivity detectors. The sky location is uncertain, but we know the source must be nearby. Right: Localization (green) with a network of three high-sensitivity detectors. We have good constraints on the source location, but it could now be at a much greater range of distances. Not to scale.

Using our localization volumes as a guide, you would only need to search one galaxy to find the true source in about 7% of cases with a three-detector network similar to at the end of our second observing run. Similarly, only ten would need to be searched in 23% of cases. It might be possible to get even better performance by considering which galaxies are most probable because they are the biggest or the most likely to produce merging binary neutron stars. This is definitely a good approach to follow.

Three-dimensional localization with galaxy catalgoue

Galaxies within the 90% credible volume of an example simulated source, colour coded by probability. The galaxies are from the GLADE Catalog; incompleteness in the plane of the Milky Way causes the missing wedge of galaxies. The true source location is marked by a cross [bonus note]. Part of Figure 5 of Del Pozzo et al. (2018).

arXiv: 1801.08009 [astro-ph.IM]
Journal: Monthly Notices of the Royal Astronomical Society; 479(1):601–614; 2018
Code: 3d_volume
Buzzword bingo: Interdisciplinary (we worked with computer scientist Tom Haines); machine learning (the inference involving our Dirichlet process Gaussian mixture model); multimessenger astronomy (as our results are useful for following up gravitational-wave signals in the search for counterparts)

Bonus notes

Writing

We started writing this paper back before the first observing run of Advanced LIGO. We had a pretty complete draft on Friday 11 September 2015. We just needed to gather together a few extra numbers and polish up the figures and we’d be done! At 10:50 am on Monday 14 September 2015, we made our first detection of gravitational waves. The paper was put on hold. The pace of discoveries over the coming years meant we never quite found enough time to get it together—I’ve rewritten the introduction a dozen times. It’s extremely satisfying to have it done. This is a shame, as it meant that this study came out much later than our other three-dimensional localization study. The delay has the advantage of justifying one of my favourite acknowledgement sections.

Sixth power

We find that the localization volume \Delta V is inversely proportional to the sixth power of the signal-to-noise ration \varrho. This is what you would expect. The localization volume depends upon the angular uncertainty on the sky \Delta \Omega, the distance to the source D, and the distance uncertainty \Delta D,

\Delta V \sim D^2 \Delta \Omega \Delta D.

Typically, the uncertainty on a parameter (like the masses) scales inversely with the signal-to-noise ratio. This is the case for the logarithm of the distance, which means

\displaystyle \frac{\Delta D}{D} \propto \varrho^{-1}.

The uncertainty in the sky location (being two dimensional) scales inversely with the square of the signal-to-noise ration,

\Delta \Omega \propto \varrho^{-2}.

The signal-to-noise ratio itself is inversely proportional to the distance to the source (sources further way are quieter. Therefore, putting everything together gives

\Delta V \propto \varrho^{-6}.

Treasure

We all know that treasure is marked by a cross. In the case of a binary neutron star merger, dense material ejected from the neutron stars will decay to heavy elements like gold and platinum, so there is definitely a lot of treasure at the source location.

Advertisements

Accuracy of inference on the physics of binary evolution from gravitational-wave observations

Gravitational-wave astronomy lets us observing binary black holes. These systems, being made up of two black holes, are pretty difficult to study by any other means. It has long been argued that with this new information we can unravel the mysteries of stellar evolution. Just as a palaeontologist can discover how long-dead animals lived from their bones, we can discover how massive stars lived by studying their black hole remnants. In this paper, we quantify how much we can really learn from this black hole palaeontology—after 1000 detections, we should pin down some of the most uncertain parameters in binary evolution to a few percent precision.

Life as a binary

There are many proposed ways of making a binary black hole. The current leading contender is isolated binary evolution: start with a binary star system (most stars are in binaries or higher multiples, our lonesome Sun is a little unusual), and let the stars evolve together. Only a fraction will end with black holes close enough to merge within the age of the Universe, but these would be the sources of the signals we see with LIGO and Virgo. We consider this isolated binary scenario in this work [bonus note].

Now, you might think that with stars being so fundamentally important to astronomy, and with binary stars being so common, we’d have the evolution of binaries figured out by now. It turns out it’s actually pretty messy, so there’s lots of work to do. We consider constraining four parameters which describe the bits of binary physics which we are currently most uncertain of:

  • Black hole natal kicks—the push black holes receive when they are born in supernova explosions. We now the neutron stars get kicks, but we’re less certain for black holes [bonus note].
  • Common envelope efficiency—one of the most intricate bits of physics about binaries is how mass is transferred between stars. As they start exhausting their nuclear fuel they puff up, so material from the outer envelope of one star may be stripped onto the other. In the most extreme cases, a common envelope may form, where so much mass is piled onto the companion, that both stars live in a single fluffy envelope. Orbiting inside the envelope helps drag the two stars closer together, bringing them closer to merging. The efficiency determines how quickly the envelope becomes unbound, ending this phase.
  • Mass loss rates during the Wolf–Rayet (not to be confused with Wolf 359) and luminous blue variable phases–stars lose mass through out their lives, but we’re not sure how much. For stars like our Sun, mass loss is low, there is enough to gives us the aurora, but it doesn’t affect the Sun much. For bigger and hotter stars, mass loss can be significant. We consider two evolutionary phases of massive stars where mass loss is high, and currently poorly known. Mass could be lost in clumps, rather than a smooth stream, making it difficult to measure or simulate.

We use parameters describing potential variations in these properties are ingredients to the COMPAS population synthesis code. This rapidly (albeit approximately) evolves a population of stellar binaries to calculate which will produce merging binary black holes.

The question is now which parameters affect our gravitational-wave measurements, and how accurately we can measure those which do?

Merger rate with redshift and chirp mass

Binary black hole merger rate at three different redshifts z as calculated by COMPAS. We show the rate in 30 different chirp mass bins for our default population parameters. The caption gives the total rate for all masses. Figure 2 of Barrett et al. (2018)

Gravitational-wave observations

For our deductions, we use two pieces of information we will get from LIGO and Virgo observations: the total number of detections, and the distributions of chirp masses. The chirp mass is a combination of the two black hole masses that is often well measured—it is the most important quantity for controlling the inspiral, so it is well measured for low mass binaries which have a long inspiral, but is less well measured for higher mass systems. In reality we’ll have much more information, so these results should be the minimum we can actually do.

We consider the population after 1000 detections. That sounds like a lot, but we should have collected this many detections after just 2 or 3 years observing at design sensitivity. Our default COMPAS model predicts 484 detections per year of observing time! Honestly, I’m a little scared about having this many signals…

For a set of population parameters (black hole natal kick, common envelope efficiency, luminous blue variable mass loss and Wolf–Rayet mass loss), COMPAS predicts the number of detections and the fraction of detections as a function of chirp mass. Using these, we can work out the probability of getting the observed number of detections and fraction of detections within different chirp mass ranges. This is the likelihood function: if a given model is correct we are more likely to get results similar to its predictions than further away, although we expect their to be some scatter.

If you like equations, the from of our likelihood is explained in this bonus note. If you don’t like equations, there’s one lurking in the paragraph below. Just remember, that it can’t see you if you don’t move. It’s OK to skip the equation.

To determine how sensitive we are to each of the population parameters, we see how the likelihood changes as we vary these. The more the likelihood changes, the easier it should be to measure that parameter. We wrap this up in terms of the Fisher information matrix. This is defined as

\displaystyle F_{ij} = -\left\langle\frac{\partial^2\ln \mathcal{L}(\mathcal{D}|\left\{\lambda\right\})}{\partial \lambda_i \partial\lambda_j}\right\rangle,

where \mathcal{L}(\mathcal{D}|\left\{\lambda\right\}) is the likelihood for data \mathcal{D} (the number of observations and their chirp mass distribution in our case), \left\{\lambda\right\} are our parameters (natal kick, etc.), and the angular brackets indicate the average over the population parameters. In statistics terminology, this is the variance of the score, which I think sounds cool. The Fisher information matrix nicely quantifies how much information we can lean about the parameters, including the correlations between them (so we can explore degeneracies). The inverse of the Fisher information matrix gives a lower bound on the covariance matrix (the multidemensional generalisation of the variance in a normal distribution) for the parameters \left\{\lambda\right\}. In the limit of a large number of detections, we can use the Fisher information matrix to estimate the accuracy to which we measure the parameters [bonus note].

We simulated several populations of binary black hole signals, and then calculate measurement uncertainties for our four population uncertainties to see what we could learn from these measurements.

Results

Using just the rate information, we find that we can constrain a combination of the common envelope efficiency and the Wolf–Rayet mass loss rate. Increasing the common envelope efficiency ends the common envelope phase earlier, leaving the binary further apart. Wider binaries take longer to merge, so this reduces the merger rate. Similarly, increasing the Wolf–Rayet mass loss rate leads to wider binaries and smaller black holes, which take longer to merge through gravitational-wave emission. Since the two parameters have similar effects, they are anticorrelated. We can increase one and still get the same number of detections if we decrease the other. There’s a hint of a similar correlation between the common envelope efficiency and the luminous blue variable mass loss rate too, but it’s not quite significant enough for us to be certain it’s there.

Correaltions between population parameters

Fisher information matrix estimates for fractional measurement precision of the four population parameters: the black hole natal kick \sigma_\mathrm{kick}, the common envelope efficiency \alpha_\mathrm{CE}, the Wolf–Rayet mass loss rate f_\mathrm{WR}, and the luminous blue variable mass loss rate f_\mathrm{LBV}. There is an anticorrealtion between f_\mathrm{WR} and \alpha_\mathrm{CE}, and hints at a similar anticorrelation between f_|mathrm{LBV} and \alpha_\mathrm{CE}. We show 1500 different realisations of the binary population to give an idea of scatter. Figure 6 of Barrett et al. (2018)

Adding in the chirp mass distribution gives us more information, and improves our measurement accuracies. The fraction uncertainties are about 2% for the two mass loss rates and the common envelope efficiency, and about 5% for the black hole natal kick. We’re less sensitive to the natal kick because the most massive black holes don’t receive a kick, and so are unaffected by the kick distribution [bonus note]. In any case, these measurements are exciting! With this type of precision, we’ll really be able to learn something about the details of binary evolution.

Standard deviation of measurements of population parameters

Measurement precision for the four population parameters after 1000 detections. We quantify the precision with the standard deviation estimated from the Fisher inforamtion matrix. We show results from 1500 realisations of the population to give an idea of scatter. Figure 5 of Barrett et al. (2018)

The accuracy of our measurements will improve (on average) with the square root of the number of gravitational-wave detections. So we can expect 1% measurements after about 4000 observations. However, we might be able to get even more improvement by combining constraints from other types of observation. Combining different types of observation can help break degeneracies. I’m looking forward to building a concordance model of binary evolution, and figuring out exactly how massive stars live their lives.

arXiv: 1711.06287 [astro-ph.HE]
Journal: Monthly Notices of the Royal Astronomical Society; 477(4):4685–4695; 2018
Favourite dinosaur: Professor Science

Bonus notes

Channel selection

In practise, we will need to worry about how binary black holes are formed, via isolated evolution or otherwise, before inferring the parameters describing binary evolution. This makes the problem more complicated. Some parameters, like mass loss rates or black hole natal kicks, might be common across multiple channels, while others are not. There are a number of ways we might be able to tell different formation mechanisms apart, such as by using spin measurements.

Kick distribution

We model the supernova kicks v_\mathrm{kick} as following a Maxwell–Boltzmann distribution,

\displaystyle p(v_\mathrm{kick}) = \sqrt{\frac{2}{\pi}}  \frac{v_\mathrm{kick}^2}{\sigma_\mathrm{kick}^3} \exp\left(\frac{-v_\mathrm{kick}^2}{2\sigma_\mathrm{kick}^2}\right),

where \sigma_\mathrm{kick} is the unknown population parameter. The natal kick received by the black hole v^*_\mathrm{kick} is not the same as this, however, as we assume some of the material ejected by the supernova falls back, reducing the over kick. The final natal kick is

v^*_\mathrm{kick} = (1-f_\mathrm{fb})v_\mathrm{kick},

where f_\mathrm{fb} is the fraction that falls back, taken from Fryer et al. (2012). The fraction is greater for larger black holes, so the biggest black holes get no kicks. This means that the largest black holes are unaffected by the value of \sigma_\mathrm{kick}.

The likelihood

In this analysis, we have two pieces of information: the number of detections, and the chirp masses of the detections. The first is easy to summarise with a single number. The second is more complicated, and we consider the fraction of events within different chirp mass bins.

Our COMPAS model predicts the merger rate \mu and the probability of falling in each chirp mass bin p_k (we factor measurement uncertainty into this). Our observations are the the total number of detections N_\mathrm{obs} and the number in each chirp mass bin c_k (N_\mathrm{obs} = \sum_k c_k). The likelihood is the probability of these observations given the model predictions. We can split the likelihood into two pieces, one for the rate, and one for the chirp mass distribution,

\mathcal{L} = \mathcal{L}_\mathrm{rate} \times \mathcal{L}_\mathrm{mass}.

For the rate likelihood, we need the probability of observing N_\mathrm{obs} given the predicted rate \mu. This is given by a Poisson distribution,

\displaystyle \mathcal{L}_\mathrm{rate} = \exp(-\mu t_\mathrm{obs}) \frac{(\mu t_\mathrm{obs})^{N_\mathrm{obs}}}{N_\mathrm{obs}!},

where t_\mathrm{obs} is the total observing time. For the chirp mass likelihood, we the probability of getting a number of detections in each bin, given the predicted fractions. This is given by a multinomial distribution,

\displaystyle \mathcal{L}_\mathrm{mass} = \frac{N_\mathrm{obs}!}{\prod_k c_k!} \prod_k p_k^{c_k}.

These look a little messy, but they simplify when you take the logarithm, as we need to do for the Fisher information matrix.

When we substitute in our likelihood into the expression for the Fisher information matrix, we get

\displaystyle F_{ij} = \mu t_\mathrm{obs} \left[ \frac{1}{\mu^2} \frac{\partial \mu}{\partial \lambda_i} \frac{\partial \mu}{\partial \lambda_j}  + \sum_k\frac{1}{p_k} \frac{\partial p_k}{\partial \lambda_i} \frac{\partial p_k}{\partial \lambda_j} \right].

Conveniently, although we only need to evaluate first-order derivatives, even though the Fisher information matrix is defined in terms of second derivatives. The expected number of events is \langle N_\mathrm{obs} \rangle = \mu t_\mathrm{obs}. Therefore, we can see that the measurement uncertainty defined by the inverse of the Fisher information matrix, scales on average as N_\mathrm{obs}^{-1/2}.

For anyone worrying about using the likelihood rather than the posterior for these estimates, the high number of detections [bonus note] should mean that the information we’ve gained from the data overwhelms our prior, meaning that the shape of the posterior is dictated by the shape of the likelihood.

Interpretation of the Fisher information matrix

As an alternative way of looking at the Fisher information matrix, we can consider the shape of the likelihood close to its peak. Around the maximum likelihood point, the first-order derivatives of the likelihood with respect to the population parameters is zero (otherwise it wouldn’t be the maximum). The maximum likelihood values of $latex N_\mathrm{obs} = \mu t_\mathrm{obs}$ and c_k = N_\mathrm{obs} p_k are the same as their expectation values. The second-order derivatives are given by the expression we have worked out for the Fisher information matrix. Therefore, in the region around the maximum likelihood point, the Fisher information matrix encodes all the relevant information about the shape of the likelihood.

So long as we are working close to the maximum likelihood point, we can approximate the distribution as a multidimensional normal distribution with its covariance matrix determined by the inverse of the Fisher information matrix. Our results for the measurement uncertainties are made subject to this approximation (which we did check was OK).

Approximating the likelihood this way should be safe in the limit of large N_\mathrm{obs}. As we get more detections, statistical uncertainties should reduce, with the peak of the distribution homing in on the maximum likelihood value, and its width narrowing. If you take the limit of N_\mathrm{obs} \rightarrow \infty, you’ll see that the distribution basically becomes a delta function at the maximum likelihood values. To check that our N_\mathrm{obs} = 1000 was large enough, we verified that higher-order derivatives were still small.

Michele Vallisneri has a good paper looking at using the Fisher information matrix for gravitational wave parameter estimation (rather than our problem of binary population synthesis). There is a good discussion of its range of validity. The high signal-to-noise ratio limit for gravitational wave signals corresponds to our high number of detections limit.

 

Science with the space-based interferometer LISA. V. Extreme mass-ratio inspirals

The space-based observatory LISA will detect gravitational waves from massive black holes (giant black holes residing in the centres of galaxies). One particularly interesting signal will come from the inspiral of a regular stellar-mass black hole into a massive black hole. These are called extreme mass-ratio inspirals (or EMRIs, pronounced emries, to their friends) [bonus note]. We have never observed such a system. This means that there’s a lot we have to learn about them. In this work, we systematically investigated the prospects for observing EMRIs. We found that even though there’s a wide range in predictions for what EMRIs we will detect, they should be a safe bet for the LISA mission.

EMRI spacetime

Artistic impression of the spacetime for an extreme-mass-ratio inspiral, with a smaller stellar-mass black hole orbiting a massive black hole. This image is mandatory when talking about extreme-mass-ratio inspirals. Credit: NASA

LISA & EMRIs

My previous post discussed some of the interesting features of EMRIs. Because of the extreme difference in masses of the two black holes, it takes a long time for them to complete their inspiral. We can measure tens of thousands of orbits, which allows us to make wonderfully precise measurements of the source properties (if we can accurately pick out the signal from the data). Here, we’ll examine exactly what we could learn with LISA from EMRIs [bonus note].

First we build a model to investigate how many EMRIs there could be.  There is a lot of astrophysics which we are currently uncertain about, which leads to a large spread in estimates for the number of EMRIs. Second, we look at how precisely we could measure properties from the EMRI signals. The astrophysical uncertainties are less important here—we could get a revolutionary insight into the lives of massive black holes.

The number of EMRIs

To build a model of how many EMRIs there are, we need a few different inputs:

  1. The population of massive black holes
  2. The distribution of stellar clusters around massive black holes
  3. The range of orbits of EMRIs

We examine each of these in turn, building a more detailed model than has previously been constructed for EMRIs.

We currently know little about the population of massive black holes. This means we’ll discover lots when we start measuring signals (yay), but it’s rather inconvenient now, when we’re trying to predict how many EMRIs there are (boo). We take two different models for the mass distribution of massive black holes. One is based upon a semi-analytic model of massive black hole formation, the other is at the pessimistic end allowed by current observations. The semi-analytic model predicts massive black hole spins around 0.98, but we also consider spins being uniformly distributed between 0 and 1, and spins of 0. This gives us a picture of the bigger black hole, now we need the smaller.

Observations show that the masses of massive black holes are correlated with their surrounding cluster of stars—bigger black holes have bigger clusters. We consider four different versions of this trend: Gültekin et al. (2009); Kormendy & Ho (2013); Graham & Scott (2013), and Shankar et al. (2016). The stars and black holes about a massive black hole should form a cusp, with the density of objects increasing towards the massive black hole. This is great for EMRI formation. However, the cusp is disrupted if two galaxies (and their massive black holes) merge. This tends to happen—it’s how we get bigger galaxies (and black holes). It then takes some time for the cusp to reform, during which time, we don’t expect as many EMRIs. Therefore, we factor in the amount of time for which there is a cusp for massive black holes of different masses and spins.

Colliding galaxies

That’s a nice galaxy you have there. It would be a shame if it were to collide with something… Hubble image of The Mice. Credit: ACS Science & Engineering Team.

Given a cusp about a massive black hole, we then need to know how often an EMRI forms. Simulations give us a starting point. However, these only consider a snap-shot, and we need to consider how things evolve with time. As stellar-mass black holes inspiral, the massive black hole will grow in mass and the surrounding cluster will become depleted. Both these effects are amplified because for each inspiral, there’ll be many more stars or stellar-mass black holes which will just plunge directly into the massive black hole. We therefore need to limit the number of EMRIs so that we don’t have an unrealistically high rate. We do this by adding in a couple of feedback factors, one to cap the rate so that we don’t deplete the cusp quicker than new objects will be added to it, and one to limit the maximum amount of mass the massive black hole can grow from inspirals and plunges. This gives us an idea for the total number of inspirals.

Finally, we calculate the orbits that EMRIs will be on.  We again base this upon simulations, and factor in how the spin of the massive black hole effects the distribution of orbital inclinations.

Putting all the pieces together, we can calculate the population of EMRIs. We now need to work out how many LISA would be able to detect. This means we need models for the gravitational-wave signal. Since we are simulating a large number, we use a computationally inexpensive analytic model. We know that this isn’t too accurate, but we consider two different options for setting the end of the inspiral (where the smaller black hole finally plunges) which should bound the true range of results.

Number of detected EMRIs

Number of EMRIs for different size massive black holes in different astrophysical models. M1 is our best estimate, the others explore variations on this. M11 and M12 are designed to be cover the extremes, being the most pessimistic and optimistic combinations. The solid and dashed lines are for two different signal models (AKK and AKS), which are designed to give an indication of potential variation. They agree where the massive black hole is not spinning (M10 and M11). The range of masses is similar for all models, as it is set by the sensitivity of LISA. We can detect higher mass systems assuming the AKK signal model as it includes extra inspiral close to highly spinning black holes: for the heaviest black holes, this is the only part of the signal at high enough frequency to be detectable. Figure 8 of Babak et al. (2017).

Allowing for all the different uncertainties, we find that there should be somewhere between 1 and 4200 EMRIs detected per year. (The model we used when studying transient resonances predicted about 250 per year, albeit with a slightly different detector configuration, which is fairly typical of all the models we consider here). This range is encouraging. The lower end means that EMRIs are a pretty safe bet, we’d be unlucky not to get at least one over the course of a multi-year mission (LISA should have at least four years observing). The upper end means there could be lots—we might actually need to worry about them forming a background source of noise if we can’t individually distinguish them!

EMRI measurements

Having shown that EMRIs are a good LISA source, we now need to consider what we could learn by measuring them?

We estimate the precision we will be able to measure parameters using the Fisher information matrix. The Fisher matrix measures how sensitive our observations are to changes in the parameters (the more sensitive we are, the better we should be able to measure that parameter). It should be a lower bound on actual measurement precision, and well approximate the uncertainty in the high signal-to-noise (loud signal) limit. The combination of our use of the Fisher matrix and our approximate signal models means our results will not be perfect estimates of real performance, but they should give an indication of the typical size of measurement uncertainties.

Given that we measure a huge number of cycles from the EMRI signal, we can make really precise measurements of the the mass and spin of the massive black hole, as these parameters control the orbital frequencies. Below are plots for the typical measurement precision from our Fisher matrix analysis. The orbital eccentricity is measured to similar accuracy, as it influences the range of orbital frequencies too. We also get pretty good measurements of the the mass of the smaller black hole, as this sets how quickly the inspiral proceeds (how quickly the orbital frequencies change). EMRIs will allow us to do precision astronomy!

EMRI redshifted mass measurements

Distribution of (one standard deviation) fractional uncertainties for measurements of the  massive black hole (redshifted) mass M_z. Results are shown for the different astrophysical models, and for the different signal models.  The astrophysical model has little impact on the uncertainties. M4 shows a slight difference as it assumes heavier stellar-mass black holes. The results with the two signal models agree when the massive black hole is not spinning (M10 and M11). Otherwise, measurements are more precise with the AKK signal model, as this includes extra signal from the end of the inspiral. Part of Figure 11 of Babak et al. (2017).

EMRI spin measurements

Distribution of (one standard deviation) uncertainties for measurements of the massive black hole spin a. The results mirror those for the masses above. Part of Figure 11 of Babak et al. (2017).

Now, before you get too excited that we’re going to learn everything about massive black holes, there is one confession I should make. In the plot above I show the measurement accuracy for the redshifted mass of the massive black hole. The cosmological expansion of the Universe causes gravitational waves to become stretched to lower frequencies in the same way light is (this makes visible light more red, hence the name). The measured frequency is f_z = (1 + z)f where f is the frequency emitted, and z is the redshift (z= 0 for a nearby source, and is larger for further away sources). Lower frequency gravitational waves correspond to higher mass systems, so it is often convenient to work with the redshifted mass, the mass corresponding to the signal you measure if you ignore redshifting. The redshifted mass of the massive black hole is M_z = (1+z)M where M is the true mass. To work out the true mass, we need the redshift, which means we need to measure the distance to the source.

EMRI lumniosity distance measurement

Distribution of (one standard deviation) fractional uncertainties for measurements of the luminosity distance D_\mathrm{L}. The signal model is not as important here, as the uncertainty only depends on how loud the signal is. Part of Figure 12 of Babak et al. (2017).

The plot above shows the fractional uncertainty on the distance. We don’t measure this too well, as it is determined from the amplitude of the signal, rather than its frequency components. The situation is much as for LIGO. The larger uncertainties on the distance will dominate the overall uncertainty on the black hole masses. We won’t be getting all these to fractions of a percent. However, that doesn’t mean we can’t still figure out what the distribution of masses looks like!

One of the really exciting things we can do with EMRIs is check that the signal matches our expectations for a black hole in general relativity. Since we get such an excellent map of the spacetime of the massive black hole, it is easy to check for deviations. In general relativity, everything about the black hole is fixed by its mass and spin (often referred to as the no-hair theorem). Using the measured EMRI signal, we can check if this is the case. One convenient way of doing this is to describe the spacetime of the massive object in terms of a multipole expansion. The first (most important) terms gives the mass, and the next term the spin. The third term (the quadrupole) is set by the first two, so if we can measure it, we can check if it is consistent with the expected relation. We estimated how precisely we could measure a deviation in the quadrupole. Fortunately, for this consistency test, all factors from redshifting cancel out, so we can get really detailed results, as shown below. Using EMRIs, we’ll be able to check for really small differences from general relativity!

EMRI measurement of bumpy black hole spacetime

Distribution of (one standard deviation) of uncertainties for deviations in the quadrupole moment of the massive object spacetime \mathcal{Q}. Results are similar to the mass and spin measurements. Figure 13 of Babak et al. (2017).

In summary: EMRIS are awesome. We’re not sure how many we’ll detect with LISA, but we’re confident there will be some, perhaps a couple of hundred per year. From the signals we’ll get new insights into the masses and spins of black holes. This should tell us something about how they, and their surrounding galaxies, evolved. We’ll also be able to do some stringent tests of whether the massive objects are black holes as described by general relativity. It’s all pretty exciting, for when LISA launches, which is currently planned about 2034…

Sometimes, it leads to very little, and it seems like it's not worth it, and you wonder why you waited so long for something so disappointing

One of the most valuable traits a student or soldier can have: patience. Credit: Sony/Marvel

arXiv: 1703.09722 [gr-qc]
Journal: Physical Review D; 477(4):4685–4695; 2018
Conference proceedings: 1704.00009 [astro-ph.GA] (from when work was still in-progress)
Estimated number of Marvel films before LISA launch: 48 (starting with Ant-Man and the Wasp)

Bonus notes

Hyphenation

Is it “extreme-mass-ratio inspiral”, “extreme mass-ratio inspiral” or “extreme mass ratio inspiral”? All are used in the literature. This is one of the advantage of using “EMRI”. The important thing is that we’re talking about inspirals that have a mass ratio which is extreme. For this paper, we used “extreme mass-ratio inspiral”, but when I first started my PhD, I was first introduced to “extreme-mass-ratio inspirals”, so they are always stuck that way in my mind.

I think hyphenation is a bit of an art, and there’s no definitive answer here, just like there isn’t for superhero names, where you can have Iron Man, Spider-Man or Iceman.

Science with LISA

This paper is part of a series looking at what LISA could tells us about different gravitational wave sources. So far, this series covers

  1. Massive black hole binaries
  2. Cosmological phase transitions
  3. Standard sirens (for measuring the expansion of the Universe)
  4. Inflation
  5. Extreme-mass-ratio inspirals

You’ll notice there’s a change in the name of the mission from eLISA to LISA part-way through, as things have evolved. (Or devolved?) I think the main take-away so far is that the cosmology group is the most enthusiastic.

Importance of transient resonances in extreme-mass-ratio inspirals

Extreme-mass-ratio inspirals (EMRIs for short) are a promising source for the planned space-borne gravitational-wave observatory LISA. To detect and analyse them we need accurate models for the signals, which are exquisitely intricate. In this paper, we investigated a feature, transient resonances, which have not previously included in our models. They are difficult to incorporate, but can have a big impact on the signal. Fortunately, we find that we can still detect the majority of EMRIs, even without including resonances. Phew!

EMRIs and orbits

EMRIs are a beautiful gravitational wave source. They occur when a stellar-mass black hole slowly inspirals into a massive black hole (as found in the centre of galaxies). The massive black hole can be tens of thousands or millions of times more massive than the stellar-mass black hole (hence extreme mass ratio). This means that the inspiral is slow—we can potentially measure tens of thousands of orbits. This is both the blessing and the curse of EMRIs. The huge numbers of cycles means that we can closely follow the inspiral, and build a detailed map of the massive black hole’s spacetime. EMRIs will give us precision measurements of the properties of massive black holes. However, to do this, we need to be able to find the EMRI signals in the data, we need models which can match the signals over all these cycles. Analysing EMRIs is a huge challenge.

 

EMRI orbits are complicated. At any moment, the orbit can be described by three orbital frequencies: one for radial (in/out) motion \Omega_r, one for polar (north/south if we think of the spin of the massive black hole like the rotation of the Earth) motion \Omega_\theta and one for axial (around in the east/west direction) motion. As gravitational waves are emitted, and the orbit shrinks, these frequencies evolve. The animation above, made by Steve Drasco, illustrates the evolution of an EMRI. Every so often, so can see the pattern freeze—the orbits stays in a constant shape (although this still rotates). This is a transient resonance. Two of the orbital frequencies become commensurate (so we might have 3 north/south cycles and 2 in/out cycles over the same period [bonus note])—this is the resonance. However, because the frequencies are still evolving, we don’t stay locked like this is forever—which is why the resonance is transient. To calculate an EMRI, you need to know how the orbital frequencies evolve.

The evolution of an EMRI is slow—the time taken to inspiral is much longer than the time taken to complete one orbit. Therefore, we can usually split the problem of calculating the trajectory of an EMRI into two parts. On short timescales, we can consider orbits as having fixed frequencies. On long timescale, we can calculate the evolution by averaging over many orbits. You might see the problem with this—around resonances, this averaging breaks down. Whereas normally averaging over many orbits means averaging over a complicated trajectory that hits pretty much all possible points in the orbital range, on resonance, you just average over the same bit again and again. On resonance, terms which usually average to zero can become important. Éanna Flanagan and Tanja Hinderer first pointed out that around resonances the usual scheme (referred to as the adiabatic approximation) doesn’t work.

A non-resonant orbit

A non-resonant EMRI orbit in three dimensions (left) and two dimensions (right), ignoring the rotation in the axial direction. A non-resonant orbit will eventually fill the r\theta plane. Credit: Rob Cole

A 2:3 resonance

For comparison, a resonant EMRI orbit. A 2:3 resonance traces the same parts of the r\theta plane over and over. Credit: Rob Cole

Around a resonance, the evolution will be enhanced or decreased a little relative to the standard adiabatic evolution. We get a kick. This is only small, but because we observe EMRIs for so many orbits, a small difference can grow to become a significant difference later on. Does this mean that we won’t be able to detect EMRIs with our standard models? This was a concern, so back at the end of PhD I began to investigate [bonus note]. The first step is to understand the size of the kick.

Jump for 2:3 resonance

A jump in the orbital energy across a 2:3 resonance. The plot shows the difference between the approximate adiabatic evolution and the instantaneous evolution including the resonance. The thickness of the blue line is from oscillations on the orbital timescale which is too short to resolve here. The dotted red line shows the fitted size of the jump. Time is measured in terms of the resonance time \tau_\mathrm{res} which is defined below. Figure 4 of Berry et al. (2016).

Resonance kicks

If there were no gravitational waves, the orbit would not evolve, it would be fixed. The orbit could then be described by a set of constants of motion. The most commonly used when describing orbits about black holes are the energy, angular momentum and Carter constant. For the purposes of this blog, we’ll not worry too much about what these constants are, we’ll just consider some constant I.

The resonance kick is a change in this constant \Delta I. What should this depend on? There are three ingredients. First, the rate of change of this constant F on the resonant orbit. Second, the time spent on resonance \tau_\mathrm{res}. The bigger these are, the bigger the size of the jump. Therefore,

|\Delta I| \propto F \tau_\mathrm{res}.

However, the jump could be positive or negative. This depends upon the relative phase of the radial and polar motion [bonus note]—for example, do they both reach their maximum point at the same time, or does one lag behind the other? We’ll call this relative phase q. By varying q we explore we can get our resonant trajectory to go through any possible point in space. Therefore, averaging over q should get us back to the adiabatic approximation: the average value of \Delta I must be zero. To complete our picture for the jump, we need a periodic function of the phase,

\Delta I = F \tau_\mathrm{res} f(q),

with \langle f(q) \rangle_q = 0. Now, we know the pieces, we can try to figure out what the pieces are.

The rate of change F is proportional the mass ratio \eta \ll 1: the smaller the stellar-mass black hole is relative to the massive one, the smaller F is. The exact details depend upon gravitational self-force calculations, which we’ll skip over, as they’re pretty hard, but they are the same for all orbits (resonant or not).

We can think of the resonance timescale either as the time for the orbital frequencies to drift apart or the time for the orbit to start filling the space again (so that it’s safe to average). The two pictures yield the same answer—there’s a fuller explanation in Section III A of the paper. To define the resonance timescale, it is useful to define the frequency \Omega = n_r \Omega_r - n_\theta \Omega_\theta, which is zero exactly on resonance. If this is evolving at rate \dot{\Omega}, then the resonance timescale is

\displaystyle \tau_\mathrm{res} = \left[\frac{2\pi}{\dot{\Omega}}\right]^{1/2}.

This bridges the two timescales that usually define EMRIs: the short orbital timescale T and the long evolution timescale \tau_\mathrm{ev}:

T \sim \eta^{1/2} \tau_\mathrm{res} \sim \eta \tau_\mathrm{ev}.

To find the form of for f(q), we need to do some quite involved maths (given in Appendix B of the paper) [bonus note]. This works by treating the evolution far from resonance as depending upon two independent times (effectively defining T and \tau_\mathrm{ev}), and then matching the evolution close to resonance using an expansion in terms of a different time (something like \tau_\mathrm{res}). The solution shows that the jump depends sensitively upon the phase q at resonance, which makes them extremely difficult to calculate.

We numerically evaluated the size of kicks for different orbits and resonances. We found a number of trends. First, higher-order resonances (those with larger n_r and n_\theta) have smaller jumps than lower-order ones. This makes sense, as higher-order resonances come closer to covering all the points in the space, and so are more like averaging over the entire space. Second, jumps are larger for higher eccentricity orbits. This also makes sense, as you can’t have resonances for circular (zero eccentricity orbits) as there’s no radial frequency, so the size of the jumps must tend to zero. We’ll see that these two points are important when it comes to observational consequences of transient resonances.

Astrophysical EMRIs

Now we’ve figured out the impact of passing through a transient resonance, let’s look at what this means for detecting EMRIs. The jump can mean that the evolution post-resonance can soon become out of phase with that pre-resonance. We can’t match both parts with the same adiabatic template. This could significantly hamper our prospects for detection, as we’re limited to the bits of signal we can pick up between resonances.

We created an astrophysical population of simulated EMRIs. We used numerical simulations to estimate a plausible population of massive black holes and distribution of stellar-mass black holes insprialling into them. We then used adiabatic models to see how many LISA (or eLISA as it was called at the time) could potentially detect. We found there were ~510 EMRIs detectable (with a signal-to-noise ratio of 15 or above) for a two-year mission.

We then calculated how much the signal-to-noise ratio would be reduced by passing through transient resonances. The plot below shows the distribution of signal-to-noise ratio for the original population, ignoring resonances, and then after factoring in the reduction. There are now ~490 detectable EMRIs, a loss of 4%. We can still detect the majority of EMRIs!

Signal-to-noise ratio distribution

Distribution of signal-to-noise ratios for EMRIs. In blue (solid outline), we have the results ignoring transient resonances. In orange (dashed outline), we have the distribution including the reduction due to resonance jumps. Events falling below 15 are deemed to be undetectable. Figure 10 of Berry et al. (2016).

We were worried about the impact of transient resonances, we know that jumps can cause them to become undetectable, so why aren’t we seeing a bit effect in our population? The answer lies is in the trends we saw earlier. Jumps are large for low order resonances with high eccentricities. These were the ones first highlighted, as they are obviously the most important. However, low-order resonances are only encountered really close to the massive black hole. This means late in the inspiral, after we have already accumulated lots of signal-to-noise ratio. Losing a little bit of signal right at the end doesn’t hurt detectability too much. On top of this, gravitational wave emission efficiently damps down eccentricity. Orbits typically have low eccentricities by the time they hit low-order resonances, meaning that the jumps are actually quite small. Although small jumps lead to some mismatch, we can still use our signal templates without jumps. Therefore, resonances don’t hamper us (too much) in finding EMRIs!

This may seem like a happy ending, but it is not the end of the story. While we can detect EMRIs, we still need to be able to accurately infer their source properties. Features not included in our signal templates (like jumps), could bias our results. For example, it might be that we can better match a jump by using a template for a different black hole mass or spin. However, if we include jumps, these extra features could give us extra precision in our measurements. The question of what jumps could mean for parameter estimation remains to be answered.

arXiv: 1608.08951 [gr-qc]
Journal: Physical Review D; 94(12):124042(24); 2016
Conference proceedings: 1702.05481 [gr-qc] (only 2 pages—ideal for emergency journal club presentations)
Favourite jumpers: Woolly, Mario, Kangaroos

Bonus notes

Radial and polar only

When discussing resonances, and their impact on orbital evolution, we’ll only care about \Omega_r\Omega_\theta resonances. Resonances with \Omega_\phi are not important because the spacetime is axisymmetric. The equations are exactly identical for all values of the the axial angle \phi, so it doesn’t matter where you are (or if you keep cycling over the same spot) for the evolution of the EMRI.

This, however, doesn’t mean that \Omega_\phi resonances aren’t interesting. They can lead to small kicks to the binary, because you are preferentially emitting gravitational waves in one direction. For EMRIs this are negligibly small, but for more equal mass systems, they could have some interesting consequences as pointed out by Maarten van de Meent.

Extra time

I’m grateful to the Cambridge Philosophical Society for giving me some extra funding to work on resonances. If you’re a Cambridge PhD student, make sure to become a member so you can take advantage of the opportunities they offer.

Calculating jumps

The theory of how to evolve through a transient resonance was developed by Kevorkian and coauthors. I spent a long time studying these calculations before working up the courage to attempt them myself. There are a few technical details which need to be adapted for the case of EMRIs. I finally figured everything out while in Warsaw Airport, coming back from a conference. It was the most I had ever felt like a real physicist.

No you won't

Transient resonances remind me of Spirographs. Thanks Frinkiac

Hierarchical analysis of gravitational-wave measurements of binary black hole spin–orbit misalignments

Gravitational waves allow us to infer the properties of binary black holes (two black holes in orbit about each other), but can we use this information to figure out how the black holes and the binary form? In this paper, we show that measurements of the black holes’ spins can help us this out, but probably not until we have at least 100 detections.

Black hole spins

Black holes are described by their masses (how much they bend spacetime) and their spins (how much they drag spacetime to rotate about them). The orientation of the spins relative to the orbit of the binary could tell us something about the history of the binary [bonus note].

We considered four different populations of spin–orbit alignments to see if we could tell them apart with gravitational-wave observations:

  1. Aligned—matching the idealised example of isolated binary evolution. This stands in for the case where misalignments are small, which might be the case if material blown off during a supernova ends up falling back and being swallowed by the black hole.
  2. Isotropic—matching the expectations for dynamically formed binaries.
  3. Equal misalignments at birth—this would be the case if the spins and orbit were aligned before the second supernova, which then tilted the plane of the orbit. (As the binary inspirals, the spins wobble around, so the two misalignment angles won’t always be the same).
  4. Both spins misaligned by supernova kicks, assuming that the stars were aligned with the orbit before exploding. This gives a more general scatter of unequal misalignments, but typically the primary (bigger and first forming) black hole is more misaligned.

These give a selection of possible spin alignments. For each, we assumed that the spin magnitude was the same and had a value of 0.7. This seemed like a sensible idea when we started this study [bonus note], but is now towards the upper end of what we expect for binary black holes.

Hierarchical analysis

To measurement the properties of the population we need to perform a hierarchical analysis: there are two layers of inference, one for the individual binaries, and one of the population.

From a gravitational wave signal, we infer the properties of the source using Bayes’ theorem. Given the data d_\alpha, we want to know the probability that the parameters \mathbf{\Theta}_\alpha have different values, which is written as p(\mathbf{\Theta}_\alpha|d_\alpha). This is calculated using

\displaystyle p(\mathbf{\Theta}_\alpha|d_\alpha) = \frac{p(d_\alpha | \mathbf{\Theta}_\alpha) p(\mathbf{\Theta}_\alpha)}{p(d_\alpha)},

where p(d_\alpha | \mathbf{\Theta}_\alpha) is the likelihood, which we can calculate from our knowledge of the noise in our gravitational wave detectors, p(\mathbf{\Theta}_\alpha) is the prior on the parameters (what we would have guessed before we had the data), and the normalisation constant p(d_\alpha) is called the evidence. We’ll use the evidence again in the next layer of inference.

Our prior on the parameters should actually depend upon what we believe about the astrophysical population. It is different if we believed that Model 1 were true (when we’d only consider aligned spins) than for Model 2. Therefore, we should really write

\displaystyle p(\mathbf{\Theta}_\alpha|d_\alpha, \lambda) = \frac{p(d_\alpha | \mathbf{\Theta}_\alpha,\lambda) p(\mathbf{\Theta}_\alpha,\lambda)}{p(d_\alpha|\lambda)},

where  \lambda denotes which model we are considering.

This is an important point to remember: if you our using our LIGO results to test your theory of binary formation, you need to remember to correct for our choice of prior. We try to pick non-informative priors—priors that don’t make strong assumptions about the physics of the source—but this doesn’t mean that they match what would be expected from your model.

We are interested in the probability distribution for the different models: how many binaries come from each. Given a set of different observations \{d_\alpha\}, we can work this out using another application of Bayes’ theorem (yay)

\displaystyle p(\mathbf{\lambda}|\{d_\alpha\}) = \frac{p(\{d_\alpha\} | \mathbf{\lambda}) p(\mathbf{\lambda})}{p(\{d_\alpha\})},

where p(\{d_\alpha\} | \mathbf{\lambda}) is just all the evidences for the individual events (given that model) multiplied together, p(\mathbf{\lambda}) is our prior for the different models, and p(\{d_\alpha\}) is another normalisation constant.

Now knowing how to go from a set of observations to the probability distribution on the different channels, let’s give it a go!

Results

To test our approach made a set of mock gravitational wave measurements. We generated signals from binaries for each of our four models, and analysed these as we would for real signals (using LALInference). This is rather computationally expensive, and we wanted a large set of events to analyse, so using these results as a guide, we created a larger catalogue of approximate distributions for the inferred source parameters p(\mathbf{\Theta}_\alpha|d_\alpha). We then fed these through our hierarchical analysis. The GIF below shows how measurements of the fraction of binaries from each population tightens up as we get more detections: the true fraction is marked in blue.

Fraction of binaries from each of the four models

Probability distribution for the fraction of binaries from each of our four spin misalignment populations for different numbers of observations. The blue dot marks the true fraction: and equal fraction from all four channels.

The plot shows that we do zoom in towards the true fraction of events from each model as the number of events increases, but there are significant degeneracies between the different models. Notably, it is difficult to tell apart Models 1 and 3, as both have strong support for both spins being nearly aligned. Similarly, there is a degeneracy between Models 2 and 4 as both allow for the two spins to have very different misalignments (and for the primary spin, which is the better measured one, to be quite significantly misaligned).

This means that we should be able to distinguish aligned from misaligned populations (we estimated that as few as 5 events would be needed to distinguish the case that all events came from either Model 1  or Model 2 if those were the only two allowed possibilities). However, it will be more difficult to distinguish different scenarios which only lead to small misalignments from each other, or disentangle whether there is significant misalignment due to big supernova kicks or because binaries are formed dynamically.

The uncertainty of the fraction of events from each model scales roughly with the square root of the number of observations, so it may be slow progress making these measurements. I’m not sure whether we’ll know the answer to how binary black hole form, or who will sit on the Iron Throne first.

arXiv: 1703.06873 [astro-ph.HE]
Journal: Monthly Notices of the Royal Astronomical Society471(3):2801–2811; 2017
Birmingham science summary: Hierarchical analysis of gravitational-wave measurements of binary black hole spin–orbit misalignment (by Simon)
If you like this you might like: Farr et al. (2017)Talbot & Thrane (2017), Vitale et al. (2017), Trifirò et al. (2016), Minogue (2000)

Bonus notes

Spin misalignments and formation histories

If you have two stars forming in a binary together, you’d expect them to be spinning in roughly the same direction, rotating the same way as they go round in their orbit (like our Solar System). This is because they all formed from the same cloud of swirling gas and dust. Furthermore, if two stars are to form a black hole binary that we can detect gravitational waves from, they need to be close together. This means that there can be tidal forces which gently tug the stars to align their rotation with the orbit. As they get older, stars puff up, meaning that if you have a close-by neighbour, you can share outer layers. This transfer of material will tend to align rotate too. Adding this all together, if you have an isolated binary of stars, you might expect that when they collapse down to become black holes, their spins are aligned with each other and the orbit.

Unfortunately, real astrophysics is rarely so clean. Even if the stars were initially rotating the same way as each other, they doesn’t mean that their black hole remnants will do the same. This depends upon how the star collapses. Massive stars explode as supernova, blasting off their outer layers while their cores collapse down to form black holes. Escaping material could carry away angular momentum, meaning that the black hole is spinning in a different direction to its parent star, or material could be blasted off asymmetrically, giving the new black hole a kick. This would change the plane of the binary’s orbit, misaligning the spins.

Alternatively, the binary could be formed dynamically. Instead of two stars living their lives together, we could have two stars (or black holes) come close enough together to form a binary. This is likely to happen in regions where there’s a high density of stars, such as a globular cluster. In this case, since the binary has been randomly assembled, there’s no reason for the spins to be aligned with each other or the orbit. For dynamically assembled binaries, all spin–orbit misalignments are equally probable.

Slow and steady

This project was led by Simon Stevenson. It was one of the first things we started working on at the beginning of his PhD. He has now graduated, and is off to start a new exciting life as a postdoc in Australia. We got a little distracted by other projects, most notably analysing the first detections of gravitational waves. Simon spent a lot of time developing the COMPAS population code, a code to simulate the evolution of binaries. Looking back, it’s impressive how far he’s come. This paper used a simple approximation to to estimate the masses of our black holes: we called it the Post-it note model, as we wrote it down on a single Post-it. Now Simon’s writing papers including the complexities of common-envelope evolution in order to explain LIGO’s actual observations.

Going the distance: Mapping host galaxies of LIGO and Virgo sources in three dimensions using local cosmography and targeted follow-up

GW150914 claimed the title of many firsts—it was the first direct observation of gravitational waves, the first observation of a binary black hole system, the first observation of two black holes merging, the first time time we’ve tested general relativity in such extreme conditions… However, there are still many firsts for gravitational-wave astronomy yet to come (hopefully, some to be accompanied by cake). One of the most sought after, is the first is signal to have a clear electromagnetic counterpart—a glow in some part of the spectrum of light (from radio to gamma-rays) that we can observe with telescopes.

Identifying a counterpart is challenging, as it is difficult to accurately localise a gravitational-wave source. electromagnetic observers must cover a large area of sky before any counterparts fade. Then, if something is found, it can be hard to determine if that is from the same source as the gravitational waves, or some thing else…

To help the search, it helps to have as much information as possible about the source. Especially useful is the distance to the source. This can help you plan where to look. For nearby sources, you can cross-reference with galaxy catalogues, and perhaps pick out the biggest galaxies as the most likely locations for the source [bonus note]. Distance can also help plan your observations: you might want to start with regions of the sky where the source would be closer and so easiest to spot, or you may want to prioritise points where it is further and so you’d need to observe longer to detect it (I’m not sure there’s a best strategy, it depends on the telescope and the amount of observing time available). In this paper we describe a method to provide easy-to-use distance information, which could be supplied to observers to help their search for a counterpart.

Going the distance

This work is the first spin-off from the First 2 Years trilogy of papers, which looked a sky localization and parameter estimation for binary neutron stars in the first two observing runs of the advance-detector era. Binary neutron star coalescences are prime candidates for electromagnetic counterparts as we think there should be a bigger an explosion as they merge. I was heavily involved in the last two papers of the trilogy, but this study was led by Leo Singer: I think I mostly annoyed Leo by being a stickler when it came to writing up the results.

3D localization with the two LIGO detectors

Three-dimensional localization showing the 20%, 50%, and 90% credible levels for a typical two-detector early Advanced LIGO event. The Earth is shown at the centre, marked by \oplus. The true location is marked by the cross. Leo poetically described this as looking like the seeds of the jacaranda tree, and less poetically as potato chips. Figure 1 of Singer et al. (2016).

The idea is to provide a convenient means of sharing a 3D localization for a gravitational wave source. The full probability distribution is rather complicated, but it can be made more manageable if you break it up into pixels on the sky. Since astronomers need to decide where to point their telescopes, breaking up the 3D information along different lines of sight, should be useful for them.

Each pixel covers a small region of the sky, and along each line of sight, the probability distribution for distance D can be approximated using an ansatz

\displaystyle p(D|\mathrm{data}) \propto D^2\exp\left[-\frac{(D - \mu)^2}{2\sigma}\right],

where \mu and \sigma are calculated for each pixel individually.  The form of this ansatz can be understood as the posterior probability distribution is proportional to the product of the prior and the likelihood. Our prior is that sources are uniformly distributed in volume, which means \propto D^2, and the likelihood can often be well approximated as a Gaussian distribution, which gives the other piece [bonus note].

The ansatz doesn’t always fit perfectly, but it performs well on average. Considering the catalogue of binary neutron star signals used in the earlier papers, we find that roughly 50% of the time sources are found within the 50% credible volume, 90% are found in the 90% volume, etc. We looked at a more sophisticated means of constructing the localization volume in a companion paper.

The 3D localization is easy to calculate, and Leo has worked out a cunning way to evaluate the ansatz with BAYESTAR, our rapid sky-localization code, meaning that we can produce it on minute time-scales. This means that observers should have something to work with straight-away, even if we’ll need to wait a while for the full, final results. We hope that this will improve prospects for finding counterparts—some potential examples are sketched out in the penultimate section of the paper.

If you are interested in trying out the 3D information, there is a data release and the supplement contains a handy Python tutorial. We are hoping that the Collaboration will use the format for alerts for LIGO and Virgo’s upcoming observing run (O2).

arXiv: 1603.07333 [astro-ph.HE]; 1605.04242 [astro-ph.IM]
Journal: Astrophysical Journal Letters; 829(1):L15(7); 2016; Astrophysical Journal Supplement Series; 226(1):10(8); 2016
Data release: Going the distance
Favourite crisp flavour: Salt & vinegar
Favourite jacaranda: Jacaranda mimosifolia

Bonus notes

Catalogue shopping

The Event’s source has a luminosity distance of around 250–570 Mpc. This is sufficiently distant that galaxy catalogues are incomplete and not much use when it comes to searching. GW151226 and LVT151012 have similar problems, being at around the same distance or even further.

The gravitational-wave likelihood

For the professionals interested in understanding more about the shape of the likelihood, I’d recommend Cutler & Flanagan (1994). This is a fantastic paper which contains many clever things [bonus bonus note]. This work is really the foundation of gravitational-wave parameter estimation. From it, you can see how the likelihood can be approximated as a Gaussian. The uncertainty can then be evaluated using Fisher matrices. Many studies have been done using Fisher matrices, but it important to check that this is a valid approximation, as nicely explained in Vallisneri (2008). I ran into a case when it didn’t during my PhD.

Mergin’

As a reminder that smart people make mistakes, Cutler & Flanagan have a typo in the title of arXiv posting of their paper. This is probably the most important thing to take away from this paper.

Parameter estimation on gravitational waves from neutron-star binaries with spinning components

blIn gravitation-wave astronomy, some parameters are easier to measure than others. We are sensitive to properties which change the form of the wave, but sometimes the effect of changing one parameter can be compensated by changing another. We call this a degeneracy. In signals for coalescing binaries (two black holes or neutron stars inspiralling together), there is a degeneracy between between the masses and spins. In this recently published paper, we look at what this means for observing binary neutron star systems.

History

This paper has been something of an albatross, and I’m extremely pleased that we finally got it published. I started working on it when I began my post-doc at Birmingham in 2013. Back then I was sharing an office with Ben Farr, and together with others in the Parameter Estimation Group, we were thinking about the prospect of observing binary neutron star signals (which we naively thought were the most likely) in LIGO’s first observing run.

One reason that this work took so long is that binary neutron star signals can be computationally expensive to analyse [bonus note]. The signal slowly chirps up in frequency, and can take up to a minute to sweep through the range of frequencies LIGO is sensitive to. That gives us a lot of gravitational wave to analyse. (For comparison, GW150914 lasted 0.2 seconds). We need to calculate waveforms to match to the observed signals, and these can be especially complicated when accounting for the effects of spin.

A second reason is shortly after submitting the paper in August 2015, we got a little distracted

This paper was the third of a trilogy look at measuring the properties of binary neutron stars. I’ve written about the previous instalment before. We knew that getting the final results for binary neutron stars, including all the important effects like spin, would take a long time, so we planned to follow up any detections in stages. A probable sky location can be computed quickly, then we can have a first try at estimating other parameters like masses using waveforms that don’t include spin, then we go for the full results with spin. The quicker results would be useful for astronomers trying to find any explosions that coincided with the merger of the two neutron stars. The first two papers looked at results from the quicker analyses (especially at sky localization); in this one we check what effect neglecting spin has on measurements.

What we did

We analysed a population of 250 binary neutron star signals (these are the same as the ones used in the first paper of the trilogy). We used what was our best guess for the sensitivity of the two LIGO detectors in the first observing run (which was about right).

The simulated neutron stars all have small spins of less than 0.05 (where 0 is no spin, and 1 would be the maximum spin of a black hole). We expect neutron stars in these binaries to have spins of about this range. The maximum observed spin (for a neutron star not in a binary neutron star system) is around 0.4, and we think neutron stars should break apart for spins of 0.7. However, since we want to keep an open mind regarding neutron stars, when measuring spins we considered spins all the way up to 1.

What we found

Our results clearly showed the effect of the mass–spin degeneracy. The degeneracy increases the uncertainty for both the spins and the masses.

Even though the true spins are low, we find that across the 250 events, the median 90% upper limit on the spin of the more massive (primary) neutron star is 0.70, and the 90% limit on the less massive (secondary) neutron star is 0.86. We learn practically nothing about the spin of the secondary, but a little more about the spin of the primary, which is more important for the inspiral. Measuring spins is hard.

The effect of the mass–spin degeneracy for mass measurements is shown in the plot below. Here we show a random selection of events. The banana-shaped curves are the 90% probability intervals. They are narrow because we can measure a particular combination of masses, the chirp mass, really well. The mass–spin degeneracy determines how long the banana is. If we restrict the range of spins, we explore less of the banana (and potentially introduce an offset in our results).

Neutron star mass distributions

Rough outlines for 90% credible regions for component masses for a random assortments of signals. The circles show the true values. The coloured lines indicate the extent of the distribution with different limits on the spins. The grey area is excluded from our convention on masses m_1 \geq m_2. Figure 5 from Farr et al. (2016).

Although you can’t see it in the plot above, including spin does also increase the uncertainty in the chirp mass too. The plots below show the standard deviation (a measure width of the posterior probability distribution), divided by the mean for several mass parameters. This gives a measure of the fractional uncertainty in our measurements. We show the chirp mass \mathcal{M}_\mathrm{c}, the mass ratio q = m_2/m_1 and the total mass M = m_1 + m_2, where m_1 and m_2 are the masses of the primary and secondary neutron stars respectively. The uncertainties are small for louder signals (higher signal-to-noise ratio). If we neglect the spin, the true chirp mass can lie outside the posterior distribution, the average is about 5 standard deviations from the mean, but if we include spin, the offset is just 0.7 from the mean (there’s still some offset as we’re allowing for spins all the way up to 1).

Mass measurements for binary neutron stars with and without spin

Fractional statistical uncertainties in chirp mass (top), mass ratio (middle) and total mass (bottom) estimates as a function of network signal-to-noise ratio for both the fully spinning analysis and the quicker non-spinning analysis. The lines indicate approximate power-law trends to guide the eye. Figure 2 of Farr et al. (2016).

We need to allow for spins when measuring binary neutron star masses in order to explore for the possible range of masses.

Sky localization and distance, however, are not affected by the spins here. This might not be the case for sources which are more rapidly spinning, but assuming that binary neutron stars do have low spin, we are safe using the easier-to-calculate results. This is good news for astronomers who need to know promptly where to look for explosions.

arXiv: 1508.05336 [astro-ph.HE]
Journal: Astrophysical Journal825(2):116(10); 2016
Authorea [bonus note]: Parameter estimation on gravitational waves from neutron-star binaries with spinning components
Conference proceedings:
 Early Advanced LIGO binary neutron-star sky localization and parameter estimation
Favourite albatross:
 Wilbur

Bonus notes

How long?

The plot below shows how long it took to analyse each of the binary neutron star signals.

Run time for different analyses of binary neutron stars

Distribution of run times for binary neutron star signals. Low-latency sky localization is done with BAYESTAR; medium-latency non-spinning parameter estimation is done with LALInference and TaylorF2 waveforms, and high-latency fully spinning parameter estimation is done with LALInference and SpinTaylorT4 waveforms. The LALInference results are for 2000 posterior samples. Figure 9 from Farr et al. (2016).

BAYESTAR provides a rapid sky localization, taking less than ten seconds. This is handy for astronomers who want to catch a flash caused by the merger before it fades.

Estimates for the other parameters are computed with LALInference. How long this takes to run depends on which waveform you are using and how many samples from the posterior probability distribution you want (the more you have, the better you can map out the shape of the distribution). Here we show times for 2000 samples, which is enough to get a rough idea (we collected ten times more for GW150914 and friends). Collecting twice as many samples takes (roughly) twice as long. Prompt results can be obtained with a waveform that doesn’t include spin (TaylorF2), these take about a day at most.

For this work, we considered results using a waveform which included the full effects of spin (SpinTaylorT4). These take about twenty times longer than the non-spinning analyses. The maximum time was 172 days. I have a strong suspicion that the computing time cost more than my salary.

Gravitational-wave arts and crafts

Waiting for LALInference runs to finish gives you some time to practise hobbies. This is a globe knitted by Hannah. The two LIGO sites marked in red, and a typical gravitational-wave sky localization stitched on.

In order to get these results, we had to add check-pointing to our code, so we could stop it and restart it; we encountered a new type of error in the software which manages jobs running on our clusters, and Hannah Middleton and I got several angry emails from cluster admins (who are wonderful people) for having too many jobs running.

In comparison, analysing GW150914, LVT151012 and GW151226 was a breeze. Grudgingly, I have to admit that getting everything sorted out for this study made us reasonably well prepared for the real thing. Although, I’m not looking forward to that first binary neutron star signal…

Authorea

Authorea is an online collaborative writing service. It allows people to work together on documents, editing text, adding comments, and chatting with each other. By the time we came to write up the paper, Ben was no longer in Birmingham, and many of our coauthors are scattered across the globe. Ben thought Authorea might be useful for putting together the paper.

Writing was easy, and the ability to add comments on the text was handy for getting feedback from coauthors. The chat was going for quickly sorting out issues like plots. Overall, I was quite pleased, up to the point we wanted to get the final document. Extracted a nicely formatted PDF was awkward. For this I switched to using the Github back-end. On reflection, a simple git repo, plus a couple of Skype calls might have been a smoother way of writing, at least for a standard journal article.

Authorea promises to be an open way of producing documents, and allows for others to comment on papers. I don’t know if anyone’s looked at our Authorea article. For astrophysics, most people use the arXiv, which is free to everyone, and I’m not sure if there’s enough appetite for interaction (beyond the occasional email to authors) to motivate people to look elsewhere. At least, not yet.

In conclusion, I think Authorea is a nice idea, and I would try out similar collaborative online writing tools again, but I don’t think I can give it a strong recommendation for your next paper unless you have a particular idea in mind of how to make the most of it.

Testing general relativity using golden black-hole binaries

Binary black hole mergers are the ultimate laboratory for testing gravity. The gravitational fields are strong, and things are moving at close to the speed of light. these extreme conditions are exactly where we expect our theories could breakdown, which is why we were so exciting by detecting gravitational waves from black hole coalescences. To accompany the first detection of gravitational waves, we performed several tests of Einstein’s theory of general relativity (it passed). This paper outlines the details of one of the tests, one that can be extended to include future detections to put Einstein’s theory to the toughest scrutiny.

One of the difficulties of testing general relativity is what do you compare it to? There are many alternative theories of gravity, but only a few of these have been studied thoroughly enough to give an concrete idea of what a binary black hole merger should look like. Even if general relativity comes out on top when compared to one alternative model, it doesn’t mean that another (perhaps one we’ve not thought of yet) can be ruled out. We need ways of looking for something odd, something which hints that general relativity is wrong, but doesn’t rely on any particular alternative theory of gravity.

The test suggested here is a consistency test. We split the gravitational-wave signal into two pieces, a low frequency part and a high frequency part, and then try to measure the properties of the source from the two parts. If general relativity is correct, we should get answers that agree; if it’s not, and there’s some deviation in the exact shape of the signal at different frequencies, we can get different answers. One way of thinking about this test is imagining that we have two experiments, one where we measure lower frequency gravitational waves and one where we measure higher frequencies, and we are checking to see if their results agree.

To split the waveform, we use a frequency around that of the last stable circular orbit: about the point that the black holes stop orbiting about each other and plunge together and merge [bonus note]. For GW150914, we used 132 Hz, which is about the same as the C an octave below middle C (a little before time zero in the simulation below). This cut roughly splits the waveform into the low frequency inspiral (where the two black hole are orbiting each other), and the higher frequency merger (where the two black holes become one) and ringdown (where the final black hole settles down).

We are fairly confident that we understand what goes on during the inspiral. This is similar physics to where we’ve been testing gravity before, for example by studying the orbits of the planets in the Solar System. The merger and ringdown are more uncertain, as we’ve never before probed these strong and rapidly changing gravitational fields. It therefore seems like a good idea to check the two independently [bonus note].

We use our parameter estimation codes on the two pieces to infer the properties of the source, and we compare the values for the mass M_f and spin \chi_f of the final black hole. We could use other sets of parameters, but this pair compactly sum up the properties of the final black hole and are easy to explain. We look at the difference between the estimated values for the mass and spin, \Delta M_f and \Delta \chi_f, if general relativity is a good match to the observations, then we expect everything to match up, and \Delta M_f and \Delta \chi_f to be consistent with zero. They won’t be exactly zero because we have noise in the detector, but hopefully zero will be within the uncertainty region [bonus note]. An illustration of the test is shown below, including one of the tests we did to show that it does spot when general relativity is not correct.

Consistency test resuls

Results from the consistency test. The top panels show the outlines of the 50% and 90% credible levels for the low frequency (inspiral) part of the waveform, the high frequency (merger–ringdown) part, and the entire (inspiral–merger–ringdown, IMR) waveform. The bottom panel shows the fractional difference between the high and low frequency results. If general relativity is correct, we expect the distribution to be consistent with (0,0), indicated by the cross (+). The left panels show a general relativity simulation, and the right panel shows a waveform from a modified theory of gravity. Figure 1 of Ghosh et al. (2016).

A convenient feature of using \Delta M_f and \Delta \chi_f to test agreement with relativity, is that you can combine results from multiple observations. By averaging over lots of signals, you can reduce the uncertainty from noise. This allows you to pin down whether or not things really are consistent, and spot smaller deviations (we could get precision of a few percent after about 100 suitable detections). I look forward to seeing how this test performs in the future!

arXiv: 1602.02453 [gr-qc]
Journal: Physical Review D; 94(2):021101(6); 2016
Favourite golden thing: Golden syrup sponge pudding

Bonus notes

Review

I became involved in this work as a reviewer. The LIGO Scientific Collaboration is a bit of a stickler when it comes to checking its science. We had to check that the test was coded up correctly, that the results made sense, and that calculations done and written up for GW150914 were all correct. Since most of the team are based in India [bonus note], this involved some early morning telecons, but it all went smoothly.

One of our checks was that the test wasn’t sensitive to exact frequency used to split the signal. If you change the frequency cut, the results from the two sections do change. If you lower the frequency, then there’s less of the low frequency signal and the measurement uncertainties from this piece get bigger. Conversely, there’ll be more signal in the high frequency part and so we’ll make a more precise measurement of the parameters from this piece. However, the overall results where you combine the two pieces stay about the same. You get best results when there’s a roughly equal balance between the two pieces, but you don’t have to worry about getting the cut exactly on the innermost stable orbit.

Golden binaries

In order for the test to work, we need the two pieces of the waveform to both be loud enough to allow us to measure parameters using them. This type of signals are referred to as golden. Earlier work on tests of general relativity using golden binaries has been done by Hughes & Menou (2015), and Nakano, Tanaka & Nakamura (2015). GW150914 was a golden binary, but GW151226 and LVT151012 were not, which is why we didn’t repeat this test for them.

GW150914 results

For The Event, we ran this test, and the results are consistent with general relativity being correct. The plots below show the estimates for the final mass and spin (here denoted a_f rather than \chi_f), and the fractional difference between the two measurements. The points (0,0) is at the 28% credible level. This means that if general relativity is correct, we’d expect a deviation at this level to occur around-about 72% of the time due to noise fluctuations. It wouldn’t take a particular rare realisation of noise to cause the assume true value of (0,0) to be found at this probability level, so we’re not too suspicious that something is amiss with general relativity.

GW150914 consistency test results

Results from the consistency test for The Event. The top panels final mass and spin measurements from the low frequency (inspiral) part of the waveform, the high frequency (post-inspiral) part, and the entire (IMR) waveform. The bottom panel shows the fractional difference between the high and low frequency results. If general relativity is correct, we expect the distribution to be consistent with (0,0), indicated by the cross. Figure 3 of the Testing General Relativity Paper.

The authors

Abhirup Ghosh and Archisman Ghosh were two of the leads of this study. They are both A. Ghosh at the same institution, which caused some confusion when compiling the LIGO Scientific Collaboration author list. I think at one point one of them (they can argue which) was removed as someone thought there was a mistaken duplication. To avoid confusion, they now have their full names used. This is a rare distinction on the Discovery Paper (I’ve spotted just two others). The academic tradition of using first initials plus second name is poorly adapted to names which don’t fit the typical western template, so we should be more flexible.

Inference on gravitational waves from coalescences of stellar-mass compact objects and intermediate-mass black holes

I love collecting things, there’s something extremely satisfying about completing a set. I suspect that this is one of the alluring features of Pokémon—you’ve gotta catch ’em all. The same is true of black hole hunting. Currently, we know of stellar-mass black holes which are a few times the mass of our Sun, up to a few tens of the mass of our Sun (the black holes of GW150914 are the biggest yet to be observed), and we know of supermassive black holes, which are ten thousand to ten billion times the mass our Sun. However, we are missing intermediate-mass black holes which lie in the middle. We have Charmander and Charizard, but where is Charmeleon? The elusive ones are always the most satisfying to capture.

Knitted black hole

Adorable black hole (available for adoption). I’m sure this could be a Pokémon. It would be a Dark type. Not that I’ve given it that much thought…

Intermediate-mass black holes have evaded us so far. We’re not even sure that they exist, although that would raise questions about how you end up with the supermassive ones (you can’t just feed the stellar-mass ones lots of rare candy). Astronomers have suggested that you could spot intermediate-mass black holes in globular clusters by the impact of their gravity on the motion of other stars. However, this effect would be small, and near impossible to conclusively spot. Another way (which I’ve discussed before), would to be to look at ultra luminous X-ray sources, which could be from a disc of material spiralling into the black hole.  However, it’s difficult to be certain that we understand the source properly and that we’re not misclassifying it. There could be one sure-fire way of identifying intermediate-mass black holes: gravitational waves.

The frequency of gravitational waves depend upon the mass of the binary. More massive systems produce lower frequencies. LIGO is sensitive to the right range of frequencies for stellar-mass black holes. GW150914 chirped up to the pitch of a guitar’s open B string (just below middle C). Supermassive black holes produce gravitational waves at too low frequency for LIGO (a space-based detector would be perfect for these). We might just be able to detect signals from intermediate-mass black holes with LIGO.

In a recent paper, a group of us from Birmingham looked at what we could learn from gravitational waves from the coalescence of an intermediate-mass black hole and a stellar-mass black hole [bonus note].  We considered how well you would be able to measure the masses of the black holes. After all, to confirm that you’ve found an intermediate-mass black hole, you need to be sure of its mass.

The signals are extremely short: we only can detect the last bit of the two black holes merging together and settling down as a final black hole. Therefore, you might think there’s not much information in the signal, and we won’t be able to measure the properties of the source. We found that this isn’t the case!

We considered a set of simulated signals, and analysed these with our parameter-estimation code [bonus note]. Below are a couple of plots showing the accuracy to which we can infer a couple of different mass parameters for binaries of different masses. We show the accuracy of measuring the chirp mass \mathcal{M} (a much beloved combination of the two component masses which we are usually able to pin down precisely) and the total mass M_\mathrm{total}.

Measurement of chirp mass

Measured chirp mass for systems of different total masses. The shaded regions show the 90% credible interval and the dashed lines show the true values. The mass ratio q is the mass of the stellar-mass black hole divided by the mass of the intermediate-mass black hole. Figure 1 of Haster et al. (2016).

Measurement of total mass

Measured total mass for systems of different total masses. The shaded regions show the 90% credible interval and the dashed lines show the true values. Figure 2 of Haster et al. (2016).

For the lower mass systems, we can measure the chirp mass quite well. This is because we get a little information from the part of the gravitational wave from when the two components are inspiralling together. However, we see less and less of this as the mass increases, and we become more and more uncertain of the chirp mass.

The total mass isn’t as accurately measured as the chirp mass at low masses, but we see that the accuracy doesn’t degrade at higher masses. This is because we get some constraints on its value from the post-inspiral part of the waveform.

We found that the transition from having better fractional accuracy on the chirp mass to having better fractional accuracy on the total mass happened when the total mass was around 200–250 solar masses. This was assuming final design sensitivity for Advanced LIGO. We currently don’t have as good sensitivity at low frequencies, so the transition will happen at lower masses: GW150914 is actually in this transition regime (the chirp mass is measured a little better).

Given our uncertainty on the masses, when can we conclude that there is an intermediate-mass black hole? If we classify black holes with masses more than 100 solar masses as intermediate mass, then we’ll be able to say to claim a discovery with 95% probability if the source has a black hole of at least 130 solar masses. The plot below shows our inferred probability of there being an intermediate-mass black hole as we increase the black hole’s mass (there’s little chance of falsely identifying a lower mass black hole).

Intermediate-mass black hole probability

Probability that the larger black hole is over 100 solar masses (our cut-off mass for intermediate-mass black holes M_\mathrm{IMBH}). Figure 7 of Haster et al. (2016).

Gravitational-wave observations could lead to a concrete detection of intermediate mass black holes if they exist and merge with another black hole. However, LIGO’s low frequency sensitivity is important for detecting these signals. If detector commissioning goes to plan and we are lucky enough to detect such a signal, we’ll finally be able to complete our set of black holes.

arXiv: 1511.01431 [astro-ph.HE]
Journal: Monthly Notices of the Royal Astronomical Society457(4):4499–4506; 2016
Birmingham science summary: Inference on gravitational waves from coalescences of stellar-mass compact objects and intermediate-mass black holes (by Carl)
Other collectables: Breakthrough, Gruber, Shaw, Kavli

Bonus notes

Jargon

The coalescence of an intermediate-mass black hole and a stellar-mass object (black hole or neutron star) has typically been known as an intermediate mass-ratio inspiral (an IMRI). This is similar to the name for the coalescence of a a supermassive black hole and a stellar-mass object: an extreme mass-ratio inspiral (an EMRI). However, my colleague Ilya has pointed out that with LIGO we don’t really see much of the intermediate-mass black hole and the stellar-mass black hole inspiralling together, instead we see the merger and ringdown of the final black hole. Therefore, he prefers the name intermediate mass-ratio coalescence (or IMRAC). It’s a better description of the signal we measure, but the acronym isn’t as good.

Parameter-estimation runs

The main parameter-estimation analysis for this paper was done by Zhilu, a summer student. This is notable for two reasons. First, it shows that useful research can come out of a summer project. Second, our parameter-estimation code installed and ran so smoothly that even an undergrad with no previous experience could get some useful results. This made us optimistic that everything would work perfectly in the upcoming observing run (O1). Unfortunately, a few improvements were made to the code before then, and we were back to the usual level of fun in time for The Event.

Parameter estimation for binary neutron-star coalescences with realistic noise during the Advanced LIGO era

The first observing run (O1) of Advanced LIGO is nearly here, and with it the prospect of the first direct detection of gravitational waves. That’s all wonderful and exciting (far more exciting than a custard cream or even a chocolate digestive), but there’s a lot to be done to get everything ready. Aside from remembering to vacuum the interferometer tubes and polish the mirrors, we need to see how the data analysis will work out. After all, having put so much effort into the detector, it would be shame if we couldn’t do any science with it!

Parameter estimation

Since joining the University of Birmingham team, I’ve been busy working on trying to figure out how well we can measure things using gravitational waves. I’ve been looking at binary neutron star systems. We expect binary neutron star mergers to be the main source of signals for Advanced LIGO. We’d like to estimate how massive the neutron stars are, how fast they’re spinning, how far away they are, and where in the sky they are. Just published is my first paper on how well we should be able to measure things. This took a lot of hard work from a lot of people, so I’m pleased it’s all done. I think I’ve earnt a celebratory biscuit. Or two.

When we see something that looks like it could be a gravitational wave, we run code to analyse the data and try to work out the properties of the signal. Working out some properties is a bit trickier than others. Sadly, we don’t have an infinite number of computers, so it means it can take a while to get results. Much longer than the time to eat a packet of Jaffa Cakes…

The fastest algorithm we have for binary neutron stars is BAYESTAR. This takes the same time as maybe eating one chocolate finger. Perhaps two, if you’re not worried about the possibility of choking. BAYESTAR is fast as it only estimates where the source is coming from. It doesn’t try to calculate a gravitational-wave signal and match it to the detector measurements, instead it just looks at numbers produced by the detection pipeline—the code that monitors the detectors and automatically flags whenever something interesting appears. As far as I can tell, you give BAYESTAR this information and a fresh cup of really hot tea, and it uses Bayes’ theorem to work out how likely it is that the signal came from each patch of the sky.

To work out further details, we need to know what a gravitational-wave signal looks like and then match this to the data. This is done using a different algorithm, which I’ll refer to as LALInference. (As names go, this isn’t as cool as SKYNET). This explores parameter space (hopping between different masses, distances, orientations, etc.), calculating waveforms and then working out how well they match the data, or rather how likely it is that we’d get just the right noise in the detector to make the waveform fit what we observed. We then use another liberal helping of Bayes’ theorem to work out how probable those particular parameter values are.

It’s rather difficult to work out the waveforms, but some our easier than others. One of the things that makes things trickier is adding in the spins of the neutron stars. If you made a batch of biscuits at the same time you started a LALInference run, they’d still be good by the time a non-spinning run finished. With a spinning run, the biscuits might not be quite so appetising—I generally prefer more chocolate than penicillin on my biscuits. We’re working on speeding things up (if only to prevent increased antibiotic resistance).

In this paper, we were interested in what you could work out quickly, while there’s still chance to catch any explosion that might accompany the merging of the neutron stars. We think that short gamma-ray bursts and kilonovae might be caused when neutron stars merge and collapse down to a black hole. (I find it mildly worrying that we don’t know what causes these massive explosions). To follow-up on a gravitational-wave detection, you need to be able to tell telescopes where to point to see something and manage this while there’s still something that’s worth seeing. This means that using spinning waveforms in LALInference is right out, we just use BAYESTAR and the non-spinning LALInference analysis.

What we did

To figure out what we could learn from binary neutron stars, we generated a large catalogue of fakes signals, and then ran the detection and parameter-estimation codes on this to see how they worked. This has been done before in The First Two Years of Electromagnetic Follow-Up with Advanced LIGO and Virgo which has a rather delicious astrobites write-up. Our paper is the sequel to this (and features most of the same cast). One of the differences is that The First Two Years assumed that the detectors were perfectly behaved and had lovely Gaussian noise. In this paper, we added in some glitches. We took some real data™ from initial LIGO’s sixth science run and stretched this so that it matches the sensitivity Advanced LIGO is expected to have in O1. This process is called recolouring [bonus note]. We now have fake signals hidden inside noise with realistic imperfections, and can treat it exactly as we would real data. We ran it through the detection pipeline, and anything which was flagged as probably being a signal (we used a false alarm rate of once per century), was analysed with the parameter-estimation codes. We looked at how well we could measure the sky location and distance of the source, and the masses of the neutron stars. It’s all good practice for O1, when we’ll be running this analysis on any detections.

What we found

  1. The flavour of noise (recoloured or Gaussian) makes no difference to how well we can measure things on average.
  2. Sky-localization in O1 isn’t great, typically hundreds of square degrees (the median 90% credible region is 632 deg2), for comparison, the Moon is about a fifth of a square degree. This’ll make things interesting for the people with telescopes.

    Sky localization map for O1.

    Probability that of a gravitational-wave signal coming from different points on the sky. The darker the red, the higher the probability. The star indicates the true location. This is one of the worst localized events from our study for O1. You can find more maps in the data release (including 3D versions), this is Figure 6 of Berry et al. (2015).

  3. BAYESTAR does just as well as LALInference, despite being about 2000 times faster.

    Sky localization for binary neutron stars during O1.

    Sky localization (the size of the patch of the sky that we’re 90% sure contains the source location) varies with the signal-to-noise ratio (how loud the signal is). The approximate best fit is \log_{10}(\mathrm{CR}_{0.9}/\mathrm{deg^2}) \approx -2 \log_{10}(\varrho) +5.06, where \mathrm{CR}_{0.9} is the 90% sky area and \varrho is the signal-to-noise ratio. The results for BAYESTAR and LALInference agree, as do the results with Gaussian and recoloured noise. This is Figure 9 of Berry et al. (2015).

  4. We can’t measure the distance too well: the median 90% credible interval divided by the true distance (which gives something like twice the fractional error) is 0.85.
  5. Because we don’t include the spins of the neutron stars, we introduce some error into our mass measurements. The chirp mass, a combination of the individual masses that we’re most sensitive to [bonus note], is still reliably measured (the median offset is 0.0026 of the mass of the Sun, which is tiny), but we’ll have to wait for the full spinning analysis for individual masses.

    Mean offset in chirp-mass estimates when not including the effects of spin.

    Fraction of events with difference between the mean estimated and true chirp mass smaller than a given value. There is an error because we are not including the effects of spin, but this is small. Again, the type of noise makes little difference. This is Figure 15 of Berry et al. (2015).

There’s still some work to be done before O1, as we need to finish up the analysis with waveforms that include spin. In the mean time, our results are all available online for anyone to play with.

arXiv: 1411.6934 [astro-ph.HE]
Journal: Astrophysical Journal; 904(2):114(24); 2015
Data release: The First Two Years of Electromagnetic Follow-Up with Advanced LIGO and Virgo
Favourite colour: Blue. No, yellow…

Notes

The colour of noise: Noise is called white if it doesn’t have any frequency dependence. We made ours by taking some noise with initial LIGO’s frequency dependence (coloured noise), removing the frequency dependence (making it white), and then adding in the frequency dependence of Advanced LIGO (recolouring it).

The chirp mass: Gravitational waves from a binary system depend upon the masses of the components, we’ll call these m_1 and m_2. The chirp mass is a combination these that we can measure really well, as it determines the most significant parts of the shape of the gravitational wave. It’s given by

\displaystyle \mathcal{M} = \frac{m_1^{3/5} m_2^{3/5}}{(m_1 + m_2)^{1/5}}.

We get lots of good information on the chirp mass, unfortunately, this isn’t too useful for turning back into the individual masses. For that we next extra information, for example the mass ratio m_2/m_1. We can get this from less dominant parts of the waveform, but it’s not typically measured as precisely as the chirp mass, so we’re often left with big uncertainties.