# An introduction to LIGO–Virgo data analysis

LIGO and Virgo make their data open for anyone to try analysing [bonus note]. If you’re a student looking for a project, a teacher planning a class activity, or a scientist working on a paper, this data is waiting for you to use. Understanding how to analyse the data can be tricky. In this post, I’ll share some of the resources made by LIGO and Virgo to help introduce gravitational-wave analysis. These papers together should give you a good grounding in how to get started working with gravitational-wave data.

If you’d like a more in-depth understanding, I’d recommend visiting your local library for Michele Maggiore’s  Gravitational Waves: Volume 1.

#### The Data Analysis Guide

Title: A guide to LIGO-Virgo detector noise and extraction of transient gravitational-wave signals
arXiv:
1908.11170 [gr-qc]
Journal: Classical & Quantum Gravity; 37(5):055002(54); 2020
Tutorial notebook: GitHub;  Google Colab; Binder
Code repository: Data Guide
LIGO science summary: A guide to LIGO-Virgo detector noise and extraction of transient gravitational-wave signals

It took many decades to develop the technology necessary to build gravitational-wave detectors. Similarly, gravitational-wave data analysis has developed over many decades—I’d say LIGO analysis was really kicked off in the early 1990s by Kipp Thorne’s group. There are now hundreds of papers on various aspects of gravitational-wave analysis. If you are new to the area, where should you start? Don’t panic! For the binary sources discovered so far, this Data Analysis Guide has you covered.

More details: The Data Analysis Guide

#### The GWOSC Paper

Title: Open data from the first and second observing runs of Advanced LIGO and Advanced Virgo
arXiv:
1912.11716 [gr-qc]
Journal: SoftwareX; 13:100658(20); 2021
Website: Gravitational Wave Open Science Center
LIGO science summary: Open data from the first and second observing runs of Advanced LIGO and Advanced Virgo

Data from the LIGO and Virgo detectors is released by the Gravitational Wave Open Science Center (GWOSC, pronounced, unfortunately, as it is spelt). If you want to try analysing our delicious data yourself, either searching for signals or studying the signals we have found, GWOSC is the place to start. This paper outlines how these data are produced, going from our laser interferometers to your hard-drive. The paper specifically looks at the data released for our first and second observing runs (O1 and O2), however, GWOSC also host data from the initial detectors’ fifth science run (S5) and sixth science run (S6), and will be updated with new data in the future.

If you do use data from GWOSC, please remember to say thank you.

More details: The GWOSC Paper

I thought I saw a 2! Credit: Fox

### The Data Analysis Guide

Synopsis: Data Analysis Guide
Read this if: You want an introduction to signal analysis
Favourite part: This is a great resource for new students [bonus note]

Gravitational-wave detectors measure ripples in spacetime. They record a simple time series of the stretching and squeezing of space as a gravitational wave passes. Well, they measure that, plus a whole lot of noise. Most of the time it is just noise. How do we go from this time series to discoveries about the Universe’s black holes and neutron stars? This paper gives the outline, it covers (in order)

1. An introduction to observations at the time of writing
2. The basics of LIGO and Virgo data—what it is the we analyse
3. The basics of detector noise—how we describe sources of noise in our data
4. Fourier analysis—how we go from time series to looking at the data in the as a function of frequency, which is the most natural way to analyse that data.
5. Time–frequency analysis and stationarity—how we check the stability of data from our detectors
6. Detector calibration and data quality—how we make sure we have good quality data
7. The noise model and likelihood—how we use our understanding of the noise, under the assumption of it being stationary, to work out the likelihood of different signals being in the data
8. Signal detection—how we identify times in the data which have a transient signal present
9. Inferring waveform and physical parameters—how we estimate the parameters of the source of a gravitational wave
10. Residuals around GW150914—a consistency check that we have understood the noise surrounding our first detection

The paper works through things thoroughly, and I would encourage you to work though it if you are interested.

I won’t summarise everything here, I want to focus the (roughly undergraduate-level) foundations of how we do our analysis in the frequency domain. My discussion of the GWOSC Paper goes into more detail on the basics of LIGO and Virgo data, and some details on calibration and data quality. I’ll leave talking about residuals to this bonus note, as it involves a long tangent and me needing to lie down for a while.

#### Fourier analysis

The signal our detectors measure is a time series $d(t)$. This is may just contain noise, $d(t) = n(t)$, or it may also contain a signal, $d(t) = n(t) + h(t)$.

There are many sources of noise for our detectors. The different sources can affect different frequencies. If we assume that the noise is stationary, so that it’s properties don’t change with time, we can simply describe the properties of the noise with the power spectral density $S_n(f)$. On average we expect the noise at a given frequency to be zero, but with it fluctuating up and down with a variance given by the power spectral density. We typically approximate the noise as Gaussian, such that

$n(f) \sim \mathcal{N}(0; S_n(f)/2)$,

where we use $\mathcal{N}(\mu; \sigma^2)$ to represent a normal distribution with mean $\mu$ and standard deviation $\sigma$. The approximations of stationary and Gaussian noise are good most of the time. The noise does vary over time, but is usually effectively stationary over the durations we look at for a signal. The noise is also mostly Gaussian except for glitches. These are taken into account when we search for signals, but we’ll ignore them for now. The statistical description of the noise in terms of the power spectral density allows us to understand our data, but this understanding comes as a function of frequency: we must transform of time domain data to frequency domain data.

The go from $d(t)$ to $d(f)$ we can use a Fourier transform. Fourier transforms are a way of converting a function of one variable into a function of a reciprocal variable—in the case of time you convert to frequency. Fourier transforms encode all the information of the original function, so it is possible to convert back and forth as you like. Really, a Fourier transform is just another way of looking at the same function.

The Fourier transform is defined as

$d(f) = \mathcal{F}_f\left\{d(t)\right\} = \int_{-\infty}^{\infty} d(t) \exp(-2\pi i f t) \,\mathrm{d}t$.

Now, from this you might notice a problem when it comes to real data analysis, namely that the integral is defined over an infinite amount of time. We don’t have that much data. Instead, we only have a short period.

We could recast the integral above over a shorter time if instead of taking the Fourier transform of $d(t)$, we take the Fourier transform of $d(t) \times w(t)$ where $w(t)$ is some window function which goes to zero outside of the time interval we are looking at. What we end up with is a convolution of the function we want with the Fourier transform of the window function,

$\mathcal{F}_f\left\{d(t)w(t)\right\} = d(f) \ast w(f)$.

It is important to pick a window function which minimises the distortion to the signal that we want. If we just take a tophat (also known as a boxcar or rectangular, possibly on account of its infamous criminal background) function which is abruptly cuts off the data at the ends of the time interval, we find that $w(f)$ is a sinc function. This is not a good thing, as it leads to all sorts of unwanted correlations between different frequencies, commonly known as spectral leakage. A much better choice is a function which smoothly tapers to zero at the edges. Using a tapering window, we lose a little data at the edges (we need to be careful choosing the length of the data analysed), but we can avoid the significant nastiness of spectral leakage. A tapering window function should always be used. Then or finite-time Fourier transform is then a good approximation to the exact $d(f)$.

Data processing to reveal GW150914. The top panel shows raw Hanford data. The second panel shows a window function being applied. The third panel shows the data after being whitened. This cleans up the data, making it easier to pick out the signal from all the low frequency noise. The bottom panel shows the whitened data after a bandpass filter is applied to pick out the signal. We don’t use the bandpass filter in our analysis (it is just for illustration), but the other steps reflect how we treat our data. Figure 2 of the Data Analysis Guide.

Now we have our data in the frequency domain, it is simple enough to compare the data to the expected noise a t a given frequency. If we measure something loud at a frequency with lots of noise we should be less surprised than if we measure something loud at a frequency which is usually quiet. This is kind of like how somewhat shouting is less startling at a rock concert than in a library. The appropriate way to weight is to divide by the square root of power spectral density $d_\mathrm{w}(f) \propto d(f)/[S_n(f)]^{1/2}$. This is known as whitening. Whitened data should have equal amplitude fluctuations at all frequencies, allowing for easy comparisons.

Now we understand the statistical properties of noise we can do some analysis! We can start by testing our assumption that the data are stationary and Gaussian by checking that that after whitening we get the expected distribution. We can also define the likelihood of obtaining the data $d(t)$ given a model of a gravitational-wave signal $h(t)$, as the properties of the noise mean that $d(f) - h(f) \sim \mathcal{N}(0; S_n(f)/2)$. Combining the likelihood for each individual frequency gives the overall likelihood

$\displaystyle p(d|h) \propto \exp\left[-\int_{-\infty}^{\infty} \frac{|d(f) - h(f)|^2}{S_n(f)} \mathrm{d}f \right]$.

This likelihood is at the heart of parameter estimation, as we can work out the probability of there being a signal with a given set of parameters. The Data Analysis Guide goes through many different analyses (including parameter estimation) and demonstrates how to check that noise is nice and Gaussian.

Distribution of residuals for 4 seconds of data around GW150914 after subtracting the maximum likelihood waveform. The residuals are the whitened Fourier amplitudes, and they should be consistent with a unit Gaussian. The residuals follow the expected distribution and show no sign of non-Gaussianity. Figure 14 of the Data Analysis Guide.

#### Homework

The Data Analysis Guide contains much more material on gravitational-wave data analysis. If you wanted to delve further, there many excellent papers cited. Favourites of mine include Finn (1992); Finn & Chernoff (1993); Cutler & Flanagan (1994); Flanagan & Hughes (1998); Allen (2005), and Allen et al. (2012). I would also recommend the tutorials available from GWOSC and the lectures from the Open Data Workshops.

### The GWOSC Paper

Synopsis: GWOSC Paper
Read this if: You want to analyse our gravitational wave data
Favourite part: All the cool projects done with this data

You’re now up-to-speed with some ideas of how to analyse gravitational-wave data, you’ve made yourself a fresh cup of really hot tea, you’re ready to get to work! All you need are the data—this paper explains where this comes from.

#### Data production

The first step in getting gravitational-wave data is the easy one. You need to design a detector, convince science agencies to invest something like half a billion dollars in building one, then spend 40 years carefully researching the necessary technology and putting it all together as part of an international collaboration of hundreds of scientists, engineers and technicians, before painstakingly commissioning the instrument and operating it. For your convenience, we have done this step for you, but do feel free to try it yourself at home.

Gravitational-wave detectors like Advanced LIGO are built around an interferometer: they have two arms at right angles to each other, and we bounce lasers up and down them to measure their length. A passing gravitational wave will change the relative length of one arm relative to the other. This changes the time taken to travel along one arm compared to the other. Hence, when the two bits of light reach the output of the interferometer, they’ll have a different phase:where normally one light wave would have a peak, it’ll have a trough. This change in phase will change how light from the two arms combine together. When no gravitational wave is present, the light interferes destructively, almost cancelling out so that the output is dark. We measure the brightness of light at the output which tells us about how the length of the arms changes.

We want our detector in measure the gravitational-wave strain. That is the fractional change in length of the arms,

$\displaystyle h(t) = \frac{\Delta L(t)}{L}$,

where $\Delta L = L_x - L_y$ is the relative difference in the length of the two arms, and $L$ is the usually arm length. Since we love jargon in LIGO & Virgo, we’ll often refer to the strain as HOFT (as you would read $h(t)$ as h of t; it took me years to realise this) or DARM (differential arm measurement).

The actual output of the detector is the voltage from a photodiode measuring the intensity of the light. It is necessary to make careful calibration of the detectors. In theory this is simple: we change the position of the mirrors at the end of the arms and see how the output changes. In practise, it is very difficult. The GW150914 Calibration Paper goes into details for O1, more up-to-date descriptions are given in Cahillane et al. (2017) for LIGO and Acernese et al. (2018) for Virgo. The calibration of the detectors can drift over time, improving the calibration is one of the things we do between originally taking the data and releasing the final data.

The data are only celebrated between 10 Hz and 5 kHz, so don’t trust the data outside of that frequency range.

The next stage of our data’s journey is going through detector characterisation and data quality checks. In addition to measuring gravitational-wave strain, we record many other data channels: about 200,000 per detector. These measure all sorts of things, from the internal state of the instrument, to monitoring the physical environment around the detectors. These auxiliary channels are used to check the data quality. In some cases, an auxiliary channel will record a source of noise, like scattered light or the mains power frequency, allowing us to clean up our strain data by subtracting out this noise. In other cases, an auxiliary channel can act as a witness to a glitch in our detector, identifying when it is misbehaving so that we know not to trust that part of the data. The GW150914 Detector Characterisation Paper goes into details of how we check potential detections. In doing data quality checks we are careful to only use the auxiliary channels which record something which would be independent of a passing gravitational wave.

We have 4 flags for data quality:

1. DATA: All clear. Certified fresh. Eat as much as you like.
2. CAT1: A critical problem with the instrument. Data from these times are likely to be a dumpster fire of noise. We do not use them in our analyses, and they are currently excluded from our public releases. About 1.7% of Hanford data and 1.0% of time from Livingston was flagged with CAT1 in O1. In O2,  we got this done to 0.001% for Hanford, 0.003% for Livingston and 0.05% for Virgo.
3. CAT2: Some activity in an auxiliary channel (possibly the electric boogaloo monitor) which has a well understood correlation with the measured strain channel. You would therefore expect to find some form of glitchiness in the data.
4. CAT3: There is some correlation in an auxiliary channel and the strain channel which is not understood. We’re not currently using this flag, but it’s kept as an option.

It’s important to verify the data quality before starting your analysis. You don’t want to get excited to discover a completely new form of gravitational wave only to realise that it’s actually some noise from nearby logging. Remember, if a tree falls in the forest and no-one is around, LIGO will still know.

To test our systems, we also occasionally perform a signal injection: we move the mirrors to simulate a signal. This is useful for calibration and for testing analysis algorithms. We don’t perform injections very often (they get in the way of looking for real signals), but these times are flagged. Just as for data quality flags, it is important to check for injections before analysing a stretch of data.

Once passing through all these checks, the data is ready to analyse!

Excited Data. Credit: Paramount

#### Accessing the data

After our data have been lovingly prepared, they are served up in two data formats:

• Hierarchical Data Format HDF, which is a popular data storage format, as it is easily allows for metadata and multiple data sets (like the important data quality flags) to be packaged together.
• Gravitational Wave Frame GWF, which is the standard format we use internally. Veteran gravitational-wave scientists often get a far-way haunted look when you bring up how the specifications for this file format were decided. It’s best not to mention unless you are also buying them a stiff drink.

In these files, you will find $h(t)$ sampled at either 4096 Hz or 16384 Hz (either are available). Pick the sampling rate you need depending upon the frequency range you are interested in: the 4096 Hz data are good for upto 1.7 kHz, while the 16384 Hz are good to the limit of the calibration range at 5 kHz.

Files can be downloaded from the GWOSC website. If you want to download a large amount, it is recommended to use the CernVM-FS distributed file system.

To check when the gravitational-wave detectors were observing, you can use the Timeline search.

Screenshot of the GWOSC Timeline showing observing from the fifth science run (S5) on the initial detector era through to the second observing run (O2) of the advanced detector era. Bars show observing of GEO 600 (G1), Hanford (H1 and H2), Livingston (L1) and Virgo (V1). Hanford initial had two detectors housed within its site, the plan in the advanced detector era is to install the equipment as LIGO India instead.

#### Try this at home

Having gone through all these details, you should now know what are data is, over what ranges it can be analyzed, and how to get access to it. Your cup of tea has also probably gone cold. Why not make yourself a new one, and have a couple of biscuits as reward too. You deserve it!

To help you on your way in starting analysing the data, GWOSC has a set of tutorials (and don’t forget the Data Analysis Guide), and a collection of open source software. Have fun, and remember, it’s never aliens.

### Bonus notes

#### Release schedule

The current policy is that data are released:

1. In a chunk surrounding an event at time of publication of that event. This enables the new detection to be analysed by anyone. We typically release about an hour of data around an event.
2. 18 months after the end of the run. This time gives us chance to properly calibrate the data, check the data quality, and then run the analyses we are committed to. A lot of work goes into producing gravitational wave data!

Start marking your calendars now for the release of O3 data.

#### Summer studenting

In summer 2019, while we were finishing up on the Data Analysis Guide, I gave it to one of my summer students Andrew Kim as an introduction. Andrew was working on gravitational-wave data analysis, so I hoped that he’d find it useful. He ended up working through the draft notebook made to accompany the paper and making a number of useful suggestions! He ended up as an author on the paper because of these contributions, which was nice.

#### The conspiracy of residuals

The Data Analysis Guide is an extremely useful paper. It explains many details of gravitational-wave analysis. The detections made by LIGO and Virgo over the last few years has increased the interest in analysing gravitational waves, making it the perfect time to write such an article. However, that’s not really what motivated us to write it.

In 2017, a paper appeared on the arXiv making claims of suspicious correlations in our LIGO data around GW150914. Could this call into question the very nature of our detection? No. The paper has two serious flaws.

1. The first argument in the paper was that there were suspicious phase correlations in the data. This is because the authors didn’t window their data before Fourier transforming.
2. The second argument was the residuals presented in Figure 1 of the GW150914 Discovery Paper contain a correlation. This is true, but these residuals aren’t actually the results of how we analyse the data. The point of Figure 1 was to that you don’t need our fancy analysis to see the signal—you can spot it by eye. Unfortunately, doing things by eye isn’t perfect, and this imperfection was picked up on.

The first flaw is a rookie mistake—pretty much everyone does it at some point. I did it starting out as a first-year PhD student, and I’ve run into it with all the undergraduates I’ve worked with writing their own analyses. The authors of this paper are rookies in gravitational-wave analysis, so they shouldn’t be judged too harshly for falling into this trap, and it is something so simple I can’t blame the referee of the paper for not thinking to ask. Any physics undergraduate who has met Fourier transforms (the second year of my degree) should grasp the mistake—it’s not something esoteric you need to be an expert in quantum gravity to understand.

The second flaw is something which could have been easily avoided if we had been more careful in the GW150914 Discovery Paper. We could have easily aligned the waveforms properly, or more clearly explained that the treatment used for Figure 1 is not what we actually do. However, we did write many other papers explaining what we did do, so we were hardly being secretive. While Figure 1 was not perfect, it was not wrong—it might not be what you might hope for, but it is described correctly in the text, and none of the LIGO–Virgo results depend on the figure in any way.

Recovered gravitational waveforms from our analysis of GW150914. The grey line shows the data whitened by the noise spectrum. The dark band shows our estimate for the waveform without assuming a particular source. The light bands show results if we assume it is a binary black hole (BBH) as predicted by general relativity. This plot more accurately represents how we analyse gravitational-wave data. Figure 6 of the GW150914 Parameter Estimation Paper.

Both mistakes are easy to fix. They are at the level of “Oops, that’s embarrassing! Give me 10 minutes. OK, that looks better”. Unfortunately, that didn’t happen.

The paper regrettably got picked up by science blogs, and caused much of a flutter. There were demands that LIGO and Virgo publically explain ourselves. This was difficult—the Collaboration is set up to do careful science, not handle a PR disaster. One of the problems was that we didn’t want to be seen to policing the use of our data. We can’t check that every paper ever using our data does everything perfectly. We don’t have time, and that probably wouldn’t encourage people to use our data if they knew any mistake would be pulled up by this 1000 person collaboration. A second problem was that getting anything approved as an official Collaboration document takes ages—getting consensus amongst so many people isn’t always easy. What would you do—would you want to be the faceless Collaboration persecuting the helpless, plucky scientists trying to check results?

There were private communications between people in the Collaboration and the authors. It took us a while to isolate the sources of the problems. In the meantime, pressure was mounting for an official™ response. It’s hard to justify why your analysis is correct by gesturing to a stack of a dozen papers—people don’t have time to dig through all that (I actually sent links to 16 papers to a science journalist who contacted me back in July 2017). Our silence may have been perceived as arrogance or guilt.

It was decided that we would put out an unofficial response. Ian Harry had been communicating with the authors, and wrote up his notes which Sean Carroll kindly shared on his blog. Unfortunately, this didn’t really make anyone too happy. The authors of the paper weren’t happy that something was shared via such an informal medium; the post is too technical for the general public to appreciate, and there was a minor typo in the accompanying code which (since fixed) was seized upon. It became necessary to write a formal paper.

Peer review will save the children! Credit: Fox

We did continue to try to explain the errors to the authors. I have colleagues who spent many hours in a room in Copenhagen trying to explain the mistakes. However, little progress was made, and it was not a fun time™. I can imagine at this point that the authors of the paper were sufficiently angry not to want to listen, which is a shame.

Now that the Data Analysis Guide is published, everyone will be satisfied, right? A refereed journal article should quash all fears, surely? Sadly, I doubt this will be the case. I expect these doubts will keep circulating for years. After all, there are those who still think vaccines cause autism. Fortunately, not believing in gravitational waves won’t kill any children. If anyone asks though, you can tell them that any doubts on LIGO’s analysis have been quashed, and that vaccines cause adults!

For a good account of the back and forth, Natalie Wolchover wrote a nice article in Quanta, and for a more acerbic view, try Mark Hannam’s blog.

# Classifying the unknown: Discovering novel gravitational-wave detector glitches using similarity learning

Gravity Spy is an awesome project that combines citizen science and machine learning to classify glitches in LIGO and Virgo data. Glitches are short bursts of noise in our detectors which make analysing our data more difficult. Some glitches have known causes, others are more mysterious. Classifying glitches into different types helps us better understand their properties, and in some cases track down their causes and eliminate them! In this paper, led by Scotty Coughlin, we demonstrated the effectiveness of a new tool which are citizen scientists can use to identify new glitch classes.

### The Gravity Spy project

Gravitational-wave detectors are complicated machines. It takes a lot of engineering to achieve the required accuracy needed to observe gravitational waves. Most of the time, our detectors perform well. The background noise in our detectors is easy to understand and model. However, our detectors are also subject to glitches, unusual  (sometimes extremely loud and complicated) noise that doesn’t fit the usual properties of noise. Glitches are short, they only appear in a small fraction of the total data, but they are common. This makes detection and analysis of gravitational-wave signals more difficult. Detection is tricky because you need to be careful to distinguish glitches from signals (and possibly glitches and signals together), and understanding the signal is complicated as we may need to model a signal and a glitch together [bonus note]. Understanding glitches is essential if gravitational-wave astronomy is to be a success.

To understand glitches, we need to be able to classify them. We can search for glitches by looking for loud pops, whooshes and splats in our data. The task is then to spot similarities between them. Once we have a set of glitches of the same type, we can examine the state of the instruments at these times. In the best cases, we can identify the cause, and then work to improve the detectors so that this no longer happens. Other times, we might be able to find the source, but we can find one of the monitors in our detectors which acts a witness to the glitch. Then we know that if something appears in that monitor, we expect a glitch of a particular form. This might mean that we throw away that bit of data, or perhaps we can use the witness data to subtract out the glitch. Since glitches are so common, classifying them is a huge amount of work. It is too much for our detector characterisation experts to do by hand.

There are two cunning options for classifying large numbers of glitches

1. Get a computer to do it. The difficulty  is teaching a computer to identify the different classes. Machine-learning algorithms can do this, if they are properly trained. Training can require a large training set, and careful validation, so the process is still labour intensive.
2. Get lots of people to help. The difficulty here is getting non-experts up-to-speed on what to look for, and then checking that they are doing a good job. Crowdsourcing classifications is something citizen scientists can do, but we will need a large number of dedicated volunteers to tackle the full set of data.

The idea behind Gravity Spy is to combine the two approaches. We start with a small training set from our detector characterization experts, and train a machine-learning algorithm on them. We then ask citizen scientists (thanks Zooniverse) to classify the glitches. We start them off with glitches the machine-learning algorithm is confident in its classification; these should be easy to identify. As citizen scientists get more experienced, they level up and start tackling more difficult glitches. The citizen scientists validate the classifications of the machine-learning algorithm, and provide a larger training set (especially helpful for the rarer glitch classes) for it. We can then happily apply the machine-learning algorithm to classify the full data set [bonus note].

How Gravity Spy works: the interconnection of machine-learning classification and citizen-scientist classification. The similarity search is used to identify glitches similar to one which do not fit into current classes. Figure 2 of Coughlin et al. (2019).

I especially like the levelling-up system in Gravity Spy. I think it helps keep citizen scientists motivated, as it both prevents them from being overwhelmed when they start and helps them see their own progress. I am currently Level 4.

Gravity Spy works using images of the data. We show spectrograms, plots of how loud the output of the detectors are at different frequencies at different times. A gravitational wave form a binary would show a chirp structure, starting at lower frequencies and sweeping up.

Spectrogram showing the upward-sweeping chirp of gravitational wave GW170104 as seen in Gravity Spy. I correctly classified this as a Chirp.

### New glitches

The Gravity Spy system works smoothly. However, it is set up to work with a fixed set of glitch classes. We may be missing new glitch classes, either because they are rare, and hadn’t been spotted by our detector characterization team, or because we changed something in our detectors and new class arose (we expect this to happen as we tune up the detectors between observing runs). We can add more classes to our citizen scientists and machine-learning algorithm to use, but how do we spot new classes in the first place?

Our citizen scientists managed to identify a few new glitches by spotting things which didn’t fit into any of the classes. These get put in the None-of-the-Above class. Occasionally, you’ll come across similar looking glitches, and by collecting a few of these together, build a new class. The Paired Dove and Helix classes were identified early on by our citizen scientists this way; my favourite suggested new class is the Falcon [bonus note]. The difficulty is finding a large number of examples of a new class—you might only recognise a common feature after going past a few examples, backtracking to find the previous examples is hard, and you just have to keep working until you are lucky enough to be given more of the same.

Example Helix (left) and Paired Dove glitches. These classes were identified by Gravity Spy citizen scientists. Helix glitches are related to related to hiccups in the auxiliary lasers used to calibrate the detectors by pushing on the mirrors. Paired Dove glitches are related to motion of the beamsplitter in the interferometer. Adapted from Figure 8 of Zevin et al. (2017).

To help our citizen scientists find new glitches, we created a similar search. Having found an interesting glitch, you can search for similar examples, and put quickly put together a collection of your new class. The video below shows how it works. The thing we had to work out was how to define similar?

#### Transfer learning

Our machine-learning algorithm only knows about the classes we tell it about. It then works out the features we distinguish the different classes, and are common to glitches of the same class. Working in this feature space, glitches form clusters of different classes.

Visualisation showing the clustering of different glitches in the Gravity Spy feature space. Each point is a different glitch from our training set. The feature space has more than three dimensions: this visualisation was made using a technique which preserves the separation and clustering of different and similar points. Figure 1 of Coughlin et al. (2019).

For our similarity search, our idea was to measure distances in feature space [bonus note for experts]. This should work well if our current set of classes have a wide enough set of features to capture to characteristics of the new class; however, it won’t be effective if the new class is completely different, so that its unique features are not recognised. As an analogy, imagine that you had an algorithm which classified M&M’s by colour. It would probably do well if you asked it to distinguish a new colour, but would probably do poorly if you asked it to distinguish peanut butter filled M&M’s as they are identified by flavour, which is not a feature it knows about. The strategy of using what a machine learning algorithm learnt about one problem to tackle a new problem is known as transfer learning, and we found this strategy worked well for our similarity search.

### Raven Pecks and Water Jets

To test our similarity search, we applied it to two glitches classes not in the Gravity Spy set:

1. Raven Peck glitches are caused by thirsty ravens pecking ice built up along nitrogen vent lines outside of the Hanford detector. Raven Pecks look like horizontal lines in spectrograms, similar to other Gravity Spy glitch classes (like the Power Line, Low Frequency Line and 1080 Line). The similarity search should therefore do a good job, as we should be able to recognise its important features.
2. Water Jet glitches were caused by local seismic noise at the Hanford detector which  causes loud bands which disturb the input laser optics. These glitches are found between , over which time there are 26,871 total glitches in GRavity Spy. The Water Jet glitch doesn’t have anything to do with water, it is named based on its appearance (like a fountain, not a weasel). Its features are subtle, and unlike other classes, so we would expect this to be difficult for our similarity search to handle.

These glitches appeared in the data from the second observing run. Raven Pecks appeared between 14 April and 9 August 2017, and Water Jets appeared 4 January and 28 May 2017. Over these intervals there are a total of 13,513 and 26,871 Gravity Spy glitches from all type, so even if you knew exactly when to look, you have a large number to search through to find examples.

Example Raven Peck (left) and Water Jet (right) glitches. These classes of glitch are not included in the usual Gravity Spy scheme. Adapted from Figure 3 of Coughlin et al. (2019).

We tested using our machine-learning feature space for the similarity search against simpler approaches: using the raw difference in pixels, and using a principal component analysis to create a feature space. Results are shown in the plots below. These show the fraction of glitches we want returned by the similarity search versus the total number of glitches rejected. Ideally, we would want to reject all the glitches except the ones we want, so the search would return 100% of the wanted classes and reject almost 100% of the total set. However, the actual results will depend on the adopted threshold for the similarity search: if we’re very strict we’ll reject pretty much everything, and only get the most similar glitches of the class we want, if we are too accepting, we get everything back, regardless of class. The plots can be read as increasing the range of the similarity search (becoming less strict) as you go left to right.

Performance of the similarity search for Raven Peck (left) and Water Jet (right) glitches: the fraction of known glitches of the desired class that have a higher similarity score (compared to an example of that glitch class) than a given percentage of full data set. Results are shown for three different ways of defining similarity: the DIRECT machine-learning algorithm feature space (think line), a principal component analysis (medium line) and a comparison of pixels (thin line). Adapted from Figure 3 of Coughlin et al. (2019).

For the Raven Peck, the similarity search always performs well. We have 50% of Raven Pecks returned while rejecting 99% of the total set of glitches, and we can get the full set while rejecting 92% of the total set! The performance is pretty similar between the different ways of defining feature space. Raven Pecks are easy to spot.

Water Jets are more difficult. When we have 50% of Water Jets returned by the search, our machine-learning feature space can still reject almost all glitches. The simpler approaches do much worse, and will only reject about 30% of the full data set. To get the full set of Water Jets we would need to loosen the similarity search so that it only rejects 55% of the full set using our machine-learning feature space; for the simpler approaches we’d basically get the full set of glitches back. They do not do a good job at narrowing down the hunt for glitches. Despite our suspicion that our machine-learning approach would struggle, it still seems to do a decent job [bonus note for experts].

### Do try this at home

Having developed and testing our similarity search tool, it is now live. Citizen scientists can use it to hunt down new glitch classes. Several new glitches classes have been identified in data from LIGO and Virgo’s (currently ongoing) third observing run. If you are looking for a new project, why not give it a go yourself? (Or get your students to give it a go, I’ve had some reasonable results with high-schoolers). There is the real possibility that your work could help us with the next big gravitational-wave discovery.

arXiv: arXiv:1903.04058 [astro-ph.IM]
Journal: Physical Review D; 99(8):082002(8); 2019
Websites: Gravity Spy; Gravity Spy Tools
Gravity Spy blog: Introducing Gravity Spy Tools
Current stats: Gravity Spy has 15,500 registered users, who have made 4.4 million glitch classifications, leading to 200,000 successfully identified glitches.

### Bonus notes

#### Signals and glitches

The best example of a gravitational-wave overlapping a glitch is GW170817. The glitch meant that the signal in the LIGO Livingston detector wasn’t immediately recognised. Fortunately, the signal in the Hanford detector was easy to spot. The glitch was analyse and categorised in Gravity Spy. It is a simple glitch, so it wasn’t too difficult to remove from the data. As our detectors become more sensitive, so that detections become more frequent, we expect that signal overlapping with glitches will become a more common occurrence. Unless we can eliminate glitches, it is only a matter of time before we get a glitch that prevents us from analysing an important signal.

In the third observing run of LIGO and Virgo, we send out automated alerts when we have a new gravitational-wave candidate. Astronomers can then pounce into action to see if they can spot anything coinciding with the source. It is important to quickly check the state of the instruments to ensure we don’t have a false alarm. To help with this, a data quality report is automatically prepared, containing many diagnostics. The classification from the Gravity Spy algorithm is one of many pieces of information included. It is the one I check first.

#### The Falcon

Excellent Gravity Spy moderator EcceruElme suggested a new glitch class Falcon. This suggestion was followed up by Oli Patane, they found that all the examples identified occured between 6:30 am and 8:30 am on 20 June 2017 in the Hanford detector. The instrument was misbehaving at the time. To solve this, the detector was taken out of observing mode and relocked (the equivalent of switching it off and on again). Since this glitch class was only found in this one 2-hour window, we’ve not added it as a class. I love how it was possible to identify this problematic stretch of time using only Gravity Spy images (which don’t identify when they are from). I think this could be the seed of a good detective story. The Hanfordese Falcon?

Examples of the proposed Falcon glitch class, illustrating the key features (and where the name comes from). This new glitch class was suggested by Gravity Spy citizen scientist EcceruElme.

#### Distance measure

We chose a cosine distance to measure similarity in feature space. We found this worked better than a Euclidean metric. Possibly because for identifying classes it is more important to have the right mix of features, rather than how significant the individual features are. However, we didn’t do a systematic investigation of the optimal means of measuring similarity.

#### Retraining the neural net

We tested the performance of the machine-learning feature space in the similarity search after modifying properties of our machine-learning algorithm. The algorithm we are using is a deep multiview convolution neural net. We switched the activation function in the fully connected layer of the net, trying tanh and leaukyREU. We also varied the number of training rounds and the number of pairs of similar and dissimilar images that are drawn from the training set each round. We found that there was little variation in results. We found that leakyREU performed a little better than tanh, possibly because it covers a larger dynamic range, and so can allow for cleaner separation of similar and dissimilar features. The number of training rounds and pairs makes negligible difference, possibly because the classes are sufficiently distinct that you don’t need many inputs to identify the basic features to tell them apart. Overall, our results appear robust. The machine-learning approach works well for the similarity search.

# GW150914—The papers

In 2015 I made a resolution to write a blog post for each paper I had published. In 2016 I’ll have to break this because there are too many to keep up with. A suite of papers were prepared to accompany the announcement of the detection of GW150914 [bonus note], and in this post I’ll give an overview of these.

### The papers

As well as the Discovery Paper published in Physical Review Letters [bonus note], there are 12 companion papers. All the papers are listed below in order of arXiv posting. My favourite is the Parameter Estimation Paper.

Subsequently, we have produced additional papers on GW150914, describing work that wasn’t finished in time for the announcement. The most up-to-date results are currently given in the O2 Catalogue Paper.

#### 0. The Discovery Paper

Title: Observation of gravitational waves from a binary black hole merger
arXiv:
1602.03837 [gr-qc]
Journal:
Physical Review Letters; 116(6):061102(16); 2016
LIGO science summary:
Observation of gravitational waves from a binary black hole merger

This is the central paper that announces the observation of gravitational waves. There are three discoveries which are describe here: (i) the direct detection of gravitational waves, (ii) the existence of stellar-mass binary black holes, and (iii) that the black holes and gravitational waves are consistent with Einstein’s theory of general relativity. That’s not too shabby in under 11 pages (if you exclude the author list). Coming 100 years after Einstein first published his prediction of gravitational waves and Schwarzschild published his black hole solution, this is the perfect birthday present.

More details: The Discovery Paper summary

#### 1. The Detector Paper

Title: GW150914: The Advanced LIGO detectors in the era of first discoveries
arXiv:
1602.03838 [gr-qc]
Journal: Physical Review Letters; 116(13):131103(12); 2016
LIGO science summary: GW150914: The Advanced LIGO detectors in the era of the first discoveries

This paper gives a short summary of how the LIGO detectors work and their configuration in O1 (see the Advanced LIGO paper for the full design). Giant lasers and tiny measurements, the experimentalists do some cool things (even if their paper titles are a little cheesy and they seem to be allergic to error bars).

More details: The Detector Paper summary

#### 2. The Compact Binary Coalescence Paper

Title: GW150914: First results from the search for binary black hole coalescence with Advanced LIGO
arXiv:
1602.03839 [gr-qc]
Journal: Physical Review D; 93(12):122003(21); 2016
LIGO science summary: How we searched for merging black holes and found GW150914

Here we explain how we search for binary black holes and calculate the significance of potential candidates. This is the evidence to back up (i) in the Discovery Paper. We can potentially detect binary black holes in two ways: with searches that use templates, or with searches that look for coherent signals in both detectors without assuming a particular shape. The first type is also used for neutron star–black hole or binary neutron star coalescences, collectively known as compact binary coalescences. This type of search is described here, while the other type is described in the Burst Paper.

This paper describes the compact binary coalescence search pipelines and their results. As well as GW150914 there is also another interesting event, LVT151012. This isn’t significant enough to be claimed as a detection, but it is worth considering in more detail.

More details: The Compact Binary Coalescence Paper summary

#### 3. The Parameter Estimation Paper

Title: Properties of the binary black hole merger GW150914
arXiv:
1602.03840 [gr-qc]
Journal: Physical Review Letters; 116(24):241102(19); 2016
LIGO science summary: The first measurement of a black hole merger and what it means

If you’re interested in the properties of the binary black hole system, then this is the paper for you! Here we explain how we do parameter estimation and how it is possible to extract masses, spins, location, etc. from the signal. These are the results I’ve been most heavily involved with, so I hope lots of people will find them useful! This is the paper to cite if you’re using our best masses, spins, distance or sky maps. The masses we infer are so large we conclude that the system must contain black holes, which is discovery (ii) reported in the Discovery Paper.

More details: The Parameter Estimation Paper summary

#### 4. The Testing General Relativity Paper

Title: Tests of general relativity with GW150914
arXiv:
1602.03841 [gr-qc]
Journal: Physical Review Letters; 116(22):221101(19); 2016
LIGO science summary:
Was Einstein right about strong gravity?

The observation of GW150914 provides a new insight into the behaviour of gravity. We have never before probed such strong gravitational fields or such highly dynamical spacetime. These are the sorts of places you might imagine that we could start to see deviations from the predictions of general relativity. Aside from checking that we understand gravity, we also need to check to see if there is any evidence that our estimated parameters for the system could be off. We find that everything is consistent with general relativity, which is good for Einstein and is also discovery (iii) in the Discovery Paper.

More details: The Testing General Relativity Paper summary

#### 5. The Rates Paper

Title: The rate of binary black hole mergers inferred from Advanced LIGO observations surrounding GW150914
arXiv:
1602.03842 [astro-ph.HE]1606.03939 [astro-ph.HE]
Journal: Astrophysical Journal Letters; 833(1):L1(8); 2016; Astrophysical Journal Supplement Series; 227(2):14(11); 2016
LIGO science summary: The first measurement of a black hole merger and what it means

Given that we’ve spotted one binary black hole (plus maybe another with LVT151012), how many more are out there and how many more should we expect to find? We answer this here, although there’s a large uncertainty on the estimates since we don’t know (yet) the distribution of masses for binary black holes.

More details: The Rates Paper summary

#### 6. The Burst Paper

Title: Observing gravitational-wave transient GW150914 with minimal assumptions
arXiv: 1602.03843 [gr-qc]
Journal: Physical Review D; 93(12):122004(20); 2016

What can you learn about GW150914 without having to make the assumptions that it corresponds to gravitational waves from a binary black hole merger (as predicted by general relativity)? This paper describes and presents the results of the burst searches. Since the pipeline which first found GW150914 was a burst pipeline, it seems a little unfair that this paper comes after the Compact Binary Coalescence Paper, but I guess the idea is to first present results assuming it is a binary (since these are tightest) and then see how things change if you relax the assumptions. The waveforms reconstructed by the burst models do match the templates for a binary black hole coalescence.

More details: The Burst Paper summary

#### 7. The Detector Characterisation Paper

Title: Characterization of transient noise in Advanced LIGO relevant to gravitational wave signal GW150914
arXiv: 1602.03844 [gr-qc]
Journal: Classical & Quantum Gravity; 33(13):134001(34); 2016
LIGO science summary:
How do we know GW150914 was real? Vetting a Gravitational Wave Signal of Astrophysical Origin
CQG+ post: How do we know LIGO detected gravitational waves? [featuring awesome cartoons]

Could GW150914 be caused by something other than a gravitational wave: are there sources of noise that could mimic a signal, or ways that the detector could be disturbed to produce something that would be mistaken for a detection? This paper looks at these problems and details all the ways we monitor the detectors and the external environment. We can find nothing that can explain GW150914 (and LVT151012) other than either a gravitational wave or a really lucky random noise fluctuation. I think this paper is extremely important to our ability to claim a detection and I’m surprised it’s not number 2 in the list of companion papers. If you want to know how thorough the Collaboration is in monitoring the detectors, this is the paper for you.

More details: The Detector Characterisation Paper summary

#### 8. The Calibration Paper

Title: Calibration of the Advanced LIGO detectors for the discovery of the binary black-hole merger GW150914
arXiv:
1602.03845 [gr-qc]
Journal: Physical Review D; 95(6):062003(16); 2017
LIGO science summary:
Calibration of the Advanced LIGO detectors for the discovery of the binary black-hole merger GW150914

Completing the triumvirate of instrumental papers with the Detector Paper and the Detector Characterisation Paper, this paper describes how the LIGO detectors are calibrated. There are some cunning control mechanisms involved in operating the interferometers, and we need to understand these to quantify how they effect what we measure. Building a better model for calibration uncertainties is high on the to-do list for improving parameter estimation, so this is an interesting area to watch for me.

More details: The Calibration Paper summary

#### 9. The Astrophysics Paper

Title: Astrophysical implications of the binary black-hole merger GW150914
arXiv:
1602.03846 [astro-ph.HE]
Journal: Astrophysical Journal Letters; 818(2):L22(15); 2016
LIGO science summary:
The first measurement of a black hole merger and what it means

Having estimated source parameters and rate of mergers, what can we say about astrophysics? This paper reviews results related to binary black holes to put our findings in context and also makes statements about what we could hope to learn in the future.

More details: The Astrophysics Paper summary

#### 10. The Stochastic Paper

Title: GW150914: Implications for the stochastic gravitational wave background from binary black holes
arXiv:
1602.03847 [gr-qc]
Journal: Physical Review Letters; 116(13):131102(12); 2016
LIGO science summary: Background of gravitational waves expected from binary black hole events like GW150914

For every loud signal we detect, we expect that there will be many more quiet ones. This paper considers how many quiet binary black hole signals could add up to form a stochastic background. We may be able to see this background as the detectors are upgraded, so we should start thinking about what to do to identify it and learn from it.

More details: The Stochastic Paper summary

#### 11. The Neutrino Paper

Title: High-energy neutrino follow-up search of gravitational wave event GW150914 with ANTARES and IceCube
arXiv:
1602.05411 [astro-ph.HE]
Journal: Physical Review D; 93(12):122010(15); 2016
LIGO science summary: Search for neutrinos from merging black holes

We are interested so see if there’s any other signal that coincides with a gravitational wave signal. We wouldn’t expect something to accompany a black hole merger, but it’s good to check. This paper describes the search for high-energy neutrinos. We didn’t find anything, but perhaps we will in the future (perhaps for a binary neutron star merger).

More details: The Neutrino Paper summary

#### 12. The Electromagnetic Follow-up Paper

Title: Localization and broadband follow-up of the gravitational-wave transient GW150914
arXiv: 1602.08492 [astro-ph.HE]; 1604.07864 [astro-ph.HE]
Journal: Astrophysical Journal Letters; 826(1):L13(8); 2016; Astrophysical Journal Supplement Series; 225(1):8(15); 2016

As well as looking for coincident neutrinos, we are also interested in electromagnetic observations (gamma-ray, X-ray, optical, infra-red or radio). We had a large group of observers interesting in following up on gravitational wave triggers, and 25 teams have reported observations. This companion describes the procedure for follow-up observations and discusses sky localisation.

This work split into a main article and a supplement which goes into more technical details.

More details: The Electromagnetic Follow-up Paper summary

### The Discovery Paper

Synopsis: Discovery Paper
Read this if: You want an overview of The Event
Favourite part: The entire conclusion:

The LIGO detectors have observed gravitational waves from the merger of two stellar-mass black holes. The detected waveform matches the predictions of general relativity for the inspiral and merger of a pair of black holes and the ringdown of the resulting single black hole. These observations demonstrate the existence of binary stellar-mass black hole systems. This is the first direct detection of gravitational waves and the first observation of a binary black hole merger.

The Discovery Paper gives the key science results and is remarkably well written. It seems a shame to summarise it: you should read it for yourself! (It’s free).

### The Detector Paper

Synopsis: Detector Paper
Read this if: You want a brief description of the detector configuration for O1
Favourite part: It’s short!

The LIGO detectors contain lots of cool pieces of physics. This paper briefly outlines them all: the mirror suspensions, the vacuum (the LIGO arms are the largest vacuum envelopes in the world and some of the cleanest), the mirror coatings, the laser optics and the control systems. A full description is given in the Advanced LIGO paper, but the specs there are for design sensitivity (it is also heavy reading). The main difference between the current configuration and that for design sensitivity is the laser power. Currently the circulating power in the arms is $100~\mathrm{kW}$, the plan is to go up to $750~\mathrm{kW}$. This will reduce shot noise, but raises all sorts of control issues, such as how to avoid parametric instabilities.

The noise amplitude spectral density. The curves for the current observations are shown in red (dark for Hanford, light for Livingston). This is around a factor 3 better than in the final run of initial LIGO (green), but still a factor of 3 off design sensitivity (dark blue). The light blue curve shows the impact of potential future upgrades. The improvement at low frequencies is especially useful for high-mass systems like GW150914. Part of Fig. 1 of the Detector Paper.

### The Compact Binary Coalescence Paper

Synopsis: Compact Binary Coalescence Paper
Read this if: You are interested in detection significance or in LVT151012
Favourite part: We might have found a second binary black hole merger

There are two compact binary coalescence searches that look for binary black holes: PyCBC and GstLAL. Both match templates to the data from the detectors to look for anything binary like, they then calculate the probability that such a match would happen by chance due to a random noise fluctuation (the false alarm probability or p-value [unhappy bonus note]). The false alarm probability isn’t the probability that there is a gravitational wave, but gives a good indication of how surprised we should be to find this signal if there wasn’t one. Here we report the results of both pipelines on the first 38.6 days of data (about 17 days where both detectors were working at the same time).

Both searches use the same set of templates to look for binary black holes [bonus note]. They look for where the same template matches the data from both detectors within a time interval consistent with the travel time between the two. However, the two searches rank candidate events and calculate false alarm probabilities using different methods. Basically, both searches use a detection statistic (the quantity used to rank candidates: higher means less likely to be noise), that is based on the signal-to-noise ratio (how loud the signal is) and a goodness-of-fit statistic. They assess the significance of a particular value of this detection statistic by calculating how frequently this would be obtained if there was just random noise (this is done by comparing data from the two detectors when there is not a coincident trigger in both). Consistency between the two searches gives us greater confidence in the results.

PyCBC’s detection statistic is a reweighted signal-to-noise ratio $\hat{\rho}_c$ which takes into account the consistency of the signal in different frequency bands. You can get a large signal-to-noise ratio from a loud glitch, but this doesn’t match the template across a range of frequencies, which is why this test is useful. The consistency is quantified by a reduced chi-squared statistic. This is used, depending on its value, to weight the signal-to-noise ratio. When it is large (indicating inconsistency across frequency bins), the reweighted signal-to-noise ratio becomes smaller.

To calculate the background, PyCBC uses time slides. Data from the two detectors are shifted in time so that any coincidences can’t be due to a real gravitational wave. Seeing how often you get something signal-like then tells you how often you’d expect this to happen due to random noise.

GstLAL calculates the signal-to-noise ratio and a residual after subtracting the template. As a detection statistic, it uses a likelihood ratio $\mathcal{L}$: the probability of finding the particular values of the signal-to-noise ratio and residual in both detectors for signals (assuming signal sources are uniformly distributed isotropically in space), divided by the probability of finding them for noise.

The background from GstLAL is worked out by looking at the likelihood ratio fro triggers that only appear in one detector. Since there’s no coincident signal in the other, these triggers can’t correspond to a real gravitational wave. Looking at their distribution tells you how frequently such things happen due to noise, and hence how probable it is for both detectors to see something signal-like at the same time.

The results of the searches are shown in the figure below.

Search results for PyCBC (left) and GstLAL (right). The histograms show the number of candidate events (orange squares) compare to the background. The black line includes GW150914 in the background estimate, the purple removes it (assuming that it is a signal). The further an orange square is above the lines, the more significant it is. Particle physicists like to quote significance in terms of $\sigma$ and for some reason we’ve copied them. The second most significant event (around $2\sigma$) is LVT151012. Fig. 7 from the Compact Binary Coalescence Paper.

GW150914 is the most significant event in both searches (it is the most significant PyCBC event even considering just single-detector triggers). They both find GW150914 with the same template values. The significance is literally off the charts. PyCBC can only calculate an upper bound on the false alarm probability of $< 2 \times 10^{-7}$. GstLAL calculates a false alarm probability of $1.4 \times 10^{-11}$, but this is reaching the level that we have to worry about the accuracy of assumptions that go into this (that the distribution of noise triggers in uniform across templates—if this is not the case, the false alarm probability could be about $10^3$ times larger). Therefore, for our overall result, we stick to the upper bound, which is consistent with both searches. The false alarm probability is so tiny, I don’t think anyone doubts this signal is real.

There is a second event that pops up above the background. This is LVT151012. It is found by both searches. Its signal-to-noise ratio is $9.6$, compared with GW150914’s $24$, so it is quiet. The false alarm probability from PyCBC is $0.02$, and from GstLAL is $0.05$, consistent with what we would expect for such a signal. LVT151012 does not reach the standards we would like to claim a detection, but it is still interesting.

Running parameter estimation on LVT151012, as we did for GW150914, gives beautiful results. If it is astrophysical in origin, it is another binary black hole merger. The component masses are lower, $m_1^\mathrm{source} = 23^{+18}_{-5} M_\odot$ and $m_2^\mathrm{source} 13^{+4}_{-5} M_\odot$ (the asymmetric uncertainties come from imposing $m_1^\mathrm{source} \geq m_2^\mathrm{source}$); the chirp mass is $\mathcal{M} = 15^{+1}_{-1} M_\odot$. The effective spin, as for GW150914, is close to zero $\chi_\mathrm{eff} = 0.0^{+0.3}_{-0.2}$. The luminosity distance is $D_\mathrm{L} = 1100^{+500}_{-500}~\mathrm{Mpc}$, meaning it is about twice as far away as GW150914’s source. I hope we’ll write more about this event in the future; there are some more details in the Rates Paper.

Is it random noise or is it a gravitational wave? LVT151012 remains a mystery. This candidate event is discussed in the Compact Binary Coalescence Paper (where it is found), the Rates Paper (which calculates the probability that it is extraterrestrial in origin), and the Detector Characterisation Paper (where known environmental sources fail to explain it). SPOILERS

### The Parameter Estimation Paper

Synopsis: Parameter Estimation Paper
Read this if: You want to know the properties of GW150914’s source
Favourite part: We inferred the properties of black holes using measurements of spacetime itself!

The gravitational wave signal encodes all sorts of information about its source. Here, we explain how we extract this information  to produce probability distributions for the source parameters. I wrote about the properties of GW150914 in my previous post, so here I’ll go into a few more technical details.

To measure parameters we match a template waveform to the data from the two instruments. The better the fit, the more likely it is that the source had the particular parameters which were used to generate that particular template. Changing different parameters has different effects on the waveform (for example, changing the distance changes the amplitude, while changing the relative arrival times changes the sky position), so we often talk about different pieces of the waveform containing different pieces of information, even though we fit the whole lot at once.

The shape of the gravitational wave encodes the properties of the source. This information is what lets us infer parameters. The example signal is GW150914. I made this explainer with Ban Farr and Nutsinee Kijbunchoo for the LIGO Magazine.

The waveform for a binary black hole merger has three fuzzily defined parts: the inspiral (where the two black holes orbit each other), the merger (where the black holes plunge together and form a single black hole) and ringdown (where the final black hole relaxes to its final state). Having waveforms which include all of these stages is a fairly recent development, and we’re still working on efficient ways of including all the effects of the spin of the initial black holes.

We currently have two favourite binary black hole waveforms for parameter estimation:

• The first we refer to as EOBNR, short for its proper name of SEOBNRv2_ROM_DoubleSpin. This is constructed by using some cunning analytic techniques to calculate the dynamics (known as effective-one-body or EOB) and tuning the results to match numerical relativity (NR) simulations. This waveform only includes the effects of spins aligned with the orbital angular momentum of the binary, so it doesn’t allow us to measure the effects of precession (wobbling around caused by the spins).
• The second we refer to as IMRPhenom, short for IMRPhenomPv2. This is constructed by fitting to the frequency dependence of EOB and NR waveforms. The dominant effects of precession of included by twisting up the waveform.

We’re currently working on results using a waveform that includes the full effects of spin, but that is extremely slow (it’s about half done now), so those results won’t be out for a while.

The results from the two waveforms agree really well, even though they’ve been created by different teams using different pieces of physics. This was a huge relief when I was first making a comparison of results! (We had been worried about systematic errors from waveform modelling). The consistency of results is partly because our models have improved and partly because the properties of the source are such that the remaining differences aren’t important. We’re quite confident that we’ve most of the parameters are reliably measured!

The component masses are the most important factor for controlling the evolution of the waveform, but we don’t measure the two masses independently.  The evolution of the inspiral is dominated by a combination called the chirp mass, and the merger and ringdown are dominated by the total mass. For lighter mass systems, where we gets lots of inspiral, we measure the chirp mass really well, and for high mass systems, where the merger and ringdown are the loudest parts, we measure the total mass. GW150914 is somewhere in the middle. The probability distribution for the masses are shown below: we can compensate for one of the component masses being smaller if we make the other larger, as this keeps chirp mass and total mass about the same.

Estimated masses for the two black holes in the binary. Results are shown for the EOBNR waveform and the IMRPhenom: both agree well. The Overall results come from averaging the two. The dotted lines mark the edge of our 90% probability intervals. The sharp diagonal line cut-off in the two-dimensional plot is a consequence of requiring $m_1^\mathrm{source} \geq m_2^\mathrm{source}$.  Fig. 1 from the Parameter Estimation Paper.

To work out these masses, we need to take into account the expansion of the Universe. As the Universe expands, it stretches the wavelength of the gravitational waves. The same happens to light: visible light becomes redder, so the phenomenon is known as redshifting (even for gravitational waves). If you don’t take this into account, the masses you measure are too large. To work out how much redshift there is you need to know the distance to the source. The probability distribution for the distance is shown below, we plot the distance together with the inclination, since both of these affect the amplitude of the waves (the source is quietest when we look at it edge-on from the side, and loudest when seen face-on/off from above/below).

Estimated luminosity distance and binary inclination angle. An inclination of $\theta_{JN} = 90^\circ$ means we are looking at the binary (approximately) edge-on. Results are shown for the EOBNR waveform and the IMRPhenom: both agree well. The Overall results come from averaging the two. The dotted lines mark the edge of our 90% probability intervals.  Fig. 2 from the Parameter Estimation Paper.

After the masses, the most important properties for the evolution of the binary are the spins. We don’t measure these too well, but the probability distribution for their magnitudes and orientations from the precessing IMRPhenom model are shown below. Both waveform models agree that the effective spin $\chi_\mathrm{eff}$, which is a combination of both spins in the direction of the orbital angular momentum) is small. Therefore, either the spins are small or are larger but not aligned (or antialigned) with the orbital angular momentum. The spin of the more massive black hole is the better measured of the two.

Estimated orientation and magnitude of the two component spins from the precessing IMRPhenom model. The magnitude is between 0 and 1 and is perfectly aligned with the orbital angular momentum if the angle is 0. The distribution for the more massive black hole is on the left, and for the smaller black hole on the right. Part of Fig. 5 from the Parameter Estimation Paper.

### The Testing General Relativity Paper

Synopsis: Testing General Relativity Paper
Read this if: You want to know more about the nature of gravity.
Favourite part: Einstein was right! (Or more correctly, we can’t prove he was wrong… yet)

The Testing General Relativity Paper is one of my favourites as it packs a lot of science in. Our first direct detection of gravitational waves and of the merger of two black holes provides a new laboratory to test gravity, and this paper runs through the results of the first few experiments.

Before we start making any claims about general relativity being wrong, we first have to check if there’s any weird noise present. You don’t want to have to rewrite the textbooks just because of an instrumental artifact. After taking out a good guess for the waveform (as predicted by general relativity), we find that the residuals do match what we expect for instrumental noise, so we’re good to continue.

I’ve written about a couple of tests of general relativity in my previous post: the consistency of the inspiral and merger–ringdown parts of the waveform, and the bounds on the mass of the graviton (from evolution of the signal). I’ll cover the others now.

The final part of the signal, where the black hole settles down to its final state (the ringdown), is the place to look to check that the object is a black hole and not some other type of mysterious dark and dense object. It is tricky to measure this part of the signal, but we don’t see anything odd. We can’t yet confirm that the object has all the properties you’d want to pin down that it is exactly a black hole as predicted by general relativity; we’re going to have to wait for a louder signal for this. This test is especially poignant, as Steven Detweiler, who pioneered a lot of the work calculating the ringdown of black holes, died a week before the announcement.

We can allow terms in our waveform (here based on the IMRPhenom model) to vary and see which values best fit the signal. If there is evidence for differences compared with the predictions from general relativity, we would have evidence for needing an alternative. Results for this analysis are shown below for a set of different waveform parameters $\hat{p}_i$: the $\varphi_i$ parameters determine the inspiral, the $\alpha_i$ parameters determine the merger–ringdown and the $\beta_i$ parameters cover the intermediate regime. If the deviation $\delta \hat{p}_i$ is zero, the value coincides with the value from general relativity. The plot shows what would happen if you allow all the variable to vary at once (the multiple results) and if you tried just that parameter on its own (the single results).

Probability distributions for waveform parameters. The single analysis only varies one parameter, the multiple analysis varies all of them, and the J0737-3039 result is the existing bound from the double pulsar. A deviation of zero is consistent with general relativity. Fig. 7 from the Testing General Relativity Paper.

Overall the results look good. Some of the single results are centred away from zero, but we think that this is just a random fluctuate caused by noise (we’ve seen similar behaviour in tests, so don’t panic yet). It’s not surprising the $\varphi_3$, $\varphi_4$ and $\varphi_{5l}$ all show this behaviour, as they are sensitive to similar noise features. These measurements are much tighter than from any test we’ve done before, except for the measurement of $\varphi_0$ which is better measured from the double pulsar (since we have lots and lots of orbits of that measured).

The final test is to look for additional polarizations of gravitational waves. These are predicted in several alternative theories of gravity. Unfortunately, because we only have two detectors which are pretty much aligned we can’t say much, at least without knowing for certain the location of the source. Extra detectors will be useful here!

In conclusion, we have found no evidence to suggest we need to throw away general relativity, but future events will help us to perform new and stronger tests.

### The Rates Paper

Synopsis: Rates Paper
Read this if: You want to know how often binary black holes merge (and how many we’ll detect)
Favourite part: There’s a good chance we’ll have ten detections by the end of our second observing run (O2)

Before September 14, we had never seen a binary stellar-mass black hole system. We were therefore rather uncertain about how many we would see. We had predictions based on simulations of the evolution of stars and their dynamical interactions. These said we shouldn’t be too surprised if we saw something in O1, but that we shouldn’t be surprised if we didn’t see anything for many years either. We weren’t really expecting to see a black hole system so soon (the smart money was on a binary neutron star). However, we did find a binary black hole, and this happened right at the start of our observations! What do we now believe about the rate of mergers?

To work out the rate, you first need to count the number of events you have detected and then work out how sensitive you are to the population of signals (how many could you see out of the total).

Counting detections sounds simple: we have GW150914 without a doubt. However, what about all the quieter signals? If you have 100 events each with a 1% probability of being real, then even though you can’t say with certainty that anyone is an actual signal, you would expect one to be so. We want to work out how many events are real and how many are due to noise. Handily, trying to tell apart different populations of things when you’re not certain about individual members is a common problem is astrophysics (where it’s often difficult to go and check what something actually is), so there exists a probabilistic framework for doing this.

Using the expected number of real and noise events for a given detection statistic (as described in the Compact Binary Coalescence Paper), we count the number of detections and as a bonus, get a probability that each event is of astrophysical origin. There are two events with more than a 50% chance of being real: GW150914, where the probability is close to 100%, and LVT151012, where to probability is 84% based on GstLAL and 91% based on PyCBC.

By injecting lots of fake signals into some data and running our detection pipelines, we can work out how sensitive they are (in effect, how far away can they find particular types of sources). For a given number of detections, the more sensitive we are, the lower the actual rate of mergers should be (for lower sensitivity we would miss more, while there’s no hiding for higher sensitivity).

There is one final difficulty in working out the total number of binary black hole mergers: we need to know the distribution of masses, because our sensitivity depends on this. However, we don’t yet know this as we’ve only seen GW150914 and (maybe) LVT151012. Therefore, we try three possibilities to get an idea of what the merger rate could be.

1. We assume that binary black holes are either like GW150914 or like LVT151012. Given that these are our only possible detections at the moment, this should give a reasonable estimate. A similar approach has been used for estimating the population of binary neutron stars from pulsar observations [bonus note].
2. We assume that the distribution of masses is flat in the logarithm of the masses. This probably gives more heavy black holes than in reality (and so a lower merger rate)
3. We assume that black holes follow a power law like the initial masses of stars. This probably gives too many low mass black holes (and so a higher merger rate)

The estimated merger rates (number of binary black hole mergers per volume per time) are then: 1. $83^{+168}_{-63}~\mathrm{Gpc^{-3}\,yr^{-1}}$; 2. $61^{+124}_{-48}~\mathrm{Gpc^{-3}\,yr^{-1}}$, and 3. $200^{+400}_{-160}~\mathrm{Gpc^{-3}\,yr^{-1}}$. There is a huge scatter, but the flat and power-law rates hopefully bound the true value.

We’ll pin down the rate better after a few more detections. How many more should we expect to see? Using the projected sensitivity of the detectors over our coming observing runs, we can work out the probability of making $N$ more detections. This is shown in the plot below. It looks like there’s about about a 10% chance of not seeing anything else in O1, but we’re confident that we’ll have 10 more by the end of O2, and 35 more by the end of O3! I may need to lie down…

The percentage chance of making 0, 10, 35 and 70 more detections of binary black holes as time goes on and detector sensitivity improves (based upon our data so far). This is a simplified version of part of Fig. 3 of the Rates Paper taken from the science summary.

### The Burst Paper

Synopsis: Burst Paper
Read this if: You want to check what we can do without a waveform template
Favourite part: You don’t need a template to make a detection

When discussing what we can learn from gravitational wave astronomy, you can almost guarantee that someone will say something about discovering the unexpected. Whenever we’ve looked at the sky in a new band of the electromagnetic spectrum, we found something we weren’t looking for: pulsars for radio, gamma-ray burst for gamma-rays, etc. Can we do the same in gravitational wave astronomy? There may well be signals we weren’t anticipating out there, but will we be able to detect them? The burst pipelines have our back here, at least for short signals.

The burst search pipelines, like their compact binary coalescence partners, assign candidate events a detection statistic and then work out a probability associated with being a false alarm caused by noise. The difference is that the burst pipelines try to find a wider range of signals.

There are three burst pipelines described: coherent WaveBurst (cWB), which famously first found GW150914; omicron–LALInferenceBurst (oLIB), and BayesWave, which follows up on cWB triggers.

As you might guess from the name, cWB looks for a coherent signal in both detectors. It looks for excess power (indicating a signal) in a time–frequency plot, and then classifies candidates based upon their structure. There’s one class for blip glitches and resonance lines (see the Detector Characterisation Paper), these are all thrown away as noise; one class for chirp-like signals that increase in frequency with time, this is where GW150914 was found, and one class for everything else. cWB’s detection statistic $\eta_c$ is something like a signal-to-noise ratio constructed based upon the correlated power in the detectors. The value for GW150914 was $\eta_c = 20$, which is higher than for any other candidate. The false alarm probability (or p-value), folding in all three search classes, is $2\times 10^{-6}$, which is pretty tiny, even if not as significant as for the tailored compact binary searches.

The oLIB search has two stages. First it makes a time–frequency plot and looks for power coincident between the two detectors. Likely candidates are then followed up by matching a sine–Gaussian wavelet to the data, using a similar algorithm to the one used for parameter estimation. It’s detection statistic is something like a likelihood ratio for the signal verses noise. It calculates a false alarm probability of about $2\times 10^{-6}$ too.

BayesWave fits a variable number of sine–Gaussian wavelets to the data. This can model both a signal (when the wavelets are the same for both detectors) and glitches (when the wavelets are independent). This is really clever, but is too computationally expensive to be left running on all the data. Therefore, it follows up on things highlighted by cWB, potentially increasing their significance. It’s detection statistic is the Bayes factor comparing the signal and glitch models. It estimates the false alarm probability to be about $7 \times 10^{-7}$ (which agrees with the cWB estimate if you only consider chirp-like triggers).

None of the searches find LVT151012. However, as this is a quiet, lower mass binary black hole, I think that this is not necessarily surprising.

cWB and BayesWave also output a reconstruction of the waveform. Reassuringly, this does look like binary black hole coalescence!

Gravitational waveforms from our analyses of GW150914. The wiggly grey line are the data from Hanford (top) and Livinston (bottom); these are analysed coherently. The plots show waveforms whitened by the noise power spectral density. The dark band shows the waveform reconstructed by BayesWave without assuming that the signal is from a binary black hole (BBH). The light bands show the distribution of BBH template waveforms that were found to be most probable from our parameter-estimation analysis. The two techniques give consistent results: the match between the two models is $94^{+2}_{-3}\%$. Fig. 6 of the Parameter Estimation Paper.

The paper concludes by performing some simple fits to the reconstructed waveforms. For this, you do have to assume that the signal cane from a binary black hole. They find parameters roughly consistent with those from the full parameter-estimation analysis, which is a nice sanity check of our results.

### The Detector Characterisation Paper

Synopsis: Detector Characteristation Paper
Read this if: You’re curious if something other than a gravitational wave could be responsible for GW150914 or LVT151012
Favourite part: Mega lightning bolts can cause correlated noise

The output from the detectors that we analyses for signals is simple. It is a single channel that records the strain. To monitor instrumental behaviour and environmental conditions the detector characterisation team record over 200,000 other channels. These measure everything from the alignment of the optics through ground motion to incidence of cosmic rays. Most of the data taken by LIGO is to monitor things which are not gravitational waves.

This paper examines all the potential sources of noise in the LIGO detectors, how we monitor them to ensure they are not confused for a signal, and the impact they could have on estimating the significance of events in our searches. It is amazingly thorough work.

There are lots of potential noise sources for LIGO. Uncorrelated noise sources happen independently at both sites, therefore they can only be mistaken for a gravitational wave if by chance two occur at the right time. Correlated noise sources effect both detectors, and so could be more confusing for our searches, although there’s no guarantee that they would cause a disturbance that looks anything like a binary black hole merger.

Sources of uncorrelated noise include:

• Ground motion caused by earthquakes or ocean waves. These create wibbling which can affect the instruments, even though they are well isolated. This is usually at low frequencies (below $0.1~\mathrm{Hz}$ for earthquakes, although it can be higher if the epicentre is near), unless there is motion in the optics around (which can couple to cause higher frequency noise). There is a network of seismometers to measure earthquakes at both sites. There where two magnitude 2.1 earthquakes within 20 minutes of GW150914 (one off the coast of Alaska, the other south-west of Seattle), but both produced ground motion that is ten times too small to impact the detectors. There was some low frequency noise in Livingston at the time of LVT151012 which is associated with a period of bad ocean waves. however, there is no evidence that these could be converted to the frequency range associated with the signal.
• People moving around near the detectors can also cause vibrational or acoustic disturbances. People are kept away from the detectors while they are running and accelerometers, microphones and seismometers monitor the environment.
• Modulation of the lasers at $9~\mathrm{MHz}$ and $45~\mathrm{MHz}$ is done to monitor and control several parts of the optics. There is a fault somewhere in the system which means that there is a coupling to the output channel and we get noise across $10~\mathrm{Hz}$ to $2~\mathrm{kHz}$, which is where we look for compact binary coalescences. Rai Weiss suggested shutting down the instruments to fix the source of this and delaying the start of observations—it’s a good job we didn’t. Periods of data where this fault occurs are flagged and not included in the analysis.
• Blip transients are a short glitch that occurs for unknown reasons. They’re quite mysterious. They are at the right frequency range ($30~\mathrm{Hz}$ to $250~\mathrm{Hz}$) to be confused with binary black holes, but don’t have the right frequency evolution. They contribute to the background of noise triggers in the compact binary coalescence searches, but are unlikely to be the cause of GW150914 or LVT151012 since they don’t have the characteristic chirp shape.

A time–frequency plot of a blip glitch in LIGO-Livingston. Blip glitches are the right frequency range to be confused with binary coalescences, but don’t have the chirp-like structure. Blips are symmetric in time, whereas binary coalescences sweep up in frequency. Fig. 3 of the Detector Characterisation Paper.

Correlated noise can be caused by:

• Electromagnetic signals which can come from lightning, solar weather or radio communications. This is measured by radio receivers and magnetometers, and its extremely difficult to produce a signal that is strong enough to have any impact of the detectors’ output. There was one strong  (peak current of about $500~\mathrm{kA}$) lightning strike in the same second as GW150914 over Burkino Faso. However, the magnetic disturbances were at least a thousand times too small to explain the amplitude of GW150914.
• Cosmic ray showers can cause electromagnetic radiation and particle showers. The particle flux become negligible after a few kilometres, so it’s unlikely that both Livingston and Hanford would be affected, but just in case there is a cosmic ray detector at Hanford. It has seen nothing suspicious.

All the monitoring channels give us a lot of insight into the behaviour of the instruments. Times which can be identified as having especially bad noise properties (where the noise could influence the measured output), or where the detectors are not working properly, are flagged and not included in the search analyses. Applying these vetoes mean that we can’t claim a detection when we know something else could mimic a gravitational wave signal, but it also helps us clean up our background of noise triggers. This has the impact of increasing the significance of the triggers which remain (since there are fewer false alarms they could be confused with). For example, if we leave the bad period in, the PyCBC false alarm probability for LVT151012 goes up from $0.02$ to $0.14$. The significance of GW150914 is so great that we don’t really need to worry about the effects of vetoes.

At the time of GW150914 the detectors were running well, the data around the event are clean, and there is nothing in any of the auxiliary channels that record anything which could have caused the event. The only source of a correlated signal which has not been rules out is a gravitational wave from a binary black hole merger. The time–frequency plots of the measured strains are shown below, and its easy to pick out the chirps.

Time–frequency plots for GW150914 as measured by Hanford (left) and Livingston (right). These show the characteristic increase in frequency with time of the chirp of a binary merger. The signal is clearly visible above the noise. Fig. 10 of the Detector Characterisation Paper.

The data around LVT151012 are significantly less stationary than around GW150914. There was an elevated noise transient rate around this time. This is probably due to extra ground motion caused by ocean waves. This low frequency noise is clearly visible in the Livingston time–frequency plot below. There is no evidence that this gets converted to higher frequencies though. None of the detector characterisation results suggest that LVT151012 has was caused by a noise artifact.

Time–frequency plots for LVT151012 as measured by Hanford (left) and Livingston (right). You can see the characteristic increase in frequency with time of the chirp of a binary merger, but this is mixed in with noise. The scale is reduced compared with for GW150914, which is why noise features appear more prominent. The band at low frequency in Livingston is due to ground motion; this is not present in Hanford. Fig. 13 of the Detector Characterisation Paper.

If you’re curious about the state of the LIGO sites and their array of sensors, you can see more about the physical environment monitors at pem.ligo.org.

### The Calibration Paper

Synopsis: Calibration Paper
Read this if: You like control engineering or precision measurement
Favourite part: Not only are the LIGO detectors sensitive enough to feel the push from a beam of light, they are so sensitive that you have to worry about where on the mirrors you push

We want to measure the gravitational wave strain—the change in length across our detectors caused by a passing gravitational wave. What we actually record is the intensity of laser light out the output of our interferometer. (The output should be dark when the strain is zero, and the intensity increases when the interferometer is stretched or squashed). We need a way to convert intensity to strain, and this requires careful calibration of the instruments.

The calibration is complicated by the control systems. The LIGO instruments are incredibly sensitive, and maintaining them in a stable condition requires lots of feedback systems. These can impact how the strain is transduced into the signal readout by the interferometer. A schematic of how what would be the change in the length of the arms without control systems $\Delta L_\mathrm{free}$ is changed into the measured strain $h$ is shown below. The calibration pipeline build models to correct for the effects of the control system to provide an accurate model of the true gravitational wave strain.

Model for how a differential arm length caused by a gravitational wave $\Delta L_\mathrm{free}$ or a photon calibration signal $x_\mathrm{T}^\mathrm{(PC)}$ is converted into the measured signal $h$. Fig. 2 from the Calibration Paper.

To measure the different responses of the system, the calibration team make several careful measurements. The primary means is using photon calibration: an auxiliary laser is used to push the mirrors and the response is measured. The spots where the lasers are pointed are carefully chosen to minimise distortion to the mirrors caused by pushing on them. A secondary means is to use actuators which are parts of the suspension system to excite the system.

As a cross-check, we can also use two auxiliary green lasers to measure changes in length using either a frequency modulation or their wavelength. These are similar approaches to those used in initial LIGO. These go give consistent results with the other methods, but they are not as accurate.

Overall, the uncertainty in the calibration of the amplitude of the strain is less than $10\%$ between $20~\mathrm{Hz}$ and $1~\mathrm{kHz}$, and the uncertainty in phase calibration is less than $10^\circ$. These are the values that we use in our parameter-estimation runs. However, the calibration uncertainty actually varies as a function of frequency, with some ranges having much less uncertainty. We’re currently working on implementing a better model for the uncertainty, which may improve our measurements. Fortunately the masses, aren’t too affected by the calibration uncertainty, but sky localization is, so we might get some gain here. We’ll hopefully produce results with updated calibration in the near future.

### The Astrophysics Paper

Synopsis: Astrophysics Paper
Read this if: You are interested in how binary black holes form
Favourite part: We might be able to see similar mass binary black holes with eLISA before they merge in the LIGO band [bonus note]

This paper puts our observations of GW150914 in context with regards to existing observations of stellar-mass black holes and theoretical models for binary black hole mergers. Although it doesn’t explicitly mention LVT151012, most of the conclusions would be just as applicable to it’s source, if it is real. I expect there will be rapid development of the field now, but if you want to catch up on some background reading, this paper is the place to start.

The paper contains lots of references to good papers to delve into. It also highlights the main conclusion we can draw in italics, so its easy to skim through if you want a summary. I discussed the main astrophysical conclusions in my previous post. We will know more about binary black holes and their formation when we get more observations, so I think it is a good time to get interested in this area.

### The Stochastic Paper

Synopsis: Stochastic Paper
Read this if: You like stochastic backgrounds
Favourite part: We might detect a background in the next decade

A stochastic gravitational wave background could be created by an incoherent superposition of many signals. In pulsar timing, they are looking for a background from many merging supermassive black holes. Could we have a similar thing from stellar-mass black holes? The loudest signals, like GW150914, are resolvable, they stand out from the background. However, for every loud signal, there will be many quiet signals, and the ones below our detection threshold could form a background. Since we’ve found that binary black hole mergers are probably plentiful, the background may be at the high end of previous predictions.

The background from stellar-mass black holes is different than the one from supermassive black holes because the signals are short. While the supermassive black holes produce an almost constant hum throughout your observations, stellar-mass black hole mergers produce short chirps. Instead of having lots of signals that overlap in time, we have a popcorn background, with one arriving on average every 15 minutes. This might allow us to do some different things when it comes to detection, but for now, we just use the standard approach.

This paper calculates the energy density of gravitational waves from binary black holes, excluding the contribution from signals loud enough to be detected. This is done for several different models. The standard (fiducial) model assumes parameters broadly consistent with those of GW150914’s source, plus a particular model for the formation of merging binaries. There are then variations on the the model for formation, considering different time delays between formation and merger, and adding in lower mass systems consistent with LVT151012. All these models are rather crude, but give an idea of potential variations in the background. Hopefully more realistic distributions will be considered in the future. There is some change between models, but this is within the (considerable) statistical uncertainty, so predictions seems robust.

Different models for the stochastic background of binary black holes. This is plotted in terms of energy density. The red band indicates the uncertainty on the fiducial model. The dashed line indicates the sensitivity of the LIGO and Virgo detectors after several years at design sensitivity. Fig. 2 of the Stochastic Paper.

After a couple of years at design sensitivity we may be able to make a confident detection of the stochastic background. The background from binary black holes is more significant than we expected.

If you’re wondering about if we could see other types of backgrounds, such as one of cosmological origin, then the background due to binary black holes could make detection more difficult. In effect, it acts as another source of noise, masking the other background. However, we may be able to distinguish the different backgrounds by measuring their frequency dependencies (we expect them to have different slopes), if they are loud enough.

### The Neutrino Paper

Synopsis: Neutrino Paper
Read this if: You really like high energy neutrinos
Favourite part: We’re doing astronomy with neutrinos and gravitational waves—this is multimessenger astronomy without any form of electromagnetic radiation

There are multiple detectors that can look for high energy neutrinos. Currently, LIGO–Virgo Observations are being followed up by searches from ANTARES and IceCube. Both of these are Cherenkov detectors: they look for flashes of light created by fast moving particles, not the neutrinos themselves, but things they’ve interacted with. ANTARES searches the waters of the Mediterranean while IceCube uses the ice of Antarctica.

Within 500 seconds either side of the time of GW150914, ANTARES found no neutrinos and IceCube found three. These results are consistent with background levels (you would expect on average less than one and 4.4 neutrinos over that time from the two respectively). Additionally, none of the IceCube neutrinos are consistent with the sky localization of GW150914 (even though the sky area is pretty big). There is no sign of a neutrino counterpart, which is what we were expecting.

Subsequent non-detections have been reported by KamLAND, the Pierre Auger ObservatorySuper-Kamiokande, Borexino and NOvA.

### The Electromagnetic Follow-up Paper

Synopsis: Electromagnetic Follow-up Paper
Read this if: You are interested in the search for electromagnetic counterparts
Favourite part: So many people were involved in this work that not only do we have to abbreviate the list of authors (Abbott, B.P. et al.), but we should probably abbreviate the list of collaborations too (LIGO Scientific & Virgo Collaboration et al.)

This is the last of the set of companion papers to be released—it took a huge amount of coordinating because of all the teams involved. The paper describes how we released information about GW150914. This should not be typical of how we will do things going forward (i) because we didn’t have all the infrastructure in place on September 14 and (ii) because it was the first time we had something we thought was real.

The first announcement was sent out on September 16, and this contained sky maps from the Burst codes cWB and LIB. In the future, we should be able to send out automated alerts with a few minutes latency.

For the first alert, we didn’t have any results which assumed the the source was a binary, as the searches which issue triggers at low latency were only looking for lower mass systems which would contain a neutron star. I suspect we’ll be reprioritising things going forward. The first information we shared about the potential masses for the source was shared on October 3. Since this was the first detection, everyone was cautious about triple-checking results, which caused the delay. Revised false alarm rates including results from GstLAL and PyCBC were sent out October 20.

The final sky maps were shared January 13. This is when we’d about finished our own reviews and knew that we would be submitting the papers soon [bonus note]. Our best sky map is the one from the Parameter Estimation Paper. You might it expect to be more con straining than the results from the burst pipelines since it uses a proper model for the gravitational waves from a binary black hole. This is the case if we ignore calibration uncertainty (which is not yet included in the burst codes), then the 50% area is $48~\mathrm{deg}^2$ and the 90% area is $150~\mathrm{deg^2}$. However, including calibration uncertainty, the sky areas are $150~\mathrm{deg^2}$ and $590~\mathrm{deg^2}$ at 50% and 90% probability respectively. Calibration uncertainty has the largest effect on sky area. All the sky maps agree that the source is in in some region of the annulus set by the time delay between the two detectors.

The different sky maps for GW150914 in an orthographic projection. The contours show the 90% region for each algorithm. The faint circles show lines of constant time delay $\Delta t_\mathrm{HL}$ between the two detectors. BAYESTAR rapidly computes sky maps for binary coalescences, but it needs the output of one of the detection pipelines to run, and so was not available at low latency. The LALInference map is our best result. All the sky maps are available as part of the data release. Fig. 2 of the Electromagnetic Follow-up Paper.

A timeline of events is shown below. There were follow-up observations across the electromagnetic spectrum from gamma-rays and X-rays through the optical and near infra-red to radio.

Timeline for observations of GW15014. The top (grey) band shows information about gravitational waves. The second (blue) band shows high-energy (gamma- and X-ray) observations. The third and fourth (green) bands show optical and near infra-red observations respectively. The bottom (red) band shows radio observations. Fig. 1 from the Electromagnetic Follow-up Paper.

Observations have been reported (via GCN notices) by

Together they cover an impressive amount of the sky as shown below. Many targeted the Large Magellanic Cloud before the knew the source was a binary black hole.

Footprints of observations compared with the 50% and 90% areas of the initially distributed (cWB: thick lines; LIB: thin lines) sky maps, also in orthographic projection. The all-sky observations are not shown. The grey background is the Galactic plane. Fig. 3 of the Electromagnetic Follow-up Paper.

Additional observations have been done using archival data by XMM-Newton and AGILE.

We don’t expect any electromagnetic counterpart to a binary black hole. No-one found anything with the exception of Fermi GBM. This has found a weak signal which may be coincident. More work is required to figure out if this is genuine (the statistical analysis looks OK, but some times you do have a false alarm). It would be a surprise if it is, so most people are sceptical. However, I think this will make people more interested in following up on our next binary black hole signal!

### Bonus notes

#### Naming The Event

GW150914 is the name we have given to the signal detected by the two LIGO instruments. The “GW” is short for gravitational wave (not galactic worm), and the numbers give the date the wave reached the detectors (2015 September 14). It was originally known as G184098, its ID in our database of candidate events (most circulars sent to and from our observer partners use this ID). That was universally agreed to be terrible to remember. We tried to think of a good nickname for the event, but failed to, so rather by default, it has informally become known as The Event within the Collaboration. I think this is fitting given its significance.

LVT151012 is the name of the most significant candidate after GW150914, it doesn’t reach our criteria to claim detection (a false alarm rate of less than once per century), which is why it’s not GW151012. The “LVT” is short for LIGO–Virgo trigger. It took a long time to settle on this and up until the final week before the announcement it was still going by G197392. Informally, it was known as The Second Monday Event, as it too was found on a Monday. You’ll have to wait for us to finish looking at the rest of the O1 data to see if the Monday trend continues. If it does, it could have serious repercussions for our understanding of Garfield.

Following the publication of the O2 Catalogue Paper, LVT151012 was upgraded to GW151012, AND we decided to get rid of the LVT class as it was rather confusing.

#### Publishing in Physical Review Letters

Several people have asked me if the Discovery Paper was submitted to Science or Nature. It was not. The decision that any detection would be submitted to Physical Review was made ahead of the run. As far as I am aware, there was never much debate about this. Physical Review had been good about publishing all our non-detections and upper limits, so it only seemed fair that they got the discovery too. You don’t abandon your friends when you strike it rich. I am glad that we submitted to them.

Gaby González, the LIGO Spokesperson, contacted the editors of Physical Review Letters ahead of submission to let them know of the anticipated results. They then started to line up some referees to give confidential and prompt reviews.

The initial plan was to submit on January 19, and we held a Collaboration-wide tele-conference to discuss the science. There were a few more things still to do, so the paper was submitted on January 21, following another presentation (and a long discussion of whether a number should be a six or a two) and a vote. The vote was overwhelmingly in favour of submission.

We got the referee reports back on January 27, although they were circulated to the Collaboration the following day. This was a rapid turnaround! From their comments, I suspect that Referee A may be a particle physicist who has dealt with similar claims of first detection—they were most concerned about statistical significance; Referee B seemed like a relativist—they made comments about the effect of spin on measurements, knew about waveforms and even historical papers on gravitational waves, and I would guess that Referee C was an astronomer involved with pulsars—they mentioned observations of binary pulsars potentially claiming the title of first detection and were also curious about sky localization. While I can’t be certain who the referees were, I am certain that I have never had such positive reviews before! Referee A wrote

The paper is extremely well written and clear. These results are obviously going to make history.

Referee B wrote

This paper is a major breakthrough and a milestone in gravitational science. The results are overall very well presented and its suitability for publication in Physical Review Letters is beyond question.

and Referee C wrote

It is an honor to have the opportunity to review this paper. It would not be an exaggeration to say that it is the most enjoyable paper I’ve ever read. […] I unreservedly recommend the paper for publication in Physical Review Letters. I expect that it will be among the most cited PRL papers ever.

I suspect I will never have such emphatic reviews again [happy bonus note][unhappy bonus note].

Publishing in Physical Review Letters seems to have been a huge success. So much so that their servers collapsed under the demand, despite them adding two more in anticipation. In the end they had to quintuple their number of servers to keep up with demand. There were 229,000 downloads from their website in the first 24 hours. Many people remarked that it was good that the paper was freely available. However, we always make our papers public on the arXiv or via LIGO’s Document Control Center [bonus bonus note], so there should never be a case where you miss out on reading a LIGO paper!

#### Publishing the Parameter Estimation Paper

The reviews for the Parameter Estimation Paper were also extremely positive. Referee A, who had some careful comments on clarifying notation, wrote

This is a beautiful paper on a spectacular result.

Referee B, who commendably did some back-of-the-envelope checks, wrote

The paper is also very well written, and includes enough background that I think a decent fraction of it will be accessible to non-experts. This, together with the profound nature of the results (first direct detection of gravitational waves, first direct evidence that Kerr black holes exist, first direct evidence that binary black holes can form and merge in a Hubble time, first data on the dynamical strong-field regime of general relativity, observation of stellar mass black holes more massive than any observed to date in our galaxy), makes me recommend this paper for publication in PRL without hesitation.

Referee C, who made some suggestions to help a non-specialist reader, wrote

This is a generally excellent paper describing the properties of LIGO’s first detection.

Physical Review Letters were also kind enough to publish this paper open access without charge!

#### Publishing the Rates Paper

It wasn’t all clear sailing getting the companion papers published. Referees did give papers the thorough checking that they deserved. The most difficult review was of the Rates Paper. There were two referees, one astrophysics, one statistics. The astrophysics referee was happy with the results and made a few suggestions to clarify or further justify the text. The statistics referee has more serious complaints…

There are five main things which I think made the statistics referee angry. First, the referee objected to our terminology

While overall I’ve been impressed with the statistics in LIGO papers, in one respect there is truly egregious malpractice, but fortunately easy to remedy. It concerns incorrectly using the term “false alarm probability” (FAP) to refer to what statisticians call a p-value, a deliberately vague term (“false alarm rate” is similarly misused). […] There is nothing subtle or controversial about the LIGO usage being erroneous, and the practice has to stop, not just within this paper, but throughout the LIGO collaboration (and as a matter of ApJ policy).

I agree with this. What we call the false alarm probability is not the probability that the detection is a false alarm. It is not the probability that the given signal is noise rather that astrophysical, but instead it is the probability that if we only had noise that we would get a detection statistic as significant or more so. It might take a minute to realise why those are different. The former (the one we should call p-value) is what the search pipelines give us, but is less useful than the latter for actually working out if the signal is real. The probabilities calculated in the Rates Paper that the signal is astrophysical are really what you want.

p-values are often misinterpreted, but most scientists are aware of this, and so are cautious when they come across them

As a consequence of this complaint, the Collaboration is purging “false alarm probability” from our papers. It is used in most of the companion papers, as they were published before we got this report (and managed to convince everyone that it is important).

Second, we were lacking in references to existing literature

Regarding scholarship, the paper is quite poor. I take it the authors have written this paper with the expectation, or at least the hope, that it would be read […] If I sound frustrated, it’s because I am.

This is fair enough. The referee made some good suggestions to work done on inferring the rate of gamma-ray bursts by Loredo & Wasserman (Part I, Part II, Part III), as well as by Petit, Kavelaars, Gladman & Loredo on trans-Neptunian objects, and we made sure to add as much work as possible in revisions. There’s no excuse for not properly citing useful work!

Third, the referee didn’t understand how we could be certain of the distribution of signal-to-noise ratio $\rho$ without also worrying about the distribution of parameters like the black hole masses. The signal-to-noise ratio is inversely proportional to distance, and we expect sources to be uniformly distributed in volume. Putting these together (and ignoring corrections from cosmology) gives a distribution for signal-to-noise ratio of $p(\rho) \propto \rho^{-4}$ (Schulz 2011).  This is sufficiently well known within the gravitational-wave community that we forgot that those outside wouldn’t appreciate it without some discussion. Therefore, it was useful that the referee did point this out.

Fourth, the referee thought we had made an error in our approach. They provided an alternative derivation which

if useful, should not be used directly without some kind of attribution

Unfortunately, they were missing some terms in their expressions. When these were added in, their approach reproduced our own (I had a go at checking this myself). Given that we had annoyed the referee on so many other points, it was tricky trying to convince them of this. Most of the time spent responding to the referees was actually working on the referee response and not on the paper.

Finally, the referee was unhappy that we didn’t make all our data public so that they could check things themselves. I think it would be great, and it will happen, it was just too early at the time.

#### LIGO Document Control Center

Papers in the LIGO Document Control Center are assigned a number starting with P (for “paper”) and then several digits. The Discover Paper’s reference is P150914. I only realised why this was the case on the day of submission.

#### The überbank

The set of templates used in the searches is designed to be able to catch binary neutron stars, neutron star–black hole binaries and binary neutron stars. It covers component masses from 1 to 99 solar masses, with total masses less than 100 solar masses. The upper cut off is chosen for computational convenience, rather than physical reasons: we do look for higher mass systems in a similar way, but they are easier to confuse with glitches and so we have to be more careful tuning the search. Since bank of templates is so comprehensive, it is known as the überbank. Although it could find binary neutron stars or neutron star–black hole binaries, we only discuss binary black holes here.

The template bank doesn’t cover the full parameter space, in particular it assumes that spins are aligned for the two components. This shouldn’t significantly affect its efficiency at finding signals, but gives another reason (together with the coarse placement of templates) why we need to do proper parameter estimation to measure properties of the source.

#### Alphabet soup

In the calculation of rates, the probabilistic means for counting sources is known as the FGMC method after its authors (who include two Birmingham colleagues and my former supervisor). The means of calculating rates assuming that the population is divided into one class to match each observation is also named for the initial of its authors as the KKL approach. The combined FGMCKKL method for estimating merger rates goes by the name alphabet soup, as that is much easier to swallow.

#### Multi-band gravitational wave astronomy

The prospect of detecting a binary black hole with a space-based detector and then seeing the same binary merger with ground-based detectors is especially exciting. My officemate Alberto Sesana (who’s not in LIGO) has just written a paper on the promise of multi-band gravitational wave astronomy. Black hole binaries like GW150914 could be spotted by eLISA (if you assume one of the better sensitivities for a detector with three arms). Then a few years to weeks later they merge, and spend their last moments emitting in LIGO’s band. The evolution of some binary black holes is sketched in the plot below.

The evolution of binary black hole mergers (shown in blue). The eLISA and Advanced LIGO sensitivity curves are shown in purple and orange respectively. As the black holes inspiral, they emit gravitational waves at higher frequency, shifting from the eLISa band to the LIGO band (where they merge). The scale at the top gives the approximate time until merger. Fig. 1 of Sesana (2016).

Seeing the signal in two bands can help in several ways. First it can increase our confidence in detection, potentially picking out signals that we wouldn’t otherwise. Second, it gives us a way to verify the calibration of our instruments. Third, it lets us improve our parameter-estimation precision—eLISA would see thousands of cycles, which lets it pin down the masses to high accuracy, these results can be combined with LIGO’s measurements of the strong-field dynamics during merger to give a fantastic overall picture of the system. Finally, since eLISA can measure the signal for a considerable time, it can well localise the source, perhaps just to a square degree; since we’ll also be able to predict when the merger will happen, you can point telescopes at the right place ahead of time to look for any electromagnetic counterparts which may exist. Opening up the gravitational wave spectrum is awesome!

#### The LALInference sky map

One of my jobs as part of the Parameter Estimation group was to produce the sky maps from our parameter-estimation runs. This is a relatively simple job of just running our sky area code. I had done it many times while were collecting our results, so I knew that the final versions were perfectly consistent with everything else we had seen. While I was comfortable with running the code and checking the results, I was rather nervous uploading the results to our database to be shared with our observational partners. I somehow managed to upload three copies by accident. D’oh! Perhaps future historians will someday look back at the records for G184098/GW150914 and wonder what was this idiot Christopher Berry doing? Probably no-one would every notice, but I know the records are there…