A level subject choices

May 23, 2015October 1, 2016 / CPLB / Leave a comment

Ofsted have recently published statistics relating to the subject choices of students starting A levels in England in 2013/2014. (For those unfamiliar with A levels, they are the qualifications taken between the ages of 16 and 18, students usually pick 3–4 subjects for the first year, which is known as AS, and normally slim down to 3 for the second year, A2; university admissions are based upon A level results). This is part of an effort to understand what drives students to pick different subjects and particularly science. Engaging students in science is a challenge, although many enjoy it or can achieve well in tests, then can struggle to see that it is for them. In Physics, we have a particular problem recruiting girls, which means we are not getting the best mix of people. I was interesting in having a look at the subject choices, so I’ve put together a few graphs.

Subject popularity

The most popular subjects at AS level are:

English,
Mathematics,
Psychology.

English and Maths make sense, as they’ll be familiar from previous study and are of general applicability. I was surprised that Psychology came third, since it’ll be a new subject; the top ten consists of subjects familiar from pre-16 education, with the exception of the two social sciences, Psychology and Sociology (8). Physics comes in at number 7, behind both Biology (4) and Chemistry (6). This makes me sad, but at least Physics is still one of the most popular choices.

The distribution of student numbers is show in the graph below. I’ve not quite figured out what the distribution of student numbers should be, but it’s roughly exponential. There are too many subjects to label individually, so I’ve grouped them roughly by subject area. The main sciences (Biology, Chemistry and Physics) all do rather well, but modern languages are languishing towards the bottom of the list (top is French at 21). The smallest subjects have been grouped together into Other categories, these make up the bottom of the distribution, but in amongst them are Classical studies (29), German (30), and Accounting & finance (31).

Student numbers in the most popular subjects at AS level (in England 2013/2014). Data from A level subject take-up.

Gender differences

The report also lists the numbers of boys and girls taking each subject. I know that Physics is male-dominated, but I didn’t know how this compared to other subjects. To quantify the imbalance, I’m going to define the asymmetry as

$\displaystyle \mathrm{Asymmetry} = \frac{\mathrm{No.\ of\ girls}\ -\ \mathrm{No.\ of\ boys}}{\mathrm{No.\ of\ students}}$ .

This is 0 if there are equal numbers of boys and girls, and is ±1 if completely made up of boys (−1) or girls (+1). Overall, more girls than boys are taking A levels, giving an total asymmetry of 0.0977. That’s not great, but we’ll see it’s smaller than is typically the case for individual subjects.

The most male-dominated subjects are:

Computing (−0.8275),
Physics (−0.5446),
Further mathematics (−0.4569).

The most female-dominated subjects are:

Sociology (0.5084),
Art & design (0.4896),
French (0.4531).

We see that Physics is in pretty poor shape, being the second most asymmetric subject overall. However, Computing is really out in a league of it’s own: there are almost 11 boys for every girl in the subject! That is not healthy. The most balanced subjects are:

Geography (0.0056),
Chemistry (−0.0167),
Government & politics (−0.0761).

These are the only subjects with asymmetries smaller than the overall population of students. The gender balance in Chemistry shows that the Physical sciences don’t need to be male-dominated; however, this could equally reflect the compromise between male-dominated Physics and female-dominated Biology (0.2049).

The graph below plots the number of students taking a subject and its asymmetry. There’s no real trend with student numbers, it’s not the case that it’s easier for smaller subjects to become biased or that it’s easier for larger subjects to develop a reputation.

Scatter plot of the number of students and gender asymmetry of AS subjects (in England 2013/2014). Higher points are more female dominated and lower points are more male dominated. The dashed line indicates gender parity and the dotted line indicates the average for all subjects. Data from A level subject take-up.

Normally, I’d expect there to be scatter in a quantity like asymmetry: some values high, some low, but more clustering in the middle than out in the extremes. Looking at the plot above, this doesn’t seem to be the case. There are relatively few subjects in the middle, but there seem to be two clusters, one at small positive asymmetries and another at small negative asymmetries. I’ve plotted the distribution of subject asymmetries below. To make it clearer to view (and to make a nice smooth, continuous distribution), I’ve smeared out the individual subjects. These means I’m actually plotting the density of subjects per unit of asymmetry, rather than the number of subjects: if you work out the area under the curve, that gives the number of subjects in that range. (For those who care, I’ve convolved with a Gaussian kernel with a standard deviation of 0.1, and made sure to renormalise them so that the total area is correct).

Smoothed distribution of gender asymmetry for AS subjects (in England 2013/2014). Left is male dominated and right is female dominated. The area under the curve gives the number of subjects. The diamonds mark the locations of individual subjects. Data from A level subject take-up.

It does appear that there are two peaks: one for boys’ subjects and another for girls’. Computing is off being a clear outlier. However, if I turn up the smoothing (using a standard deviation of 0.3), this disappears. This always happens if you smooth too much…

Heavily smoothed distribution of gender asymmetry for AS subjects (in England 2013/2014). Left is male dominated and right is female dominated. The area under the curve gives the number of subjects. The diamonds mark the locations of individual subjects. Data from A level subject take-up.

It looks like this is one of the cases where I should really do things properly and I should come back to look at this again later.

Regardless of whether my suspicion of there being two clusters of subjects is correct, there does appear to be a spectrum of subjects, with some being as perceived as for boys and others for girls. This differentiation exists already exists at age 16—even for subjects like Psychology and Sociology that have not been studied previously. It seems that these stereotypes are ingrained from an earlier age.

Ada, Countess of Lovelace, mathematician and first computer programmer (and superheroine), and Sigmund Freud, neurologist and founder of psychoanalysis. Evidence that there really shouldn’t be divides in Computing, Psychology or any other subject.

Continuation

As well as looking at how many students take AS, we can look at how many continue to A2. The report gives the percentage that continue for both boys and girls. The distribution of all continuation percentages is shown below, again with subjects grouped by area. The average progression across all subjects is 72.7%.

Percentage continuation from AS to A2 for different subjects (in England 2013/2014). The dotted line indicates the average. Data from A level subject take-up.

The top subjects for continuation to A2 are:

Other modern languages (90.4%),
Drama (82.7%),
Media/film/TV studies (81.4%).

Other modern languages is the smallest subject in terms of student numbers, but has the highest continuation: I guess those who opt for it are dedicated to seeing it through. However, there doesn’t seem to be a correlation between student numbers and continuation. English, the most popular subject, comes in just below Media/film/TV studies with 81.2%. The bottom subjects for continuation are:

Other social sciences (45.9%),
Accounting & finance (59.7%),
Computing (61.4%).

I don’t know enough about these subjects to know if there might be a particular reason why just taking them for one year might be useful. In contrast to Other modern languages, German (62.7%), French (64.1%) and Spanish (65.8%) have some of the lowest continuation rates (coming in just above Computing). Physics also does poorly, with only 67.8% continuing, below both Chemistry (71.0%) and Biology (72.2%). For comparison, Further mathematics has 68.3% continuation and Mathematics has 75.4%. I would expect continuation to be lower for subjects that students find more difficult (possibly with the biggest jump from GCSE).

Now, let’s have a look at the difference in progression between the genders. In the figure below, I plot the difference in the percentage progression between boys and girls,

$\mathrm{Difference} = \mathrm{Percent\ girls\ continuing}\ -\ \mathrm{Percent\ boys\ continuing}$ ,

versus the asymmetry. The two quantities show a clear correlation: more girls than boys progress in subjects that are female dominated and vice versa. Gender asymmetry gets worse with progression.

Scatter plot of the gender asymmetry and difference in percentage progression of AS subjects (in England 2013/2014). Left is male dominated and right is female dominated. Higher points have a higher proportion of girls than boys continuing and lower points have a higher proportion of boys than girls continuing. Data from A level subject take-up.

The subjects with the largest differences in continuation are:

Physics (−14%),
Other science (−12%),
Psychology (11%).

That’s a really poor show for Physics. This polarising trend is not surprising. People like to be where they feel they belong. If you’re conspicuously outnumbered, you’re more likely to feel uncomfortable. Data show that girls are more likely to continue with Physics in all-girls schools. Also, as we’ve seen, there seems to be a clustering of boys’ subjects and girls’ subjects, and developing these reputations can make it difficult for people to go against stereotypes. This impacts both how people view themselves and others, potentially impacting perceived competence (e.g., for Physics, Gonslaves 2014a, 2014b). These cultural biases are something we need to work against if we’re going the get the best mix of students (I guess it’s good we have all these Psychologists and Sociologists to help figure this out).

I’d recommend trying the excellent (and adorable) Parable of the Polygons to see how biases can become magnified.

Summary

At A level, some subjects are favoured by boys or by girls. This imbalance gets larger during the transition from AS to A2. Physics is one of the most popular subjects at AS level, but lags behind the other main sciences. It has a poor gender ratio, which notably gets worse going from AS to A2. Physics is (arguably) the the most awesome subject, so we should do more to show that is for everyone. If you’d like to play around the data (and don’t fancy typing it out yourself), I have it available via Google Drive.

(For disclosure: I took Geography at AS, and Physics, Maths and Further maths at A2).

An introduction to probability: Great expectations

August 16, 2014October 25, 2014 / CPLB / Leave a comment

We use probabilities a lot in science. Previously, I introduced the concept of probabilities, here I’ll explain the concept of expectation and averages. Expectation and average values are one of the most useful statistics that we can construct from a probability distribution. This post contains a traces of calculus, but is peanut free.

Expectations

Imagine that we have a discrete set of numeric outcomes, such as the number from rolling a die. We’ll label these as $x_1$ , $x_2$ , etc., or as $x_i$ where the subscript $i$ is used as shorthand to indicate any of the possible outcomes. The probability of the numeric value being a particular $x_i$ is given by $P(x_i)$ . For rolling our dice, the outcomes are one to six ( $x_1 =1$ , $x_2 = 2$ , etc.) and the probabilities are

$\displaystyle P(1) = P(2) = P(3) = P(4) = P(5) = P(6) = \frac{1}{6}$ .

The expectation value is the sum of all the possible outcomes multiplied by their respective probabilities,

$\displaystyle \langle x \rangle = \sum_i x_i P(x_i)$ ,

where $\sum_i$ means sum over all values of $i$ (over all outcomes). The expectation value for rolling a die is

$\displaystyle \langle x \rangle = 1 \times P(1) + 2 \times P(2) + 3 \times P(3) + 4 \times P(4) + 5 \times P(5) + 6 \times P(6) = \frac{7}{2}$ .

The expectation value of a distribution is its average, the value you’d expect after many (infinite) repetitions. (Of course this is possible in reality—you’d certainly get RSI—but it is useful for keeping statisticians out of mischief).

For a continuous distribution, the expectation value is given by

$\displaystyle \langle x \rangle = \int x p(x) \, \mathrm{d} x$ ,

where $p(x)$ is the probability density function.

You can use the expectation value to guide predictions for the outcome. You can never predict with complete accuracy (unless there is only one possible outcome), but you can use knowledge of the probabilities of the various outcomes the inform your predictions.

Imagine that after buying a large quantity of fudge, for entirely innocent reasons, the owner offers you the chance to play double-or-nothing—you’ll either pay double the bill or nothing, based on some random event—should you play? Obviously, this depends upon the probability of winning. Let’s say that the probability of winning is $p$ and find out how high it needs to be to be worthwhile. We can use the expectation value to calculate how much we should expect to pay, if this is less than the bill as it stands, it’s worth giving it a go, if the expectation value is larger than the original bill, we should expect to pay more (and so probably shouldn’t play). The expectation value is

$\displaystyle \langle x \rangle = 0 \times (1 - p) + 2 \times p = 2 p$ ,

where I’m working in terms of unified fudge currency, which, shockingly, is accepted in very few shops, but has the nice property that your fudge bill is always 1. Anyhow, if $\langle x \rangle$ is less than one, so if $p < 0.5$ , it’s worth playing. If we were tossing a (fair) coin, we’d expect to come out even, if we had to roll a six, we’d expect to pay more.

The expectation value is the equivalent of the mean. This is the average that people usually think of first. If you have a set of numeric results, you calculate the mean by adding up all or your results and dividing by the total number of results $N$ . Imagine each outcome $x_i$ occurs $n_i$ times, then the mean is

$\displaystyle \bar{x} = \sum_i x_i \frac{n_i}{N}$ .

We can estimate the probability of each outcome as $P(x_i) = n_i/N$ so that $\bar{x} = \langle x \rangle$ .

Median and mode

Aside from the mean there are two other commonly used averages, the median and the mode. These aren’t quite as commonly used, despite sounding friendlier. With a set of numeric results, the median is the middle result and the mode is the most common result. We can define equivalents for both when dealing with probability distributions.

To calculate the median we find the value where the total probability of being smaller (or bigger) than it is a half: $P(x < x_\mathrm{med}) = 0.5$ . This can be done by adding up probabilities until you get a half

$\displaystyle \sum_{x_i \, \leq \, x_\mathrm{med} } P(x_i) = 0.5$ .

For a continuous distribution this becomes

$\displaystyle \int_{x_\mathrm{low}}^{x_\mathrm{med}} p(x) \, \mathrm{d}x = 0.5$ ,

where $x_\mathrm{low}$ is the lower limit of the distribution. (That’s all the calculus out of the way now, so if you’re not a fan you can relax). The LD₅₀ lethal dose is a median. The median is effectively the middle of the distribution, the point at which you’re equally likely to be higher or lower.

The median is often used as it is not as sensitive as the mean to a few outlying results which are far from the typical values.

The mode is the value with the largest probability, the most probable outcome

$\displaystyle P(x_\mathrm{mode}) = \max P(x_i)$ .

For a continuous distribution, it is the point which maximises the probability density function

$\displaystyle p(x_\mathrm{mode}) = \max p(x)$ .

The modal value is the most probable outcome, the most likely result, the one to bet on if you only have one chance.

Education matters

Every so often, some official, usually an education minister, says something about wanting more than half of students to be above average. This results in much mocking, although seemingly little rise in the standards for education ministers. Having discussed averages ourselves, we can now see if it’s entirely fair to pick on these poor MPs.

The first line of defence, is that we should probably specify the distribution we’ve averaging. It may well be that they actually meant the average bear. It’s a sad truth that bears perform badly in formal education. Many blame the unfortunate stereotyping resulting from Winnie the Pooh. It might make sense to compare with performance in the past to see if standards are improving. We could imagine that taking the average from the 1400s would indeed show some improvement. For argument’s sake, let’s say that we are indeed talking about the average over this year’s students.

If the average we were talking about was the median, then it would be impossible for more (or fewer) than half of students to do better than average. In the case, it is entirely fair to mock the minister, and possibly to introduce them to the average bear. In this case, a mean bear.

If we were referring to the mode, then it is quite simple for more than half of the students to do better than this. To achieve this we just need a bottom-heavy distribution, a set of results where the most likely outcome is low, but most students do better than this. We might want to question an education system where the aim is to have a large number of students doing poorly though!

Finally, there is the mean; to use the mean, we first have to decide if we have a sensible if we are averaging a sensible quantity. For education performance this normally means exam results. Let’s sidestep the issue of if we want to reduce the output of the education system down to a single number, and consider the properties we want in order to take a sensible average. We want the results to be numeric (check); to be ordered, such that high is good and low is bad (or vice versa) so 42 is better than 41 but not as good as 43 and so on (check), and to be on a linear scale. The last criterion means that performance is directly proportional to the mark: a mark twice as big is twice as good. Most exams I’ve encountered are not like this, but I can imagine that it is possible to define a mark scheme this way. Let’s keep imagining, and assume things are sensible (and perhaps think about kittens and rainbows too… ).

We can construct a distribution where over half of students perform better than the mean. In this case we’d really need a long tail: a few students doing really very poorly. In this case, these few outliers are enough to skew the mean and make everyone else look better by comparison. This might be better than the modal case where we had a glut of students doing badly, as now we can have lots doing nicely. However, it also means that there are a few students who are totally failed by the system (perhaps growing up to become a minister for education), which is sad.

In summary, it is possible to have more than 50% of students performing above average, assuming that we are not using the median. It’s therefore unfair to heckle officials with claims of innumeracy. However, for these targets to be met requires lots of students to do badly. This seems like a poor goal. It’s probably better to try to aim for a more sensible distribution with about half of students performing above average, just like you’d expect.