Revised REF rankings tables

Revised REF rankings tables

In my analysis of the Impact of Impact I showed that due to the different scatters in the three different subprofiles in REF 2014 (Output, Impact and Environment) the effective weights in Grade Point Averages were not the nominal weights (0.65, 0.2, and 0.15) as you might naively expect.  This was especially true in Physics where Impact had the largest effective weight (equal with Sociology).  It is interesting to think how the rankings would have been if each subprofile had had the same scatter. Statisticians would do this by calculating the standardised variables (i.e. normalising by the scatter before combining).  So, I’ve recalculated the rankings using standardised variables and weighting by nominal weights.

The ranking table for Physics is below.

So you can see from this that e.g. Sussex is 27th in the standardised table but was 33rd in the original table and so has “lost” 6 positions due to variation in scatter in the different metrics (i.e. primarily because Impact had a larger scatter).   The big losers in this sense were Portsmouth (-10 positions), Birmingham (-8 positions), King’s College (-7), QMUL and Sussex (-6 position).

Winners were Glasgow (+10), Surrey (+9), Leeds (+8) and Manchester (+7).

If you want to look at the rankings for any other UoA, they are all included in this spread sheet. ImpactImpactRankings

One final comment.  It is clear to me from my analyses and discussions with others that the  outcomes in Physics in particular from the REF in absolute terms were generally very good across the board.  The relative differences between different departments are actually quite small.  However, it is these small relative differences which drive the league tables.  The league tables are therefore not a very useful measure and should be treated with extreme caution. As we plan at Sussex for our next REF I am going to focus on what we can do to continue to demonstrate our excellent research what we can do to improve on this but try and ignore the randomness of the league tables.

UoA # Original GPA Original Rank Ranking with standardized GPAs weighted Change in ranking (Standardised – Original) Institution
9 3.33 2 1 -1 University of Oxford
9 3.33 4 2 -2 Universities of Edinburgh and St Andrews
9 3.35 1 3 2 University of Strathclyde
9 3.3 5 4 -1 Cardiff University
9 3.33 3 5 2 University of Nottingham
9 3.24 10 6 -4 University of Southampton
9 3.22 11 7 -4 University of Warwick
9 3.27 8 8 0 University of Durham
9 3.28 7 9 2 University of Cambridge
9 3.27 9 10 1 Imperial College London
9 3.16 13 11 -2 Heriot-Watt University
9 3.13 15 12 -3 University of Sheffield
9 3.28 6 13 7 University of Manchester
9 3.05 24 14 -10 University of Portsmouth
9 3.1 19 15 -4 Queen’s University Belfast
9 3.07 22 16 -6 Queen Mary University of London
9 3.05 25 17 -8 University of Birmingham
9 3.16 14 18 4 University of Bath
9 3.12 17 19 2 University of Exeter
9 3.16 12 20 8 University of Leeds
9 3.12 16 21 5 University College London
9 3.02 29 22 -7 King’s College London
9 3.06 23 23 0 University of Liverpool
9 3.04 26 24 -2 Lancaster University
9 3.02 27 25 -2 Liverpool John Moores University
9 3.07 21 26 5 University of Bristol
9 2.93 33 27 -6 University of Sussex
9 3.1 18 28 10 University of Glasgow
9 3.1 20 29 9 University of Surrey
9 3 31 30 -1 Royal Holloway, University of London
9 3.02 28 31 3 University of Kent
9 2.93 32 32 0 University of York
9 2.89 34 33 -1 Swansea University
9 2.67 36 34 -2 Keele University
9 3 30 35 5 University of Leicester
9 2.7 35 36 1 University of Hertfordshire
9 2.6 37 37 0 University of Central Lancashire
9 2.53 38 38 0 Loughborough University
9 2.43 39 39 0 University of Huddersfield
9 2.42 40 40 0 Aberystwyth University
Posted in Uncategorized | 1 Comment

Biting the bullet

I’ve finally bowed to peer pressure and dusted down my wordpress blog site established about four years ago but never used. Thanks to Charlotte and  Jillian for their encouragement.

Posted in Uncategorized | Leave a comment

Astronomy and Dementia Diagnosis

Liz Ford and I went to London yesterday to try and pitch our plans to use astronomical data analysis techniques to try and help GPs diagnose dementia.  Fortunately, this coincided with an article on the Today program with Professor June Andrews talking about her book “Dementia: The One-Stop Guide: Practical advice for families, professionals, and people living with dementia and Alzheimer’s Disease”. Unfortunately, I don’t think the panel had heard it.

Posted in Uncategorized | Leave a comment

The Impact of Impact

 The Impact of Impact

I wrote the following article to explore how Impact in the Research Excellence Framework 2014 (REF2014) affected the average scores of departments (and hence rankings). This produced a “league table” of how strongly impact affected different subjects. Some of the information in this article was used in a THE article by Paul Jump due to come out 00:00 on 19th Feb 2015.  I’ve now also produced ranking tables for each UoA using the standardised weighting I advocate below (see Standardised Rankings).

UoA Unit of Assessment Effective Weight of GPA

ranking in each sub-profile as %

Outputs Impact Envir.
9 Physics 37.9 38.6 23.5
23 Sociology 34.1 38.6 27.3
10 Mathematical Sciences 37.6 37.5 24.9
24 Anthropology and Development Studies 40.2 35.0 24.8
6 Agriculture, Veterinary and Food Science 42.0 33.0 25.0
31 Classics 43.3 32.6 24.0
16 Architecture, Built Environment and Planning 48.6 31.1 20.3
22 Social Work and Social Policy 44.3 31.1 24.7
27 Area Studies 45.8 30.5 23.6
14 Civil and Construction Engineering 49.0 30.2 20.8
32 Philosophy 47.2 30.2 22.7
26 Sport and Exercise Sciences, Leisure and Tourism 50.2 29.7 20.1
36 Communication, Cultural and Media Studies, … 48.4 29.3 22.3
15 General Engineering 47.6 29.1 23.3
25 Education 45.1 29.0 25.9
20 Law 49.3 28.8 21.9
13 Electrical and Electronic Engineering, Metallurgy … 45.9 28.7 25.4
29 English Language and Literature 42.9 28.6 28.5
30 History 47.6 28.5 23.9
1 Clinical Medicine 41.1 28.3 30.6
28 Modern Languages and Linguistics 49.7 28.0 22.3
3 Allied Health Professions, Dentistry, Nursing and … 46.7 27.9 25.4
11 Computer Science and Informatics 51.5 27.9 20.6
17 Geography, Environmental Studies and … 46.9 27.4 25.8
18 Economics and Econometrics 54.3 27.3 18.3
21 Politics and International Studies 48.4 27.0 24.6
34 Art and Design: History, Practice and Theory 50.3 26.9 22.8
5 Biological Sciences 50.7 26.7 22.6
4 Psychology, Psychiatry and Neuroscience 51.1 26.6 22.3
12 Aeronautical, Mechanical, Chemical and … 45.9 26.6 27.5
33 Theology and Religious Studies 48.7 26.6 24.7
7 Earth Systems and Environmental Sciences 51.0 25.6 23.4
19 Business and Management Studies 52.5 24.4 23.0
8 Chemistry 45.8 23.9 30.3
35 Music, Drama, Dance and Performing Arts 56.4 23.5 20.1
2 Public Health, Health Services and Primary Care 56.9 19.6 23.4
Average 47.1 29.0 23.9

Table 1: Effective weights on the rankings of the Grade Point Averages, expressed as a percentage, in each of the three measures (Outputs, Impact and Environment). Effective weight conveys the contribution the relative position in each of the sub-profiles contributes to the relative position in the Overall GPA) and is the product of the nominal weight (65%, 20% or 15%) and the standard deviation. These have have been normalised to 100%. The Table is ranked by effective weight in the Impact sub-profile.

The Research Excellence Framework 2014 (REF2014) is the latest assessment of the quality of research in UK universities. These assessments occur roughly every six years and have important direct funding and indirect consequences for individual departments and universities as a whole. A new feature of the 2014 exercise was the introduction of a measure of the socioeconomic Impact that could be attributed to research. Its novelty will ensure that it is the subject of much scrutiny. Many people have and will continue to explore in detail what the measurements tell us about the extent to which the UK research does have an impact on the wider society, however, I am going to try and tackle the simpler task of looking at the impact of impact measurement on the overall assessment of research quality and hence league tables.

As Director of Research and Knowledge Exchange for the School of Mathematical and Physical Sciences at the University of Sussex I oversaw the REF 2014 submissions for the Departments of Mathematics, and Physics and Astronomy. On seeing our result for Physics and Astronomy I was initially surprised by how much our poorer ranking in the Impact measurement had affected our overall position. My immediate intuition was that the Impact scores were spread over a larger range and that this caused them to have a greater influence on the overall score than I would have naïvely expected.

For any subject area (Unit of Assessment, UoA) and any individual department, the REF 2014 results are aggregated into a Grade Point Average (GPA) for each of three sub-profiles: Outputs, Impact and Environment. These are then combined with relative weightings of 65%, 20%, 15% to give an Overall GPA. Naïvely one might then expect that the Impact would affect the Overall rankings by quite a lot less than the Outputs and a little bit more than the Environment. However, that is not necessarily the case.

To consider the contribution of each component to an aggregate score we need to consider the weights and the intrinsic spread in each measurement. For example, in many grant or fellowship panels the proposals are graded by a number of panel members and the scores averaged together. It is well known that a panel member who uses the full range of scores (1-5, say) will have a bigger influence on the final rankings than a panel member whose scores tend to span a more conservative range (say 2-4). Suppose one panel member gives candidate A a score of 1 and B a score of 5 while two other panel members both give A a score of 4 and B a score of 3 then the equal weighted averages will be 3.00 and 3.67. So, even though the three panel members have the same nominal weight, the final ranking of the candidates is that of the minority panel member whose scores had a greater variety and thus their scores had a larger effective weight.

We can estimate these effective weights for each sub-profile in REF 2014. The ranking of a department in one sub-profile (e.g. the GPA of Environment, scored as ge) is determined by comparison to the peers. We can characterise the performance of the peers by their mean GPA, µe, and the variety by the standard deviation, σe. The rank of the department is then related to its difference from the mean compared with the deviation of other departments from the mean i.e. the rank is related to Δe= (gee)/σe [1]. When three sub-profiles are combined with nominal weights wo, wi, we i.e. g=wogo+wigi+wege we can see that Δ, which governs the Overall rank, is proportional to [2] woso Δo+ wisi Δi + wese Δe. The effective weights (i.e. the impact of each sub-profile on the overall ranking) are thus proportional to the nominal weights scaled by the standard deviation of the measure. If all the standard deviations were the same then the effective weights would be equal to the nominal weights. However, if the standard deviation in one sub-profile is higher than another then it will acquire a higher effective weight. These effective weights for each UoA are shown in Table 1.

It is immediately clear from this table that the effective weights for Outputs are always less than their nominal weights. For two units of assessment (UoA 23 – Sociology and UoA 9 – Physics) Impact has a higher effective weight than Outputs. The average effective weights are 47%, 29% and 24% in contrast to the nominal weights of 65%, 20% and 15%. There is also a wide variety of nearly a factor of two in effective weights in Impact ranging from 19.7% to 38.6%.

So, the variations in scores in Impact (and Environment) are generally much larger than the variations in the scores in Outputs. Why should this be? There are probably a number of factors:

  • Although the same numerical scale is used in each of the sub-profiles, they are measuring different things (e.g. 4* means “Quality that is world-leading in terms of originality, significance and rigour” in the Outputs sub-profile, but means “Outstanding impacts in terms of their reach and significance” in the Impact sub-profile). So, a weighted Overall is somewhat meaningless
  • There is less variety in the Output sub-profiles because departments typically select the best research to be considered
  • There is more variety in the Impact sub-profile because this is the first time this profile has been used and UoAs didn’t know how best to present or select their best Impact
  • There is less variation in the Outputs sub-profile because it is based on many more measurements (4 outputs per faculty) than e.g. Impact (~1 impact case study for 10 faculty), so has a smaller “error”
  • There is more intrinsic variety e.g. in the environment of departments than there is in the research outputs of individual researchers.

Before the guidelines for REF 2014 were finalised there was some discussion about whether the nominal weight for Impact should be 25% or 20% and there are indications that the weight will increase in the next exercise. This analysis shows that the effective weight is already well above 25% in most cases but with a wide variety of effective weights across different Units of Assessment. Given this variety, and the different definitions of the criteria used in each sub-profile, policy makers should consider carefully how the Overall profiles are constructed in future. They might want to combine standardized statistics rather than raw statistics.

The most obvious conclusion is that care should be taken in interpreting the published Overall scores and rankings. They different sub-profiles do not have the influence you might expect from the naïve weights and, in particular, Impact has a higher impact.

[1] In statistics Δ is known as a standardized variable

[2] The constant of proportionality is the standard deviation, σ, of the Overall GPA. This depends both on the standard deviations of the sub-profiles and their correlations. However, the rank order in GPA is the same in either Δ or Δσ so we don’t need to look at the correlations to understand the weights in the ranking.

Seb Oliver, University of Sussex, 20 January 2015

Posted in Uncategorized | Tagged , , | 5 Comments


Seb’s page

Posted in Uncategorized