Grade expectations: summer 2020
By Mick Walker, CIEA Vice Chair
The publication of GCE A level and GCSE results taken by 16 and 18 year-olds in much of the UK is a nervously anticipated annual event for hundreds of thousands of young people, their parents or guardians, their teachers – and yes, politicians, awarding bodies and regulators. Self-evidently, the results of examinations hold most importance for the future opportunities open to students; this has always been the case and a fact not lost on teachers who constantly look to the best interests of their pupils. But since the introduction of national school performance tables in 1992, the results have come under ever increasing public scrutiny as measures of the performance of teachers and the ranking of schools, even though general and vocational qualifications (and National Curriculum tests for that matter) were never designed for these purposes. In subsequent years, National Curriculum statutory tests and other qualification outcomes were added to the performance measures, but 1992 is seminal, that’s when the purpose of examination and test results was no longer centred on the candidates: they became ‘personal’, particularly for teachers and schools. Over the years, results days in August have attracted the ever-increasing attention of the mass media in ways un-common across the rest of the world. In this country, if results go up, the exams are not tough enough and we have ‘grade inflation’; if results have gone down, educational standards have fallen and schools have failed; and if results remain broadly the same, no one has improved! The outcomes have become artefacts of politics and news headlines rather than a cause for celebration and reflection. We seem to have a national aversion to offering praise - to anyone - including politicians, regulators, examination boards, examiners and educational institutions.
We’ve also lost trust, in just about everyone. We don’t trust teachers’ assessments as their judgments are compromised in a context of high stakes examinations, hence the now dominant reliance on externally set and marked examinations: and teachers don’t trust the marking conducted by examination boards, even though markers are predominantly teachers. Elsewhere, schools have been labelled as ‘exam factories’ as they tailor their curriculums to ‘game the system’ rather than offer a ‘broad and balanced’ curriculum. Meanwhile, politicians are singled out for attempting to ‘raise standards’ in a way that suggests everyone else is not. And all the time we too frequently overlook, or simply don’t understand that despite our very best efforts, any form of assessment lacks perfection – be it in the classroom or through a public examination.
But that doesn’t stop our collective ambition to design and administer examinations that are as valid, fair and equitable as humanly possible. At any time this is demanding stuff, but 2020 has presented even greater challenge for reasons everyone now understands. The government’s swift decision to cancel examinations in 2020 was the right decision, but this move exposed the near total reliance on externally set and marked examinations and caught us with our collective assessment trousers down. To be very clear, I am not an advocate of removing externally set and marked examinations and tests from our assessment regime. As Tim Oates rightly pointed out in the CIEA (2020) webinar on the award of qualifications in summer 2020 held on the 9th June, examinations offer a standardised, fair and objective assessment free of any localised bias and are not bound by the relationship between the teacher and the pupil. But, and it’s a big but, timed examinations cover only a sample of the subject construct and have their own limitations in terms of validity and reliability: see for example the NFER literature review on reliability (2013) and the study on marking consistency published by Ofqual (2018) that shows wide variation across subjects. So in my view, it is at least desirable to include broader measures over longer periods of time. This is where schools and teachers come in. But, and it’s another big but, teachers’ assessments are also un-reliable, either through unconscious bias or the type of conscious bias driven by a range of motivators like results forming a part of their performance management measures or schools seeking to improve their performance table positions at any cost and the widespread suspicion that ‘teachers elsewhere are inflating their students’ marks’. Of course this does not apply to all teachers and schools, but when reports such as the NAHT Commission on Assessment speak of the lack of trust “…within the profession itself” (2014, p.14), this is disheartening when one holds the ambition to see teachers recognised as consummate professionals. But such recognition has to be earned, and in my opinion, teachers are not adequately educated in the art and science of educational assessment. Take for example bias, a topic of much focus when it comes to teachers’ assessments. My own research at the University of Leeds suggests that aspects of educational assessment such as bias are not adequately covered in ITT courses – along with validity and reliability. And I’m not being over-critical of ITT providers: either time constraints or their own lack of educational assessment expertise work against them. So newly qualified teachers become more reliant on picking up their educational assessment knowledge from their schools, which inevitably means it is highly variable in quality. But where does that expertise come from? True, practising teachers have years of experience, but do they really possess any depth of study of key aspects of educational assessment? And ‘experience’ tells us nothing about the efficacy of understanding – if you don’t believe me, look at all the dodgy assessment data generated by schools in flight paths and other forms of data abuse and the sometime insane marking policies. Indeed it is such activities as these that give assessment a bad name as generators of unnecessary teacher workload. Over the years, teacher associations have erroneously welcomed reductions in teachers’ involvement in high stakes assessment as ‘reductions in workload’. Teacher workload is a serious issue, but we should eliminate really unnecessary activities – not assessment. Taking teachers out of the assessment process is like taking doctors out of the diagnosis stage of treatment.
Again, I’m not having a pop; professional development in educational assessment is at best sparse and not a formalised condition of being a member of the teaching profession, yet assessment is integral to the process of teaching and learning let alone marking examinations. This is clearly resolvable. Teachers are bright, motivated individuals, but they have been de-skilled by a system that since 1988 has provided a National Curriculum, associated statutory assessments and ever-increasing externally set and marked examinations and tests. So why would you spend loads of time on educational assessment and the philosophy of education in ITT course amidst a list of other possible topics? And when the message from successive governments is that the only credible assessment is external assessment, that is, it’s ‘done to you’ from outside.
And now in 2020, it has come back to bite us. No one was really expecting such a devastating pandemic – a bit like Monty Python’s Spanish Inquisition sketch, but it has happened and it has exposed the Achilles’ heel of the teaching profession; the lack of educational assessment expertise and equally worryingly, the lack of trust in their judgments. In theory, 2020 should offer teachers the opportunity to ‘show their worth’ when it comes to educational assessment through the submission of the estimated grades and rank orders of their students, especially in a context when externally imposed accountability measures based on test and examination outcomes have rightly been suspended by the government. But have they come up with the goods? In short, I very much doubt it, and teachers should prepare themselves for some disappointment come results day – unless of course some knew full-well they were chancing their arm. But that may need some explaining to disappointed students - and their parents.
The research tells us that teachers’ grade predictions do not generally align with examination results. For example, see the work by Gill and Benton (2015 a & b), or the work of Sandra Johnson (2012 & 2013). And if we look at more recent work by FFT Educationdatalab (2020), and then headlines from Ofqual’s 2020 Summer Symposium, the majority of school based estimated grades are ‘over-optimistic’ – and by quite some distance when one considers key grades such as grade 4 at GCSE and grade B at GCE A level. The data also suggest that levels of optimism differ by centre type, of which I’m sure more detailed analysis will become available over the coming months.
This is of course the view at the national level and no doubt some schools, some teachers, will be very accurate. But schools holding out for all of their estimated grades to be awarded are likely to be disappointed. Indeed, the estimated grades are at the bottom of the pecking order in the standardisation approach adopted for 2020 with prior attainment, historical data and rank ordering all taking priority. What is key, however, is that teachers’ rank ordering will be maintained by the standardisation model adopted for 2020, and yes, it is very likely that teachers will be within one grade of the final awarded grade and results have, overall, ‘gone up’ from previous years. But the International Baccalaureate Diploma results have gone up this year for the first time in three years and there are still cries of foul play.
Indeed, teachers are pretty good at rank ordering their pupils, but at class level. Rank ordering across a year group is another matter. Strangely, little is known about how teachers actually go about estimating grades (Gill, 2019) even though predicted grades are used for a variety of purposes – to inform pupils of their likely achievement in public examinations, to act as aspirational goals – as suggested by UCAS (2020), and up to about 2015 as submissions to awarding bodies to assist in the grade awarding process or for a level of reassurance if a candidate misses an examination component. This latter use has clearly fallen by the wayside and in current circumstances could have been used to analyse centre-based predicted grades against those awarded by the examination boards. However, these are two different assessments; teachers are presumed to generate their predictions on work produced by students throughout the course, whereas an examination focuses on a smaller, one-off sample of the syllabus.
We therefore do need to know more about how teachers and schools actually produce estimated grades and the work now being undertaken by Ofqual to draw on the summer 2020 experience is very welcome and worthy of support. For those unaware, Ofqual is currently running a survey to understand the different approaches teaching staff and centres used to produce the Centre Assessment Grades and rank orders submitted to awarding organisations earlier this year. They want to know about the sources of evidence used and how their importance varied across centres, the difficulties the teachers faced when completing this task, and how centres tried to minimise the influence of any biases in these grades. Ofqual will also be carrying out in-depth interviews with a sample of teaching staff that will allow them to dig much deeper into the same questions but in individual contexts. This research runs as part of a larger programme of work including projects using data analysis to understand any particular equalities issues raised in the context of the extraordinary arrangements this summer. The survey is live until end of 1st August and well worth supporting. A link to the survey is provided at the end of this article.
So we can, and really should do better when it comes to teacher-based assessments. Not just to support the regime of tests and qualifications, but to improve teaching and learning at every level. That’s what educators are about and assessment is the fulcrum on which teaching and learning pivot. If there’s no learning, by definition there’s no teaching and the only proxy we have for judging learning is through assessment. In short, we don't have a common standardisation approach when it comes to high stakes assessments – a point made by Ofqual in justifying their approach this summer: and they are correct. Over the years there have been attempts in developing quite elaborate moderation processes, but where we’ve found such systems lacking we’ve generally cut them out of the frame rather than looking at longer-term improvements. So we do need a common standardisation approach, but any system used by centres has to be of good and proven quality, not an add-on or token gesture used to compile mad and meaningless flight paths and other spurious concoctions of data graphics, and not something driven by a politically biased agenda. The goal has to be driven by educational considerations that will require deeper professional knowledge and understanding and appropriate quality assurance processes covering educational assessment within every educational institution. This requires training and evidence of the efficacy of non-examination assessment systems. All educational institutions should have a Chartered Educational Assessor to provide the level of quality assurance but also to act as a point of expertise. And this is not just to support examinations, but to improve educational assessment throughout the institution - be it in the day-to-day use such as how to use effective questioning in the classroom, real understanding of how to design or evaluate assessment instruments for points of transition, for example from one class to the other, or one school to the other, or for summative purposes at the end of a period or stage of education: all of which can be used to inform future teaching and learning. Such an approach would not only be of benefit to teaching and learning and the validity of high stakes assessments, it will also raise the professional status of teachers – and raise standards of performance.
It needs to be done now. We need better articulation between internal and external assessment – this is not an ‘either or’, but building the best possible approach and not the most expedient. The potential for further COVID-19 disruption is real, so if we want a more robust system in 2021, time is of the essence. But this is not the only driver for change. We have the opportunity to use this year’s experience as a stimulus for longer-term change that will benefit learners and lift the professional status of teachers. As the former Chair of the CIEA Sir John Dunford recently reminded us in his blog on the future of the examinations system (see references), let’s not waste a crisis.
CIEA, 2020. Covid-19 and the award of qualifications in summer 2020. [online]. CIEA webinar, 9th June 2020. Available from: https://www.herts.ac.uk/ciea/ciea-lecture-series
Dunford, J. 2020. The future of the exams system: Moving to a more valid and reliable system [online]. John Dunford Consulting. June 14th 2020. Available from: https://johndunfordconsulting.wordpress.com/blog/
FFT Education datalab. 2020. GCSE results 2020: A look at the grades proposed by schools. Available from: https://ffteducationdatalab.org.uk/2020/06/gcse-results-2020-a-look-at-the-grades-proposed-by-schools/
Gill, T. 2019. Methods used by teachers to predict final A level grades for their students. Research Matters: A Cambridge Assessment publication. 28, 2-10. Available from: https://www.cambridgeassessment.org.uk/Images/562367-research-matters-28-autumn-2019.pdf
Gill, T. and Benton, T. 2015a. The accuracy of forecast grades for OCR GCSEs in June 2014. Statistics Report Series No.91. Cambridge Assessment. Available from: https://www.cambridgeassessment.org.uk/Images/241265-the-accuracy-of-forecast-grades-for-ocr-gcses-in-june-2014.pdf
Gill, T. and Benton, T. 2015b. The accuracy of forecast grades for OCR A levels in June 2014. Statistics Report Series No. 90. Cambridge, UK: Cambridge Assessment.
Johnson. S. 2012. A focus on teacher assessment reliability in GCSE and GCE. In: Ofqual’s Reliability Compendium. Opposs, D and He, Q. eds. Ofqual, pp. 365-416.
Johnson, S. 2013. On the reliability of high-stakes teacher assessment, Research Papers in Education, 28:1, 91-105, DOI: 10.1080/02671522.2012.754229. Available from: https://doi.org/10.1080/02671522.2012.754229
NAHT. 2014. Report of the NAHT Commission on Assessment. National Association of Headteachers. Available from: https://www.stem.org.uk/system/files/elibrary-resources/2016/04/Assessment%20commission%20report%20document%20%282%29.pdf
Ofqual, 2018. Research and Analysis, Marking consistency metrics. An update. The Office of Qualifications and Examinations Regulation. (Ofqual). Available from: https://assets.publishing.service.gov.uk/government/uploads/system/uploads/attachment_data/file/759207/Marking_consistency_metrics_-_an_update_-_FINAL64492.pdf
UCAS, 2020. Predicted grades – what you need to know. [Online]. The Universities and Colleges Admissions Service. [Accessed 11th January 2020]. Available from: https://www.ucas.com/advisers/managing-applications/predicted-grades-what-you-need-know
Ofqual materials on the arrangements for 2020
- Ofqual video
- Symposium support materials
- Slides from the 2020 symposium
- Ofqual fact sheet
- The Ofqual Survey on approaches to centre based grading and rank-ordering
BBC article on David Laws’ submission to ministers. https://www.bbc.co.uk/news/education-52895640