International benchmarking with CAT4

International benchmarking with CAT4

Introduction

The Cognitive Abilities Test® (CAT4) and Progress Test Series® for English, maths and science are standardised against the UK school population. Many international schools have asked for an international school standardisation of these assessments. However, we feel that:

  • a single representative international standardisation is difficult due to the diversity of international schools – for example curriculum, geography, number of students with English as Additional Language (EAL)

And, more importantly

  • our UK standardisations, based on a diverse student population*, have proven to be relevant and reliable for many international schools.

For the second year in a row we have carried out benchmarking exercises to compare the scores achieved by thousands of international students globally against the original standardisations for CAT4 (~224,000 students). The Progress Test Series will be released later this year.

Standard Age Score

The Standard Age Score (SAS) is based on the student’s raw score, but then adjusted for age and scaled to a mean of 100.. When comparing SAS scores, a difference of less than 3 points is not usually seen as statistically significant.

Comparison of UK standardisations and international scores in 2017

The distribution of international scores for CAT4 Level D (Table 1) is typical for international students and is very similar to the UK standardisation profile overall. Results show that typically international school students tend to score more highly on quantitative, nonverbal and spatial batteries, but lower on verbal. Figure 1 shows the slight skew to the left on the verbal distribution; Figure 2 shows the slight skew to the right on the nonverbal for CAT4 D.

Fig. 1

Fig. 2

In other words, the typical international class might be expected to be slightly more able overall but have weaker literacy skills. Given the higher level of EAL students in the typical international school, this is not surprising but it is an important point. Low verbal reasoning scores can be indicative of students who may have difficulty in accessing the curriculum.

Table 1

Table 1 shows mean scores for a number of CAT4 levels. The mean scores for some of the levels indicate that the international student benchmarking sample at these ages is more able than that of the UK standardisation (highlighted in green). At level G in particular, this difference is significant.

Verbal scores are lower across all levels, although this is not a significant difference. Mean scores across batteries were relatively similar between this year and last year’s mean scores for CAT4, varying only by a maximum of 1 point depending on the level.

How can schools use the data?

Plan resourcing at a cohort level

Does one year group appear to be significantly below average or less able than your other year groups? Can you provide additional teaching support for this cohort?

Consider the implications for EAL learners

Students with EAL may need literacy support throughout their school career to enable them to fully realise their potential in formal exams and to prepare them for higher education. Look out for students with verbal scores significantly lower than scores achieved in the other batteries.

Take advantage of the combination reports

Quickly identify students who are under-performing by using the automatic combination reports for CAT4 and the Progress Test Series. See more here.

Set realistic but challenging targets

The CAT4 reports provide a useful range of likely outcomes for each student. Some schools use these as a basis for a discussion with students helping them take ownership of their targets.

Your point of reference

The original standardisation is a reliable and valid benchmark for international students. Schools may additionally wish to compare their own data to the findings of this benchmarking exercise to see if they reflect some of the nuances seen between the different sections of CAT4 and from year to year with the Progress Test Series.

 

* For example, during our last standardisation exercise, for the Baseline assessment, the EAL ratio was 18%