The eventing data revolution
Statistics have sometimes had an uneasy relationship with a sport that relies so heavily on instinct. But there’s no doubting the numbers are here to stay
It all started with a tweet. Watching from beside the course at Badminton, armed only with his phone and an Excel sheet, Diarmuid Byrne typed out a message and pushed it out to an account yet to earn even a single follower.
All eyes were on the action back then as the first riders went out to kick off the five-day festival. Yet before the end of the event in 2015, the industry was abuzz with the pair of relative unknowns using numbers to predict what would happen before it did.
“I was looking at three-year run averages, which I know now aren’t that powerful – but they’re understandable,” recalls Diarmuid of his Nostradamus moment.
“I tweeted, ‘this rider is going to score 41’ and then that rider actually did score 41. But it was one tweet, one score, so I did the same with the next rider and so on. It was pure chance they scored close to what their average had been previously.
“On that very first day, I was tweeting and the commentary box at Badminton direct messaged me and said, ‘who are you and will you come to the commentary box?’. So I went and sat in there, telling them my numbers and what I thought was going to happen.”
This was no wizardry, though. After showing off the simple method devised by him and his friend and business partner Sam Watson, Diarmuid’s average scores were suddenly being broadcast much further than an unknown Twitter account.
He soon found himself nestled between two former world champion riders – Lucinda Green and Sir Mark Todd – offering his statistical analysis as part of Badminton’s on-course coverage being listened to across the estate.
What makes it all the more surprising is that Diarmuid had never ridden a horse competitively in his life.
“I was saying, ‘I don’t think this will be a very good score at all’, but I was just looking at their last three scores, which were all bad,” he continues.
“I don’t need world championships and Olympic medals to do that, I just need to say that the recent form hasn’t been that good. On the Sunday I was working with Clare Balding at the BBC, standing next to her, and as every horse would come out, I’d say something like, ‘they’ve been to the venue 10 times and this is the first time they’ve had a sub-40 score’ and then Clare would just repeat it.
“Nobody was actually doing that before. For years, it was accepted that coverage and commentary were based on really nice pictures and what you see in front of you; this adrenaline and excitement. The concept of making this thing a study or sort of academic was so alien at the time.”
And so began eventing’s data revolution.
In the following six years, the power of stats has exploded across the sport, with Diarmuid and Sam’s business, EquiRatings, at the forefront of the transformation.
Although the numbers haven’t just been used by the media to enhance the audience’s experience. Since that breakthrough at Badminton, data has been used to aid performance and Olympic selection, as well as improving rider and course safety.
There had been pockets of data analysis done across the sport previously, but harnessing digital communication and a dedicated force driving it forward has meant the reach of this latest emergence has been far greater.
“As a coach, you have a gut feeling or a hunch that you can see a little bit what people are doing – data can sometimes back this up,” says elite coach Yogi Breisner. “I’d been putting stats together myself prior to EquiRatings, doing things like taking the last five dressage scores at major events and putting the highest score and an average. I’d do the same with the cross country and show jumping and that would give me a bit of an idea of what would happen.
“The average score tells you what a rider is most likely to produce and the top score is what they’re capable of doing if everything is right. From a team point of view, you’re more interested in an average than the top, but from an individual, it’s top over average.”
Yogi was already familiar with EquiRatings, thanks to a previous relationship with Sam, who is a top rider himself, and took full advantage of the data provided when it came to preparing Great Britain’s Olympic cohort for Rio 2016.
Using the extensive records on not just Britain’s riders but also their opposition, Yogi says the team were able to identify their own weaknesses much easier – figuring out how to make the gains they needed to climb up the rankings.
“One of the factors that came up was that some of our opponents – depending on the various riders they were using – were sitting on a 76-77% clear round in the show jumping phase, whereas the British team were sometimes sitting on 32-33%,” Yogi explains.
“This means that, statistically, there could be three clear rounds from a team of four, whereas for the Brits it would be one, possibly two, clear rounds from four riders.
“By using that stat, we could highlight to the riders that we needed to concentrate on working in that area. In Rio, we had four horses show jumping and we had clear rounds – one had time penalties, but we had four clear rounds.
The type of statistics produced is constantly developing to discover the most insightful and useful figures, particularly when it comes to performance.
Yet even when providing data for several of the leading Olympic eventing teams, EquiRatings’ big task is finding the sweet spot between being detailed and too sophisticated.
“It’s been a really interesting journey in terms of the metrics we release, what works and why,” Diarmuid picks up.
“We started off with some basic averages, then moved to some quite complex risk analysis metrics. There was one called the EquiRatings Quality Index (ERQI) and, essentially, it was detailed and driven by algorithms, taking into account hundreds of variables – it was accurate but also hard to explain.
“We went back to simple things, such as Opposition Beaten Percentage (OBP), which was how many people a rider has taken on and how many they’ve beaten. You can like that or dislike that, but it’s fact.
“We’ve gone from very simple to very accurate, back to very simple again. But this time, very simple is very transparent. Right now, the cutting-edge stuff is what’s called Explainable AI, so it’s machine learning, black-box stuff again.
“But the explainable bit is the ‘why’. For example, if I’m ranked 25th in the world, I look at where I would be if I jumped clear at Blenheim. This is the way a lot of companies want their data, across every industry.”
There are sceptics, though. For traditional riders, nothing can replace the intuition they feel when they’re sat in the saddle and quite often a lot of the numbers they’re presented with may only confirm their assertions on their weaknesses and strengths.
The idea isn’t for numbers to overtake any of the natural instincts, just sit alongside them. And with variables such as different dressage judges and changing cross country terrains, the best approach seems to be a combination of both numbers and touch – even if the numbers take this into account.
The numbers also provide context for fans to understand who the favourites are, which riders normally struggle at a particular course and identify an array of records. And it’s this extra interest that’s attracting sponsors, such as Equilume and BETTALIFE, to support the high-performance and milestone data that audiences are lapping up.
But perhaps even more importantly, EquiRatings has driven new safety thresholds that British Eventing uses to identify horses or riders who are putting themselves at risk by moving to a new level – even if they’ve clocked up the prerequisite number of clears to qualify for the standard.
“Poor past performance predicts poor future performance,” Diarmuid says. “We started providing a list of poor performances to the governing body or to risk managers so they knew there was a horse coming through with five falls in the past 10 events. It says, ‘we need to keep an eye on this person’.
“It’s basically just results management, but we turned it into a rating and that became a global risk rating, which is used all over the world. It means there’s now some sort of quality metric in place that allows you to at least recognise poor performers who might be putting themselves at risk.”
So as data’s influence grows in all aspects of the sport, how far could it go?
“Data won’t ever take over completely,” answers Yogi. “You have it in sport for the technical side of things, but it can’t predict everything. Take Formula One, for example, which is an incredibly figures-based sport, but at the end of the day, it’s still only a human being driving the car – the data just helps with your decision-making.”
Like it or lump it, data is definitely here to stay. We’ve seen it in the numbers.