Professor Barry Smyth is a data scientist and AI researcher working in the SFI funded Insight Centre for Data Analytics. He also runs marathons on the side. A few years ago, Smyth started to combine these interests by using his data science skills to collect and analyse race data from millions of marathon runners around the world. Now, what started as a hobby, has become part of his day job.
Until now, most research on marathons has focused on the elite runners. By looking at the data collected by different marathons all over the world Smyth has been able to answer many questions about the non-elite, everyday runners. Sometimes the data confirmed some of the conventional wisdom – such as that starting too fast leads to slower finish-times – but other times it led to more surprising surprising findings, such as that runners who sprint to the finish tend to finish more slowly overall. The data showed that women are more disciplined runners than men, pacing their races more evenly and hitting the wall far less often. And, although men are faster than women, this gender-gap closes with age.
While analysing data to understand the past was fine, Smyth he knew that his AI expertise made it possible to predict the future, to help runners to train better, run farther, and finish stronger. With this in mind, Smyth has turned his attention to the future, to use artificial intelligence to make predictions to help runners to achieve new personal best (PB) finish-times.
He says, “Earlier this year, I was on droning about my latest marathon data analysis to a friend, and fellow researcher and he asked whether it might be possible to use the data to predict a runner’s potential personal best finish-time for some upcoming marathon. To be clear, we were not talking about predicting any old marathon finish-time, based on a recent half-marathon or 10k race time; there are plenty of race calculators available to do this. Rather, we were interested in predicting a challenging but achievable PB time, which a runner might be capable of pushing themselves to achieve.”
It was clear that several things were needed to do this. First, they needed to be able to predict a realistic PB finish-time for a runner, a time that would challenge them, while avoiding the mistake of picking a time that wasn’t achievable.
But predicting a target finish-time was just one part of what was needed. They also needed to recommend a pacing plan so that the runner could be advised about how to pace themselves, throughout the race, to achieve the predicted PB.
Finally, all of this needed to be tailored for the course in question as both finish-times and pacing plans are highly influenced by the twists and turns, ups and downs, of a particular race course.
When Smyth examined data for the London marathon in this manner, he found that participants who could complete the marathon in 150 minutes, could improve their personal best by about five minutes. Slower men, those who could finish in four hours, could expect to improve their time by 22 minutes, while women at a similar standard could reasonably aim for a 17-minute jump in their personal best.
Smyth’s predictions use a machine learning technique called Case Based Reasoning. He makes his PB predictions, by adapting the PB times and pacing profiles of runners who are similar to the runner whose PB he wants to predict. These similar runners all have past races that are similar to the target runner’s performance, but they all went on to achieve faster times in the future, and these faster times, and their corresponding pacing profiles, are the basis of the prediction for the target runner.
The more cases he has, the better the predictions are likely to be, and if he wants to make predictions for a different marathon, let’s say Chicago, then he can generate his cases from the race records of Chicago marathoners.
Pacing is key to achieving the predicted PB time. Smyth found that it was easier to make accurate predictions for female runners. “Women are more consistent pacers than men and, as such, tend to run more disciplined races,” says Smyth.
He found the same to be true for elite runners. People who run very fast marathons also tend to be disciplined when it comes to pacing, making prediction of their races easier as well.
So right now, the system can suggest challenging but achievable PB times and it can recommend pacing plans to help a runner achieve that time. Smyth is currently working to improve the system’s prediction accuracy. His ambitions don’t stop there however.
He says, “We are looking at many different ways in which machine learning can be used to help marathon runners and other athletes, not just for PB prediction, but also for injury prevention, for recovery advice, to personalise training plans and so-on. The possibilities are quite exciting. It is all about using wearable technology and AI to help people make better decisions. Better decisions about how they exercise, but that’s just the start.”