Monday, November 25, 2013

Predicting Your Duncan Ridge Trail 50km Time

Congratulations to all the DRT finishers this past weekend! I meant to get this post out before the race to provide you with something to talk about with your friends and family on race day, maybe even encourage you to shoot for a faster finishing time. Yet time got away from me.
What follows is an analysis of finishing times for runners who have finished both the StumpJump and the DRT 50km races.
Given how fast you ran the StumpJump, how well should you expect to do in the DRT? Someone recommended adding two hours to your StumpJump time, but is that true? Let's look at the data. Taking all the finishing times for those runners who have finished both races, averaging their results if they've run more than once, we plot DRT time versus StumpJump time. Stumpjump times are on the X axis (independent variable), DRT on the Y axis (dependent variable). Although the DRT 2013 results are in, let's look at how good a prediction adding two hours to your StumpJump time stacks up against a linear model prior to 2013. What's a linear model? Simple, try and draw a straight line on a plot that goes through each of your data points. That's a linear model ;) Obviously this doesn't work if your data are not already in a straight line, but try and draw a line through the oval shape that closely resembles your data. That's a linear model, too :) Unfortunately as you'll see below, sometimes the shape of your data is weird. That's when your linear model doesn't really characterize your data, and using a linear model just becomes useless.
The two hour recommendation is the red line, while the linear model recommendation is the blue line. All those dots represent runners who have run both races, their average StumpJump on the X axis, average DRT time on the Y axis. If you want to predict your own DRT finishing time, start by finding your StumpJump time on the X axis with your computer mouse, scroll up to either one of the lines, then scroll to the left to get your predicted DRT Time.

So given StumpJump times for all years including 2013, and given DRT times for all years except 2013, we see that the linear model predicts DRT times less than two hours plus StumpJump time. But our data look funny, and the linear model is probably not be a good predictor.

StumpJumpTime TwoHours ModelRec Difference
04:00:00 06:00:00 06:09:43 -09:43
04:30:00 06:30:00 06:34:28 -04:28
05:00:00 07:00:00 06:59:14 46
05:30:00 07:30:00 07:23:59 06:01
06:00:00 08:00:00 07:48:45 11:15
06:30:00 08:30:00 08:13:30 16:30
07:00:00 09:00:00 08:38:16 21:44
07:30:00 09:30:00 09:03:01 26:59
08:00:00 10:00:00 09:27:47 32:13
08:30:00 10:30:00 09:52:32 37:28
09:00:00 11:00:00 10:17:18 42:42
Now let's look at the DRT 2013 finishers who've also run the StumpJump.

At least for 2013, the two hour recommendation looks like it characterizes the data pretty well, although you could probably get away with recommending adding only 1.5 hours instead of 2.
And finally let's add the DRT 2013 finishers to our original data and see what our linear model does.

It seems to characterize the data a little better, but the data shape still looks funny. Maybe next time I'll investigate a different model. Regardless, here's a prediction table to use while preparing for DRT 2014!

StumpJumpTime TwoHours ModelRec Difference
04:00:00 06:00:00 05:50:40 09:20
04:30:00 06:30:00 06:18:11 11:49
05:00:00 07:00:00 06:45:42 14:18
05:30:00 07:30:00 07:13:13 16:47
06:00:00 08:00:00 07:40:44 19:16
06:30:00 08:30:00 08:08:15 21:45
07:00:00 09:00:00 08:35:46 24:14
07:30:00 09:30:00 09:03:16 26:44
08:00:00 10:00:00 09:30:47 29:13
08:30:00 10:30:00 09:58:18 31:42
09:00:00 11:00:00 10:25:49 34:11

No comments :