Race Viz

So I had an idea. And a lot of data. If I took split times, I could get a pretty decent approximation of a skiers position during the race. Take their wire start time and pace, add the pace to the the wire time, 8:01, 8:02, 8:03 … and you would have a good idea of, for the entire race, how many people were on the trail at any given point in time and space. I figured I could then take these data and display them graphically. I could break the racers out by wave, kind of like I do with the finishing results, but in real time. I started to get excited.

The issue was that the data analysis was a little beyond my scope of expertise. I had, for each skier, their split times and their intermediate pace times. The next step was to calculate using a rather extensive Excel formula their position for every minute of the race (starting with the Elite women at 8:00 a.m. and running 600 minutes until the official timing cutoff at 6:00 p.m.). This gave me a rather extensive spreadsheet with 6000 rows of data and more than 600 columns. If you’re doing the math at home, that’s 3.6 million data points—most of them multi-byte numbers—meaning that my spreadsheet was not-that-fast with 60 mb of data to sort through. But I did have an estimation of the position of every skier for every minute of the race.

A major potential enhancement to these data would be to use actual real-time data from skiers as “control” data for the rest. For instance, right now, we are assuming that a skier with a 4:00 per kilometer pace from the start to OO moves exactly 0.25 kilometers per minute. Obviously, that skier moves much more slowly up hills, and much more quickly down, which is why there are backups in areas (namely, the bottoms of hills) in later waves. There are, for instance, about 20 Birkie tracks on Strava. We could correlate these actual data with our modeled times to show how the trail bunches on uphills and spreads out on downhills. That’s out next iteration, I suppose. (For now, I have to crunch the classic data anyway.)

The next step was doing a count, for any given time, of how many skiers were in each portion of the race course (I arbitrarily divided it in to 250m segments. At 4:00 per kilometer a skier would finish the race in about 3:20, so this seemed like a perfectly good level of granularity. Then a count. Done. I now could pull up a chart for any minute of the race to show the state of the race at that point.

This was all well and good, but I was not about to manually enter 600 times and export 600 charts. Nosireebob. I posted on this page asking for help, and Neil Bescamper said he might be able to lend a hand. (Many thanks, Neil!) After teaching this old dog some new tricks (in Excel) we waded through enough Visual Basic to export a chart and then increment the time one minute, and run that loop. And somehow using the VB formula meant that my computer used four cores instead of one (thanks, Microsoft, for porting a program and giving it a severe handicap; it’s almost like you hobble it so much that we’ll all go out and buy Windows machines) meant that it only took an hour to export 600 files.

So I had 600 png files. And iMovie was crashing when I tried to create a movie (to its credit: I need to free up some hard drive space). So I found a quick free program, converted to jpgs, and was on my way. A little music (which might change, although Beethoven’s Seventh is nothing to sneeze at) and we’re on our way!

Fun!

A few things to note. First, the current race does a pretty good job of staying pretty well spread out. Once the initial congestion clears (each wave pretty quickly settles out in to a nice bell curve) there is similar congestion—peaking at 30 to 50 skiers per 250m segment—throughout the race. In other words, no one wave winds up being much more crowded than others. Which might lend some credence to the strategy of having bigger early waves, stacking the back waves might lead to more crowding.

Second, 50 skiers per 250 meters. Or one skier every 5 meters. That’s kind of cool.

Third, if you look at the Elite and first waves you notice a lot of pack formation. The top Elite pack for men and women breaks free pretty early in the race, and opens up a significant, kilometer-long gap pretty quickly. The poor chase group isn’t within site by OO or before given the sinewy trail.

Anyway, I think this is good stuff. We’re working on the Classic race, too, when I have a chance to muck around with those data as well.