# Birkie Data: What do you want to see?

So, I went to sleep after my bedtime last night. Once I’d assigned “place in wave” numbers to about half the field (a bit more arduous in Excel than it should be, but not arduous enough to merit figuring out how to write a script) without having fully sorted the data, I realized it was time for bed. In any case, numbers are coming and should be posted soon and, well, a lot of the charts will look pretty similar to last year. Why? Because when 8000+ people ski a race, there’s only so much variability in the data. In other words, the fourth wave in 2012 looks like the fourth wave in 2011. Which is a good thing.

In any case, I want to look at some new things this year. So I’ll ask: what statistics do you want to see? A couple ideas include:

• Start position (measured by time to the timing wire) vs finish position: how well do they correlate by wave?
• Can we measure the overall field fitness in a low-snow year?
• Did the different weather dramatically change any finish times?
I’ve opened the comments on this post, please post anything you’d like to see there—and, if possible, how you’d measure it—and I will see if there’s any way of wrangling the data to prove your hypothesis.
Thanks!
## 9 thoughts on “Birkie Data: What do you want to see?”

1. It sounds like there was not a significant drop off in times this year despite the lack of snow in many areas. I am curious if the lower waves had slower times which could mean that the less serious skiers did not get a chance to get out as much this year. Or is there any way to tell (without doing a ton of work) that skiers from Minnesota, southern Wisconsin, Iowa, the Dakotas, etc. finished a lot slower than last year in general?

• That’s a good question; I’ll look in to it. I do know that beyond place 100 (before place 100 skiers finish in waves and the data between the last two years is similar) the percent back was significantly lower (better) this year than last. I’ll check it out for the whole race. I know that my % back improved from 22% to 20% even though I trained about half the snow miles that I did last winter.

2. A couple things come to mind.

How well are skiers seeded? Take skiers this year, and check against your prior two years of data. Make three groups: those with no Birkie related race results for 2010 and 2011, those with only Kortie related race results for 2010 or 2011, and those with one or two full length results from 2010 or 2011. What’s the relative spread by wave look like across the three types?

Next, instead of just looking at finish time by wave, also split by bib type. I’m curious as to how the times look by wave for groups such as the founders, Birch Leggings, etc. I think this may be apparent by groups of bib numbers within wave.

Finally, how were the conditions for skate versus classic skiers this year? Should the conversion factor vary by year? Maybe plot percentile times for 5% to 25% back between skate and classic, for each year. Are the slopes different? We know the demographics are not even between skate and classic, and there tend to be anomalies at the top end, which is why using a non-symmetric trimmed approach might be good.

• Hi Ted,

I don’t have the full results for 2010, so I can’t do those sorts of correlations. Maybe next year. The City of Lakes Loppet does run these types of data, so I don’t think I’d be adding much. The claim their seeding is better than the Birkie. But their pool is a little smaller.

As far as bib type goes, that was actually some of the first data I published back in 2010. Scroll down to the first couple of charts. I probably won’t run that data for this year because it doesn’t change too much from year to year and I don’t want to inundate people too much with charts!

The skate vs. classic data is quite different last year versus this one. I mean, the charts look pretty much the same until you look closely. That will definitely be up soon.

3. Anyone want to guess as to how highly correlated places at the timing wire, Timber Trail, OO and Mosquito Brook are to finishing places? (R-squared values.) Guess away in the comments; we’ll have the data up shortly.

4. R^2 = 0.95, 0.97, 0.98, 0.99

In the data request category, how about looking at the % change in pace between the first and second half of the race by finishing time? The top people speed up a fair amount in the second half. It would be interesting to see if later waves do the same.

• Quite a bit lower from the start wire, actually, and even from the other time points. I’ll have the data up in a bit. Good idea on the split changes, but I am going to have to figure out the format of the timing numbers for that. I think I’d divide the race in to five or maybe even six categories: Early Elites (top 50 skiers), Late Elites, Wave 1, Waves 2-4, Waves 5-8 and Waves 9-10 (new skiers) to look at these differences. I might even separate out Wave 3 which includes all over-65s. Maybe Early Elites, Late Elites, Wave 1-2, Wave 3, Waves 4-8, 9-10. Thanks for the suggestions.

5. Ari, I must disagree with you on your HEED comment. HEED is the best performing sports drink on the market. You may not like the flavor and it was a bit too warm this year, but for even energy (complex carbodydrate of HEED vs. simple sugars in Gatorade) and electrolyte profile, HEED cannot be beat.

6. Who won the Birkie classic? I know this might be beating a dead horse here after Bibgate 2012, but I can’t help it. I’m sure you no doubt noticed the 2 guys from wave 2 classic that clearly had faster times than the 3 guys on the podium. Even with a conservative estimate that it took Chamberlain, Vegard and Murray Carter 60 sec to the wire that would have put them at 2:50:15…That’s a full 40 seconds behind the Latvian’s time of 2:49:36.

Shouldn’t these guys be getting some credit? If I had the fastest time in the race- I would want to be considered the winner. As far as I know there is no “you have to win the race from wave 1” rule. Let’s give Janis Melbardis and Odd-Aage Bersvendsen some credit.