Skip to main content

The other day I was looking around for a site for a new calibration course near my home and found the necessary criteria very difficult to satisfy. (I do not live in Ohio!) The length of 300 meters was particularly difficult to find.

A length of 300 meters was established when the Jones counter was virtually the only one in use and the limit of readability of this is only one count or 0.05 rev. The recent use of the calibrated rim particularly with spoke intervals improves readability by a factor of ten. (The calibrated rim can be used with the Jones as well as with electronic counters.) I wondered if a shorter calibration course would give satisfactory results with a calibrated rim. Therefore, when I set up my new 300-meter course, I included a 100-meter mark. Calibrations done on my old 400-meter course and the new ones are shown below. (Spoke intervals were read from a 32-spoke wheel and no SCPF was applied. % difference is that from the old course.)

Crse,m--Rev----Ride1, si-------Ride2, si---------Rev/km---Cts/km--------%diff


Starting and stopping form a greater proportion of a 100-meter course than for longer courses, and there is probably a tendency as in the above test for the 100-meter course to produce a slightly longer calibration constant. I have an idea that this might be minimized by having the pedal at its highest point for push-off and by avoidance of coasting. In my test the constant from the shorter course was only greater than that from the others by about one third of the SCPF. Therefore in difficult situations the 100-meter calibration might be used.
Last edited {1}
Original Post

Replies sorted oldest to newest

Shorter Calibration Courses

Older measurers will remember the 1970’s and early 1980’s when the minimum length of a calibration course was either a kilometer or ½ mile. As this is a daunting length to lay out, few calibration courses existed. People would drive many miles to and from a calibration course, if they could not find one close to the race course.

In October 1985, in Measurement News #13 I asked people to help with an experiment. The purpose of the experiment was to get a handle on the magnitude of “startup wobble” which was thought to affect calibration. Wayne Nicoll, Pete Riegel, Tom McBrayer, Bob Baumel, David Reik, Tom Knight, Kevin Lucas and Paul Christensen responded with data.

Each person was asked to make 5 rides of a calibration course, as follows:

First ride – no stops
Second Ride – 1 intermediate stop
Third Ride – 3 intermediate Stops
Fourth Ride – 5 intermediate stops
Fifth Ride – no stops.

They were asked to do this twice. Counts were recorded only on completion of the full ride. No intermediate counts were recorded.

Results were published in February 1986 Measurement News #15 and are seen below.

To my surprise, the effect of startup wobble was not one-sided. Instead, as calibration length got shorter, scatter increased in both directions. After some discussion among participants and RRTC people, 300 meters or 1000 feet was taken as the new minimum length for a US calibration course.

Original data is in an Excel file. Anyone who wants to try his hand at analysis need only ask me.

I don't think shortening the minimum even further is a good idea.
Last edited by peteriegel

Since there doesn't seem to be any bias in the scatter in your experiment, another possibility would be to use a shorter calibration course but do more calibration rides that get averaged.

Of course the scatter in that experiment is from different riders, not from multiple rides from the same rider.

Might be worth another experiment though.
> Might be worth another experiment though.

Well it took a while but I finally got around to doing that other experiment.

I measured a 300-meter calibration course and included marks at the 100-meter and 200-meter locations. At each of the marks (100m, 200m, 300m) I put 4 pieces of tape just before the mark and 4 pieces of tape just after the mark at random locations. I then measured the distance of each piece of tape from the mark (100m, 200m, 300m) close by. Here's what it all looked like at the 200m mark.

The cones are at the 1st tape piece, the 200m mark, and the last tape piece, just to help me identify everything as I approach.

Now that the course was set up I began my calibration rides. The plan was to ride to each of the 24 tape pieces once, and ride back from each of the tape pieces once, for a total of 48 rides. 16 of those rides would be close to 100m, 16 close to 200m, and 16 close to 300m.

The rides were done in a mostly random order. I would ride to a random piece of tape and would return from one of the other 7 pieces of tape near that mark. This random ordering was all set up ahead of time.

This may seem like an awful lot of trouble to go through, but setting it up this way allowed me to have 3 things in the experiment that would not have been possible otherwise:

1) A fairly large number of trials, 48.

2) Completely blind observations. As I was taking data the numbers had no significance to me, which would not have been the case if I simply repeated 100, 200, and 300m rides.

3) Compensation for the time factor. Because the trials were random, I was able to factor out the effect of time (increasing temperature) in the analysis.

The charts below show the PRELIMINARY results. I have sent my data to a statistician friend, and I'm sure the results will change somewhat after she gets a hold of them.

This chart shows the calibration constants calculated from each of the 48 rides. There is significantly more scatter in the shorter calibration rides.

This chart shows the means of the cal constants calculated with the rides at the three distances. It can be seen that the differences are very small, on the order of 1 count/km. I'm guessing these differences are statistically insignificant. For this experiment at least, there was little or no "wobble" effect.

This chart shows the 1-sigma(~67%) and 3-sigma(~97%) errors. For example, you have a 97% chance of your error being less than 0.029% if you ride a 300m calibration course 4 times and average the results. This error drops to 0.02% if you ride it 8 times and average the results.

Because the means of the cal constants from the three course distances are nearly the same, I'm guessing the scatter in the individual readings is caused mostly by the fact that you can't really discriminate less than 1/2 count on the Jones, which translates to 5 counts/km for a 100m cal course. Perhaps the use of the marked-up rim, as Neville suggests, for calibration rides might reduce this scatter.

The experiments conducted to date seem to lead to the ideas that:

1) The constant obtained on a short calibration course will be, on the average, about the same as that obtained on a longer one, and,

2) The variability of calibration gets larger as the calibration course gets shorter. A one-count difference on a 100 meter calibration course will have three times the effect on the constant as it would on a 300 meter calibration course.

While further shortening the length of the calibration course would probably have little effect on the average accuracy of the courses produced, it would increase the variability of their lengths. We would see more deviation from the average, resulting in more courses failing validation and more that are extra long.

When the length of the calibration course was originally reduced from about 1 km to 300 meters, it was in response to a genuine problem. The labor to lay out a kilometer was discouraging. Moreover, straight kilometers were hard to find. Changing to 300 meters solved this.

There is not much layout labor saving from 300 meters to 100 meters. The remaining benefit seems to be that 100 meters is more readily available than 300 meters. In the two decades that have passed since the original change was made I had not heard that it was hard to locate a suitable 300 meters, until Neville brought it up.

The proposed change would have an adverse effect on course accuracy without a great benefit to the ease of measurement. I don’t believe the benefit is worth the cost.
Success in finding a calibration course that meets the flat, straight, paved, 300m criteria depends a lot on where you live. Here in Michigan it is sometimes easy and sometimes not. In the town where I grew up in West Virginia, I doubt any such stretch of road exists!

If a measurer is faced with this issue, he has 3 choices:
1) Use a shorter course
2) Use a course that includes some hill
3) Use a remote course

And the question is, which poison is the one to pick? Personally, I will be wary of choosing #3.

In my experiment, the temperature increased 4F during the 90 minutes it took to do my 48 rides, although the sun was out so the pavement temp probably increased more (it was about half shaded). Here's a plot of the cal constants of all rides as a function of time.

The red line is a least square fit and shows that the cal constant changed about 4 counts (0.033%) every 30 minutes. Even if I live close by, I cannot drive to the course, drive the course (which I always do), warm my bike back up, and get started on my measurement in less than 30 minutes. If it ends up taking 60 minutes, the resulting error is 2/3 of the SCPF!

How much error does a cal course with a hill cause? I don't know. I plan to do another experiment to find out.

The book’s requirement that a calibration course must be flat has been ignored for years. The question of “how flat is flat?” has never been addressed.

Over the same distance and the same riding posture, you will obtain fewer counts riding uphill than downhill, because your weight shifts to the rear, slightly unloading the front wheel. The reverse is true riding downhill.

The effect is the same when wind is present. A headwind produces fewer counts than does a tailwind.

The general sense is, I believe, is that if you do two rides in each direction the differences will even out. This remains unproven, but seems plausible. How steep a hill would be needed to make a difference is not known.

I believe we should eliminate or modify the “flat” requirement or reduce it to a “recommendation.” Most people will use a flat course anyway. I believe that this requirement, in practice, has not been enforced at any time.

One way to see how calibration courses vary is to download some cal course maps, and use Google Earth to check them for flatness.

It does my heart good to see Mark and Neville doing some actual experimentation. We spend a lot of time discussing rules rather than measurement science.

In 1993 seven riders gathered in Birmingham, Alabama, to measure the course of the US Men’s Olympic Marathon Trials. The start was east of town at a higher elevation than was the main part of the course, separated from the finish by about 15 km. We wanted a calibration course at each end, one near the start and one near the finish.

The finish was downtown and a flat 300 meters was laid out.

At the start, everything was either high-traffic or had a hill. We elected to use a little-traveled straight side road that had a hill. It was later certified as AL03038BDC.

Checking with Google Earth gives a rise/drop of about 11 meters (drop = ±36 m/km) over the 300 meters of the course.

Precal was at 7:30 AM at 77F. The seven riders averaged 3347.66 counts on the 300 meters (11158.9 counts/km). The average downhill ride was 1.54 counts greater than the average uphill ride, or 5.13 counts per kilometer.

Postcal was at 10:15 AM, at 89F. The seven riders averaged 3345.93 meters on the flat 300 meters (11153.10 counts/km). The rides going east differed from the rides going west by 0.28 counts, or 0.93 counts per kilometer.

The measurement file is available in Excel for the asking, as is the complete measurement report. Contact me if you want either.
Last edited by peteriegel
I just got back from a trip to Iowa and Michigan and envied measurers there for the grid pattern of roads that results in long, straight stretches. You could no doubt lay out cal. courses that are 3 miles long if you wanted.

Here in New England, it's tougher to find a straight 300m, and once you get away from the shoreline, that is flat as well. I would say a third of the cal courses I've laid out have some sort of elevation change. My "home course" in fact has a rise in the middle, so the end points are probably about the same elevation, but I gain and lose about 3m on each ride. I feel it mirrors the terrain I usually measure on, and let it go at that.
The cal course in front of my house is a little short of 400m and rises about 5m, most of that in about 150m at one end. The counts going in one direction will vary by 1 or 2 counts from the counts in the other direction.

One day I was calibrating and the counts in the two directions were consistently different by 5 counts. What the heck! I then remembered that I didn't check my tire pressure. It was only about 40psi instead of my usual 55-60. After I pumped up to 55, the difference was back to 1-2 counts.

I think there is a good bit of individual experience to suggest that riding a cal course in the uphill direction will give you a different count than riding it in the downhill direction. But if you ride in both directions and average the results, will it give you the same count as a flat course. My instincts tell me yes it will, but data is always better than instincts, even when they tell you the same thing.

The interesting thing about this is that if we do find that it causes a bias, the result would be that you should use a hilly cal course if you are going to measure a hilly race course, and a flat cal course if you are going to measure a flat race course. Kind of what Jim was saying, but not the current recommended practice.
Success in finding a calibration course

Mark your worries would be at an end if you were to adopt the pressure-monitoring method.

All you would ever need is one calibration course. This could be laid out in an ideal location near your home and you could afford to take take great pains to get it perfectly right.(For instance by chosing a cloudy day when the temperature is close to 68 deg.)

After calibrating at the start of measurement, you would not have to worry about distance to the race course, air leaks, bike warm up, wet-road calibration,temperature changes, or postcalibration. You would always know the precise calibration factor at any one instance by simply reading the pressure gauge.
I decided to take Neville's advice about using wheel revolution counts rather than Jones counts during calibration. I already had my wheel marked with every 1/20th revolution, so it was a fairly simple matter to include additional marks between those to give me marks every 100th revolution.

With the Jones counter the best accuracy I could hope to get was 1/2 count, or about 1/50th revolution, and I believe that is what was causing much of the scatter in my calibration experiment. With the marked up rim I believe I can read to the closest 200th revolution, and this should eliminate much of that scatter.

In order to test that theory I repeated my calibration experiment with the marked up rim. After compensating for the increasing temperature, as I described in my previous post, the results of the 48 rides are shown below.

Below shows the resulting average calibration constants for the three different length cal courses.

Recall that in the previous experiment the mean for the 100m cal course was also smaller, although it was by a very small amount.

In my mind there are only two plausible explanations for the 100m course giving the smallest cal constant:

1) The difference is extremely small (less than 1 count on a 300m cal course) and isn't real. It is only due to the fact that even in these experiments we don't have enough data.

2) With the 100m cal course you can actually see the mark in the road at the end, so it is easy to follow the same path as the steel tape. With the 200m and 300m courses it is much more difficult to follow the shortest route to the end mark. If this is true it would mean that using a 100m cal course is actually more accurate! I'm not saying that is the case, but if the 100m really does give the smallest cal constant, I see no other explanation for it. Remember that any "wobble effect" would give the opposite result, the 100m giving a larger cal constant.

The chart below shows the 97% probability error for the three different cal course lengths using 4, 6, and 8 calibration rides. It shows this data both for the case where the rider uses Jones counts to record his data and where he uses a marked up rim. For example, if a rider does 4 300m cal rides and records data using his Jones counter, he has a 97% chance that his cal constant will be less than 0.028% away from the 16-ride average (yellow column).

The key thing to notice here is that the use of the marked up rim dramatically reduced the scatter that was seen in the 100m cal rides.

In fact, if we compare the conventional calibration method (4 300m rides using a Jones counter, yellow column) with the marked up rim method (4 100m rides using a marked up rim, light blue column) we see that they have nearly identical error.
What a fascinating discussion, sorry I missed it when it was going on. Kudos to all for your input.
I just returned from measuring 3 courses in Bermuda. I have scoured the island looking for suitable calibration courses, and 250 meters is as long as I have been able to find. I believe there are longer possibilities over by the airport but they would be really inconvenient and I would rather calibrate frequently. I think many places in the US face the same kind of limitation.
I always keep track of spokes when I calibrate (haven't yet converted to decimal rim markings), and by interpolating between spokes I have a precision of 1/360 of a revolution. This is enough to know whether my calibration results are consistent or not. When I notice the differences aren't as consistent as I'd like, I sometimes make an on-the-spot decision to do some more rides. Not very scientific but I think it works pretty well.
Bob Thurston
I also missed this discussion as it was taking place. But now that I've found it, I thought it interesting to bring up still earlier experiments, prior to the 1986 work that Pete discussed. In particular, I did an experiment in April 1983, which was reported in Measurement News #4 (May 1983). Pete summarized my results as follows:

In contrast to the bias illustrated in Mark Neal's results (smaller constant on the smaller cal course), mine showed the opposite bias, which is exactly what you'd expect if there's some wobble in starting and/or stopping the bike. In fact, as I still have a copy of the complete 11 page report that I sent Pete, here is page 5, containing my analysis of the "Calibration-Wobble" effect:

As for my methodology in that experiment, I'd wondered whether I might have supplemented the Jones counter readings with spoke counting (which I sometimes did when using short cal courses). It turns out that I didn't do any spoke counting; I used only the Jones counter, but tried to read it as precisely as possible. I wrote on page 2 of my report: "I always reset the counter to a multiple of 1000 counts before starting the bike... And I read the counter to as small a fraction of a count as possible (often to 1/4 of a count)."

As we know, the minimum cal course length prior to 1987 was 800 meters. Based on my 1983 experiments, I suggested it be reduced to 300 m (as Pete also suggested), partly due to the resolution of the Jones Counter, but also because, due to the theoretical Calibration-Wobble effect, constants would become increasingly inaccurate for cal courses much shorter than 300 m.
Last edited by bobbaumel

Yes, I did multiple rides at each distance. For you, and anyone else interested, I have scanned my complete 1983 report, and you can download it from (this is 12 pages long, and the PDF file is 1.1 MB). My experimental protocol is described on page 2; my raw data displayed on page 3, and a summary of the data (including the numbers plotted) appears on page 4.

In addition to performing this 1983 experiment, I was one of the participants in Pete's 1985 experiment, and in that case, I estimated my "wobble distance" as only around 3 cm, or about half what I'd estimated from my 1983 experiment. Some of this may be random, but some may be explainable by the different experimental protocols. In my 1983 experiment, every ride on a short course was treated as totally independent (every time I stopped the bike, I wrote down counter readings and then spun the wheel to reset the counter to a round number before restarting). In Pete's 1985 experiment, the stops were much briefer: No data was recorded at any of the intermediate points (and there was definitely no resetting of the counter); we just restarted immediately after stopping. It's possible that in this situation, cyclists don't wobble as much when restarting.

It's also interesting that some cyclists (including you, Mark) appear to exhibit a negative wobble effect (smaller constant with decreasing cal course length). How can this be explained? I had assumed that the dominant effect would be wobbling when starting and/or stopping. But perhaps some cyclists proceed for a significant distance when starting and stopping without their full weight over the bike, and maybe in some cases, this dominates the wobble effect. Moving the bike without your full weight over it results in a larger effective wheel circumference; thus, fewer counts.
With increasing precision, perfection from the measurer, reading a jones counter to 1/4 of a turn, a well practiced rider who wobbles least, fair weather with no cross winds and a perfect alignment of the planets, I think you can use much shorter courses.

The point of having a universal way for mesuring is that it can be executed by an inexperenced mesurer, under less than ideal conditions, using the device to the resolution of the device and not beoyond.

We are aware of the un-preventable start and finish wobbles, the whip lash when backing up a jones counter and the number of other errors that an inexperienced measurer may miss.

By having a calibration course that is longer than mathematically necessary when conditions are perfect helps eliminate a lot of the end errors and get more of the middle counts that are hopefully similar to what the rider will do on the real course.

I think there is also a difference in riding when balancing down a 300m course, trying to get started right and then immediately switching to finishing right, rather than doing some heads up riding as one expects to do on the real course.

It takes me riding my 1/2 mile cal course a couple of times to warm up my bike, my tiers and my muscles before the counts start to be constantly repeatable. Maybe this is just my inexpedience. If I was working on a 300 meter course I don't think I would never get to the same stable state as when I am riding a course.

Now it helps that I am not traveling to do measurements and have the advantage of a 1/2 mile totally straight and flat road. Obviously the cert course does not have to be 1/2 mile long but at that length I feel very confident that any minor errors, like counter whip lash and bobbles at the start and finish are only multiplied by 6 times when scaling up to a 5K course.

There is a direct mathematical relationship between the length of the cert course and the final race course. What ever that ratio is, any errors that get introduced in riding the cert course will be multiplied by that factor. The shorter the cal course the more times error will be multiplied. Yes the percentage of error stays the same, but it's that start, stop, whiplash and rounding error that is minimized with a longer cal course. Taken to the logical limit, the best cal course is the same length as the target distance.

Even though I run on very hard high pressure tires, 100 psi, and warm up myself and the bike before starting on the cal course or the race course, there is still a difference between pre and post calibrations. Even very minor changes in the weather can make a measurable difference. I can detect the variations becuase my cal course is long enough for the variations to have an effect. With a much shorter course the changes would be masked by rounding errors.

Remember, we are trying to perfect a method that while not fool prof, is going to allow an inexperienced person who follows the method to arrive at a course that will pass verification.

We are under attack from people who want to replace our mesurement system with GPS or other ideas. It is not good when the 'Method' we use, when in the hands of someone who is not a high preast, leads to compounding errors and brings the methodoligy into question. Same reasion we don't allow a mesurement done with a yard stick.

I understand the math and sound logic for your argument for a shorter course, but you are working from data derived by a very experienced and careful measurer who is taking extraordinary care to measure and record to a higher precision than the jones counter reads.

The methodology must be robust, and must be robust in the hands of an inexperienced operator who is following the instructions for the first time. For this I think you don't want the shortest cal course possible.

Maybe we need some stats of first time measures, who are doing it on their own for the first time, without the supervision of an old hand who teaches them tricks and points out their errors.
JamesM has a good idea hidden in his post, which I think could be added to the measurer's manual, which first-time measurers should be reading. That point should be that when conducting your calibration rides, you should ride the cal course at least twice in each direction, and until your counts in each direction are within 2 clicks (ideally one click) of the other ride in the same direction.

I believe this will encourage newbies to pay attention to wobble, and will let them know what an acceptable multiple-ride variable is. Currently, the only instruction is to ride twice in each direction, then average the counts. It says nothing of accuracy and repeatability.

I do, however, disagree with JamesM that 300 m is too short to avoid wobble. I think 300 m is a viable length. I compare my cal counts from multiple rides, and they are very consistent.

Add Reply

Link copied to your clipboard.