Clock drift

29.02.2012 19:58

When I was processing the raw measurement results from Munich experiment I noticed that absolute timestamps on spectrograms recorded by two VESNA nodes were differing randomly from zero to three seconds and required manual alignment.

The experimental setup was not expected to give very precise time reference: each VESNA has its own free running real-time clock stabilized by a 32.768 kHz quartz which was providing time relative to the start of the measurement. Absolute time on the other hand was provided by two Linux running laptops to which the sensor nodes were sending the data. So all in all the accuracy of our timestamps depended on four quartz clocks.

Nevertheless this result was surprising to me. I expected these four clocks to drift apart in the three hours of measurements, but this drift should be uniform. I can't explain what could cause the difference between clocks to randomly change between experiments.

To rule out any problems on VESNA I rigged up a simple test where I compared the difference between VESNA's real-time clock and the clock provided by the Linux kernel with a running NTP daemon. This is the result of four test runs with two nodes that we were using in Munich:

Results of a clock drift test for two VESNA nodes.

As you can see, the clocks do drift apart and do so more or less linearly (nodes were turned off for the night before each test to also see if warm-up affected the drift). Node 117 drifts at around 3 ppm and node 113 at 9 ppm. This is actually quite bad, as node 117 would gain a second each four days, even considering that the reference here was a laptop that probably isn't a shining example of accuracy either.

The only weird thing is the strange bump on the graph for node 113, where drift was positive for around 15 minutes and then turned back to negative. I could not reproduce it in any of the later runs and it might be due to NTP adjusting the laptops clock. But even that can't explain the observed deviations that are more than ten times what we see here.

While this is of course not conclusive, it does give me some confidence that it might be the PC software that was causing these problems.

Posted by Tomaž | Categories: Digital

Comments

With a 32.768kHz Quartz that has shunt capacitance 1pF, motional capacitance 2fF and load capacitance 12.5pF (you are probably using a Pierce oscillator), the two resonances are 16.4Hz apart, which is 500ppm.

The oscillation frequency of a Pierce oscillator lies between series and parallel resonance and depends on the load capacitance. With infinite load capacitance the oscillation frequency lies at series resonance, while with 0pF load the frequency equals that of parallel resonance. Only with the load specified in datasheet the frequency is 32768Hz. Now I did my math and found that a 10% error in load capacitance results in roughly 1.5ppm frequency error. And the load capacitance variation is caused not only by load capacitor tolerances, but also by stray capacitance which easily accounts for an extra pF (more than 10% error).

If one board has +1.5 ppm error and the other one has -1.5ppm error ... there's your 3ppm. Then again there's ageing (a few ppm per year), calibration tolerance (i.e. how exact are those 32768Hz at 12.5pF load) around 10 or 20 ppm, ... You are walking on the thin ice here with ppms. Quartz is not almighty.

Posted by AB

AB, thank you for your very informative comment. Already with my binary clock project I learned that it's necessary to have a trimmer capacitor to fine adjust the frequency if you want to keep accurate wall clock time for any longer periods of time.

With the microcontroller it's possible to compensate for this drift in software, but here it was well within acceptable limits for such a short experiment. With this test I was only trying to find out if there are any random irregularities in the clock's frequency.

Posted by Tomaž

NTP does not set/change the clock. It only slows it down, or makes it go faster.

Changing the clock forwards would create a problem with (for example) scheduled jobs, and changing the clock backwards would create problems with transactions and logging (and many many other things).

Actually, as I found out one unfortunate time, ntpd will do step changes (forward or backward) in the clock if it thinks the clock has drifted so far that just slowing it down or speeding it up would take too long to correct for the error. This is perhaps not stressed enough in the documentation (see -x option).

Posted by Tomaž

Correct me if I am wrong, but the positive slope in the red trace drifts with +56ppm. This means that the oscillation frequency is probably still between series and parallel resonance which are 500ppm apart. The bump could be explained by a change in capacitance. A bad solder joint would greatly decrease load capacitance and move the oscillation frequency toward parallel resonance (speed up the quartz and thus cause positive drift).

Wouldn't step changes by the NTP daemon manifest themselves as discrete jumps in the plot?

Posted by AB

Add a new comment


(No HTML tags allowed. Separate paragraphs with a blank line.)