[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: non-monotonic do_gettimeofday?



On Friday 06 September 2002 15.17, Akshay Adhikari wrote:
> here is a modified version of do_gettimeoffset and do_gettimeofday in
> time.c v.1.11, which seems to have drastically reduced the number of time
> warps, possibly because it accounts for unexcecuted timer bottom halves...

Yes, you're right - we didn't handle that, but you missed one case I think,
see below:

> (assumption is that HZ is 100, but that can be changed easily..)

To be clear you can write (1000000/HZ), the compiler will optimise it
so no division will take place at runtime anyway.

> static unsigned long do_slow_gettimeoffset(void)
> {
>         unsigned long count;
>         unsigned long usec_count = 0;
>
>         static unsigned long count_p = LATCH;    /* for the first call
> after boot */
>         static unsigned long jiffies_p = 0;
>
>         unsigned long jiffies_t;
>
>         count = *R_TIMER0_DATA;
>         jiffies_t = jiffies;
>
>         usec_count = cris_timer0_value_us[count];
>         if( jiffies_t == jiffies_p ) {
>                 if( count > count_p ) {
>                         /*printk(KERN_WARNING "warp %ld to %ld\n",
> count_p, count);*/
>                         usec_count += (jiffies_t-wall_jiffies+1)*10000;
>                 }

Here you misses one case were wall_jiffies isn't handled.

>         }
>
>         else
>                 usec_count+= (jiffies_t-wall_jiffies)*10000;

I suggest changing the intire if statement to the following and
allways take care of the wall_jiffies:

         if( jiffies_t == jiffies_p ) {
                 if( count > count_p ) {
                         /*printk(KERN_WARNING "warp %ld to %ld\n",
		 count_p, count);*/
                         usec_count += (1000000/HZ);
                 }
         }

         usec_count+= (jiffies_t-wall_jiffies)*(1000000/HZ);

>         jiffies_p = jiffies_t;
>         count_p = count;
>         return usec_count;
> }
>
> do_gettimeofday is changed to what you had suggested earlier...
> void do_gettimeofday(struct timeval *tv)
> {
>         unsigned long flags;
>
>         save_flags(flags);
>         cli();
>         *tv = xtime;
>         tv->tv_usec += do_gettimeoffset();
>         restore_flags(flags);
>         while (tv->tv_usec >= 1000000) {
>                 tv->tv_usec -= 1000000;
>                 tv->tv_sec++;
>         }
> }
>
> There are still rare occasions where there are time warps, meaning that
> this piece of code may not be correct, but it does seem to be working
> better than the original version... maybe something similar will work for
> the 2.4.19 version? comments / suggestions on why this does not always
> work are welcome....
>
> Thanks,
> akshay

Best regards
/Johan


> On Thu, 5 Sep 2002 johan.adolfsson@xxxxxxx.com wrote:
> > Quoting Akshay Adhikari <akshay@xxxxxxx.com>:
> > > hi,
> > > another couple of questions:
> > >
> > > I have only tried patching do_gettimeofday in v 1.11 as you had
> > > suggested
> > > in your previous mail, I havent completely replaced time.c with the
> > > 2.4.19
> > > version (v 1.14), and the problem of non-monotonic timestamps
> > > persists.
> > >
> > >
> > > In the 2.4.14 version of time.c (v 1.11), the code in do_getttimeofset
> > > refers to two (three according to the
> > > comments surrounding it??) -
> > >
> > > 1.the timer wraps around:
> > > the code
> > > if(jiffies_t == jiffies_p)
> > > {
> > > 	if (count_ > count_p)
> > > }
> > >
> > > tries to takes care of that. Under what conditions does this happen?
> > > is it when we have missed exactly one timer interrupt
> >
> > Not missed, but the timer regiser has wrapped but the timer interrupt
> > hasn't executed and updated jiffies yet.
> >
> > > if so, then instead of adding 1000000/CLOCK_TICK_RATE/2 (assuming
> > > the 19.2kHz timer0, this means roughly 25us...) to the usec according
> > > to
> > > the cris_timer_value0 lookup table , shouldnt we add the lookup
> > > valur for count_p?
> > >
> > > what am I missing here?
> >
> > The current code increases the time with one jiffie instead
> > and uses the new count value
> >
> > > 2. we are after a timer interrupt, but the bottom half has not
> > > executed
> > > yet.
> > >
> > > what part of the code handles this? should this not be checked by
> > > checking
> > > the difference between jiffies and wall_jiffies?
> >
> > Not really sure, and it might be that the comment is a
> > remains from the i386 code.
> >
> > Give the latest version a try.
> >
> > > TIA,
> > > akshay
> >
> > /Johan
> >
> > > On Wed, 4 Sep 2002, Johan Adolfsson wrote:
> > > > On Tuesday 03 September 2002 21.59, Akshay Adhikari wrote:
> > > > > hi,
> > > > > in kernel 2.4.14, I have noticed that sometimes when the CPU load
> > >
> > > is
> > >
> > > > > extremely high, the clock jumps backwards. Here is how it happens:
> > > > > I
> > >
> > > send
> > >
> > > > > UDP packets and receive them back from an echoer, much like
> > > > > ping. The packets have a user space send timestamp using
> > >
> > > gettimeofday,
> > >
> > > > > and _kernel_ space receive timestamps retrieved using the
> > >
> > > SIOCGSTAMP
> > >
> > > > > ioctl.
> > > > >
> > > > >
> > > > > To get high precision receive timestamps, I have replaced the
> > > > > get_fast_time call in netif_rx with the do_gettimeofday call.
> > > > > Sometimes, under high load conditions, the kernel receive
> > > > > timestamps
> > >
> > > are
> > >
> > > > > smaller than the user receive timestamps.
> > > > >
> > > > > Is this the same bug that used to be in 2.4.5? (there was a report
> > >
> > > of
> > >
> > > > > this on the mailing list a while back).
> > > >
> > > > Probbaly not the same, but something nearby. Exactly what version of
> > > >
> > > > linux/arch/cris/kernel/time.c are you using?
> > > > (Check the $Id: tag)
> > > > If you are using version 1.11 there has been changes after that that
> > > > might fix this:
> > > > E.g:
> > > > @@ -124,11 +211,11 @@ void do_gettimeofday(struct timeval *tv)
> > > >         cli();
> > > >         *tv = xtime;
> > > >         tv->tv_usec += do_gettimeoffset();
> > > > -       if (tv->tv_usec >= 1000000) {
> > > > +       restore_flags(flags);
> > > > +       while (tv->tv_usec >= 1000000) {
> > > >                 tv->tv_usec -= 1000000;
> > > >                 tv->tv_sec++;
> > > >         }
> > > > -       restore_flags(flags);
> > > >  }
> > > >
> > > > Other changes in newer version is that we use the prescale timer to
> > >
> > > get
> > >
> > > > an exact 100Hz clock (otherwise it's 64 ppm to fast, as indicated in
> > >
> > > the
> > >
> > > > datasheet), there might be some wrap problem not properly handled
> > > > in the new code, although I haven't had any problems in my tests.
> > > >
> > > > > a related question: do_gettimeofday disables interrupts, right?.
> > >
> > > does this
> > >
> > > > > mean
> > > > > that we could lose timer interrupts (or any other interrupts)
> > > > > simply
> > >
> > > by
> > >
> > > > > calling gettimeofday in a tight loop (and this is a possible reason
> > >
> > > for
> > >
> > > > > the non-monotonic timestamps?), or is the timer interrupt
> > >
> > > non-maskable?
> > >
> > > > It's not non-maskable, so it might be off, but I dought it's off for
> > >
> > > that
> > >
> > > > long - that would indicate some other problem somewhere.
> > > >
> > > > > TIA,
> > > > > akshay
> > > >
> > > > /Johan