ltimer 1.0.1

ltimer 1.0.1

Another Big Notice At The Top Of The Page
We had a long discussion at Flipcode, and some folks there convinced me that one design decision in ltimer was a bad move: the "recalibration". At some point I'll go back and clean up ltimer so this is turned off by default (or gone completely). In the meantime, I suggest you turn recalibration off, by making the following function call just after calling ltimerStartup():
ltimerSetDefaultOption(LTIMER_OPTION_RECALIBRATION_PERIOD, 0xFFFFffff);

Upgrade Notice
Anyone using ltimer 1.0 is strongly advised to upgrade. See Version History for details.

Overview

ltimer is yet another high-performance timing library. Timing libraries are a staple of realtime games, as they must perform important functions (poll for user input, render a new screen) in a timely fashion. If you're writing a game, chances are you're already using one. The difference is, this one is clever.

Highlights of ltimer:

You can have as many timers as you like.
You can explicitly set the resolution of each timer. The default is milliseconds (1000 per second).
ltimer also provides high-precision sleep functions.
ltimer is self-calibrating, and calibrates itself concurrently with your application. (In other words, it doesn't require a separate calibration step; it "hits the ground running".)
When running in "performance mode", and after it's calibrated, ltimer executes much faster than QueryPerformanceCounter(). Timings range from 3x to 6x faster, depending on your computer. (Before it's calibrated, or when in "safety mode", it's generally 1.1x slower than QueryPerformanceCounter().)
ltimer protects against what I call "retrograde time". (QueryPerformanceCounter() will, on rare occasions, report a time earlier than it did last time. For shame!)
ltimer is Win32-on-Intel only, and requires a Pentium processor or above.

The ltimer web page is here:

http://www.midwinter.com/~larry/programming/ltimer/

And you can download a fresh copy of the source code here:

http://www.midwinter.com/~larry/programming/ltimer/ltimer.zip

How ltimer Works

ltimer uses a two-pronged approach to determining the current time, using both QueryPerformanceCounter() and RDTSC.

QueryPerformanceCounter() is a Win32 API call that reports elapsed time. All it gives you is an arbitrary "current time", in an arbitrary "number of cycles per second" (which in my experience is invariably just over 3.5 million cycles per second). But this is all you need to perform high-precision timing. The biggest downside is that QueryPerformanceCounter() isn't particularly convenient to use... most people wind up writing layers around it. It also seems to be a bit slower than is necessary, as illustrated by ltimer.

RDTSC is an assembly-language instruction, first included on the Intel Pentium, and provided on every new chip since (including AMD processors). It tells you exactly how many clock cycles have occured since the processor was last powered up. Therefore, the timing resolution varies depending on how fast the local processor is; for instance, a 200 MHz Pentium Pro would generate 200,000,000 cycles per second; my beefy 2.4GHz Pentium 4 clocks in at 2,400,000,000 cycles per second. It is amazingly fast and 100% accurate. The problem is, there's no good way to ask the CPU what the current RDTSC resolution is—and if you don't know the timer's resolution, you can't do real-world timing with it. (For more on RDTSC, read Intel's exhaustive article Using the RDTSC Instruction for Performance Monitoring.)

ltimer combines these two approaches to produce high-performance, high-precision time.

At startup, ltimer notes the current time reported by both QueryPerformanceCounter() and RDTSC.
For an initial "calibration" period (currently two seconds), ltimer uses QueryPerformanceCounter() to determine the current time. It shields you from "retrograde" time by remembering the last returned value, and never returning a value less than that.
After the calibration period is over, ltimer now knows three important things:
1. The number of cycles reported by QueryPerformanceCounter() during the calibration period.
2. The resolution of QueryPerformanceCounter().
3. The number of cycles reported by RDTSC during the calibration period.
It can therefore calculate the important fourth thing: the resolution of RDTSC.
From that point on, it calculates the current time based on RDTSC. It will automatically recalibrate itself every so often (currently every two seconds).

How To Use ltimer

Include ltimer.cpp in your project, and include ltimer.h in any files where you want to use ltimer entry points..
When your program starts up, call ltimerStartup().
When your program shuts down, call ltimerShutdown().
To create an ltimer, call ltimerCreate(&timer).
To reset an ltimer to zero, call ltimerReset(ltimer). Timers are automatically reset when created.
To set the resolution of an ltimer, call ltimerSetOption(ltimer, LTIMER_OPTION_TIMER_RESOLUTION, resolution). The resolution is a 32-bit unsigned integer. (There are other tasty options; see ltimer.h for more.)
To determine the current time, call ltimerGetCurrentTime(ltimer). It returns a 64-bit unsigned integer representing the elapsed time since the last ltimerReset() in the current resolution.
To get the last time reported, call ltimerGetLastReportedTime(ltimer). The return value will be exactly the same as the most recent return value from ltimerGetCurrentTime(ltimer).
To sleep for a certain amount of time, call ltimerSleep(ltimer, delta). The delta is expressed in terms of the current resolution; for instance, if the current resolution was 8192 ticks per second, then sleeping for 4096 ticks would mean sleeping for 0.5 seconds.
To sleep until a certain time, call ltimerSleepUntil(timer, time). The time is in terms of the current resolution.
To destroy an ltimer, call ltimerDestroy(<imer).

"Performance" Mode Versus "Safety" Mode

Most of the time, ltimer will automatically run in performance mode. This means ltimer will use RDTSC when it can, which results in ltimer being both faster and more accurate. However, there are some circumstances where you might not want to use RDTSC:

ltimer is not directly aware of any fluctuations in a processor's speed. SpeedStep and other power management programs can cause a processor to slow down; so can overheating. In these instances, ltimer's auto-recalibration will slowly shift it up or down to the new CPU speed, but it will likely result in the reported time speeding up / slowing down. It's possible that QueryPerformanceCounter() is immune to these sorts of problems.
ltimer could do funny things on multi-processor machines, which by necessity have more than one RDTSC counter. (On the other hand, I don't know how much RDTSC would vary between two processors; after all, they both got power at about the same time, right?)
ltimer's approach to preventing retrograde motion could cause problems in science fiction. For instance, a program that ran for several hundred years would overflow ltimer's puny 64-bit integers. And a program that travelled back in time would be flummoxed. I recommend you not use ltimer for these sorts of applications.

If you don't want to worry, and don't mind ltimer being slower, you can change your timers to safety mode. Just call ltimerSetOption(ltimer, LTIMER_OPTION_TIMING_MODE, LTIMER_TIMING_MODE_SAFETY) and that timer will always use QueryPerformanceCounter(). If you want to set it as a new default, call ltimerSetDefaultOption(LTIMER_OPTION_TIMING_MODE, LTIMER_TIMING_MODE_SAFETY) and all timers created afterwards will use safety mode.

For your convenience, ltimer automatically switches to safety mode on multi-processor machines. If you only ever call ltimer functions from only one thread, and that thread has affinity for only one CPU, you can explicitly turn on performance mode by calling ltimerSetOption(ltimer, LTIMER_OPTION_TIMING_MODE, LTIMER_TIMING_MODE_PERFORMANCE) (or by setting it as the default before creating your first ltimer).

High-Precision Sleep

ltimer also features high-precision sleep functions. Their goal is threefold:

Do not return until the requested sleep period has passed.
Do not oversleep. I've seen the Win32 API function Sleep() oversleep now and then.
Spend as much of the intervening time as possible actually asleep by calling the Win32 API function Sleep().

The ltimer sleep functions accomplish this with a three-step approach to sleeping. Note that the LTIMER_OPTION_... symbols below represent run-time settable option values, not compile-time constants.

If there are more than LTIMER_OPTION_LONG_SLEEP_THRESHOLD milliseconds left in the sleep period, call Sleep(LTIMER_OPTION_LONG_SLEEP_THRESHOLD / 2).
Otherwise, if there are more than LTIMER_OPTION_SHORT_SLEEP_THRESHOLD milliseconds left in the sleep period, call Sleep(0). This yields the rest of the thread's timeslice back to the Windows scheduler, which it spreads around to other more needy processes.
Otherwise, we're in the final stretch and we don't want to oversleep. The sleep function will run in a tight loop without any Sleep() calls until the sleep period is up.

Remarks:

Why does ltimer divide by two when calling Sleep(LTIMER_OPTION_LONG_SLEEP_THRESHOLD / 2)? To avoid oversleeping. If we slept for LTIMER_OPTION_LONG_SLEEP_THRESHOLD milliseconds when there were exactly LTIMER_OPTION_LONG_SLEEP_THRESHOLD + 1 milliseconds left in the sleep period, Sleep() might be naughty and oversleep. Dividing by two means we loop and call Sleep() twice as often, which is really no big deal in the grand scheme of things.
The two sleep functions, ltimerSleep() and ltimerSleepUntil(), both use ltimerGetCurrentTime() internally. If you're relying on ltimerGetLastReportedTime(), be forewarned that its value will change after a sleep function is called.

Reentrancy

ltimer is not particularly reentrant. Making concurrent calls with a single ltimer_t handle would likely result in terrible, awful, no-good very-bad things happening. Even concurrent calls with different ltimer_t handles might cause problems, as there's a little bit of global data that gets written to when using LTIMER_TIMING_MODE_PERFORMANCE. I didn't see this as a problem, as I wrote ltimer for myself and I only ever call ltimer from my main game thread. And I expect most people who'd be interested in ltimer are also writing basically single-threaded applications.

But be warned: if you want to use ltimer in a heavily multithreaded app, you'll have some work to do. For the global data, one approach would be to add mutual exclusion synchronization around the recalibration code. (I'd recommend critical sections, as they're way faster than mutex handles). Alternatively you could copy that global data into the ltimer_s struct, resulting in the global data being read-only. A third approach would be to switch to LTIMER_TIMING_MODE_SAFETY.

As for calling ltimer concurrently with a single handle, my advice is to simply not do it. Your only option to prevent unpredictable results is mutual exclusion, with handle-specific critical sections around all the relevant function bodies. And I'm pretty sure this would introduce error into the reported times.

Trying out ltimer

I've included a test driver for ltimer, which both demonstrates how to use it and how fast it can go. On my 2.4GHz Pentium 4, it produced the following output:

QPC's resolution is 3579545 cycles per second.
 nothing x 1000000: took 0.000001 seconds (3080 machine cycles)
From now on, we'll subtract that handicap.

Forcing ltimer to use its slow method...
  ltimer x 1000000: took 1.339275 seconds (3213350640 machine cycles)
     TGT x 1000000: took 0.061161 seconds (146748676 machine cycles)
     QPC x 1000000: took 1.174217 seconds (2817323432 machine cycles)
   RDTSC x 1000000: took 0.232953 seconds (558932900 machine cycles)

Sleeping to give ltimer time to calibrate, then using its fast method...
  ltimer x 1000000: took 0.228220 seconds (547576416 machine cycles)

When it hasn't warmed up yet, ltimer is 1.141x slower than QPC.
But once it's warmed up, ltimer is 5.145x faster than QPC!

If you specify a command-line argument, the first argument is used as the number of iterations to use in the various loops. (The default is one million.) All other arguments are ignored.

Licensing

Here's the license:

/*
** [BEGIN NOTICE]
**
** Copyright (C) 1999-2003 Larry Hastings
**
** This software is provided 'as-is', without any express or implied warranty.
** In no event will the authors be held liable for any damages arising from
** the use of this software.
**
** Permission is granted to anyone to use this software for any purpose,
** including commercial applications, and to alter it and redistribute
** it freely, subject to the following restrictions:
**
** 1. The origin of this software must not be misrepresented; you must not
**    claim that you wrote the original software. If you use this software
**    in a product, an acknowledgment in the product documentation would be
**    appreciated but is not required.
** 2. Altered source versions must be plainly marked as such, and must not be
**    misrepresented as being the original software.
** 3. This notice may not be removed or altered from any source distribution.
**
** The ltimer homepage is here:
**		http://www.midwinter.com/~larry/programming/ltimer/
**
** [END NOTICE]
*/

In non-legalese, my goal was to allow you to do anything you like with the software, except claim that you wrote the original version. If my license prevents you from doing something you'd like to do, contact me (my email address is in the source) and we can discuss it.

Version History

1.0.1: Thursday, June 19^th, 2003
Oops. Big bug fixed. On modern super-fast machines, the calibration calculation could overflow (a 64-bit integer!). On my 2.4GHz machine, this time is just under 36 minutes.
1.0: Friday, June 6^th, 2003
Initial public release.

Happy timing!

larry