## Thursday, January 29, 2009

### The Vagaries of Time

Most people don't spend much time thinking about time. It flows, we count it off on our clocks and calendars, and there is usually not enough of it.

As a programmer who has worked on applications that deal specifically with time, such as activity reports, I have developed an acute awareness of how complicated time is. I don't just mean the complications of relativity (although I have spent a lot of time trying to understand that as well), I am referring to the fact that the way we measure and mark the flow of time is more complex than most people normally consider.

### Units of Time

As children we learn about how we measure time: 60 seconds in a minute, 60 minutes in an hour, 24 hours in a day, 7 days in a week, 365 days in a year. Then they tell us about the exceptions: every four years is a leap year with 366 days. And the months are irregular to start with, with the various months of the year having three different number of days in them.

That's typically where our formal education on the measurement of time stops. But of course that's not the end of the special cases.

Each year has 365 days, except that every four years is a leap year with 366 days, so for example 2008 had 366 days. Except that every 100 years is not a leap year. 1800 was not a leap year, 1900 was not a leap year, and 2000 would not have been a leap year, except that every 400 years is a leap year, so it was. Good thing, too. A lot of programmers in the last few decades of the 1900s didn't even bother to deal with the century as part of the year they stored (leading to the Y2K issue), so it seems unlikely that those who bothered to deal with leap years would have bothered to deal with non-leap centuries. As it was, because of the 400-year exception to the 100-year rule, we didn't have to worry about that. Hopefully by the year 2100, all of this gnarly time measurement stuff will be will debugged and codified into standard libraries that everyone uses, so it won't be a problem.

You would think by now everyone would be able to deal with leap years, yet leap-year bugs still happen. On December 31, 2008, Microsoft Zune 30GB MP3 players froze up due to a leap-year bug. Microsoft's official solution: wait until about noon GMT on January 1, let the battery drain, then recharge and turn it back on. You have four years to upgrade.

We all know not all months have the same number of days, and not all years have the same number of days. Are there other units of time that we think of as always being the same length that don't really all have the same length? Of course.

Daylight Saving Time was instituted in the early 20th century. "Fall back, Spring forward." We set our clocks forward by one hour in the spring, and we set them back one hour in the fall. Since the time officially changes in the middle of the night, typically at 2AM when most people are asleep, most of us are not much affected by it, other than losing an hour of sleep in the spring and getting an extra hour of sleep in the fall. And, with every device in our house now sporting a clock, spending that extra hour walking around resetting the time on all those devices. But obviously, this means that there is one day every year that has 23 hours and another day that has 25 hours.

So the years are not all the same length, the months are not all the same length, and the days are not all the same length. But we're good on the other units, right? Nope. It turns out that minutes are not all the same length either.

Starting in 1972, the International Time Bureau in Paris has been responsible (until 1987, when the job was taken over by the International Bureau of Weights and Measures) for determining and announcing when we add leap seconds to the year. This can happen at two possible times in the year: once in the middle of the year at the end of June 30, and once at the end of the year on New Year's Eve, December 31. 2008 included a leap second on New Year's Eve. If you very carefully synchronized your watch to atomic time shortly before midnight, counted down to the New Year, and celebrated one second after 23:59:59, you were one second early. The last minute of 2008 contained 61 seconds, so the last second was 23:59:60. Did your watch display that time? Mine didn't.

You might not think that a leap second would cause problems. Most of us can ignore it - our timekeeping pieces are not that accurate anyway. But some devices make a point of accounting for that leap second. For example, Unix time is defined as the number of seconds since the start of January 1, 1970, UTC - except that leap seconds did not exist in 1970, so Unix time is not defined to include them. In order to maintain the correct time on the system clock for systems that care about this level of accuracy, it needs to know when those leap seconds are so that it can take one second off the system clock when it happens. Because leap seconds are so rare (even with two leap seconds in a year, that's only one second out of over 15 million), if a device has a problem with leap second calculations it can take a while to track down. The Linux kernel has code to deal with leap seconds, and this code gets executed exactly twice a year, in June and December, to handle the possibility of a leap second. A race condition in that code path would only show up during those two seconds out of the year, making the problem hard to catch and reproduce, but still potentially a serious problem for those few people who happen to experience it on their system.

Leap seconds can also be negative, so the last minute of June 30 or December 31 might contain only 59 seconds, but this has not yet happened since the start of leap seconds in 1972.

### Daylight Saving Time

I already mentioned how Daylight Saving Time complicates things by giving us some days with 23 hours and some with 25. It also means (in the US) that there is one day each year for which the time 1:30AM is ambiguous because it happens twice, and one day for which the time 2:30AM does not exist. But of course it's worse than that. As a scheme enacted by politicians, there's not a lot of logic to when we switch to and from DST and how they choose those dates. In 1987 Chile delayed changing DST for one day to accommodate a visit by the Pope. At least it's always at 2AM (here in the USA) and always on a Sunday morning, but over the years the lawmakers have messed with the starting and stopping days so many times that - well, you really don't want to have to deal with that. Fortunately, today's operating systems take care of this one vagary pretty well. By now they have even extracted all those changing dates out into configuration files so that, when Congress decides yet again to change the days on which DST starts and ends, your computer can keep track of the new dates with just a new configuration file rather than requiring a whole new version of the operating system.

Of course, there are plenty of devices that have clocks in them, many which we generally don't think about, so it's pretty easy to overlook updating the DST information for those devices when Congress makes a change. In October of 2008 I was in an elevator that would not take me to my floor because it was programmed to stop at that floor only between 7AM and 6PM, and although I was there at 7:45AM, the new DST law took effect in 2008 and DST ended one week later than it had been ending for many years. The elevator, which had not had its software updated to include the new DST dates, had dutifully moved its clock back by one hour on October 26 although the rest of us would not do so until November 2, and consequently it thought it was 6:45AM and thus not yet time to allow access to my floor.

Even without Congress changing when DST starts and ends, there are other complications. Arizona does not observe DST (except the Navajo Nation, which does). During the winter, they are on Mountain Standard Time, but during the summer they are on Pacific Daylight Time. If you are making a phone call from New York to Arizona, you probably want to remember that they are two hours behind you during the winter, but three hours behind you during the summer, so that you don't call too early on a summer morning. And of course countries don't all follow the same schedule, and some countries, including China, India and Japan, don't observe DST. If you work in California and are setting up a business call with a client in London, it is good to know when the time difference is 8 hours and when it is 7 or 9 hours.

### Time Zones

You might think there are 24 time zones in the world. After all, there are 24 hours in the day (except for those 23 and 25 hours days I mentioned above) and each time zone is one hour apart, so that makes 24 time zones. Not quite. Yes, most of the world is divided into 24 time zones, but some people are just not satisfied with that. There are time zones that are 30 minutes different than their neighbors (Adelaide, Kabul, Mumbai, Tehran), and time zones that are 15 minutes different than their neighbors (Chatham Island, Kathmandu). If you write a program that handles time zones but assumes they are always integer multiples of one hour apart, your program won't work correctly for those locations.

Fractional hour time shifts and changing offsets between time zones due to differences in DST are not the only problems caused by time zones. If a customer comes to you and says they want daily activity summaries, that may be pretty straightforward for a small company with one location, but what is a "daily summary" for an international company with locations all over the world?

If you are storing dates and times in a database, you might want to use the TIMESTAMP WITH TIME ZONE data type defined in SQL 92. This is supported in Oracle starting in 9i. Microsoft introduced support for timezones in SQL Server 2008 with the `datetimeoffset` data type.

### Internationalization

So what do you need to do in order to deal with world dates and times? Of course you need to store time zone information along with the date and time, and use that when doing such things as date comparisons. Although what exactly a date comparison is can take some thought. For example, you can fly from Tokyo to Seattle and arrive before you left - if you ignore the time zone.

Then there is the date and time format, which is different in different parts of the world. Different formats include different order of time units, different separator characters, different names, and 12 or 24 hour time. For example, in the US, most people would interpret the date 01/02/03 as January 2, 2003, but most Europeans would interpret it as February 1, 2003, and people in other countries might interpret it as February 3, 2001 or a few other possibilities. If you are writing a program to be used in multiple countries, you should avoid that format. Many modern language environments, such as Java, provide support for locales. If you use this language feature consistently and properly, along with the rest of the tools of internationalization, that will take care of most of the issues due to the different formatting of dates and times in different countries.

For internal use, when representing dates and times as a string, such as in data files for import/export, you should use the standard ISO-8601 standard format, which looks like this: `2009-03-22T15:30:40-08`. Remember to include the time zone.

### Calendars

Americans and most of the western world use the Gregorian calendar, and while a lot of Americans know that Chinese New Year is not on January 1st, I suspect that that is as much knowledge as most Americans have about calendars. However, besides the Gregorian calendar, which is now the internationally accepted civil calendar, there are a number of other calendars currently in use, including Chinese, Hebrew and Indian. If you happen to be writing an application that has to deal with those calendars, you will likely have the joy of writing conversion methods to and from the Gregorian calendar.

Then there are yet more calendars that are not in common use. Some of these are pretty wild by our reckoning.

### History

Different lengths for time units, Daylight Saving Time, time zones, and internationalization - what a hassle. But wait, there's more! If you are writing a program that deals with historical dates, there are a number of other little problems you have to deal with.

If you are dealing with dates going back to, say 1500, you will have to deal with the switch from the Julian calendar to the Gregorian calendar. The only difference between the two is how leap year is calculated: when Julius Caesar created the Julian calendar, he defined a leap year as occurring once every four years. As I discussed above, the current rules for leap years, which are part of the Gregorian calendar, now include exceptions every 100 and 400 years, exceptions which were never part of the Julian calendar. That means the Julian calendar was off by one day every 100 years, and after a millenium and a half it was off by about two weeks. Pope Gregory decreed in 1582 that leap years would be calculated with the new 100 and 400 year exceptions, and also that the date would be corrected to account for all those years that were leap years that should not have been. (The Church really wanted the calendar to be right so that they could know they were celebrating their holy days such as Easter on the right day.) So, according to the Pope's decree, October 4, 1582 (in the Julian calendar) was followed by October 15, 1582 (in the new Gregorian calendar). If you happen to be programming in Java, you are in luck: the GregorianCalendar class properly models this discontinuity, along with handling the Julian model of leap year before that date and the Gregorian model after that date.

But dealing with dates and times is never as easy as it first appears. Not everyone switched from the Julian to the Gregorian calendar at the same time. The English and the Americans, suspicious of the Roman Catholic empire, did not switch from the Julian to the Gregorian calendars until the middle of the 18th century. So when somebody tells you about something that happened on some date in 1700, it would be useful to know where it happened, as that would give you a good idea whether the historian recording that event was using the Julian or the Gregorian calendar. George Washington was born on February 11, because when he was born the American colonies used the Julian calendar; but as marked on the Gregorian calendar, which we now use, the day was February 22, so that's the day we now call Washington's birthday.

If you get all the way back to year zero, you should know that there was no year zero: the year before 1 CE (formerly referred to as AD) was the year 1 BCE (formerly BC). So while there are ten years between January 1 1995 and January 1 2005, there are only nine years between January 1 5 BCE and January 1 5 CE. Unless you are using the Astronomy Calendar, which does have a year zero.

But by the time you get back that far, you have to start dealing with all sorts of interesting other calendars, compared to which the Gregorian calendar, with all of its special cases mentioned above, is a model of pristine simplicity. The Gregorian calendar was a reform of the Julian calendar, which was in turn a reform of the Roman calendar, after the "years of confusion" when the older Roman calendar got messed up because the rules were kind of complicated and nobody was minding the store (due to a couple of wars). And of course the historians of the time were not using the same year numbers as we now use (perhaps you have heard the anecdote about the coin collector who tries to pass off a coin dated "6 BC").

The ancients did not use our calendar, and they did not use our times either. In the Parable of the Workers in the book of Matthew, when a workman shows up at the eleventh hour, he is not getting there just before lunch. At that time they divided the day and the night each into twelve hours, so getting there at the eleventh hour was near the end of the day. More to our point, there were 12 hours in a day (i.e. 12 hours of daylight) in both summer and winter, so an hour of day in summer was significantly longer than an hour in winter.

When you start dealing with historical dates and times you have to think about precision and accuracy. We know the time for some recent events down the second. The further back in time we go, the less accuracy we have in our knowledge of when an event occurred. If we are trying to create a database that contains events of all ages, we should also be able to specify the precision to which we want to store the time of the event so that it matches the accuracy to which we think we know that time. If you are storing dates in a database using a date/time data type, it probably does not include the ability to specify a precision. The ISO-8601 date/time standard allows omitting lesser units, such as specifying only the year and month, and it defines a way to specify a duration, which could be used as a way of specifying a precision estimate for a date. If you are going to store historical date values in a database, you might want to consider using that format, or at least think about the issue.

In general there is a conflict between the modern trend of measuring and recording when things happened to high precision versus the lower precision and uncertain accuracy of knowing when things happened in the past. This makes it difficult to have a unified timeline that can represent both very old and very new events. Given that ancient historians did not use the same calendars as we do, and in general did not measure time as accurately as we do, it can be pretty hard to figure out when something happened.

### Relativity

I mentioned Relativity at the beginning of this article. You might think it has no practical effect on our use of time in the real world, but like leap seconds, there are a few applications that have to deal with it.

To refresh your memory, according to Einstein's Theory of Special Relativity, when an object travels close to the speed of light, its clock slows down relative to an observer at rest, an effect known as time dilation. Strictly speaking, a clock on an object moving at any non-zero velocity will be slowed down relative to an object at rest, but at the speeds most of use are used to dealing with, that slowdown is imperceptible. In addition, Einstein's Theory of General Relativity predicts that an object deeper in a gravitational well will experience time dilation relative to one not as deep in the well. As with time dilation due to high speed, time dilation due to being in the Earth's gravity well is a very small effect. But both of these effects have been measured by atomic clocks carried by commercial jets, and sure enough, the clocks ran at a different rate on the jets than they would have run at rest on the ground: they ran faster because they were not as deep in the Earth's gravitational well, an effect which was somewhat countered but not overcome by the amount the clocks ran slower due to velocity time dilation.

After learning that years, days, and minutes are not all the same length of time, you might have taken solace in the thought that the second, defined according to atomic vibrations (as "the duration of 9,192,631,770 cycles of microwave light absorbed or emitted by the hyperfine transition of cesium-133 atoms in their ground state undisturbed by external fields") always has the same length of time; but given relativistic clock changes and the need to keep things synchronized, sometimes even the length of a second needs to be adjusted to make the clock work correctly in today's world of micro-second synchronization.

So what kind of application needs this kind of accuracy? Here's one example: GPS, the Global Positioning System. GPS relies on extremely accurate clocks to ensure that positions can accurately be measured. The satellites that comprise the GPS system are, like the jets carrying the atomic clocks, moving fast enough and orbiting high enough above the surface of the Earth for relativistic effects to become measurable. Without compensating for these effects, the clocks in the satellites would be off by 38 milliseconds each day. While this does not sound like much, that 38 millisecond translates to an error of over 7 miles (about 10KM) in one day. Clearly, making that relativistic correction is essential in order to make the system useful.

### Pondering Time

Imagine that you are creating a reporting program for a user. Here are some of the interesting questions you might want to think about.

If you produce a report that shows daily activity, with each day listed on the X axis, do you space all of the days evenly even across Daylight Saving Time changes, or do you make those special days a little bit shorter or longer in the graph? What about when the user zooms in on one month, or one week, or two days? Does your graphing package even support uneven lengths of days? How do you tell it that you want unevenly spaced tick marks and major labels on the X axis?

The same question can be asked regarding months, which you might want to display as the same size when viewing a couple of years worth of data, but perhaps want to display as different sizes when showing two months worth of data.

If a user is running a report that allows specifying a starting time and an ending time, and he wants one that is one hour long from 1:30AM to 1:30AM on the one 25 hour day of the year, how does he specify that?

If a user in a multi-time-zone company wants a report with a summary of sales for one day, what records do you collect? Do you arbitrarily pick a reference time zone and collect all sales transactions that happened from midnight to midnight as measure in that time zone? Do you use the time zone of the user requesting the report? The time zone of the company headquarters? Do you use the time zone of each sales transaction and collect all transactions with a local time between midnight and midnight? What if the transaction are on the internet, how do you decide what the "local" time zone is for each transaction?

### Time To Summarize

I have wandered over a lot of territory in this discussion, yet still left out all sorts of details that might be significant if you are writing a program that deals with dates and times, particularly if you are dealing with anything before a couple hundred years ago. If you find yourself in that position, or if you are simply interested in learning about gnarly details (like people starting their year on days other than January 1, or when various countries switched from Julian to Gregorian, or which countries still use other calendars), I recommend you dive in to any of the links above and start exploring.

When I first started having to deal with this stuff, there was much less support for dealing with all of these annoyances. Now, the operating systems handle DST date changes easily, the language runtimes support multiple calendars and internationalization, the databases have a datatype that includes timezones, and you can much more easily come up to speed on all these issues by using resources on the Internet. It's all so easy now! Well, maybe not so easy, but it is easier than it used to be.

You may never need to know most of these details, but at least now you know about them and hopefully will remember to review these details when you start widening the scope of your dates and times.

After all of the above, what can we briefly say about time? You could spend a lifetime learning about details, but there is not much that can easily be generalized. So what we know pretty much boils down to this: it flows, we mark and measure it in our usual human hodge-podge manner, and there is usually not enough of it.