Leap Seconds and its Effects on Site Reliability

Jayaraj Jayaraman
2 min readMar 4, 2022

--

If you say a day is exactly 86400 seconds long, you are wrong!

Did you know that Earth was spinning faster in 2020 and experienced 28 shortest days since 1960, each day lasted less than 86399.999 seconds?

The main reason for the slowing down of the Earth’s rotation is tidal friction, which alone would lengthen the day by 2.3 ms/century. 2004 Indian Ocean earthquake is thought to have shortened that day by 2.68 microseconds.

If you are developing applications that should be very precise about date and time, you should know about Leap Seconds!

A leap second is a One Second adjustment that is occasionally applied to UTC, to accommodate the difference between precise time (International Atomic Time (TAI)) and imprecise observed solar time (UT1), which varies due to irregularities in the Earth’s rotation.

The UTC, widely used for international timekeeping and in computing, uses TAI and consequently would run ahead of solar time unless it is reset to UT1 as needed.

The irregularity and unpredictability of UTC leap seconds is problematic for several areas, especially computing.

Here are some issues in computing field due to insertion/removal of leap seconds.

  • Elapsed time in seconds between two given UTC dates can go wrong.
  • Time distribution systems announce leap seconds only in the last minute and some even not at all. Clocks that are not regularly synchronized can miss a leap second.
  • Not all clocks implement leap seconds in the same manner. Unix repeats 23:59:59 or adds 23:59:60. NTP freezes time during the leap second.
  • The textual representation of leap seconds is defined as “23:59:60”. Not all programs are familiar with this format and can’t deal with such input.
  • Linux assign to the leap second the counter value of the preceding, 23:59:59 second (59–59–0 sequence), while other OSs assign to the leap second the counter value of the next, 00:00:00 second (59–0–0 sequence). There is no standard governing this sequence, which explains flaws in time-critical systems that rely on timestamped values.
  • Following the June 30, 2012 leap second, Reddit, Inc. (Apache Cassandra), Mozilla (Hadoop), Qantas, and various sites running #Linux faced production issues.
  • Cloudflare’s DNS resolver incorrectly calculated a negative number when subtracting two timestamps obtained from the Go’s time.Now() function, which then used only a real-time clock source
  • The Intercontinental Exchange, parent body to 7 clearing houses and 11 stock exchanges including the NYSE, ceased operations for 61 minutes at the time of the June 30, 2015 leap second

On solutions front, instead of inserting a leap second at the end of the day, Google servers implement a “leap smear”, extending seconds slightly over a 24-hour period. Amazon released an NTP service for EC2 instances which performs leap smearing.

Read more at: https://lnkd.in/guNhDswy

--

--