The fastest “Is Leap Year” functions

Showcasing techniques from various algorithm designers

11 December 2025
Updated: 18 December 2025

Welcome back to the fast date series, where we showcase techniques to squeeze every last nanosecond out of standard date library functions.

In this article I give an overview and explanation of various functions that people have developed to calculate whether a year is a leap year. This type of calculation is a fundamental date library utility, it is used all over the place: validating and parsing dates, determining the Nth last weekday of a month, handling month overflow/underflow when adjusting the day, etc.

I will also outline my own technique that I have developed, which matches the speed of the usual recommended algorithm (Drepper-Neri-Schneider) but using a slightly different idea.

Sections:

Speed comparison at a glance

Relative Speeds of Fastest Leap Year AlgorithmsAs tested on Intel x64, Snapdragon ARM and Apple M4 Pro processors
(smaller numbers = faster)
See benchmark section
for further information.

Textbook
(Branchless)
2-6×
Drepper-Neri-Schneider
(Signed)
(2023)
Drepper-Neri-Schneider
(Unsigned)
(2023)~0.9×
Falk
Hüffner
(Restricted range)
(2025)~0.85×

Drepper-Neri-Schneider

For the longest time, leap years were determined by variations of the standard “textbook” formula:

The Textbook Formula
  1. is_leap = year % 4 == 0 && (year % 100 != 0 || year % 400 == 0)

The above is quite slow, as every call must compute two costly modulus functions.

It is important to note that this whole line often compiles to a branch-free formula, where each equality is always calculated and then boolean-compared, which, on many platforms, is faster than using branches, which can halt the pipeline when mis-predicted.

Branched versions have also been used in many libraries in the past, however with the advent of very fast leap year functions, branched versions are no longer the best choice.

The textbook formula was improved significantly by Cassio Neri and Lorenz Schneider:

Neri-Schneider (Version 1)
  1. bool is_cen = (year % 100 == 0)
  2. bool leap = is_cen ? (year % 16 == 0) : (year % 4 == 0)

The brilliant insight here is that all parts of the formula are going to run each time in a branchless version, so we might as well check for divisibility by 100 first. If it's divisible by 100 then checking for divisibility by 4 is redundant. In such century-years, the divisibility check by 16 will accurately match centuries that are a multiple of 400, but not other centuries.

This overall technique saves an "expensive" non-power-of-2 modulus call. Such modulus calls are a lot more expensive than they look. In order to compute year % 100, a compiler will, in the worst case, calculate: year - year / 100 * 100. That is, two chained expensive operations. On the other hand, % 16 is a cheap single-cycle operation.

This formula was further improved via collaboration with Ulrich Drepper:

Drepper-Neri-Schneider (version 2)
  1. bool is_cen = (year % 100 == 0)
  2. bool leap = (year & (is_cen ? 15 : 3) == 0)

Drepper's idea was to take the year and the equality check outside of the ternary. This reduces the number of steps required, as a computer would otherwise need to perform the modulus and equality check twice, and send the results into the ternary.

Note: The line: & (is_cen ? 15 : 3) is a bitwise equivalent of % (is_cen ? 16 : 4). This change was due to Neri observing that gcc was emitting an expensive idiv assembly instruction instead of a cheap bitwise operation. From my testing, MSVC also seems to compile % to more slow code than & for signed integers, so until compilers have improved, this more obscure variant will be appropriate.

Neri presented several different variants in his calendar algorithms repository , however eventually settled upon a final improvement (first described on 15 Nov 2023 in a commit to libstdc++):

Drepper-Neri-Schneider (version 3)
  1. bool is_cen = (year % 25 == 0)
  2. bool leap = (year & (is_cen ? 15 : 3) == 0)

You can see that the century line is checking for divisibility by 25 rather than by 100. The reason for this is that modern compilers are able to compile % 25 more efficiently than % 100. This will yield false-positives for the century check, but they do not matter, as they are cancelled out in the next step.

That is to say, this first step achieves the following:

  • Categorisation of all years that are divisible by 100 one way.
  • Categorisation of all other years that are divisible by 4 another way.
  • And importantly: For all other years, categorisation does not matter.

The modulus check by 25 being fast is a feature of many compilers, however MSVC still uses outdated techniques to compile this, and therefore if you want the maximum possible speed on all compilers, you will need to hand-code the specific transformation that is applied:

Drepper-Neri-Schneider (version 3 - manually unrolled)
  1. const int32 A = -1030792151
  2. const uint32 B = 85899345
  3. const uint32 C = 171798691
  4. bool is_cen = uint32(year * A + B) < C
  5. bool is_leap = (year & (is_cen ? 15 : 3)) == 0

Falk Hüffner's Algorithm

On 15 May 2025, Falk Hüffner published an incredibly brief 3-instruction function to compute the leap year check over a restricted range of years: 0 ➜ 102499:

Falk Hüffner's Algorithm (Unsigned 32-bit)
  1. bool is_leap = (uint32(year * 1073750999) & 3221352463) <= 126976

In the words of Mr Hüffner: “How does this work? The answer is surprisingly complex.”.

Interestingly, it was found by setting up the framework of the algorithm and using an automated solver to find the constants. Hüffner does a good job of describing the algorithm in his blog post .

A solution was also found in 64-bit, providing a range of years: 0 ➜ 5965232499. This conveniently exceeds the unsigned 32-bit range, so can be used as a general purpose leap year check for libraries that only deal with 32-bit positive dates.

Falk Hüffner's Algorithm (Unsigned 64-bit)
  1. const uint64 f = 4611686019114582671
  2. const uint64 m = 13835058121854156815
  3. const uint64 t = 66571993088
  4. bool is_leap = (uint64(year * f) & m) <= t

The performance gain of this algorithm is modest, so the disadvantages of obscurity and restricted range/sign mean this is not generally recommended in all cases. My testing indicates that an attempt to make this algorithm work for signed dates by using an initial constant addition removes most (if not all) of the speed gains on x64, but retains the speed on ARM.

If a library only needs to provide accurate results over the signed 16-bit range (-32,768 to 32,767) then using the 32-version with a positive offset bias is likely optimal. 82 * 400 = 32800 is an appropriate bias as it is the smallest multiple of 400 exceeding 2^15.

Such an algorithm might look like the following:

Given:   year (signed 16-bit integer)   —   Then:

Falk Hüffner's Algorithm (Signed 16-bit)
  1. // Falk Hüffner Algorithm with bias for supporting negative dates
  2. const uint32 MUL = 1073750999
  3. const uint32 CUTOFF = 126976
  4. const uint32 MASK = 3221352463
  5. const uint32 BIAS = 82 * 400
  6. uint32 low = (int32(year) + BIAS) * MUL
  7. bool is_leap = (low & MASK) <= CUTOFF

My 32-Bit Algorithm

Upon the first release of this article, I was only aware of the "version 2" of the Drepper-Neri-Schneider full range leap year algorithm, and thought that I had developed a new faster full-range one. Shortly afterwards I was informed of these other recent developments since 2023 — so, it turns out my attempt is equally as fast as the pre-existing "(version 3)" of Drepper-Neri-Schneider, and of course slower than Falk Hüffner's algorithm.

My attempt, while being obscure, is at least slightly interesting so I'll still document it here:

Given:   year (signed 32-bit integer)   —   Then:

My fuzzy-century Leap Year function (32-bit only):See in C++
  1. const uint32 CEN_MUL = (1 << 32) / 100 + 1   // 32-bit reciprocal of 100   (42,949,673)
  2. const uint32 CEN_CUTOFF = CEN_MUL * 4        // Over-estimated cutoff      (171,798,692)
  3. const uint32 CEN_BIAS = CEN_MUL / 2 * 100     // Large multiple of 100y  (2,147,483,600)
  4. uint32 low = (year + CEN_BIAS) * CEN_MUL     // Lower 32-bits
  5. bool maybe_cen = low < CEN_CUTOFF            // Year is likely divisible by 100
  6. bool is_leap = (year & (maybe_cen ? 15 : 3)) == 0
  • Note: For an unsigned version, the algorithm is the same but use CEN_BIAS = 0.

The general approach here is to accept false positives in our century check, in the same vein as the % 25 idea. Instead of false positives being equally spread out 3 times per century (year 25, 50 and 75), they are false positives around the century mark.

Using the navigation buttons and scrollbar, you can scroll through all 4-billion+ possible 32-bit input values, and see the computation yourself. The scrollbar will snap to centuries, as that is where the important computation steps are found. As you can see, each century the value of low < CEN_CUTOFF evaluates to true, and nearby false positives (“Wrong (but OK)”) are never leap years.

yearCentury?Leap?A = uint32(
year + CEN_BIAS)
(2,147,483,600)
low = u32(
 A * CEN_MUL)
(42,949,673)
low < CEN_CUTOFF
(< 171,798,692)
Correct?
i.e. is a Cent.
0005NoNo2,147,483,605300,647,709falseCorrect
0004NoLeap2,147,483,604257,698,036falseCorrect
0003NoNo2,147,483,603214,748,363falseCorrect
0002NoNo2,147,483,602171,798,690trueWrong (but OK)
0001NoNo2,147,483,601128,849,017trueWrong (but OK)
0000CenturyLeap2,147,483,60085,899,344trueCorrect
-0001NoNo2,147,483,59942,949,671trueWrong (but OK)
-0002NoNo2,147,483,5984,294,967,294falseCorrect
-0003NoNo2,147,483,5974,252,017,621falseCorrect
-0004NoLeap2,147,483,5964,209,067,948falseCorrect
-0005NoNo2,147,483,5954,166,118,275falseCorrect

The reason this algorithm works is due to 232 being very close to a multiple of 100:

2^32 / 100 = 42,949,672.96

As you can see above, the fractional part of 0.96 is only 4% away from a whole number. That “error” aligns perfectly with the required tolerance, where we can allow the century check to be accurate within ±4 years.

On the other hand, 16-bit and 64-bit do not have as nice fixed-point reciprocals, so this technique is restricted to 32-bit only:

  • 2^16 / 100 = 655.36
  • 2^64 / 100 = 184,467,440,737,095,516.16

Benchmark results

The below table shows the number of milliseconds taken to process a large number of dates.
The number of dates processed differed per platform due to memory constraints (stated per row), but the relative times ratios noted in bold are comparable across the board.

The benchmarks were run 5 times on each platform, with the median result selected for each algorithm.

Importantly: The Drepper-Neri-Schneider algorithm that is benchmarked is the manually unrolled version, given that MSVC does not compile the % 25 version efficiently. If the % 25 version is used, it seems to cause a slowdown on these Windows machines by around 15-40%.

No meaningful difference in speed was detected on Apple Silicon for Signed vs Unsigned.

Algorithm:TextbookDrepper-Neri
-Schneider
(Signed)
Drepper-Neri
-Schneider
(Unsigned)
Falk Hüffner
RangeFullFullFull0 to 102,499
Dell Inspiron 13-5378 (Windows 10 22H2)
Intel Core i3-7100 @ 2.4 GHz
Compiler: MSVC 19.44 (x64)
Iterations: 100M
4.86x
(952.009)
1.00x
(195.983)
0.88x
(171.64)
0.79x
(155.082)
Lenovo IdeaPad Slim 5 (Windows 11 24H2)
Snapdragon X126100 @ 2.956 GHz
Compiler: MSVC 19.44 (ARM64)
Iterations: 300M
3.28x
(1011.72)
1.00x
(308.811)
0.92x
(284.355)
0.89x
(275.108)
MacBook Pro 2024 (MacOS 15.6.1)
Apple M4 Pro
Compiler: Apple clang 17.0.0
Iterations: 500M
6.29x
(702.742)
1.00x
(111.675)
1.00x
(112.196)
0.86x
(95.885)
MacBook Pro 2020 (MacOS 14.3.1)
Intel Core i5 @ 2 GHz
Compiler: Apple clang 15.0.0
Iterations: 500M
2.22x
(452.112)
1.00x
(203.38)
0.94x
(190.249)
0.89x
(181.542)

The same C++ file is used for verifying the range as well as benchmarking (although it still needs an update to reflect the changes to this article).


Special thank you to Cassio Neri for providing me with a detailed account of how his algorithm was developed. This article could not have been written without his input.


In the next article, yet to be published, we will explore how the calculation of "is leap year" can be made almost for "free", when combined with other calendar related algorithms that already need to compute the number of centuries that have elapsed. The technique explored will involve analysing both the high and low 32 bits resulting from the century calculation step, with a particular focus on ordinal dates, as used by the popular "Time" Rust library. If you are particularly interested, you can preview that new algorithm here .

If you found this interesting, you should follow me on X to get notified when I publish more date and algorithm related articles.