Skip to main content

Number Theory and Modular Arithmetic

Number theory studies the properties of integers. It is the mathematical foundation of modern cryptography (RSA, Diffie-Hellman), hashing, and a large class of competitive programming problems.

Divisibility

a divides b (written a | b) means b = a × k for some integer k. Equivalently, b mod a = 0.

def divides(a, b):
    return b % a == 0

Divisors of n: all positive integers that divide n. For n = 12: 12.

Naively finding all divisors is O(n). Smarter: check only up to √n — if d divides n and d ≤ √n, then n/d also divides n.

def divisors(n):
    divs = []
    for i in range(1, int(n**0.5) + 1):
        if n % i == 0:
            divs.append(i)
            if i != n // i:
                divs.append(n // i)
    return sorted(divs)

O(√n) — a crucial optimisation.

Prime Numbers

A prime is an integer > 1 divisible only by 1 and itself. There are infinitely many primes (Euclid's proof: assume finitely many, multiply them all and add 1 — the result is divisible by none of them, contradiction).

Primality test: check divisibility by all integers up to √n.

def is_prime(n):
    if n < 2: return False
    if n < 4: return True
    if n % 2 == 0 or n % 3 == 0: return False
    i = 5
    while i * i <= n:
        if n % i == 0 or n % (i + 2) == 0:
            return False
        i += 6
    return True

O(√n).

Sieve of Eratosthenes: find all primes up to N in O(N log log N).

def sieve(n):
    is_prime = [True] * (n + 1)
    is_prime[0] = is_prime[1] = False
    for i in range(2, int(n**0.5) + 1):
        if is_prime[i]:
            for j in range(i*i, n + 1, i):
                is_prime[j] = False
    return [i for i in range(2, n + 1) if is_prime[i]]

GCD and LCM

Greatest Common Divisor (GCD): the largest integer dividing both a and b.

Euclidean algorithm: gcd(a, b) = gcd(b, a mod b), base case gcd(a, 0) = a. O(log(min(a,b))).

def gcd(a, b):
    while b:
        a, b = b, a % b
    return a

# Or:
from math import gcd

Least Common Multiple: lcm(a, b) = a × b / gcd(a, b).

Bézout's identity: there exist integers x, y such that ax + by = gcd(a, b). The extended Euclidean algorithm finds x and y — essential for computing modular inverses.

Modular Arithmetic

a mod m (or a % m) is the remainder when a is divided by m. The result is always in [0, m-1].

17 mod 5 = 2   (17 = 3×5 + 2)
-3 mod 5 = 2   (in mathematics; Python also gives 2)

Congruence: a ≡ b (mod m) means m | (a - b). They leave the same remainder when divided by m.

Arithmetic in Modular World

Addition, subtraction, and multiplication distribute over mod:

(a + b) mod m = ((a mod m) + (b mod m)) mod m
(a × b) mod m = ((a mod m) × (b mod m)) mod m

This is crucial for preventing integer overflow when computing large numbers:

MOD = 10**9 + 7   # common modulus in competitive programming

result = 1
for i in range(1, n + 1):
    result = (result * i) % MOD   # factorial mod MOD, never overflows

Modular Exponentiation

Computing aⁿ mod m naively is O(n) multiplications. Fast exponentiation (binary exponentiation) does it in O(log n):

def pow_mod(base, exp, mod):
    result = 1
    base %= mod
    while exp > 0:
        if exp % 2 == 1:
            result = (result * base) % mod
        base = (base * base) % mod
        exp //= 2
    return result

# Python has this built in:
pow(base, exp, mod)   # O(log exp)

RSA encryption and decryption are essentially modular exponentiation.

Modular Inverse

The modular inverse of a mod m is a number x such that (a × x) ≡ 1 (mod m). It is the modular analogue of division.

It exists if and only if gcd(a, m) = 1 (a and m are coprime).

Fermat's Little Theorem: if p is prime and gcd(a, p) = 1:

aᵖ⁻¹  1 (mod p)

Therefore: a⁻¹ ≡ aᵖ⁻² (mod p). Compute using fast exponentiation.

def mod_inverse(a, p):
    # p must be prime
    return pow(a, p - 2, p)

# Modular division:
# (a / b) mod p = (a × b⁻¹) mod p = (a × pow(b, p-2, p)) mod p

This enables computing combinations mod p efficiently — critical in competitive programming:

def comb_mod(n, k, p):
    if k > n: return 0
    num = 1
    den = 1
    for i in range(k):
        num = (num * (n - i)) % p
        den = (den * (i + 1)) % p
    return (num * pow(den, p - 2, p)) % p

Chinese Remainder Theorem (CRT)

Given a system of congruences:

x  a₁ (mod m₁)
x  a₂ (mod m₂)
...
x  aₖ (mod mₖ)

If the moduli m₁, m₂, ..., mₖ are pairwise coprime, there exists a unique solution x mod (m₁ × m₂ × ... × mₖ).

CRT appears in: RSA key generation, calendar calculations, competitive programming, and distributed systems (hashing across multiple tables).

Number Theory in Hashing

The choice of hash table size as a prime is not arbitrary. If m is prime, fewer keys collide: for a hash function h(k) = k mod m, if m is prime, the collision pattern has better uniformity than composite moduli.

Polynomial rolling hash (used in Rabin-Karp string matching):

def rolling_hash(s, base=31, mod=10**9 + 7):
    h = 0
    for ch in s:
        h = (h * base + ord(ch)) % mod
    return h

The modulus is typically a large prime to minimise collisions.

Key Takeaways

  • Check divisibility up to √n for O(√n) factorisation. Sieve of Eratosthenes finds all primes up to N in O(N log log N).
  • GCD via the Euclidean algorithm runs in O(log(min(a,b))).
  • Modular arithmetic: addition and multiplication distribute over mod. Use this to prevent overflow.
  • Fast exponentiation: aⁿ mod m in O(log n) using repeated squaring — the core of RSA.
  • Fermat's little theorem: a modular inverse of a (mod prime p) = aᵖ⁻² mod p.
  • Prime moduli improve hash uniformity; polynomial rolling hash uses modular arithmetic for string matching.