Welcome to your-notes.
∴
lorem ipsum dolor sit amet
— your-name
Welcome to Number Theory and Cryptography notes.
∴
I completed the course by taking detailed notes and summarizing critical concepts for future reference.
University of California San Diego
Instructor: Michael Levin
— emreaslan —
Numbers
Most of the concepts we learn in math seem pretty far removed from our everyday lives. We might think, "Who really needs to know about calculus or trigonometry?" But the truth is, even the most basic mathematical ideas have their roots in real-world problems. Let's go back to the very beginning and see how it all started...
Once upon a time, in a big, dark cave, there lived a caveman named Grog. Grog had a bunch of sheep. He wanted to know how many sheep he had, so he started counting, And just like that, he discovered counting numbers or natural numbers (). Pretty cool, huh?
But then, a sneaky caveman named Ug came along and stole some of Grog’s sheep! Ug told Grog that he needed to give him 3 more sheep. Grog was upset, but he started thinking. If he wanted to have 3 sheep left, he needed to have 6 sheep in total to give 3 to Ug. This is how Grog discovered negative numbers () and integers ()
Later, Grog had 3 delicious berries. He wanted to share them with his family of 4. But wait, he couldn’t divide 3 berries into 4 equal parts! That’s when Grog discovered rational numbers (fractions). Even though he couldn’t share them equally, he learned that sometimes, numbers can be broken into smaller pieces.
Number theory is all about studying whole numbers and how they work together. It started out with practical uses, but as it got more complex, it seemed less useful in real life.

After interesting researches about cryptography and electric fiels, number theory, even the really complicated parts, is super important for modern cryptography. And cryptography is a big deal in our lives. It's what keeps our emails, messages, online shopping, and the whole internet safe.
Divisibility
When we say that a number is divisible by number , we mean that can be divided by without leaving a remainder. In other words, is multiple of .
Example:
is divisible by because divided by equals without any remainder. is not divisible by because 15 divided by leaves a remainder of .
Formal Definition:
We can write this mathematically as:
If is divisible by , we can write as multiplied by another integer . In symbols: .
Equivalently, we can use the vertical line notation: . This means b divides a.
Example:
is divisible by because . Therefore, we can also write .
Checking Divisibility in Python:
In Python, we can use the modulo operator (%) to check if a number is divisible by another. If the remainder is , then the first number is divisible by the second.
print(15 % 3 == 0) # Output: True
print(15 % 4 == 0) # Output: False
Properties of Divisibility:
-
: If divides and , then divides .
: Since divides , there is such that . Similarly, there is such that . Then
-
: If , then for any integer , .
: Since , we have that for some . We can multiply both sides of equality by : . By defination, this means that is divisible by .
Quiz - Division by 101
Question : How many -digit non-negative numbers are there that have remainder when divided by 101? Here we assume that -digit and -digit numbers are also -digit, they just start with .
Answer :
All numbers having the remainder when divided by have the form for integer . The number given by this expression is non-negative and -digit for between and , inclusive.
These properties are essential for understanding divisibility and proving various mathematical theorems.
As you can see, divisibility is a fundamental concept in mathematics that has practical applications in various fields, from computer science to cryptography.
Remainders
In our previous exploration of integer division, we encountered situations where one integer couldn't be perfectly divided by another. This led us to the concept of divisibility. Now, let's delve deeper into the mechanics of integer division when it's not exact.
Defination
Given two integers, and (where is positive), the division with remainder of by is a pair of integers, (quotient) and (remainder), satisfying the following conditions:
Think of as a collection of objects. We want to divide these objects into groups of size . The quotient, , represents the number of complete groups we can form, while the remainder, , is the number of objects left over after forming the groups.
-
, So, and
-
, So, and
-
, So, and (in this case, a is divisible by b, ).
def divide_with_remainder(a, b):
"Divides a by b and returns the quotient and remainder."
q = a // b
r = a % b
return q, r
# Examples
a, b = 15, 4
q, r = divide_with_remainder(a, b)
print(f"{a} = {q} * {b} + {r}")
Algorithms
If we consider numbers that have the same remainder when divided by specific number, we can observe interesting patterns. For example, numbers that leave a remainder of when divided by have the form:
where is any integer. This pattern can be extended to any divisor and any remainder. For example, this pattern creates own group by given . For positive we get sequence For negative we get sequence
Overall, this is our group would be like this:

Exercise
Question : Is it true that for any four integers there are two of them whose difference is divisible by 3?
See answer ▼
Answer : A
There are only three possible remainders when dividing by .So, among four integers there are two with the same remainder when divided by . The difference of these two numbers is always divisible by .
Conclusion
Division with remainder is a fundamental concept in number theory that allows us to understand the relationships between integers and their divisibility properties.
It has applications in various areas of mathematics and computer science.
Modular Arithmetic
What is the remainder when this expression is divided by 3?
It seems like it will take some time to calculate the result of the operation. Maybe, we choose another way to solve this. But first, we must examine modular artihmetic deeply.
Definition : We say that two numbers and are congruent modulo if they have the same remainder when divided by . This donete by
In other words, a number is congruent modulo to all numbers for integer . In particular, if is remainder of when divided by then .
Key Lemmas and Properties
Addition
- If , then for any .
- If and , then .
Multiplication
- If , then for any .
- If and , then .
These lemmas essentially state that we can perform addition and multiplication on congruent numbers without changing their congruence modulo .
Example
To find the remainder of when divided by , we can replace each number with its equivalent modulo :
Then, the sum becomes:
Therefore, the remainder is 1.
Now we are ready to solve the problem that asked us to calculate the remainder of when divided by .
To compute the remainder, we can substitute each number by their remainder when divided by 3. As we have seen, this change does not affect the remainders of the results of arithmetic operations. This gives us
This simplifies computation a lot. This way, the computer does not consume much computing power.
Modular Subtraction and Division
Modular Subtraction
Modular subtraction is a straightforward operation within the realm of modular arithmetic. It follows intuitively from the properties of modular addition.
Definition: Given two integers and , and modulus , the modular subtraction of from modulo is defined as finding an integer such that:
This is equivalent to finding a number that, when added to , gives modulo .
Example:
Find such that . We can calculate directly: . Since , we have .
Modular subtraction is closed under the set of integers modulo m. That is, the result of modular subtraction is always an integer between 0 and m-1.
Modular Division
Modular division is more complex than modular subtraction and is not always defined.
Definition: Given two integers and , and modulus , the modular division of by modulo is defined as finding an integer such that:
This is often called the modular multiplicative inverse of modulo .
Not Always Defined
Unlike modular addition and subtraction, modular division is not always defined. For example, consider the equation There is no integer that satisfies this equation.
Is there a way when the division is possible and when it is not? We will figure this out soon!
The things turn out to be rather complicated. On the one hand, it is bad: it is nice when the computations and calculations are simple. On the other hand, this is good: complicated things are crucial for cryptography.
Questions
Question-1 : What is the remainder of when divided by ?
See answer ▼
Answer : 1
We can substitute each number by a congruent modulo and the remainder will remain the same. Substituting each number by or we get
Question-2 : What is the remainder of when divided by ?
See answer ▼
Answer : 4
We can substitute each number by a congruent modulo 5 and the remainder will remain the same. Substituting each number by or we get
Question-3 : What are the last two digits of the number ?
The number is huge, we do not want to compute it. Instead, we can use remainders. Note that the number consisting of the last two digits is just the remainder after the division by . Thus, we are actually interested in the remainder of the number when divided by .
Note that . This allows us to simplify the computation a lot:
Thus, the remainder of when divided by is and these are the two last digits of .
In the computation above we used that is just multiplied by itself times. We then use that in the multiplication modulo some number we can substitute the numbers by their congruent.
Note that one cannot just substitute any number in any expression by its congruent. For example, we cannot substitute by We have only proved that we can substitute numbers by their congruents in additions and multiplications, and we should be careful to use only these properties.
Question-4 : Is the number divisible by ?
To solve this problem it is enough to compute the remainder of the number after the division by : the number is divisible by if the remainder is .
But how can we compute the remainder fast? We can try the same approach we used for divisibility tests. Observe that
Now we represented our number as an arithmetic expression and we can use modular arithmetic. Note that
Thus, for any positive
Applying this to our number, we get
Now
Thus, the remainder of when divided by is and is not divisible by .
Note, that the same argument can be applied in a general setting, giving the following lemma.
Question-5 : What is the remainder of the number when divided by ?
See answer ▼
Answer : 1
we are interested in the remainder of the product of by itself times. Since congruence is preserved under multiplication, we can substitute each number by a congruent number. Note that , so we have
Question-6 : Find a remainder such that ?
We must multiply by such an integer that when we divide the resulting number by , gives the remainder.
The first positive number that gives a remainder of is . However, we cannot get by multiplying a positive number by . Let's try the number sequence starting from and increasing by . the quotient of every number divisible by in this group is our answer. The smallest correct answer for is .
Question-7 : Find a remainder such that ?
Again, try the way just before. But first of all, we can minimize this problem to
Now, We must multiply by such an integer that when we divide the resulting number by , gives the remainder. We create possible group like this: However, we cannot get by multiplying a positive number by . and are the numbers divisible by at the beginning of the group and these constitute our answers. The smallest value will be .
Of course, we can create correct answer group like this:
Greatest Common Divisor
The greatest common divisor (GCD) of two integers, and , is the largest positive integer that divides both A and B without leaving a remainder. It's a fundamental concept in number theory with applications in various fields, including cryptography and computer science.
Key Properties of the GCD
- Divisibility: Any common divisor of and must also divide their GCD.
- Uniqueness: For any given and , there exists a unique GCD.
- Non-Trivial Case: If and are not both zero, their GCD is always positive.
The Naive Approach: Inefficient Brute Force
A straightforward method to compute the GCD is to iteratively check all possible divisors from the minimum of and down to .
However, this approach becomes extremely inefficient for large numbers, as the number of iterations grows linearly with the minimum value.
You can try this python code on your computer. First try small numbers you see solution in miliseconds but if you try large numbers you can hear the sound of the computer's fans 😁
def gcd(a, b):
assert a >= 0 and b >= 0 and a + b > 0
if a == 0 or b == 0:
return max(a, b)
for d in range(min(a, b), 0, -1):
if a % d == 0 and b % d == 0:
return d
return 1
print(gcd(0, 1))
print(gcd(24, 16))
# The following call would take too long
#print(gcd(790933790547, 1849639579327))
Motivation for an Efficient Algorithm
The need for an efficient GCD algorithm arises from its applications in computationally intensive tasks. For example, in cryptography, the GCD is used in algorithms like the RSA encryption scheme. A slow GCD algorithm can significantly impact the performance of these cryptographic systems.
The Euclidean Algorithm: A More Efficient Approach
The Euclidean algorithm is a classical method for efficiently computing the greatest common divisor (GCD) of two numbers. It is based on the principle that the GCD of two numbers also divides their difference. This algorithm is not only simple but also significantly faster than the naive approach of checking all possible divisors.
In the next lesson, we will delve into the Euclidean algorithm in detail. We will explore its mathematical foundation, step-by-step procedure, and various optimizations that can be applied to further enhance its performance. Additionally, we will implement the algorithm in code and discuss its applications in different computational tasks, including cryptography and number theory.
Euclid's Algorithm
Euclid's algorithm, a cornerstone of number theory, provides an efficient method for calculating the greatest common divisor (GCD) of two integers. This ancient algorithm, attributed to the Greek mathematician Euclid, has found widespread applications in various fields, from cryptography to computer science.
The algorithm's foundation lies in the following lemma:
Euclid's Lemma: An integer divides both integers and if and only if divides and .
This lemma can be intuitively understood by considering a visual representation. If and can be divided into equal-sized groups of , then their difference, , can also be divided into the same-sized groups.
Conversely, if and can be divided into groups of , then can be constructed by combining these groups, implying that is also divisible by .
def gcd(a, b):
assert a >= 0 and b >= 0 and a + b > 0
while a > 0 and b > 0:
if a >= b:
a = a - b
else:
b = b - a
return max(a, b)
print(gcd(24, 16))
print(gcd(790933790547, 1849639579327))
# The following call would take too long
# print(gcd(790933790548, 2))
This implementation directly follows the definition of the GCD by repeatedly subtracting the smaller number from the larger one until one of them becomes zero.
However, for large numbers, this approach can be inefficient due to the potentially large number of subtractions. In this example, we see lots of subtractions.
Why the Modulo Operator % is More Efficient
A more efficient approach involves using the modulo operator (%) to find the remainder when one number is divided by another.
This is because the modulo operation directly gives us the "leftover" part after division, eliminating the need for repeated subtractions.
Consider the following revised implementation:
def gcd_efficient(a, b):
while b != 0:
a, b = b, a % b
return a
This recursive version is more concise and efficient than the iterative one. Here's a breakdown of why the modulo operator is more efficient:
-
Direct Calculation: The modulo operator calculates the remainder in a single operation, avoiding the need for multiple subtractions.
-
Reduced Iterations: By directly jumping to the remainder, the algorithm reduces the number of iterations required to find the GCD, especially for large numbers.
-
Quicker Convergence: The modulo operation ensures that the numbers involved in the calculation become smaller more rapidly, leading to faster convergence to the GCD.
To illustrate the difference in efficiency, consider the following example:
Calculating the GCD of and using repeated subtraction would involve subtracting from the larger number billions of times. However, using the modulo operator, the algorithm would quickly determine that the remainder is , and hence the GCD is .
While both the repeated subtraction and modulo-based approaches are correct, the modulo-based implementation is significantly more efficient, especially for larger numbers.
By directly calculating the remainder, the modulo operator reduces the number of iterations required and leads to a faster convergence to the GCD. Therefore, the modulo-based approach is the preferred method for implementing Euclid's algorithm in practice.
The Iterative Process
Euclid's algorithm employs a repetitive process to determine the GCD:
-
Initialization: Start with two non-negative integers, and .
-
Remainder Calculation: If a is greater than , replace a with the remainder of divided by . Otherwise, replace with the remainder of divided by .
-
Termination: Repeat step until one of the numbers becomes zero. The non-zero number at this point is the GCD of the original and .
A Step-by-Step Example
Let's illustrate Euclid's algorithm with a concrete example:
Example: Find the GCD of and .
Initial values: ,
-
Iteration: % . New values: ,
-
Iteration: % . New values: ,
-
Iteration: % . The algorithm terminates.
The GCD of and is .
The Recursive Approach
While the iterative approach is intuitive, Euclid's algorithm can also be implemented recursively:
def gcd_recursive(a, b):
if b == 0:
return a
else:
return gcd_recursive(b, a % b)
The number of iterations required by Euclid's algorithm is bounded by the logarithm of the larger input number. This makes it extremely efficient, even for very large numbers.
In practice, the algorithm typically converges much faster than this theoretical upper bound.
Applications
Euclid's algorithm has numerous applications, including:
-
Diophantine Equations: Solving equations involving integers.
-
Continued Fractions: Generating continued fraction representations of real numbers.
-
Modular Arithmetic: Computing modular inverses and solving linear congruences.
-
Cryptography: Underlying various cryptographic algorithms, such as RSA.
Conclusion
Euclid's algorithm is a fundamental algorithm in number theory, providing an efficient and elegant solution for calculating the greatest common divisor of two integers.
Its applications extend to various fields, making it an essential tool in mathematics and computer science.
Extended Euclid's Algorithm
Introduction
The Extended Euclidean Algorithm is a fundamental algorithm in number theory that not only computes the greatest common divisor (GCD) of two integers but also provides coefficients that express the GCD as a linear combination of these integers.
This additional information is invaluable in various mathematical and cryptographic applications.
The Standard Euclidean Algorithm
Before delving into the extended version, let's briefly review the standard Euclidean Algorithm. It's a method for finding the GCD of two non-negative integers.
The algorithm repeatedly applies the division algorithm to obtain a sequence of remainders until a remainder of is reached. The last non-zero remainder is the GCD.
Example:
Let's apply the Euclidean algorithm to find the GCD of and :
-
Step 1:
-
Step 2:
-
Step 3:
Since the remainder is , the GCD is the last non-zero remainder, which is .
In summary, the Euclidean algorithm is a simple yet efficient method for finding the GCD of two numbers.
It involves repeated division and remainder calculations until a remainder of is reached. The last non-zero remainder is the GCD.
Extending the Algorithm
The Extended Euclidean Algorithm builds upon the standard algorithm by introducing two auxiliary sequences of integers. These sequences, often denoted as and , are calculated at each step of the algorithm and are used to express the current remainder as a linear combination of the original two numbers.
Formal Description:
Given two non-negative integers and , the Extended Euclidean Algorithm calculates integers , , and such that:
Our main goal is to see which and value multiplication produces the number pair whose gcd we know. We are actually putting Euclid's operations in reverse order.
Example:
Given:
(as calculated using the standard Euclidean algorithm)
Goal:
Find integers and such that
Steps:
-
Step 1: So,
-
Step 2:
-
Step 3: So,
Rearranging to match the desired form:
Final Solution:
Therefore, for the equation , the values of and are:
So, can be expressed as:
The Algorithm
def extended_gcd(a, b):
assert a >= b and b >= 0 and a + b > 0
if b == 0:
d, x, y = a, 1, 0
else:
(d, p, q) = extended_gcd(b, a % b)
x = q
y = p - q * (a // b)
assert a % d == 0 and b % d == 0
assert d == a * x + b * y
return (d, x, y)
Explanation
-
Function Definition:
extended_gcd(a, b): This defines a function that takes two non-negative integers, and , as input and returns a tuple containing the greatest common divisor (GCD) d, and the coefficients and such that .
-
Preconditions:
assert a >= b and b >= 0 and a + b > 0: Ensures that the input numbers are valid for the algorithm. a must be greater than or equal to b, both must be non-negative, and their sum must be greater than .
-
Base Case:
if b == 0: d, x, y = a, 1, 0: If b is 0, the GCD is , and the coefficients are and , respectively. This is the base case of the recursion.
-
Recursive Case:
(d, p, q) = extended_gcd(b, a % b): Recursively calls the function with b and the remainder of divided by . This is the core of the Euclidean algorithm.
x = q: The coefficient is assigned the value of from the recursive call.
y = p - q * (a // b): The coefficient is calculated using the values of , , and the integer division of by . This step is crucial for expressing the GCD as a linear combination of and .
-
Postconditions:
assert a % d == 0 and b % d == 0: Verifies that the calculated GCD divides both and .
assert d == a _ x + b _ y: Ensures that the calculated GCD can be expressed as a linear combination of and using the coefficients and .
-
Return Value:
return (d, x, y): Returns a tuple containing the GCD and the coefficients and .
Applications
The Extended Euclidean Algorithm has numerous applications, including:
-
Modular Arithmetic:
-
Finding modular inverses: Given two relatively prime integers and , the algorithm can be used to find an integer such that . This is crucial in cryptography for operations like decryption.
-
Solving linear Diophantine equations: Equations of the form can be solved efficiently using the algorithm.
-
-
Cryptography:
-
RSA algorithm: The Extended Euclidean Algorithm is used to compute the private key in the RSA cryptosystem.
-
Diffie-Hellman key exchange: It is used to compute the shared secret key.
-
-
Number theory:
- Continued fractions: The algorithm can be used to find the continued fraction representation of a rational number.
Conclusion
The Extended Euclidean Algorithm is a fundamental algorithm with far-reaching implications in number theory and cryptography.
Its ability to compute the GCD and provide coefficients for a linear combination makes it an essential tool for solving various mathematical problems.
By understanding the principles behind this algorithm, one can gain a deeper appreciation for its applications in modern cryptography and computer science.
Applications
Least Common Multiple
The least common multiple () is a fundamental concept in number theory, with applications ranging from simple arithmetic to advanced algorithms.
In essence, the of two integers is the smallest positive integer that is divisible by both numbers.
This concept is crucial in various mathematical operations, especially when dealing with fractions and rational numbers.
Definition and Properties
Formally, the least common multiple of two non-zero integers, and , denoted as , is the smallest positive integer that is divisible by both and . It is undefined if either or is zero.
Key properties of the include:
-
Commutativity:
-
Associativity:
-
Distributivity:
-
Relationship with Greatest Common Divisor ():
Applications of the
The finds numerous applications in various mathematical and computational contexts:
-
Fractions: When adding or subtracting fractions, the lcm of the denominators is used to find a common denominator.
-
Modular Arithmetic: The is used to determine the least common period of periodic functions in modular arithmetic.
-
Scheduling Problems: In scheduling problems, the is used to find the earliest time at which two or more events can occur simultaneously.
-
Number Theory: The is a fundamental concept in number theory, used in various theorems and algorithms.
Algorithms for Computing the
Several algorithms can be used to compute the of two integers:
Brute Force Method
The simplest approach is to iterate through all integers from to the product of and , checking if each integer is divisible by both and . The first such integer is the .
def lcm_naive(a, b):
max_num = max(a, b)
while True:
if max_num % a == 0 and max_num % b == 0:
return max_num
max_num += 1
Using the
A more efficient method is to use the relationship between the and the :
The and of two numbers are closely related. The equation states that the product of the and of two numbers is equal to the product of the two numbers.
When we multiply the and , we are essentially "canceling out" the common factors and ending up with the product of the original numbers.
def gcd(a, b):
#Calculates the GCD of two numbers using the Euclidean algorithm.
while b != 0:
a, b = b, a % b
return a
def lcm_efficient(a, b):
#Calculates the LCM of two numbers using the GCD.
return (a * b) // gcd(a, b)
Why is this more efficient?
-
Naive approach: Simple to understand but can be very slow for large numbers, especially if the is much larger than the inputs.
-
Euclidean algorithm approach: More complex but significantly faster, especially for larger numbers. It leverages a well-established algorithm for calculating the and a direct formula to compute the .
In conclusion, while the naive approach is straightforward, the Euclidean algorithm-based approach is generally preferred for its efficiency and elegance. The relationship between and provides a powerful tool for calculating the in a computationally efficient manner.
Conclusion
The least common multiple is a fundamental concept in number theory with widespread applications. Understanding its properties and algorithms for computing it is essential for various mathematical operations and problem-solving tasks.
Diophantine Theorem
Diophantine equations, named after the ancient Greek mathematician Diophantus of Alexandria, are polynomial equations with integer coefficients that seek integer solutions.
A Simple Example: The Candy Problem
You have a $ bill and want to buy $ candies. The cashier only has $ coins. How many $ bills should you give, and how many $ coins will you receive in return?
We can represent this situation with the equation:
where is the number of $ bills and is the number of $ coins.
A solution to this equation is and , meaning you give the cashier $ bills and receive $ coins in change.
The Case of
This equation is more complex, but it still follows the same principle of seeking integer solutions. One solution is and .
However, what's intriguing is that this equation has infinitely many solutions. This means there are countless pairs of and that satisfy the equation.
The Puzzle of Solvability
Not all Diophantine equations have solutions. The equation , for instance, has no integer solutions. This begs the question: What factors determine whether a Diophantine equation has a solution?
Now we will explorer how we can solve this theorem with the help of the Euclid algorithm.
Solving Diophantine Equations with Euclid's Algorithm
The simplest form of a Diophantine equation is the linear equation:
where , , and are integers. A fundamental theorem states that this equation has a solution if and only if the greatest common divisor of and divides .
The extended Euclidean algorithm is a powerful tool for finding solutions to linear Diophantine equations. This algorithm not only computes the greatest common divisor of two integers but also finds coefficients that express the greatest common divisor as a linear combination of the two integers.
Let's we see this in example:
Example:
Do not forget! The main porpuse ise finding and .
-
Find the gcd:
-
Use the extended Euclidean algorithm:
-
Multiply by c: To get , we multiply both sides of the previous equation by 7, obtaining So, one solution is
-
Find all solutions: All solutions can be expressed as where is any integer.
Consider the equation
Here, the greatest common divisor of and is .
However, does not divide evenly. Therefore, this equation has no integer solutions.
To determine if a Diophantine equation has a solution, we find the greatest common divisor of the coefficients of the variables. If this greatest common divisor does not divide the constant term evenly, then the equation has no integer solutions.
Modular Division
Modular division is a mathematical operation that involves finding the remainder when a number is divided by another number called the modulus.
Multiple Inverse
In modular arithmetic, a modular inverse of a number a modulo is number such that:
This means that when you multiply and and then divide by , the remainder is . Think of it like finding a number that "undoes" the multiplication by a within a specific modular system.
Let's find the modular inverse of modulo . Possible modular inverses
So, the modular inverse of modulo is .
Key points to remember:
- Not every number has a modular inverse for every modulus.
- The Euclidean algorithm is a common method for finding modular inverses.
- Modular inverses are essential for solving linear congruences and other problems in modular arithmetic.
Uniqueness of Inverses
In modular arithmetic, not all numbers have a unique inverse. This means that two different numbers can have the same inverse when working with a specific modulus.
Example:
Let's consider the modulus .
The inverse of modulo is because
The inverse of modulo is also because
As you can see, both and have the same inverse () modulo .
This phenomenon occurs due to the cyclical nature of modular arithmetic, where numbers wrap around within a fixed range.
Existence of Inverses
In modular arithmetic, not all numbers have a modular inverse. The existence of an inverse depends on the relationship between the number and the modulus.
A number a has a modular inverse modulo if and only if and are coprime. This means that their greatest common divisor is .
Why is coprimality important?
When and are coprime, there exists a linear combination of and that equals . This linear combination can be used to find the modular inverse of .
Example:
Let's consider the modulus .
-
The number is coprime with because their is .
Therefore, has a modular inverse modulo .
-
The number is not coprime with because their is .
Therefore, does not have a modular inverse modulo .
Extended Euclidean Algorithm
The Extended Euclidean Algorithm is a powerful tool for calculating modular inverses. It not only finds the greatest common divisor of two numbers but also expresses the as a linear combination of them. This linear combination is crucial for determining the modular inverse.
Delving Deeper with Examples
I want to describe this just numbers and equations.
What is the ?
First we must write and in the form of the extended Euclidean algorithm as
We have reached the remainder , now let's substitute it in reverse and write it.
First we leave alone
If we take of both sides of the equation
Wait, what is that? Isn't this equation the same as our question? We clearly see that is .
In fact, we are using the extended Euclidean algorithm in all the steps in this example and the answer we have at the end is.
Integer Factorization
Number theory is not only an old and beautiful branch of mathematics, but is also (surprise!) practically useful in an everyday sense. When you pay with a credit card or connect to a website, cryptographic protocols using number-theoretic tools operate behind the scenes.
In this chapter, we discuss some of these tools.
The most popular number-theoretic cryptographic protocol, RSA (invented by Rivest, Shamir and Adleman and published around 1977) is based on our ability to generate large prime numbers and on our inability to factor large integers in a reasonable time.
We will discuss this algorithm in detail later. Currently, it is under attack because quantum computers may be able to factor large integers at some point in the future.
That said, the danger is (for now) mostly theoretical, and the RSA protocol continues to be widely used. It is an exciting time to be learning number theory, when advanced cryptographic attacks need to be deterred and many potential replacement protocols (thought to be more secure against quantum attacks) are rooted in number-theoretic concepts.
Prime Numbers
You can easily arrange balls in three rows and four columns

This worked because . In other words, because can be represented as a product of two smaller numbers, and .
The question to ask is: "How many groups of apples are possible?"
Note that we do not include trivial decompositions like (a rectangle with one row or one column).
On the other hand, does not have any non-trivial decompositions (only and ), meaning it is prime.
Definition
A prime number is a positive integer that cannot be represented as the product of two smaller positive integers.
A number is prime if it does not have positive integer divisors except for and .
Indeed, if is composite where and are non-trivial divisors, then and are both smaller than , making it impossible for either one to be equal to . And if a number has some non-trivial divisor , then for some positive integer we must have by the definition of a divisor, and neither nor can be equal to , so both must be smaller than .
Naive Approach - Trial Divison
The most straightforward method to determine if a number is prime is through trial division. We divide the number by all integers from up to its square root. If none of these divisions result in a remainder of , the number is prime.
def is_prime_naive(num):
"""Checks if a number is prime using trial division."""
if num <= 1:
return False
for i in range(2, int(num**0.5) + 1):
if num % i == 0:
return False
return True
The provided Python function, is_prime_naive, efficiently determines whether a given integer is a prime number. It first handles the base case where the number is less than or equal to , which is not prime. The function then iterates through potential divisors from up to the square root of the number.
If any divisor divides the number evenly, it's not prime and the function returns False. If the loop completes without finding a divisor, the number must be prime, and the function returns True. This optimization of checking only up to the square root is based on the mathematical property that if a number has a divisor greater than its square root, it must also have a corresponding divisor less than its square root.
Factoring
Every integer can be represented as the product of primes: there exists a positive integer and primes such that .
Existence
What happens if is prime?
In this case, and . So the phrase "product of primes" should not be taken too literally: in this case there is only one (prime) number in the "product" and the product is equal to this number. (One could even allow by considering an "empty product" with no factors that is equal to , but we will not go that far.)
Note also that we do not require the to all be distinct: for example,
Prove the Theorem
Let us look at this statement from the algorithmic point of view: given some , how can we find a prime decomposition (existence of which we want to prove)?
If is prime, we already know that this decomposition consists of only. What if is composite (not prime)?
By definition then, for some where . How can we then find the decomposition of ?
It is easy: just combine the decompositions of and , forming a long product from the two shorter ones. Both and are smaller that , so we may assume (by induction in our proof, or a recursive call in our algorithm) that for and the decompositions exist and can be found.
# Finds the minimal divisor>1 of the given integer m>1
def min_divisor(m):
for d in range(2, m + 1):
if m % d == 0:
return d
# optimization:
if d * d > m:
return m
def is_prime(m):
return m == min_divisor(m)
def factoring(m):
if is_prime(m):
return [m]
else:
divisor = min_divisor(m)
factors = factoring(m // divisor)
factors.append(divisor)
return factors
The numbers of the form were considered long ago by Pierre Fermat (1607--1665, Wikipedia), famous for the Last Fermat theorem; we will see the other result of him later in this chapter. He noticed that are all primes and conjectured that all subsequent numbers are also prime.
Only later Leonhard Euler (1707--1783,Wikipedia) discovered the factorization of in 1732, and more than a century after that was needed to find the factorization of . Now our simple program gives these factorization in seconds. (But do not forget two optimization lines, otherwise it would take much longer.)
For the next Fermat numbers one should use more advanced factorization algorithms. There is a module sympy in python that provides function factorint that factors and in minutes, but Fermat numbers grow really fast (and only few more factorizations are known, even if we use the most advanced factorization tools).
Unique Factoring
Decomposition of an integer into a product of prime factors is essentially unique. The word essentially here means that we ignore the ordering of the factors: they can be permuted in any way and we still get the same decomposition. For example,
First, let us note that we can group identical factors in the decomposition. In this we get a product of the form
Here all are different, and all are positive integers (the number of occurrences of in the factorization).
The number is called the multiplicity of a prime in . If does not appear in the decomposition, we say that the multiplicity of in equals zero. (This sounds natural since we may add fictional term in the factoring.)
Divisors and Multiplicity
Complete the statement: an integer is a divisor of an integer if and only if the multiplicity of for every prime .
The last part can be filled as follows: if the multiplicity of in does not exceed the multiplicity of in , for every prime . Indeed, imagine that is a divisor of , i.e., that for some . Then the (unique) factorization of is obtained by concatenating the factorizations of and . Therefore, the multiplicity of every prime in is the sum of its multiplicities in and , and is greater or equal than the multiplicity of in .
On the other hand, if every prime appears in the factorization of at least as many times as for , then we can add missing factors to to obtain . If is the product of these missing factors, then , so divides .
Problem
Consider all the positive divisors of . How many of them do exist (including trivial divisiors and )?
As the previous problem shows, all the divisors have the form where and . For we get , for and we get . We may list all of them systematically:
We do not need to write them all explicitly; we have only to count them. Each of five powers of (i.e., ) is combined with four powers of (i.e., ), so we get combinations that lead (as we know) to divisors.
Question : How many positive divisors has ?
See answer ▼
Answer : 3696
Question : What is the minimal positive number that has exactly divisors?
See answer ▼
Answer : 144
We must choose the smallest prime numbers and large numbers for this.
Chinese Remainder Theorem
When dividing numbers by (say) , we get remainders
this repeats all the time. In general, and have the same remainder (, their difference is divisible by ).
What if we consider the remainders for two different modulo at the same time?
First of all, we choose a pair like and for dividing a number for example .
-
The remainders modulo (i.e., and ) alternate (even values of alternate with odd values of ).
-
The remainders modulo (i.e., ) appear in a -loop ()
-
When we come to six, both cycles return to their original position (since is divisible both by and , and then the pairs start to repeat, thus forming a loop of length ).
-
All six possible combinations of remainders (two possible remainders modulo combined with three possible remainders modulo ) appear in this sequence (once per a -loop).
Let us look at the column for and : we see a cycle of length in this column formed by pairs - and, indeed, is divisible both by and , so we get again the pair . Note that not all combinations of remainders appear in the cycle: for example, pair does not appear.
Can you explain why this pair does not appear: why there are no m such that and ?
Again, the reason is simple: numbers of the form are even, while numbers of the form are odd.
To understand the behavior of remainders, it is useful to look at bigger examples, and draw the pairs of remainders on a coordinate plane. This can be done with the following code (do not worry if you do not understand the options, they are needed to make nice plots).
import matplotlib.pyplot as plt
a, b = 10, 5
n = a * b
plt.plot([i % a for i in range(n)], [i % b for i in range(n)],
color='green', linestyle='dashed', linewidth=3,
marker='o', markerfacecolor='blue', markersize=12)
plt.axis('square')
plt.xlim(-1, a)
plt.ylim(-1, b)
plt.xlabel(f'm mod {a}')
plt.ylabel(f'm mod {b}')
plt.savefig('crt-10-5.png')

In the first picture, the remainder modulo determines the remainder modulo . Indeed, if , then is divisible by , so .
A similar picture for and looks different:

Here we see that all pairs of remainders are possible. Why? This question is answered by the Chinese remainder theorem.
Theorem
Let be coprime. Then the system of equations
has a unique solution for modulo .
The reverse direction is trivial: given we can reduce modulo and modulo to obtain two equations of the above form.
The following is a general construction to find a solution to a system of congruences using the Chinese remainder theorem:
-
Compute
-
For each compute
-
For each compute using Euclid's extended algorithm ( exists since are pairwise coprime).
-
The integer is a solution to the system of congruences, and is the unique solution modulo .
When a system contains a relatively small number of congruences, an efficient process exists to apply the Chinese remainder theorem.
Solve the system of congruences
Take all the yi and construct the integer: We can prepare our situation to this: We wrote instead for inverse For find to general solve inverses of values For : We can use extended euclid algortihm find all values like this: and combine together:
Question-1 :
A box contains gold coins.
If the coins are equally divided among six friends, four coins are left over.
If the coins are equally divided among five friends, three coins are left over.
If the box holds the smallest number of coins that meets these two conditions, how many coins are left when equally divided among seven friends?
See answer ▼

If the coins are equally divided among six friends, four coins are left over.
If the coins are equally divided among five friends, three coins are left over.
If the box holds the smallest number of coins that meets these two conditions, how many coins are left when equally divided among seven friends?
Answer : 0
First of all, mathematize the question. So, We can prepare our situation For and : general equation: we found gold in our box. Now divide among seven friends Opps, zero coins left after divided our friends :)
Example
Find , if possible, such that
Solution First note that has an inverse modulo , namely . So we can write the first equivalence as . Hence, we can multiply every side of equation with inverse of . and we reached simple equation do it same process for other We can revise the question again and solve it like the previous ones. I left this to you :)
Question-1 :
What are the last two digits of ?
See answer ▼
Answer : 49
Observe that and . Then by the Chinese remainder theorem, the value is in correspondence with the solutions to the simultaneous congruences Now, Then the Chinese remainder theorem gives the value Therefore, the last two digits of are . Note that the above system of congruences is obtained for any odd exponent of , so the solution using the Chinese remainder theorem also gives that the last two digits of are for any positive odd value of .
Fast Modular Exponentiation
Our next results (Fermat's little theorem and its generalization, Euler's totient theorem) deal with modular exponentiation, i.e., with expressions of type where are integers. But before stating and proving this result, let us discuss the purely algorithmic question: how to compute reasonably fast?
How would you compute and ?
Computing this in python directly gives the following result:
print((314 ** 271) % 123)
print((314159265358 ** 2718281828) % 123456789)
# 38
# it will be crashed...
If you try this code on your computer, you can see that your computer is struggling. Because we know that pc uses which methot. This method consists of taking the exponent of the number and then taking its modulo.
The right way to compute modular exponentiation in python is to use the built-in pow method. The following code produces the result in the blink of an eye. We will learn what is going on under the hood below.
print((314 ** 271) % 123)
print((314159265358 ** 2718281828) % 123456789)
# 38
# 32073907
Problem
Compute .
Of course, starting with computing is a bad idea. Note that , and hence the answer is .
Problem
Prove that is divisible by .
We know . If we write instead of . And it is equal to zero. So, it's divisible by .
Problem
Find .
Since , answer is .
Now we can compute , , , ... sequentially, multiplying each time by and keeping only the remainder modulo . In this way we do not need to store large numbers (only numbers smaller than ). Does this approach work for our example ?
The size of numbers is no more a problem, but we need to perform multiplications --- too many. Can we do better? This is a general question for computing powers (that makes sense not only for modular computations).
Let be some number. Can you compute in less than 63 multiplications (that would be needed if we compute , , , ..., sequentially)?
For this problem the answer is especially easy to guess, since and repeated squaring helps:
In this way the exponents grow faster ( instead of at every step), and we use only multiplications.
In general, to compute for we need multiplications.
For we may multiply by . Since we have also computed previous powers of 2, we may use that and (note that ). Similar trick can be used in the general.
def fast_modular_exponentiation(b, e, m):
assert m > 0 and e >= 0
if e == 0:
return 1
if e == 1:
return b
if e % 2 == 0:
return fast_modular_exponentiation((b * b) % m, e // 2, m)
else:
return (fast_modular_exponentiation(b, e - 1, m) * b) % m
print(fast_modular_exponentiation(123456789222, 456788, 598723))
# 32073907
The program implementing this idea computes in a fraction of a second.
Problem
Prove that the number of multiplications when computing (for ) both in this program and in the method described above (using binary notation) is the same: where counting the bits and ones in we use the binary representation of .
It turns out that this is not exactly the minimal number of multiplications needed to compute . For example, for we get 6 multiplications with both our approaches, while one can compute it in 5 multiplications:
See Wikipedia article about addition chains that minimize the number of multiplications.
Fermat's Little Theorem
Introduction
Fermat's Little Theorem, a fundamental result in number theory, provides a powerful tool for understanding the behavior of integers modulo a prime number. It has far-reaching implications in various fields, including cryptography, computer science, and pure mathematics.
The Theorem
Fermat's Little Theorem states that if is a prime number and is an integer not divisible by , then:
In other words, if we raise to the power of and divide the result by , the remainder will always be .
Proof
To prove Fermat's Little Theorem, we will consider the set of integers . We will multiply each element of this set by modulo :
No two of these numbers are congruent modulo . To see why, suppose that for some . Then, would divide . Since does not divide , it must divide . However, , so this is impossible.
Therefore, the set is a rearrangement of the set modulo . Multiplying all the elements of these sets together, we get:
Simplifying, we obtain:
Since is not divisible by , we can cancel it from both sides, leaving us with:
This completes the proof of Fermat's Little Theorem.
Problems and Solutions
Problem 1:
Find the remainder when is divided by .
Solution:
Since is a prime number, we can apply Fermat's Little Theorem:
Therefore,
So, the remainder when is divided by is .
Problem 2:
Find the last digit of .
Solution:
The last digit of a number is equivalent to its remainder when divided by . So, we need to find .
We can not use Fermat's Theorem in this problem. Because is not prime. Think another way:
Note that . Therefore,
Hence, the last digit of is .
Conclusion
Fermat's Little Theorem is a powerful tool with wide-ranging applications in number theory and computer science. Its elegant proof and practical utility make it a cornerstone of modern mathematics.
As we continue to explore the depths of number theory, Fermat's Little Theorem will undoubtedly remain a valuable asset in our mathematical toolkit.
Euler's Totient Theorem
At its core lies Euler’s Totient Function, denoted , which counts the integers up to that are coprime with . This seemingly simple concept has profound implications, ranging from pure mathematics to cryptography.
Key Concepts in Euler's Totient Theorem
1. Definition of :
The totient function is defined as:
Example:
For :
Thus, .
What happens with prime numbers?
Prime numbers has only one prime factor: itself!
Example
For :
Thus, .
This is generally true of all prime numbers:
We have our first timesaver.
2. Multiplicative Property:
If and are coprime (), then:
Example:
For :
3. Prime Factorization Formula:
For :
Example:
For :
Applications of Euler's Totient Theorem
1. Primality Testing:
For a prime :
This property aids in testing whether a number is prime.
2. RSA Encryption:
RSA relies on the totient function to generate keys:
- Let , where are primes.
Example: For :
Public and private keys are derived using , ensuring secure encryption.
3. Applications in Modular Arithmetic and Prime Numbers:
a) Efficient Exponentiation:
Using the property for coprime to , large powers modulo can be computed efficiently.
Example: Calculate :
- Since , .
- Break into : .
b) Properties of Primes in Modular Arithmetic:
Prime numbers simplify modular inverses, as every nonzero element in is coprime to .
Example: Modular Inverses:
Find the modular inverse of :
- , so .
- Use powers to find : Thus, is the modular inverse of under mod .
Euler's Totient Theorem stands as a testament to the interplay between theory and application, bridging the gap between abstract mathematics and real-world cryptography. Its elegance and utility continue to inspire mathematicians and engineers alike.
History of Cryptography
Cryptography made its greatest advancements during World War I and II, driven by the need for secure communication amidst rampant espionage. Early ciphers proved insufficient, and secure communication required frequently changing shared keys—an increasingly complex task as global communication expanded. This challenge led to the creation of the RSA cryptosystem, enabling secret key exchange without fear of eavesdropping. We will also explore common vulnerabilities in RSA implementations and conduct a few targeted attacks.

After the war, cryptographic methods became crucial for global power, particularly in nuclear defense systems and computer networks. Early radar systems transmitted alerts through these networks, allowing swift decision-making. Universities soon adopted computer networks, paving the way for innovations like online commerce, messaging systems, and digital currencies. Today, secure communication methods form the backbone of technologies we rely on daily.
Alice and Bob, the traditional placeholders in cryptography, represent any two parties exchanging information securely. Eve, the eavesdropper, listens to everything sent between them.

To ensure privacy, Alice encrypts her plaintext message into ciphertext, which only Bob can decrypt. Various cipher systems enable this, balancing efficiency with security to outpace potential attackers like Eve.
Caesar Encryption
If the key is a cyclic shift of the alphabet is known as Caesar cipher. It is named after Julius Caesar, who used it for establishing secure communication. A key for such an encryption scheme can be generated using a physical device like the one shown below:

Key: Shift the alphabet by N letters to create cipher alphabet
In example on below, every letter is replaced by a letter three places further in the alphabet. Nowadays, it is not recommended to use Caesar cipher as it is too easy to crack it.
Encryption

Decryption
The reason why this cipher is easily breakable is that the space of possible keys is small. Indeed, there are only different cyclic shifts of the alphabet: the possible values of the shift are . This makes it possible for Eve to enumerate all the keys and to decode using each of them.
ciphertext = 'kyv wzmv sfozex nzqriuj aldg hlztbcp'
for shift in range(26):
key = alphabet[shift:] + alphabet[:shift]
print(decode(ciphertext))
# try and find plain text
Substitution Ciphers
Substitution cipher is one of the oldest and simplest. To use it, Alice and Bob share a private key that represents a permutation of the letters, e.g.,
jsuyfhkpicomxrqatlbvznewgd
To encode her message, Alice starts by aligning the key with the alphabet:
abcdefghijklmnopqrstuvwxyz
jsuyfhkpicomxrqatlbvznewgd
Then, she uses the resulting substitution table as follows: she replaces every letter in her message by , every by , and so on. For decoding, one uses the same substitution table with the two rows switched: is replaced by , is replaced by , and so on. It is particularly easy to implement this cipher in python:
alphabet = 'abcdefghijklmnopqrstuvwxyz'
key = 'jsuyfhkpicomxrqatlbvznewgd'
def substitute(text, substitute_what, substitute_by):
result = ''
for symbol in text.lower():
if symbol in substitute_what:
result += substitute_by[substitute_what.index(symbol)]
else:
result += symbol
return result
def encode(plaintext):
return substitute(plaintext, alphabet, key)
def decode(ciphertext):
return substitute(ciphertext, key, alphabet)
message = 'the quick brown fox jumps over the lazy dog'
code = encode(message)
print(code)
print(decode(code))
"The quick brown fox jumps over the lazy dog" is a well-known pangram, that is, a sentence containing all letters of the alphabet. For this reason, it is widely used for testing fonts.
When the key is not just a cyclic shift of the alphabet, but rather an arbitrary permutation of the alphabet, enumerating all keys is not that easy: the size of the space of all keys is
Cracking Simple Substitution
Substitution ciphers are not considered secure as they are vulnerable to frequency analysis attacks. Such attacks exploit the fact that some letter combinations appear more frequently in English texts than others: the letters are the most common; the word usually appears many times.

Did something catch your attention? We don't need to try twice. We can solve the problem simply by assuming that the letter z in the encrypted text is equal to the letter e.
One-Time Pad
This section focuses on an encryption scheme that is both invulnerable to frequency analysis and provably secure. In fact, it guarantees that an eavesdropper, Eve, gains no information from the intercepted ciphertext.
Imagine Alice needs to send a single bit of information to Bob, such as buy or sell, attack or retreat. They can agree in advance on a pair of text messages (e.g., cat and dog) to represent these options, along with their meanings. If Eve intercepts the message but doesn’t know the agreed code, the information remains completely inaccessible to her.

To explain in terms of bit operations, assume Alice’s message is a single bit ( or ), and the private key , known only to Alice and Bob, is also a single bit. The ciphertext , transmitted over a public channel, is calculated as , where represents (exclusive OR). For example, if , the ciphertext equals the message (). If , the ciphertext is the reverse of the message (). To decode, Bob simply computes .
Extending to Multi-Bit Messages
The one-time pad can be generalized to longer messages. To encode a bit string, each bit is XORed with a corresponding bit from the private key. For example, if Bob sends the bit string and the key is , the ciphertext becomes , where each plaintext bit is XORed with the respective key bit. Decoding reverses the operation: .
Limitations of a Single-Bit Key
While the one-time pad is theoretically secure, it requires keys as long as the messages themselves. Using a single-bit key repeatedly, as demonstrated, is insecure. Eve could try both possible keys ( and ) to decode the ciphertext. For example, if Bob encrypts a PIN code, Eve could easily guess it by testing both keys.
The Perfect Security of One-Time Pads
To achieve true security, Alice and Bob must use a unique, secret key for each bit of the message. For a message , they agree on a private key and compute for each bit. This approach ensures that Eve gains no information from the ciphertext. Regardless of the intercepted data, all possible original messages appear equally likely to Eve.

This scheme, known as a symmetric private-key method, is illustrated below: encoding and decoding are symmetric, and the private key is shared between Alice and Bob. Patented in 1919, the one-time pad remains a cornerstone of cryptographic history, though its practical use is often limited by the need for key management.
Many Time Pad Attack
The many-time pad attack exploits a critical vulnerability of reusing a one-time pad (OTP) key across multiple messages. While the OTP is perfectly secure when a unique key is used for each message, reusing the same key compromises security. Let’s understand why.
Suppose Alice and Bob use the same key for two distinct plaintext messages and . The ciphertexts generated are and . An eavesdropper, Eve, intercepting both ciphertexts, can compute their XOR:
This removes the key entirely, leaving Eve with , the XOR of the original plaintexts. Although Eve does not directly see or , patterns in natural language (such as common words or phrases) or predictable message structures allow her to deduce the plaintexts. This vulnerability is why reusing keys in an OTP scheme is strictly forbidden.
Stop and think! How does this differ from the original one-time pad approach? The key’s uniqueness is central to the OTP's security. Reusing keys transforms a perfectly secure system into one vulnerable to cryptanalysis.
See attack description here
Disadvantages of One-Time Pad
While the one-time pad provides perfect secrecy, it comes with significant practical limitations:
-
Key Length Requirement: The key must be at least as long as the message to ensure security. For large messages, this makes key management challenging.
-
Key Distribution: Both parties must securely exchange the key in advance. If Alice and Bob have never met or cannot share a secret key beforehand, using the OTP becomes infeasible.
RSA Cryptosystem
What is the RSA?
The RSA algorithm is an asymmetric cryptography algorithm widely regarded as the most secure encryption method. You may wonder what I mean by saying the RSA algorithm is asymmetric. It means it works on two different keys: the Public Key and the Private Key. And, as the name implies, the Public Key is distributed to everyone while the Private Key is kept private. The RSA algorithm is based on how difficult it is to factorize a large integer.
The RSA algorithm is named after those who invented it in 1978: Ron Rivest, Adi Shamir, and Leonard Adleman. (whitepaper)
The following illustration highlights how asymmetric cryptography works:
How does the RSA algorithm work?
RSA algorithm comprises four stages, and those four stages are:
-
Key generation: To generate a private key (to keep) and a public key (to share).
-
Key distribution: Flood the network with the public key.
-
Encryption: The sender encrypts the message using the receiver’s public key.
-
Decryption: The message is decrypted by the receiver using its private key.
1. Key Generation
The key generation process is as follows:
-
Select two prime numbers: Select two large prime numbers, say and .
-
Calculate : Calculate , where
-
Calculate : Calculate Euler's totient function of , where
-
Select : Select an integer such that and is co-prime to .
-
Calculate : Calculate , the modular multiplicative inverse of modulo . Calculate such that can be found using the extended euclidean algorithm.
-
Public key: The public key is .
-
Private key: The private key is .
2. Key Distribution
The public key is distributed, and the private key is kept secret.
3. Encryption
The encryption process is as follows:
-
Convert the plaintext to a number: Convert the plaintext to a number .
-
Encrypt the message: Compute the ciphertext from the plaintext using the public key , where
4. Decryption
The decryption process is as follows:
- Decrypt the message: Compute the plaintext from the ciphertext using the private key , where
Proof
Since is the inverse of modulo , there exists an integer such that . Assuming that , we have that , by Euler's Totient Theorem. Thus, In case , we know that is divisible by either or . It cannot be divisible by both, since . Assume, without loss of generality, that divides . Then, . In this case, By the argument above, . Since, , we conclude that , by the Chinese Reminder Theorem.
Example
-
Choose and .
-
Compute
-
Compute
-
Choose e such that and and are coprime. Let
-
Compute a value for d such that One solution is
-
Public key is
-
Private key is
-
The encryption of is
-
The decryption of is
Pseudocode
p = 3
q = 11
n = x * y
# n = 33.
# compute the totient, phi
int phi = (x-1)*(y-1)
# phi = 20.
int e = findCoprime(phi)
# find an 'e' which is > 1 and is a co-prime of phi.
# e = 7 satisfies the current values.
# Using the extended euclidean algorithm, find 'd' which satisfies
# this equation:
d = (1 mod (phi))/e
# d = 3 for the example values.
public_key = (e=7, n=33)
private_key = (d=3, n=33)
# Given the plaintext P=2, the ciphertext C is :
C = (2^7) % 33 = 29
# To decrypt the cypher text C:
P = (29^3) % 33 = 2
Security of RSA
The security of RSA relies on the assumption that decoding a message without the private key is computationally difficult, though no formal proof guarantees this.
A common attack involves factoring the public modulus into its prime components and . Once is factored, the private key can be computed as the modular inverse of modulo , allowing decryption of any ciphertext .
This approach is impractical because factoring large integers (e.g., digits) is computationally infeasible. RSA's security therefore relies on the hardness of factoring.
Another potential attack is solving the modular root problem, where is recovered directly from without factorization. While inefficient methods exist, no efficient solution has been discovered, making this problem another cornerstone of RSA's security.
Attacks and Vulnerabilities
RSA is considered secure. But this only means that we don't know an efficient decoding algorithm that works for all messages. In this section, we will study (and implement!) a few attacks that work well in certain special cases.
Small Space of Messages
Alice needs to send one of the following messages to Bob:
attack, or retreat.
This essentially conveys one bit of information. To simplify, Alice and Bob agree to represent these messages as:
for attack, and for retreat.
However, this approach is insecure. Why?
If Alice sends a ciphertext to Bob, an eavesdropper (Eve) can encode both and using Bob’s public key , then compare the results to . This is because cryptographic conventions often assume that the mapping of messages to integers is public. Even if Eve doesn’t initially know what the bit represents, she can intercept future ciphertexts and recognize if they are identical to previous ones.
This attack extends to cases where small integers other than and are used, or when the total number of possible messages is small. In such cases, Eve can enumerate all potential encodings and match them to the intercepted ciphertext.
Defense Mechanism
To counteract this vulnerability, the space of possible messages must be artificially expanded, making it computationally infeasible for Eve to enumerate all possibilities. For instance, Alice can append random bits to her -bit message before encryption. After decoding, Bob discards these extra random bits. This makes it practically impossible for Eve to guess among possibilities within a reasonable timeframe.
Problem Statement
Alice, a secret agent, has sent one of these messages to the center:
attack , don’t attack, or wait.
She encrypted the message using the public key , which is accessible to you. You intercepted her ciphertext and want to determine the content of her message.
You are given access to two functions:
Encrypt(message, modulo, exponent)
Encrypts a message using RSA with the provided public key, returning the ciphertext as a large integer.
DecipherSimple(ciphertext, modulo, exponent, potential_messages)
This function needs to be implemented. It should:
- Take the ciphertext, public key, and potential messages.
- Return the decrypted message as a string.
For example:
If Alice encrypts "wait" with the given key and produces the ciphertext , your implementation of DecipherSimple should return "wait" when provided with the ciphertext, public key, and the list of potential messages ["attack", "don’t attack", "wait"].
def DecipherSimple(ciphertext, modulo, exponent, potential_messages):
# Fix this implementation
if ciphertext == Encrypt(potential_messages[0], modulo, exponent):
return potential_messages[0]
return "don't know"
modulo = 101
exponent = 12
ciphertext = Encrypt("attack", modulo, exponent)
print(ciphertext)
print(DecipherSimple(ciphertext, modulo, exponent,
["attack", "don't attack", "wait"]))
Small Prime Problem
When Bob prepares his RSA keys, he generates two random prime numbers, and . Let’s assume one of these primes, say , is small (). While this is a rare case in real-world cryptography, it highlights a potential vulnerability.
Why is this insecure?
If is small, an attacker like Eve can factorize (where ) quickly. She can:
- Enumerate all primes up to .
- Check which prime divides .
Once Eve finds , she calculates . With both and known, Eve can derive the private key and decrypt any message encrypted with .
Defense Strategy
Bob should generate both and such that they are sufficiently large — e.g., -bit primes. Enumerating all primes of this size would be computationally infeasible for any attacker.
Problem Statement
Alice is using RSA encryption with a public key (modulo,exponent), where:
One of the primes, either or , is less than . You intercepted her ciphertext and need to decrypt it.
Given Functions
Decrypt(ciphertext, p, q, e)
This function decrypts the ciphertext using the private key () and the public exponent ().
DecipherSmallPrime(ciphertext, modulo, exponent)
This function must be implemented to:
- Factorize by finding a small prime .
- Use and to decrypt the ciphertext.
Implementation Steps
To solve the problem, the implementation of DecipherSmallPrime should:
-
Iterate through all primes up to .
-
For each prime , check if .
-
Once is found, compute .
-
Use , , and the provided Decrypt function to decrypt the ciphertext.
-
Return the decrypted message as a string.
def DecipherSmallPrime(ciphertext, modulo, exponent):
if modulo % 2 == 0:
small_prime = 2
big_prime = modulo // 2
return Decrypt(ciphertext, small_prime, big_prime, exponent)
return "don't know"
modulo = 101 * 18298970732541109011012304219376080251334480295537316123696052970419466495220522723330315111017831737980079504337868198011077274303193766040393009648852841770668239779097280026631944319501437547002412556176186750790476901358334138818777298389724049250700606462316428106882097210008142941838672676714188593227684360287806974345181893018133710957167334490627178666071809992955566020058374505477745993383434501768887090900283569055646901291270870833498474402084748161755197005050874785474707550376333429671113753137201128897550014524209754619355308207537703754006699795711188492048286436285518105948050401762394690148387
exponent = 239
ciphertext = Encrypt("attack", modulo, exponent)
print(ciphertext)
print(DecipherSmallPrime(ciphertext, modulo, exponent))
Small Difference of Primes
When preparing the keys, Bob generates two random prime numbers, and . They turn out to be close to each other: say, . (As with the previous example, the probability of this event is negligible.)
Defense Strategy
Bob should verify that is large enough after generating the primes. If the difference is small, he should regenerate the primes to avoid this vulnerability.
Problem Statement
Alice is using RSA encryption with a public key (modulo,exponent) such that with , and you know about it. You want to break the cipher and decrypt her message.
Given Functions
Decrypt(ciphertext,p,q,e)
which decrypts the ciphertext given the private key and the public exponent . You also have access to the function which takes integer and returns the largest integer such that . You are also given the function
DecipherSmallDiff(ciphertext,modulo,exponent)
and you need to fix its implementation so that it can decipher the ciphertext in case when the difference between prime factors of the public modulo is smaller than .
def DecipherSmallDiff(ciphertext, modulo, exponent):
small_prime = IntSqrt(modulo)
big_prime = modulo // small_prime
return Decrypt(ciphertext, small_prime, big_prime, exponent)
p = 1000000007
q = 1000000009
n = p * q
e = 239
ciphertext = Encrypt("attack", n, e)
message = DecipherSmallDiff(ciphertext, n, e)
print(ciphertext)
print(message)def DecipherSmallDiff(ciphertext, modulo, exponent):
small_prime = IntSqrt(modulo)
big_prime = modulo // small_prime
return Decrypt(ciphertext, small_prime, big_prime, exponent)
p = 1000000007
q = 1000000009
n = p * q
e = 239
ciphertext = Encrypt("attack", n, e)
message = DecipherSmallDiff(ciphertext, n, e)
print(ciphertext)
print(message)
Insufficient Randomness
Every day, Bob generates two fresh random prime numbers and and then publishes as part of the public key. To do this, Bob uses a random number generator. Assume that for some two days, the same prime was used: and .
Problem Statement
Alice and Angelina generated their RSA keys using insufficient randomness, which caused the same prime to be used in their public keys:
If the public modulo and share a common prime factor , we can efficiently compute and use it to decrypt both ciphertexts.
DecipherCommonDivisor(first_ciphertext, first_modulo, first_exponent, second_ciphertext, second_modulo, second_exponent)
Steps:
-
Compute .
-
Calculate and .
-
Decrypt both ciphertexts using the provided Decrypt function:
- Decrypt the first ciphertext using and .
- Decrypt the second ciphertext using and .
-
Return the two decrypted messages.
def GCD(a, b):
if b == 0:
return a
return GCD(b, a % b)
def DecipherCommonDivisor(first_ciphertext, first_modulo, first_exponent,
second_ciphertext, second_modulo, second_exponent):
# Fix this implementation to correctly decipher both messages in case
# first_modulo and second_modulo share a prime factor, and return
# a pair (first_message, second_message). The implementation below won't work
# if the common_prime is bigger than 1000000.
for common_prime in range(2, 1000000):
if first_modulo % common_prime == 0 and second_modulo % common_prime == 0:
q1 = first_modulo // common_prime
q2 = second_modulo // common_prime
return (Decrypt(first_ciphertext, common_prime, q1, first_exponent),
Decrypt(second_ciphertext, common_prime, q2, second_exponent))
return ("unknown message 1", "unknown message 2")
# Example usage with common prime p and different second primes q1 and q2
p = 101
q1 = 18298970732541109011012304219376080251334480295537316123696052970419466495220522723330315111017831737980079504337868198011077274303193766040393009648852841770668239779097280026631944319501437547002412556176186750790476901358334138818777298389724049250700606462316428106882097210008142941838672676714188593227684360287806974345181893018133710957167334490627178666071809992955566020058374505477745993383434501768887090900283569055646901291270870833498474402084748161755197005050874785474707550376333429671113753137201128897550014524209754619355308207537703754006699795711188492048286436285518105948050401762394690148387
q2 = 1000000007
first_modulo = p * q1
second_modulo = p * q2
first_exponent = 239
second_exponent = 17
first_ciphertext = Encrypt("attack", first_modulo, first_exponent)
second_ciphertext = Encrypt("wait", second_modulo, second_exponent)
print(DecipherCommonDivisor(first_ciphertext, first_modulo, first_exponent,
second_ciphertext, second_modulo, second_exponent))
Randomness Generation
Randomness plays a crucial role in cryptography and number theory, especially for creating secure systems. Random numbers are used for tasks like key generation, nonces, and encryption schemes. However, generating true randomness is not always easy, so we rely on pseudorandom number generators (PRNGs).
What is Randomness?
Randomness means unpredictability. A random process produces outcomes that cannot be easily guessed. For example, flipping a coin or rolling a dice generates random results.
Random Seed
A random seed is the initial value used to start a PRNG. It acts like the "starting point" for generating a sequence of numbers. If the same seed is used, the PRNG will produce the same sequence. This is useful for reproducibility, such as debugging software.
Example:
- Seed =
- PRNG output:
If we use Seed = again, we’ll get the same output: .
Pseudorandom Number Generator (PRNG)
A PRNG is an algorithm that creates a sequence of numbers that seem random but are determined by the seed. These numbers are pseudorandom because they are predictable if the algorithm and seed are known. PRNGs are fast and widely used in cryptographic systems, but they are not suitable for highly secure applications unless enhanced.
Example:
- PRNG generates numbers for password generators, simulations, or video games.
True Randomness vs. Pseudorandomness
True randomness comes from physical processes, like radioactive decay or thermal noise. These sources are unpredictable and provide higher security. However, they are slower and harder to implement compared to PRNGs.
In summary, randomness generation, whether true or pseudorandom, is essential for creating secure cryptographic systems. Understanding the basics of seeds and PRNGs helps us design better algorithms for encryption and security.