Basic Number Theory
Primality Test:
Composite Numbers: It is a positive integer that has at least one divisor other than 1 and itself e.g. 10 is composite with factors 1,2,5 and 10.
Prime Numbers: It is a number which is only divisible by two numbers; 1 and itself e.g. 7.
A lot of questions require the knowledge of finding out whether a given number is prime or not. This blog will cover different methods of checking and finding prime numbers.
Naive Approach:
By definition, a prime number is only divisible by 1 and itself. So the brute force or naive approach to finding out the primality of a number n could be to check all numbers from 1 to n and find out if n is their multiple or not.
In the code snippet, the loop runs from 2 to n-1 and each time it checks whether n is divisible by i or not. is_prime variable assumes n to be prime at the beginning and if any such divisor of n is found, its value becomes false and the loop terminates. The time complexity of this approach is O(n). This means that we need to perform at max n operations to check whether n is prime or not.
Optimization:
It can be observed that if n has divisors p and q such that n = p*q, then either p or q should be less than or equal to the square root of n. This observation reduces the search space. Now we can run the loop from 2 to sqrt(n) and it will be sufficient to check the primality. One more optimization can be to check if n is even and not equal to 2, then it is always composite. And if it is odd, we need not check the even numbers between 2 and sqrt(n).
Now, the time complexity improves to O(n). Hence, we can now check the primality of n with fewer operations.
Sieve of Eratosthenes:
Some problems require us to verify the primality of a lot of numbers and sometimes, we might have to check the primality of the same number again and again. In those cases, the upper mentioned approaches fail to get the green tick.
Sieve of Eratosthenes helps us to find the prime numbers in the range 1 to n, both inclusive. We pre-compute the prime numbers which then can be used to check if a number (lying between 1 and n) is prime or not in O(1) time complexity.
The basic idea behind sieve is we pick first assume all the numbers between 1 and n to be prime. Then we pick a prime number and mark all its multiples as composite. We then find the next prime number and mark its multiples as composite. Continuing this process, we eventually get the list of primes and composites between 1 and n.
Pseudo-code:
- Declare a boolean array of size n+1, starting from 0 up to n. Mark all the indices as true i.e. assuming each of them to be prime.
- Mark index 0 and 1 as false since they are not prime numbers.
- Start a loop from 2 to sqrt(n). If the number is prime, mark all its multiples as false. We are traversing up to sqrt(n) only. [ I explained before that m = p*q implies that either p or q is less than sqrt(m). So, if a prime factor of m exists then we will find it by searching only first sqrt(m) numbers. ]
- Numbers that are still marked as true after the execution of loop have no divisors and are hence prime.
Time Complexity: Observe the code. For each prime number p, the inner loop runs for n/p times. For p = 2, inner loop runs n/2 times. For p=3, n/3 times. And so on.
Therefore, time complexity = N*(½ + ⅓ + ⅕ + .. ) = O(N*log(log(N))).
Sieve for Prime Factorization:
Suppose you have to find the prime factorization of a number. You can use a for loop and repeated division by a prime divisor to find the answer. In case you need to find the prime factorization of many numbers, the sieve can come in handy. One just needs to store the smallest prime factor of each number which can be used to find the prime factorization. Let’s take an example. For n = 10, the array will store:
Now, the prime factorization of any number between up to 10 can be found in log (n) time.
For n = 8, value is 2. Store it in a list. Go to index 4 (8/2). Here again, we find value 2. Store it too. Go to index 2 (4/2). We get 2 again. Store it in the list. Now go to index 1 (2/2). Here the value is 1. So the prime factorization process is over. The numbers stored in the list will give the answer i.e. 8 = 2*2*2.
This approach can be used only when pre-computing the min_prime_factor array is feasible. Otherwise, the precomputation cost will be very high. N should be of the order 106.
Primes in a range l to r:
Sometimes, we have to find primes in the range l to r where both l and r are very large numbers but the range- r-l+1 is of the order 106. Then we cannot use the above-mentioned technique of finding primes from 1 to r. We have to check the numbers only in the range- r-l+1. Implementation can be:
Time complexity is of the order sqrt(r).
This is the first part of the blogs on number theory. There are many more topics that will be discussed in further blogs. I hope you will find this helpful. Keep Coding! :)
Written by Vinay Aggarwal.