The ElGamal Cryptosystem. From Root to the Leaves

Hussain Safwan
The Startup
Published in
6 min readSep 29, 2020

--

Photo by Zhen Hu on Unsplash

Over the years, we’ve all grown up and so have Alice and Bob. Now they need more sophisticated and reliable cryptographic standards. That’s where the idea of public key encryption pops in. Public key encryption, in the generic sense, is a scheme that employs one globally shareable public key and an individual private key from each side. The message can be decrypted via a proper combination of the keys on each side. A classic instance of such a scheme is the ElGamal cryptosystem, which we dissect in this episode.

Okay, to study the ElGamal algorithm, we need to have a reasonable grasp over a few (notorious, that’s how I would describe) concepts of number theory. Now most of these elementary notions are well scattered around the internet and it’s a pain going to and fro, following every other reference, so I’ve done my best to put them under a single title. Note that it’s root level tutorial, so even if you’re not familiar with any of the following, you should still be good to go. What we’ll cover in this episode are primitive roots, discrete logarithm, cyclic fields, the robustness of ElGamal, the algorithm, and finally a small work-out. And as you’ve guessed gonna be a long journey, just bear with me.

Elementary number theory

Primitive roots: x is a primitive root of n if, for every y relatively prime to n, there is a z such that,

For example, 2 is a primitive root of 5 while 4 isn’t. Because there are numbers z to which 2 can be mod-exponentiated to produce all the number s relatively prime to 5 (like 2⁰ = 1, 1 mod 5 = 1; 2¹ = 2, 2 mod 5 = 2 and so on) but in case of 4 there isn’t enough such exponents to produce all the relatively primes of 5 (4⁰ = 1, 1mod5 = 1; 4¹ = 4, 4mod5 = 4; 4² = 16, 16mod5 = 1, see the cycle? 2, 3 can never be produced). Head on to this for more details.

Discrete Logarithm: I’ll begin with an example since a formal definition could a bit frightening. Think of a group like Z(23) (z subscript 23, not familiar with groups? Think of it as the set {1, 2, 3 …, 23}). What would be the value of 3⁴ in this arithmetic? 3⁴ mod 23 = 81 mod 23 = 12, simple enough. The problem of discrete log is the very inverse of this problem, where you’re asked to find k given,

From the above context, 4 is an obvious solution, however not the only. 26, 48 and to be precise any number of the form 4+22n (n = 0, 1, 2, …) satisfies the equation. So informally discrete log is the inverse of modular exponentiation and the term discrete is to denote the one of interest out of infinitely many solutions. For more reading, please navigate to these two links. Here and here too

Cyclic Groups: These are the groups where each member may be procured applying the same operation on the generator or its inverse. For example, G is a group closed under multiplication and x is a member of G. Now the smallest subgroup g of G, that contains the member x, is a cyclic group. How? Consider what can g be like to guarantee the existence of x within itself. It must contain x(trivial), its inverse 1/x (since closed under multiplication), identity 1(same reason) and all derived from x and 1/x such as x², x³, … and 1/x², 1/x³,…

g = { …, 1/x³, 1/x², 1/x, 1, x, x², x³, … }

Note that each member is procured applying the same operation (multiplication) to either x(generator) or 1/x(inverse of the generator). Head over here for more.

Okay, how to put all these together? The knowledge gained from above (if any, at all) can be handy in understanding the intuition behind the scheme and how it is resistant to attacks.

The algorithm

Back to our old Alice and Bob. Bob initiates the process. He chooses a prime m and generates a field F(m). Then he picks the numbers a and g where a should be,

a is the private and he computes the public key h as

Now he sends the tuple <F(m), h, m, g> to Alice.

Alice generates her own private key k,

and the public key p as

and the multiplier s as

The final task for Alice is to encrypt the message M by multiplying with s and send the tuple <p, E> to Bob; E = M . s.

Bob on this side will simply multiply p with g^a to get g^ka = s and dividing the encrypted message E(=Ms) by s produces the plaintext M. Now I know this might seem a little hazy, solving a real-world problem is just what you need. To get a more organized form of the algorithm, take a sneak peek here.

Work-out problem

To begin with, Bob needs a prime generator to build a cyclic field. Let that be m=11 and the associated field be F(11). As per algorithm, he selects a=3(gcd(3, 11) = 1) and g=7. Now he calculates the public key to be,

h = 7³ mod 11 = 2

and sends the tuple <F(11), 2, 11, 7>. Alice, on the other end, picks her private key k = 5 (gcd(5, 11 = 1) and generates the public key,

p = 7⁵ mod 11 = 10

and the multiplier,

s = 2⁵ mod 11 = 10

Now, say she happens to start the conversion saying “hello”. All she does is ASCII down the message as [104, 101, 108, 108, 111] and multiplies the factor s to each character and produces the array

E = [1040, 1010, 1080, 1080, 1110] and sends the tuple <10, E>

Note that none of the values can be represented by a character, as they’re way greater than 256, but that doesn’t matter as network cables carry only the binary information in form of electric signals.

Bob receives the token and quickly computes

s=p^a mod 11=10³ mod 11=10.

Note that he receives the exact same multiplier as Alice did (=10) and divides each number of E with that to extract the plaintext.

Intuition and Robustness

The algorithm is essentially all about finding discrete log k in a cyclic field generated by p. So every attack boils down to this phenomenon. Information of interest here to the attacker is the value of g and g^a(=h). The problem is to solve b = g^a (mod m) for a. In brute force, the attacker loops through all possible values of a (0 to m) and checks if g^a equals to b. This is linear in m, ie. O(m). m however contains l=log(m) bits so the actual complexity is O(2^l) plus the exponentiation time which is of exponential order. Hence, a tough job! So the size of the generator is taken to be large enough to resist the brute force attacks.

--

--