What Is Hash-Based Message Authentication?

Hussain Safwan
The Startup
Published in
5 min readOct 9, 2020

--

So you’re reading this, but how confident are you that this is the same text from the author stated above and hasn’t been compromised to something else on the way (like I was writing on Data Science and the adversary altered it to a text on cryptography)? Well, networks are quite precarious in nature and such an incidence shouldn’t be surprising at all. In network communications, the integrity of the message to be delivered is just as important as secrecy and other privacy requirements are.

By integrity, we mean that a message (like this article is a message from me to my audience) should arrive at the receiver’s end just as it left the sender’s, i.e. without any alterations or at the least the fact that it has been altered should be confirmable. The central concept here is to tag the message with an authenticator so that the receiver has a clue of what the original message might have been. Now, there are quite a few algorithms to produce and attach the authenticator, the one we are particularly interested in this episode is the hMAC algorithm that stands for hash-based message authentication code. To go for hMAC, we need to understand the rudimentary one, the MAC algorithm. This should serve as a nice gateway into the realm of message authentication.

Message Authentication Codes

The MAC is a string of fixed size generated by an algorithm that takes the message and a secret key as input. The string, also known as the tag or the digest, is then concatenated to the message itself, and the compound is passed through the unsafe network. On the receiver’s end, the message is extracted and input into the same algorithm provided the receiver has access to the secret key, the same digest should be reproduced which is then compared to the incoming one — if matched the message is authentic else fraudulent. Some of the characteristics of MAC are,

  1. The same message with the same key produces the same MAC
  2. MAC is a many-to-one mapping, denoting multiple messages may result in non-unique MACs
  3. MACs are evenly distributed and ideally depend on every bit of the input.

Cipher MACs

MACs are a way to secure the integrity of the message and obviously can’t provide the secrecy alone. So the use if MAC with an encryption scheme is the best way to provide privacy and authenticity of the communication at the same time. Below is a diagram to illustrate the procedure, press on the image to zoom in,

The procedure is known as cipher tied MAC as the MAC is produced from the cipher. There is another procedure called plaintext tied MAC where the encryption is done after the message-MAC concatenation.

With that said, I guess we’re ready to take the big bite and discuss about hMACs.

Hash-based MACs

This class of MAC tags is produced via algorithms that have an underlying hash function. Any well-known hash function may be used for example SHA, MD, Whirlpool, etc. The mathematical definition of hMAC is,

The algorithm takes in the secret key and the message (plain or encrypted) as input, pretty usual. The size of the key is to be adjusted to the block size of the underlying hash function, denoted by H. Like, for instance, the block size of SHA-256 is 256B. The adjustment is done by hashing the key down to the required size if exceeded. K’ here is the length-adjusted key. The opad and ipad are the outer and inner padding constants. The values are strings of hex 0x36 and 0x5c repeated to match the block size, i.e. for a block size of 4, the value of opad is 0x360x360x360x36. The algorithmic pseudocode would be like,

Since the hash function is anything like SHA or MD, it’s quite arduous to actually work an example out — sorry for that.

Attacks and Robustness

Brute force attack on the MAC algorithm can be carried out on two lines, one is to linearly search the keyspace, which for a k-length key should be 2^k. The other is to search the entire tag space to generate a valid tag for the message m, the effort for which is 2^n. So the overall effort is the minimum of the two, i.e. min(2^k, 2^n).

Similar goes for hMAC. For an algorithm employing n-bit key, it takes 2^n keyspace to be searched, which makes brute force infeasible even for a short lengthed key.

Before I end this episode, I’d like to share a brute-force scenario that I found at the crypto exchange site. Say we’re trynna break an hMAC-MD5 that employs a 128-bit key, which raises the keyspace to 2¹²⁷. Now to brute-force this algorithm we use the current fastest GPU, actually an amalgam of 25 internal GPUs, that breaks 400 billion (4 x 10¹¹) keys per second. Say due to a super boosted budget we manage 1 million of these servers which raises the computation power to a staggering 4 x 10¹⁷ keys/second. Impressive, no?

Well, not exactly, even at this rate it would take 1.35 x 10¹³ years! Fun fact: The sun turns into a red giant in about 5 x 10⁹ years!

ENJOY THE FIREWORKS!

--

--