Tag Archives: cryptography

HTTPS By Default

There was a time, not too long ago, that I used to dream of running a website that had HTTPS written in the address bar. Most big websites had it, but very few little ones and personal blogs did. I couldn’t because it required a lot of money (for a student, at least). Today, thanks to the efforts by Let’s Encrypt, Cloudflare and many other organizations which gave away basic SSL certificates for free, all of the domains I own are on HTTPS. But that’s not even the best part.

The best part is that HTTPS is now seen less as a luxury by small website owners and more of a necessity. Part of the reason for this is Google’s penalty in terms of search rankings for websites without HTTPS. The other part, and this is more important for the little non-business blog owners who do not care much about traffic and SEO, Google Chrome (>= 68) has started displaying a Not Secure for non-HTTPS websites. I would say that Google is doing a great job at providing people an incentive to switch. I hope Mozilla Firefox follows in Chrome’s footsteps in this regard.

non-https website on Google Chrome

non-https website on Mozilla Firefox

On similar lines, Github now only shows the entire web address (along with the protocol) if it is on HTTP. If the link posted is HTTPS, Github will truncate the protocol part and only show the domain name.

For example,


will be rendered as




(notice https) will be rendered as


If you notice, this is opposite of what happens in browsers today, http is truncated and https is emphasized.

I personally hope more and more website and browsers treat HTTPS as the default and HTTP as the exception, and not the other way round. The merits of using HTTP over HTTPS are either obsolete or negligible (read more about HTTPS here: ELI5 – How HTTPS Works). On the other hand, if done properly, HTTPS can be much faster than HTTP.

We already know that the more people using encryption, greater will be the overall value of using it. I’ll end here with a quote by Bruce Schneier, a cryptology badas- expert.

Encryption should be enabled for everything by default, not a feature you turn on only if you’re doing something you consider worth protecting.

This is important. If we only use encryption when we’re working with important data, then encryption signals that data’s importance. If only dissidents use encryption in a country, that country’s authorities have an easy way of identifying them. But if everyone uses it all of the time, encryption ceases to be a signal. No one can distinguish simple chatting from deeply private conversation. The government can’t tell the dissidents from the rest of the population. Every time you use encryption, you’re protecting someone who needs to use it to stay alive.


Thank you for reading.

ELI5 – How HTTPS Works

Let’s start with some basics. Just like when you want to talk to another person, you talk in a language that both of you understand, every system that wants to talk to another system needs to talk in a commonly understood language. Technically, we call it a protocol. HTTPS is a protocol, and so is English. Every protocol is designed with some goals in mind. For real-world languages, the goals are simple. They are usually communication, literature and so on. With computer protocols, the goals have to be more stringent. Usually, different computer protocols have very different purposes. For example, File Transfer Protocol (FTP) was (and still is) widely used for transferring files, Secure Shell (SSH) is used for remote administration and so on.

Note that we’re only talking about application layer protocols in the Internet Protocol Suite. Once the appropriate protocol in the application layer creates a packet for transmission, this is encapsulated in many coverings, one by one, by all the layers beneath it. Each layer attaches its own header to the message, which then becomes the message for the next layer to attach its header on. A reverse of this process happens on the recipient’s end. It is easier to imagine this process as peeling of layers of an onion.

So having that set, we’ll start out discussion about HTTPS. HTTPS, or HTTP Secure, is an application layer protocol that provides HTTP traffic encryption using TLS (Transport Layer Security) or its predecessor, SSL. The underlying application doesn’t have to worry about HTTP or HTTPS, and once the initial handshake is done, for the most part, it is just an HTTP connection, one that is over a secure tunnel. I’ve been a frontend engineer and I’ve never written any specific HTTPS code, ever. That’s the magic of TLS.

What’s TLS?

So HTTP that is encrypted using TLS is HTTPS. Got it. But what about TLS then? For starters, TLS is a hybrid cryptosystem. It uses multiple cryptographic primitives underneath its hood to achieve its goals.

Aside on cryptographic primitives: Cryptographic primitives, like symmetric encryption, block ciphers and so on are designed by experts who know what they’re doing. The role of protocol implementers is to take these primitives and combine them in useful ways to accomplish certain goals.

TLS uses symmetric key encryption, asymmetric key encryption, and (sometimes) message authentication code to establish an encrypted bidirectional data tunnel and transfer encrypted bits. We’ll try to explore how each primitive is used to attain some goal in a bit. With these primitives, particularly with public key infrastructure (PKI), TLS establishes the identity of one or both the parties involved in a communication (your browser and the web server in most cases). Then, a key is derived at both the ends using another primitive called Diffie Hellman or RSA which are asymmetric key crypto algorithms. Once the key is derived, this key can be used as the session key to be used in symmetric key algorithms like AES. If an authenticated encryption mode is not used (such as GCM), then a MAC algorithm might also be needed (such as HMAC). Also, a hashing algorithm (such as SHA256) is used to authenticate the initial handshake (and as a PRF if HMAC is used). Let’s try to follow a typical HTTPS handshake and see what we learn during it.

In the beginning…

In the beginning, there was no connection. You open your browser and type in nagekar.com. The following things will happen in that order, more or less.

  • Your browser send a DNS resolution request for nagekar.com.
  • Your router (or any DNS resolution service) will provide you with the IP address of the host
  • Now the three way TCP handshake that we studied in our networking classes happen (SYN -> SYN/ACK -> ACK).
  • After establishing a TCP connection, your browser makes a request to for host nagekar.com. The server responds with a 301 Moved Permanently as my website is only accessible over HTTPS and with the WWW subdomain.
  • Now starts the TLS handshake. First client sends a client hello. It contains the following important pieces of data:
    • A random 28 byte string (later used for establishing session key).
    • Session ID (used for resuming a previously established session and avoiding the entire handshake altogether, 0 here because no previous sessions found).
    • Cipher suites supported by the client in order of preference.
    • Server name (this enables the server to identify which site’s certificate to offer to the client in case multiple websites are hosted from a single IP address, as in the case with most small/medium websites).
  • Then server sends a server hello which has the following important pieces of data:
    • Another random 28 byte string (later used for establishing session key)
    • Cipher suite selected by server (in our case, the server selected TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256 which was our most preferred cipher suite)
  • At this point, both client and server have the necessary information to establish an encrypted tunnel, but one important detail is missing. No party has verified the identity of the other party (and if not done, it really defeats the purpose of whatever came before this since an active man-in-the-middle adversary could easily break this scheme). This is done in the certificate message. In most cases, only the client will verify the identity of the server. Here’s how it looks like:
  • And this is exactly what you see when you click on the green padlock icon in your address bar and try to see more information about the certificate offered by the website.
  • At this point, the server hello is done. It is indicated in the message that the server won’t be asking the client for a certificate.
  • The server sends one half of the Diffie Hellman key in a separate Server Key Exchange message. Following this, the client sends other half of the Diffie Hellman key exchange. After that, the client sends a Change Cipher Spec message which means any subsequent message from the client will be encrypted with the schemes just negotiated. Lastly, the client sends the first encrypted message, an encrypted handshake.
  • On similar lines, server issues the client a Session Ticket which the client can then use to resume connections and not go through the entire Diffie Hellman procedure again (although it is valid only for 18 hours in our case). The server sends a Change Cipher Spec message, indicating that no more plaintext messages will be sent by the server. Lastly, the server sends its first encrypted message, an encrypted handshake, just like the client.
  • That’s it. We have established a secure connection to a computer on the other side of the planet and verified its identity. Magic!

Crypto Primitives

Let’s discuss what goal of cryptography is achieved by what part of this entire handshake. Remember the cipher suite that the server choose? It was TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256.

ECDHE: ephemeral Elliptical Curve Diffie Hellman, as we saw, is used to establish a shared secret session key from the random values our client and the server exchanged (over an insecure channel). It is a key exchange crypto.

ECDSA: Elliptical Curve Digital Signature Algorithm is used to verify the public key supplied by the server, in our case nagekar.com, issued for Cloudflare by COMODO.

AES 128 with GCM: AES is a block cipher. Being a symmetric key encryption algorithm, it is much faster than the asymmetric key ones, and hence used for encryption of all the data after the initial handshake is done. 128 is the size of the key in bits, which is sufficiently secure. GCM stands for Galois/Counter Mode, which is an encryption mode used with AES to provide authentication and integrity along with the regular confidentiality.

SHA256: Since we’re using AES with GCM, we won’t be using this hash function for message authentication. However, since TLS 1.2, SHA256 is used as a PRF. It will also be used to verify that all content exchanged during the handshake were not tampered with.

Security Considerations

About trust: As you might have noticed, all the above steps were essentially so that two random computers can come up with a shared secret session key. The other part to this is Certificate Authorities. Why did we trust the certificate that the server sent us? Because it was signed by someone, whom we trusted. At the end of it all, you still have to implicitly trust someone to verify the identity. In this case, we’re trusting COMODO to have only signed one certificate for the domain in question.

About browser and updates: If you see the version of TLS that we used, it is 1.2 which is not the latest. The cipher suite is also not the best we could’ve got. Why did that happen? Simple, because we were using an outdated browser which didn’t support the strongest cipher suites and the latest version of TLS. Since that was a test machine, it doesn’t matter a lot. On any up to date browser, this is what you should see.

About cryptographic primitives: We saw some of the most understood crypto primitives being used in the handshake. This is a pattern you’ll see often while reading about cryptology. It is a sin to implement your own crypto, especially the primitives. Use a library that implements these primitives, or better yet, the entire cryptosystem.

About mathematics: The reason that we think the above scheme is secure, that no data is leaked even though key was generated using the information sent in clear, is because the basis of some of these cryptographic primitives are hard problems in mathematics. For example, since mathematicians believe that discrete logarithms are easy to verify but are hard to calculate the other way, we say that Diffie Hellman (which makes use of discrete logarithms) is secure. Similarly with RSA, mathematicians believe that factoring large prime numbers is a hard problem, hence RSA is considered secure as long as the numbers are large enough. Of course, not always is a mathematical proof available. For example, AES is considered secure, but there’s not proof that it is secure. We think it must be secure because the brightest minds in cryptology have spent thousands of man hours trying to break the encryption algorithm but they haven’t succeeded (yet?).

In Closing

As you can guess, a lot of important details are skipped in this article. There are two reasons for that. 1. I lack the necessary knowledge to simplify the deeper parts and 2. It would be boring to read if the post felt like a spec. If you wish to read more, refer to the list of references below this section.


Thank you for reading!

ELI5 – Key Derivation Function

We’ve heard that AES and other block ciphers require specific key sizes; 128, 256 and 512 bits. But I don’t ever remember having to calculate my password length based on the underlying key size. Never have I read on a website “passwords need to be of 16 ASCII characters, 1 byte each, to make a total of 128 bits of key material”. So what lies between me entering an arbitrarily sized password and the encryption algorithm receiving a 128/256 bit nicely sized key. Let’s find that out in this ELI5.

Key Derivation Function

A Key Derivation Function (wait for it…) derives cryptographic key(s) from a password. Generally speaking, the passwords we humans come up with are something like “MyAwesomeDog007” which, while long and easy to remember, just don’t have enough entropy for cryptographic applications. On the other hand, a key derived from a simple password “ml6xU*dwGS5rvE!dcIg6509w$$” (that’s not a real key, a real key would in most cases be binary) is complex and entropy rich. This is the first purpose a KDF serves; to increase the entropy of a password and making it suitable for use in other algorithms such as AES.

The second purpose that KDFs serve is that they make brute forcing infeasible. Due to the high computational costs of running a good KDF, brute forcing is typically not achievable for any half decent password. Of course, it won’t protect a user from a dictionary attack if she selects a password such as “password123”.


A KDF takes an arbitrarily sized input that has low entropy (user-supplied password, for example), runs some hash-based algorithms on it, and output a random looking fixed sized cryptographic key (which becomes input key to encryption and MACing algorithms later). A KDF can be thought of as a pseudo-random function (PRF) which maps an input password to an output key. As a PRF, the input and output mappings should look completely random to an attacker and in no circumstance should he be able to get the original password from a cryptographic key (that is, the function should be one way). The high iteration count makes computing KDF an expensive affair. This is acceptable for a legitimate user but will prevent brute forcing of the password.

Typically, key derivation functions employ keyed hash algorithms or HMAC. Cryptographic salt is used to prevent rainbow table attacks (precomputed hash lookups). The number of iterations (in the order of tens to hundreds of thousands) of the hash function is selected to slow down bruteforce attacks.


A simple key derivation function is Password Based Key Derivation Function 2, PBKDF2. It takes as input a pseudo-random function (such a SHA-256), user supplied key, salt (64+ bits), number of iterations, length of output key, and outputs a key of specified length.

Although PBKDF2 is still used and recommended, modern alternatives such as Scrypt and Argon2 offer much better resistance to bruteforce.

ELI5 – Format Preserving Encryption

Most block ciphers work on the bytes and bits of data. It doesn’t matter to them if the data is a video, a phone number, a credit card number or a meme of Patrick Star. And that’s good. It means that a generic block cipher can handle almost all the traffic encryption over TLS while we’re busy browsing Reddit with YouTube playing music in another tab. And since the underlying network handles binary data just as well and efficiently, no one complains.

And it is generally a bad practice to write domain-specific encryption algorithms. That is the reason you’ve never heard of AES Image Encryptors. But sometimes, for very specific use cases, it becomes necessary to have something like that. Something like an AES for images, so to speak.

The Problem

What if a legacy application needs to integrate encryption of some of its data but not change any of the existing data structures? AES works on 128-bit blocks of data, DES on 64, so it won’t work for phone numbers or credit card numbers. At least, not without changing the underlying data structures required to store the encrypted data. And suppose we cannot change the underlying architecture for various reasons, one of them simply because we do not control some of the machines passing our data. Yes, that’s where we need format preserving encryption (FPE).

Format Preserving Encryption

While researching about this topic, I came across this beautiful construct by cryptographers John Black and Phillip Rogaway. The construct is simple and the best part is that it uses a block cipher as a pseudo-random function (in case of block ciphers with larger block sizes, we truncate their outputs to the desired bit size by taking only the N least significant bits), thus inheriting all the goodies of the underlying block cipher. Let’s look at a brief working of this method.

Let the message space be M. In case of phone numbers, that’s from 0 to 9,999,999,999 (that’s for India, and while the actual message space is much smaller than that, no harm in assuming for the entire range). The number of bits required to store this information is ln(10^10) = ~24. So we can fit the ciphertext in 24 bits assuming no padding or integrity checks. Now imagine two sets, X and Y. Let X be a superset of Y. In this construct, X represents the set of all possible ciphertexts that we can get on encrypting each Mi with our block cipher. Y represents the set of allowed ciphertext, that is, ciphertexts that are equal to or less than the max value of our message space M (which is 9999999999 in our example).

Now, when you encrypt a phone number with a block cipher, there’s a good probability that the value would be less than or equal to our phone number block size (10 digits or 24 bits, assuming we’re truncating AES output to 24 bits as well). If that’s the case, that’s our answer. If not, encrypt this ciphertext and check again. Continue this until you reach a number that can fit in 10 integer digits.

Now while some of you might think (I certainly did) this would result in a long loop, it would not (with high probability). This solution not only works but works efficiently (on an average, the answer will be found in 2 iterations with 50% plus probability of finding it in each iteration). That’s pretty cool if you’d ask me!

In an ideal world, you’d want to rewrite the logic and underlying data structures such that native AES is possible. In this one, format-preserving encryption would work just fine. Thank you for reading.

ELI5 – Deterministic Encryption

Suppose you have a database full of confidential information such as emails of users. As a responsible sysadmin, you’d not let such data exist in plaintext in your systems and therefore you decide to encrypt everything. But now, the application needs a searching functionality where users can see their emails in the system.

Essentially, you need to run a where email = ‘ ‘ query on the encrypted database, get all the rows that match, decrypt them and send the decrypted data to the application layer. With traditional encryption modes like CBC or modern authenticated encryption modes like GCM, this would be impossible (or extremely inefficient). This is where deterministic encryption comes into the picture.

Deterministic Encryption

Deterministic encryption is nothing fancy. Instead of using modes like CBC and CTR where each block of the ciphertext is dependent on the previous block or the message counter, in deterministic encryption, data can be imagined to be encrypted in EBC mode or the IV kept constant. No nonce is involved. Basically, a plaintext message M will always map to the same ciphertext C in a given deterministic encryption scheme under a particular key K.

Once the data is encrypted into ciphertext, it is sorted and stored. Now, when a search term comes up, it is encrypted using the same key, then the database is queried for this ciphertext which returns all the rows that match. The application can then decrypt these rows with the given key. The search takes logarithmic time (since for the database, this is a normal text search) and the database never sees any data in plaintext.

Even with all of this, deterministic encryption faces all the issues that plague encryption modes like EBC. Namely, if two plaintexts are same, their encrypted ciphertexts would be equivalent as well, thus leaking information to an adversary. Formally, we say that deterministic encryption can never be secure under chosen ciphertext attack. Although that doesn’t diminish its value when we have to deal with searchable encrypted datasets.

From Wikipedia’s encryption modes of block cipher page

This means that deterministic encryption cannot (or rather should not) be used when the message space M is small. It can only be used on unique values such as email address or usernames which are designed to be unique in a database.

Further reading:

Thank you for reading.

ELI5 – Message Authentication Code

You need some urgent cash to buy today’s lunch. You throw a paper chit at your colleague, “Hey, I need you to transfer 100 bucks in my account number 10022, urgent”. Eve, a bad actor in your office, intercepts the chit, changes the 10022 to 10033, which is her account number, and forwards it to your friend. Your friend, intending to help you, transfers the amount and you both get duped!

The Problem

The above is not a overly rare event, far from it. Such attacks happen all the time on the internet, and the reason is the lack of (cryptographic) authenticity built into core internet protocols. We learned in Authenticated Encryption that confidentiality alone doesn’t mean anything if the attacker can perform active attacks on your communication channel (just like Eve could). We need something better. We need MACs.

Message Authentication Code

As the name gives away, a MAC is an authentication code associated with a message which verifies the integrity of the message and, assuming that the key is only known to you and the message’s sender, its authenticity. Just like with encryption, you give a MAC algorithm a message and a key, and it gives you a tag. This tag is unique to your message and the key pair, and an attacker shouldn’t be able to forge a valid tag for any random message of his choice even if he’s given an infinite number of ciphertext-tag pairs to analyze.

From Wikipedia’s MAC page

In concept, a MAC is similar to a hash function, such that given an arbitrary sized input, you get a fixed-sized output (digest) and this can be reproduced (‘verified’) on other machines as long as one can find the same hash function’s implementation. This is how your download manager ensures that the file it has downloaded from the internet is not broken, by calculating the hash digest and comparing it with the one the website claims. A MAC differs from a traditional hash function in that along with a message input, it also takes a key and as such, knowledge of the key as well as the underlying MAC algorithm is needed to verify (or create a new) a tag.

In fact, one of the most popular MAC algorithms is based on hash functions. The algorithm is called HMAC for Hash-based Message Authentication Code. It works by hashing key material with the message while taking preventive measures for popular attacks on hash functions such as length extension attacks. Any reasonable hash function can be used for the purpose of MAC’ing, including SHA-1 and SHA-256, (MD5 isn’t recommended).

Encryption of the underlying data is not a prerequisite for using MAC, and they can be used irrespective of whether the data being MAC’d needs confidentiality or not. Use MACs whenever data integrity is needed. One caveat to look out for; MAC algorithms by themselves do not prevent replay attacks.

Aside on Replay attacks: A replay attack may happen when, say, you owe Eve some money. You send a note with Eve for your bank saying, “Please give Eve Rs.100 from my account, Signed: Bob”. Now there’s nothing preventing Eve from being greedy and using that same note again some days later. This is prevented in the real world by making cheques unique and one-time use only. Similarly, ciphertexts must embed information (such as packet number, timestamp, session counter etc) that will expire once received and not let Eve re-send it at a later time.

Thank you for reading.

ELI5 – Authenticated Encryption

The core goals of cryptography and any application of cryptography are confidentiality, integrity, and authenticity. Let’s begin with a short one liner on each:

  • Confidentiality: No one should be able to read the contents of the message except the intended recipient.
  • Integrity: No one should be able to tamper with the message without going unnoticed.
  • Authenticity: The recipient should be able to confirm that the message indeed came from the sender.

There are other goals that we do not need to touch upon in this article, such as non-repudiation and plausible deniability.

The Problem

Now the problem with using just an encryption algorithm like AES with a non-authenticating mode like CBC is that anyone can change the ciphertext during transmission. And while you might think, “but the modified ciphertext, with high probability, will decrypt to something gibberish”, this isn’t the right argument because the recipient will have no way of knowing for sure, which is a problem, a huge one.

Secondly, there’s also no way of knowing if the message was sent by a person you’re expecting it from. It might have come from any middleman intercepting your network and you wouldn’t be able to tell a difference. And for this reason, encryption without authentication and integrity completely destroys the purpose of encryption. An example of this in the real world is when you see an error such as the following:


While this can mean that the encryption mode used by the website is weak, more often than not, this means that the browser was able to establish a secure connection but the identity of the website is unknown. This defeats the purpose of encryption because even if the connection is secure, the fact that you don’t know if you’re receiving a message from your intended recipient or if the message hasn’t tampered with defeats the purpose of using cryptography.

Enter Authenticated Encryption

Authenticated encryption solves this problem by introducing authentication and integrity as freebies that you get when you use an authenticated encryption mode along with an encryption cipher such as AES. Examples of authenticated encryption modes include GCM and CCM. In fact, if you check the connection info of the site you’re reading this on (click the green icon and then select more info or something similar on chrome and firefox) and check the technical details part, you’ll see something like this, depending on your browser.

Yes, I’m the most active visitor of my blog

Here, AES_128_GCM is used for symmetric encryption of the content you exchange with the server with AES providing confidentiality and GCM providing authentication and integrity. SHA256 is used to authenticate the initial handshake and as a pseudo-random function (PRF).

In a nutshell, these authenticated encryptions usually take a message, encrypt it, then MAC the ciphertext (and IV) and then append the MAC to the ciphertext. This is called Encrypt-then-MAC. Now if the ciphertext is changed, the MAC won’t match and the receiver can easily discard such messages without having to touch the contents of ciphertext. There are other variations to this method, namely MAC-then-Encrypt and MAC-and-Encrypt, with benefits of going with each although most experts recommend doing Encrypt-then-MAC.

From wikipedia page on authenticated encryption. This is Encrypt-then-MAC

As you can imagine, this can be easily done manually (and until some years ago, it was mostly done by developers). But since it is easier (and much more secure) to standardize such modes and leave the secure implementation part to the experts, these ‘readymade’ modes have picked up wide adoption and as you saw, you’re currently using GCM to ensure confidentiality, integrity, and authenticity of this very line. Thank you for reading!

Time & Hash Based One Time Passwords

Ever wondered how two factor authentication apps works? I certainly did. One could just guess how SMS based tokens work, that’s simple (although they shouldn’t be used as per guidelines from NIST). But what about TOTP or Time based One Time Password, the ones in which you scan a QR code and the OTP generator app (like freeOTP and andOTP) gives you a new six digit token every 30 seconds or so?

I was, for quite some time, under the (very misinformed) impression that web servers which implement this method of 2FA expose an API and, by means of the QR code, give you the endpoint with some token and then the OTP generator app polls the server and gets a new ephemeral password which the user enters in the application. Straight forward, but plain wrong.

The belief was challenged when I noticed that the OTP generator app works irrespective of network connection, even if both devices are offline (that is, when working on localhost server). HOW? I dug further and I learned some very interesting things, some of which I wanted to write here.

The Name, Dude

Time based One Time Password, the name itself gives enough clue to guess that it uses time, and as such is independent of inter-communication as long as the two systems are in time-sync. But an adversary can be assumed to be in time-sync as well, right? Yes, that’s where we bring in the secret (a randomly generated token) which is embedded in the QR code that you scan with your smartphone app. So we have the time which is in sync and we’ve established a way of transferring the secret from the server to the client. Turns out, that’s all the data we need to keep generating secrets independently on the client and server side, completely offline once the initial secret sharing happens.

Basic OTP Algorithm

  function generate_otp(secret, counter) {
    h = hmac(key=secret, message=counter, algorithm=sha1)
    offset = get_last_four_bits(h)
    pre_opt = get_32_bits_starting_from_offset(h)
    otp = get_desired_number_of_chars(pre_otp, N)

  function get_totp(secret) {
    counter = epoch / 30
    return generate_otp(secret, counter)

  global counter; // get from database
  function get_hotp(secret) {
    return generate_otp(secret, counter)

The basic OTP algorithm (both time and hash based) accept a secret and a counter value. Combining current time and the secret, a new 6 (or N) digit token is generated every 30 (or Ti) seconds. They differ in what the counter value supplied to the algorithm.

  • TOTP: take the number of times the interval Ti can be fitted in the total number of seconds since epoch. Which is just a weird way of saying that the interval is the quotient when you divide the seconds_since_epoch number by the interval duration (Ti).
  • HOTP: take the current counter stored persistently and use that. After use, increment the counter in the database.

After establishing the counter value, the rest of the steps remain the same in both the cases.

  • Compute HMAC value of message Ti and key secret, get the hex digest
  • The last 4 bits (last digit in hex) is stored as offset
  • Starting from offsetth bit, take 32 bits (8 hex digits) and discard the first bit (xor with 7ffffffff. This works because f = 1111 and 7 = 0111, so &’ing with 7 (0111) is equal to switching off the first bit)
  • Convert the 32 bits hex to int, then take the least significant 6 bits (or N depending on requirement), and that is your OTP
  • ?? Profit!

Security Consideration

The entire security of the OTP lies in the secrecy of the initial secret. If that’s compromised, an attacker can easily generate as many OTPs as she wishes. Also, given the keyspace of the OTP, and also that many servers are designed to accept counter+1 and counter-1 OTPs, securing the system against bruteforce is a most.

One important aspect of these 2FA mechanisms is that losing your 2FA device means losing your account. This is especially the case with services like ProtonMail where the password is used to decrypt client data.

Naive Python Implementation

Given how simple this algorithm was, I tried to implement it. The core algorithm was literally less than five lines of python code. Here’s a naive implementation of the same, and while it seems to work, I’d not use it anywhere.

Thank you for reading! PS Python noob, please don’t judge! 😛

ELI5 – DES (kinda)

In my previous post, which was a review of the book Applied Cryptography and Cryptography Engineering, I wrote that DES, in spite of retiring officially (after numerous successful attacks), is still a great algorithm from an academic perspective to learn and peek into the minds of cryptographers. In this one, I’ll try to explain DES in my own words, testing my own understanding and giving you a (hopefully) nice read from a novice perspective. So let’s get started with DES, a cipher that was once the standard for government data encryption for the US and many others around the globe, now defunct and only exists, if at all, in form of 3DES.

Before we begin, let us understand where we are (with DES) in the cryptoverse and then talk about DES itself. In cryptography, encryption of data can happen in two ways. There’s symmetric cryptography, and there’s asymmetric cryptography. In symmetric key cryptography, we have block ciphers and stream ciphers. DES is a block cipher.

A Brief History

DES, for Data Encryption Standard, is a symmetric key encryption algorithm proposed by IBM with inputs from the NSA in 1974. Not many details about the development process were shared with the world. It is studied by numerous experts and counts as one of the most studied ciphers of all time. DES was designed to keep government secrets, secrets.

The 56 bit key size didn’t impress anyone back in the day, much less today. In spite of a small key size, there weren’t any attacks faster than brute force, both theoretically and practically, until into the late 80s when Adi Shamir and Eli Biham discovered a new kind of attack on block ciphers called differential cryptanalysis. The world then learnt that NSA and IBM knew about this attack since at least 1974, and the algorithm was designed specifically to counter this attack.

In late 90s DES was practically cracked in a contest and then many times after that. The main weakness in DES was the small key size, and to patch it, 3DES was proposed which is still used today, although not recommended. But from an academic point of view, DES is a gold mine. It is easy to understand, let’s us deep dive into the way cryptographers think and learn why certain decisions are made, and most importantly, why the math just works!

DES algorithm from 10,000ft

Okay, let’s start with a super complex diagram that you probably won’t understand without 4 years of formal training in mathematics. Just kidding.

And for the sake of my love for bullet points,

  • DES is a Feistel cipher, which is a family of ciphers which are iterative in nature (repeat a simple set of instructions several times, called ’rounds’) and share many similar properties.
  • DES has a block size of 64 bits, that is, 64 bits of plaintext is converted into 64 bits of ciphertext in one go.
  • The algorithm makes use of a 64 bit key, 56 of which are used by the algorithm and 8 are used for parity check. Effective security is 56 bits.
  • DES has 16 rounds.
  • The encryption and decryption functions are almost similar, which is a great advantage as the implementation and audit has to be done for single function only, simplifying things.

So how does the algorithm work? Like any other algorithm, you can put it down as a list of easy to understand steps.


  1. Take input as plaintext block of 64 bits, and key K
  2. Apply Initial Permutation (IP) on input plaintext (which shuffles the bits in a predefined manner)
  3. Split the input into left half and right half (L0 and R0) (form two equal halves of 32 bits, no tricks)
  4. Apply magic function F (not really) on the right half R0 (32 bits input => 32 bits output)
  5. Function F takes R0 and K1 as input, where R0 is the right halve (32 bit) input for the 1st round and K1 is the 1st round key. In this step, the key material mixes with the plaintext
  6. XOR output of F (32 bits) with L0 (which is already 32 bits), this is the new R1 (L0 ⊕ F(R0) => R1). R0 is simply copied to L1
  7. Thus, we’ve essentially swapped L0 and R0 with some pre-processing on R0. This completes our round 1. Repeat 4-5-6 16 times and you’ve done 16 rounds of DES.
  8. Apply reverse Initial Permutation (a.k.a. Final Permutation or IP-1) and you have your ciphertext. Tadaa!

Yes, I know, that was a mouthful, wasn’t it? This slide [link here] shows the round key Ki in action. Now that we have a basic flow, we can take on each of the components and talk about them in detail, in a proper top down approach.

Little aside on confusion and diffusion

Confusion and diffusion are exactly what they mean in plain English. They provide confusion and diffusion properties in the ciphertext. They are crucial for the overall security of the DES algorithm.

Confusion means having a non-linear, complex relationship between the key and the ciphertext. In simple words, each bit of the ciphertext has to depend on as many bits in the key as possible, such that even with a choosen ciphertext attack scenario, not much can be known about the key given a practically infinite supply of plaintext-ciphertext pairs.

Diffusion means any change in the plaintext should cause an avalanche/snowball effect and change around half of the bits in the ciphertext and vice versa.

We will talk more about how DES achieves both of these properties when we talk about the F function in detail.

DES algorithm: Major parts

Please take a moment to appreciate the effort I’ve put into the diagram. Error: The K(i) should be K(i+1)

We have here the following major components to talk about.

  • Initial permutation, final permutation
  • Round key generator
  • The round function F

Initial & Final Permutation (IP & FP)

The IP accepts the plaintext and the FP returns the ciphertext generated by the algorithm. In decryption, the ciphertext goes into the FP and plaintext leaves through IP, similar but exact opposite of encryption, which is one of the properties of a Feistel cipher. From functionality perspective, it shuffles the 64 bit input block according to a predefined vector, given below.

58    50   42    34    26   18    10    2
60    52   44    36    28   20    12    4
62    54   46    38    30   22    14    6
64    56   48    40    32   24    16    8
57    49   41    33    25   17     9    1
59    51   43    35    27   19    11    3
61    53   45    37    29   21    13    5
63    55   47    39    31   23    15    7

The above text is a linear list, or a vector, and not a matrix. What it says is “take the 58th bit and connect it to output bit 1”, “take the 50th bit and connect it to output bit 2” and so on. It is basically a one-to-one substitution. So how does it, one might ask, help in adding security if the list is public and it is a simple substitution operation. Well, it does not. To quote wikipedia,

IP and FP have no cryptographic significance, but were included in order to facilitate loading blocks in and out of mid-1970s 8-bit based hardware.

Round Key generator


The round key generator function generates a key for each of the 16 rounds of DES. There are a couple of steps involved, as illustrated in the above visual.

  1. Permuted choice 1 (parity drop) – Get the permuted 56 bit key from the input 64 bit key by dropping the parity bits (bit 8, 16…64 are dropped). The permutation is done according to the predefined vector shown below.
  2. PC-1
    57   49    41   33    25    17    9
     1   58    50   42    34    26   18
    10    2    59   51    43    35   27
    19   11     3   60    52    44   36
    63   55    47   39    31    23   15
     7   62    54   46    38    30   22
    14    6    61   53    45    37   29
    21   13     5   28    20    12    4
  3. Split the 56 bit key into two 28 bit halves, and left shift them either by one bit (for round 1, 2, 9 and 16) or by two bits (for every other round).
  4. Concatenate the two halves thus returned after left shifting, and apply the permutation table 2 to the concatenated pair.
  5. PC-2
     14    17   11    24     1    5
      3    28   15     6    21   10
     23    19   12     4    26    8
     16     7   27    20    13    2
     41    52   31    37    47   55
     30    40   51    45    33   48
     44    49   39    56    34   53
     46    42   50    36    29   32
  6. Permuted choice 2 (compression p-box) – Takes a 56 bit key and returns a 48 bit round key Ki after dropping another 8 bits
  7. The 48 bit round key is then used by our magic function F (remember that?) to mix key into the plaintext by xoring the plaintext with this 48 bit key. (Wait, but our right input halve Ri is 32 bits, right? Yes, we’ll get to how our input is expanded to 48 bits in the next section)

Round Function

We’re finally into the meat of this beautiful algorithm. I’ve mentioned in brief about what the round function consists of. To reiterate,

  • Split the input into left half and right half (Li and Ri) (form two equal halves of 32 bits, no tricks)
  • Apply magic function F on the right half Ri-1 (F takes 32 bits input and gives 32 bits output), where Ri-1 is the right halve of the ith round and Ki is the ith round key. This is where the key material mixes with the plaintext.
  • XOR output of F (32 bits) with Li-1 (which is already 32 bits), this is the new Ri (that is, Li-1 ⊕ F(Ri-1) => Ri). Unaltered Ri-1 is simply copied to Li

What we haven’t talked about is the magic function F itself. The magic function F isn’t really magical. It just does 4 neat sub-operations, and does them really well.


  1. Expansion function E
  2. XOR with round key
  3. S box substitution
  4. Permutation

Let’s look at them one by one and try to see where exactly they fit in and what cryptographic property they give to our ciphertext.

Expansion function
32     1    2     3     4    5
 4     5    6     7     8    9
 8     9   10    11    12   13
12    13   14    15    16   17
16    17   18    19    20   21
20    21   22    23    24   25
24    25   26    27    28   29
28    29   30    31    32    1

As the name might have hinted, the expansion function expands our plaintext input. Expansion gives us diffusion. It diffuses the impact of change of one bit in the input across the block. Remember how the 32 bit Ri part of the 64 bit input is sent to the F function? E function takes those 32 bits of input and expands them to 48 bits. How it does that? Well, repetition, of course. So it basically takes input as 1 2 3 4 5 6 7 8 and outputs something like 1 2 2 3 4 4 5 6 6 7 8 8, effectively increasing the size by 50% (32 => 48).

XOR with round key

XOR is a simple mathematical operation that has a very important property from a cryptographic standpoint. If you XOR a number A with B, you get a new number C. To get A from C, you need B. To get B from C, you need A. Basically, A ⊕ B ⊕ B = A, and A ⊕ B ⊕ A = B. XORing plaintext and key locks them in a interdependent mixture such that to get back the plaintext, you have to have the key with which it was XORed (locked).

S-box substitution

In some ways, this is the heart of the algorithm. S-box substitution gives us confusion. There are eight S-boxes in total, each taking 6 input bits and giving 4 output bits. S-boxes provide DES immunity against differential cryptanalysis which I mentioned at the beginning of this article. Here’s S-box number 1.

      0  1   2  3   4  5   6  7   8  9  10 11  12 13  14 15
  0 | 14  4  13  1   2 15  11  8   3 10   6 12   5  9   0  7
  1 |  0 15   7  4  14  2  13  1  10  6  12 11   9  5   3  8
  2 |  4  1  14  8  13  6   2 11  15 12   9  7   3 10   5  0
  3 | 15 12   8  2   4  9   1  7   5 11   3 14  10  0   6 13

Here’s how it works. After the XOR operation, we are left with a cryptic looking 48 bit string.

say 110010101100101111111100110111101100111010101001

Now we take this 48 bit string and divide it into 8 equal parts of 6 bits each, and input one of the 8 parts into each S box.

SB1(110101) SB2(101100) SB3(101111) SB4(111100) SB5(110111) SB6(101100) SB7(111010) SB8(101001)

Now, our S-box 1 receives 110101.

We take the first and last bit (1 and 1 in this case, coloured yellow), concatenate it to form a two bit number (1 . 1 => Binary(11)) which is 3, and look it up in the row labels of our S-box 1.

Similarly, we take the middle 4 bits (2 to 5), which in our case are 1, 0, 1 and 0, coloured blue, concatenate them to form a 4 bit number (1 . 0 . 1 . 0 => Binary(1010)) which is 10, and look up the corresponding column label in our S-box 1.

The number corresponding to row 3 and column 10 is 3, which is 0010 in 4 bit binary representation. That is the output of S-box 1 for input 110101. Similarly do this for S-box 2-8, for each of the 16 rounds of DES. The result of the 8 S-boxes (4 bits each) is combined to get a 32 bit output.


The final step of our magic function F is a simple one-to-one permutation, taking 32 bits and returning 32 bits.

16   7   20  21
29   12  28  17
 1   15  23  26
 5   18  31  10
 2    8  24  14
32   27   3   9
19   13  30   6
22   11   4  25

Catch your breath

I’m really too proud of this picture. Edit: Not so much after finding that K(i) => K(i+1) error.

Wake up! Do you even remember that all this was done on Ri?

Now, after the F function, which wasn’t very magical after all, returns the 32 bit output, we XOR it with Li, which gives us our new Ri+1, while the untouched Ri is simply copied to Li+1‘s place. Hence begins a new round of DES, which goes on this way for 15 more rounds.

After 16 rounds

Not much is left to be done after the 16 rounds. The two halves are concatenated, the 64 bit cipher block is then passed through our final permutation using FP vector given below, and this gives us our ciphertext. Easy.

40     8   48    16    56   24    64   32
39     7   47    15    55   23    63   31
38     6   46    14    54   22    62   30
37     5   45    13    53   21    61   29
36     4   44    12    52   20    60   28
35     3   43    11    51   19    59   27
34     2   42    10    50   18    58   26
33     1   41     9    49   17    57   25 

Wrapping DES Up

So that was DES. I hope you enjoyed reading this article. I’m expecting some mistakes, technical and otherwise, so take everything with a pinch of salt. Some interesting reads are given below for those of you who wish to learn more. I realized that writing this article was a nice way of testing my own understanding of the topic, find holes in it and then study to fix those holes. As always, thank you for reading!

Further Reading

Book Review – Applied Cryptography Part I And II – Bruce Schneier

This book has been, without a doubt, crucial in aiding my understanding of cryptosystems and why things are the way they are, and how do these cryptic crypto algorithms even work. If you are interested in learning how to develop software that are ‘correct’ and secure, then this is a great book to understand what are the primitives of information security, what algorithms already exist and which ones to use in what scenario.

So the motivation to pursue a thorough understanding of cryptography and to gain the ability and knowledge required to make a secure cryptosystem came sometime after college ended, when I and Kunal were working on a terminal chat application that would support end-to-end encryption. At that time, I hardly knew what I had gotten myself into (which is similar to a lot of things in my life), as the application development part seemed very simple. We got done with the application part, terminal app and the backend, and then came the encryption part, and that is when the knowledge about existing techniques and understanding of basic crypto primitives fell short. And that is when I started reading about cryptography and stumbled upon this book.

Although they seemed daunting at first, both the books are very accommodating for a wide range of audience, right from someone like me who barely knew what a block cipher is, to the more experienced folks who might understand all of the mathematics given in the book in the first go. While not very complex (school grade algebra with addition, multiplication, modulus and xor operations), it takes a little effort (read: re-reading a topic 3 times, sometimes more) to actually get what’s happening, why an operation is being performed, for example.

While reading the first book, remember that it was written when I was literally a year old, in 1996. Hence, although the engineering principles and general recommendation is still valid, you need to keep in mind that the algorithms recommended in that book are not valid (as attacks are found for many of them and DES has officially retired), and that is corrected in the second edition of the book. In any case, studying the DES algorithm in detail should be a delight for any crypto nerd, regardless of its practical value.

The second version is more up to date, and for some reason I was more comfortable reading it than the first one. It might be because I knew a little more while reading the second edition, which can be a good tip: If you’re serious about understanding cryptography from an engineering standpoint, skim over the first book and make a note of everything that you find useful and interesting, and do a more detailed study of the second edition of the book.

What I found nice about the books is, they really are ‘applied’ books. It isn’t all mathematics and algorithms, but the actual merger of these algorithms into real world systems. In the real world, cryptography and cryptosystems don’t exist in isolation, but play a small role in the larger scheme of things. Breaking a cryptosystem is usually reserved for the more resourceful adversary, and while these (well established and peer reviewed) cryptographic primitives rarely fail, when they do, it is catastrophic. The computational infeasibility makes the theoretical aspect of cryptography very secure. Problems appear when they are implemented, and that is where the bugs start to show up. Then there is the software development methodology which usually prioritises deadlines and features above security. There is a section dedicated to explaining what ‘trust’ is, how it forms such an important aspect of information security and secure software development. Overall, the book is quite interesting to read, and the content is without a doubt top quality, which is what one expects from Schneier.

In closing, I’d recommend this book if you are into security and wouldn’t mind knowing the details of some of the fundamental algorithms that make the digital revolution possible. Thank you for reading.