Encryption 101: Substitution ciphers

Technology

So far in this blog series, we’ve mainly focused on transposition ciphers, which encrypt their messages by shifting the letters around, as in the Caesar and Atbash ciphers, or by ‘jumbling them up’ in some way that makes discerning their true meaning difficult, à la the Columnar Transposition Cipher.

The simple substitution cipher

The basic idea of a substitution cipher is a simple one: take one letter in your message, let’s say ‘A’, and replace it with a different letter, such as ‘E’.

Sounds familiar?

Both the Atbash and Caesar ciphers used this basic principle, however they both have one weakness: predictability. Figure out how a handful of letters had been encrypted and you can pretty much break the entire message. (Learn more about how these ciphers work in my previous post: Encryption 101: Back to basics.)

The substitution cipher, however, takes this idea to the next level and provides a ‘random’ alphabet to encrypt the message. In other words, each letter is encrypted with its own key.

The table below displays an alphabet that I chose at random, simply placing letters in different locations until it was complete.

 Plaintext alphabet A B C D E F G H I J K L M N O P Q R S T U V W X Y Z Ciphertext alphabet D H C L P F S V J Y U O B R N T Z K I X W E Q M G A

This new alphabet makes figuring out the relationship between the plaintext and the ciphertext a lot harder, as the confusion that the cipher provides has been increased. The diffusion, however, is still fairly low – changing one letter in the plaintext will still only change one letter in the ciptertext – but this won’t really increase in complexity until we start looking at more modern examples.

However, while the Atbash cipher had just one key and the Caesar cipher had 25, the substitution cipher has 26 (factorial) unique keys. This works out to about 403,291,461,126,605,635,584,000,000 different ways to write the alphabet!

As you can see, the number of keys increases rapidly the more the ciphers advance.

More keys = More secure?

While one might think that having a vast number of keys to choose from is a good security metric – after all, what attacker is going to sit there and write out every possible permutation of the alphabet, run your ciphertext through it and see whether they can break the encryption – substitution ciphers still suffer the same inherit weakness as the transposition ciphers before them: letter frequency analysis.(I discussed this topic in further detail when looking at weaknesses in the Caesar cipher.)

Defeating letter frequency analysis

Letter frequency analysis has so far proven to be a very powerful cryptanalysis method, so you would be forgiven for thinking that eventually all ciphers would be cracked by it.

As part of this Encryption 101 series, however, we will move onto the Vigenere Cipher, Substitution-Permutation Networks, which start to try to increase the diffusion property of the encryption process to make the relationship between plaintext and ciphertext. We’ll also take a look at the One Time Pad cipher, which some argue is the only form of ‘perfect’ cryptography we’ve ever created – however nothing is perfect in the world of cryptography and even this ‘perfect’ cipher has its drawbacks.