This post discusses the difference between substitution and transposition in terms of encryption.
There are a number of different types of substitution cipher. If the cipher operates on single letters, it is termed a simple substitution cipher; a cipher that operates on larger groups of letters is termed polygraphic. A monoalphabetic cipher uses fixed substitution over the entire message, whereas a polyalphabetic cipher uses a number of substitutions at different positions in the message, where a unit from the plaintext is mapped to one of several possibilities in the ciphertext and vice versa.
Substitution of single letters separately—simple substitution—can be demonstrated by writing out the alphabet in some order to represent the substitution. This is termed a substitution alphabet. The cipher alphabet may be shifted or reversed (creating the Caesar and Atbash ciphers, respectively) or scrambled in a more complex fashion, in which case it is called a mixed alphabet or deranged alphabet. Traditionally, mixed alphabets may be created by first writing out a keyword, removing repeated letters in it, then writing all the remaining letters in the alphabet in the usual order.
Using this system, the keyword “zebras” gives us the following alphabets:
Plaintext alphabet: | ABCDEFGHIJKLMNOPQRSTUVWXYZ |
Ciphertext alphabet: | ZEBRASCDFGHIJKLMNOPQTUVWXY |
Vulnerabilities:
- Frequency Analysis
- Trial and Error
- One and two letter words
- Pairs and repetition
Transposition: Each letter retains its identity but changes its position
In cryptography, a transposition cipher is a method of encryption by which the positions held by units of plaintext (which are commonly characters or groups of characters) are shifted according to a regular system, so that the ciphertext constitutes a permutation of the plaintext. That is, the order of the units is changed (the plaintext is reordered). Mathematically a bijective function is used on the characters’ positions to encrypt and an inverse function to decrypt.
Following are some implementations.
Vulnerabilities:
Since transposition does not affect the frequency of individual symbols, simple transposition can be easily detected by the cryptanalyst by doing a frequency count. If the ciphertext exhibits a frequency distribution very similar to plaintext, it is most likely a transposition. This can then often be attacked by anagramming—sliding pieces of ciphertext around, then looking for sections that look like anagrams of English words, and solving the anagrams. Once such anagrams have been found, they reveal information about the transposition pattern, and can consequently be extended.
Simpler transpositions also often suffer from the property that keys very close to the correct key will reveal long sections of legible plaintext interspersed by gibberish. Consequently such ciphers may be vulnerable to optimum seeking algorithms such as genetic algorithms.^{[4]}
Substitution and Transposition are two cryptographic techniques.
Substitution’s goal is confusion. It involves the substitution of one letter with another based on a translation table. The table is used to substitute a character or symbol for each character of the original message. The table can take different forms;
- Monoalphabethic : Caesar cipher
- Polyalphabetic : Vignere Tableau
- Vernamcipher : One Time Pad
- Long Random number sequences
- Book ciphers
Transposition’s goal is diffusion. It involves the swapping of elements of a message to hide the meaning, so it changes where the letters are located. Transpositions try to break established patterns.
The aim is to make it difficult for an attacker to determine how the message and key were transformed. A technique based on Transposition will aim to diffuse the text as much as possible across the cipher text.
- Columnar Transposition: Break up the text into columns
- Double Transposition: Columnar performed twice using the same or different keyword
Weaknesses of each that leave them vulnerable to cryptanalysis.
Cryptanalysis is the practice to study encryption and encrypted messages with the goal of finding the hidden meanings of the messages.
In the case of Substitution,
- Because it’s based on the substitution of one letter with another based on a translation table. Once the translation table is known, the code is broken.
- Short words, words with repeated patterns, and common initial and final letters all give clues for guessing the pattern of the encryption.
- An encryption algorithm must be regular for it to be algorithmic and for cryptographers to be able to remember it. Unfortunately, the regularity gives clues to the cryptanalyst to break a substitution.
In the case of Transposition,
- Just as there are characteristic letter frequencies, there are also characteristic patterns of pairs of adjacent letters, called digrams (groups of 2 letter) and trigrams (groups of 3 letters). The frequency of appearance of letter groups can be used to match up plaintext letters that have been separated in a ciphertext