SUNY Geneseo Department of Mathematics

Cracking Substitution Ciphers, Part 1

Friday, October 1

INTD 105 17
Fall 2021
Prof. Doug Baldwin

Return to Course Outline

Previous Lecture

Anything You Want to Talk About?

(No.)

Breaking Substitution Ciphers

Based on “Cracking the Substitution Cipher” and its subsections at the Black Chamber web site.

How Do You Do It?

Identify common letters (e.g., E, T, A for English) first and work up (frequency analysis).

See “The Gold Bug” for a surprisingly good (except for maybe making the process sound a little too easy) description.

Practice

Here are three pieces of text, all related to this course, that I’ve encrypted with a substitution cipher. See if you can decrypt one or more.

I used the same cipher for each text, so you could treat them all as one giant text if you want. If you treat them as separate messages, cracking one will greatly help you crack the others.

I suggest that you copy and paste the text you want to work with into documents or tools of your own, where you can edit, make notes, rearrange, etc. as you wish in your work.

XCEQFBOPA BD LJA ZHTFLJ SBDLAF HZ LJA SCF: C PJHUL LHSD (FHQAFL JCFFBU)

ZHF LJHTUCDOU HZ NACFU, KBDPU, ITAADU, CDO PADAFCWU JCYA FAWBAO HD AZZBXBADL XHEETDBXCLBHD BD HFOAF LH PHYAFD LJABF XHTDLFBAU CDO XHEECDO LJABF CFEBAU. CL LJA UCEA LBEA, LJAN JCYA CWW QAAD CSCFA HZ LJA XHDUAITADXAU HZ LJABF EAUUCPAU ZCWWBDP BDLH LJA SFHDP JCDOU, ... (UBEHD UBDPJ)

LJA UHWTLBHD BU QN DH EACDU UH OBZZBXTWL CU NHT EBPJL QA WACO LH BECPBDA ZFHE LJA ZBFUL JCULN BDUVAXLBHD HZ LJA XJCFCXLAFU (AOPCF CWWCD VHA)

Discussion

No-one claimed to have successfully broken one of the messages, so we tried doing it as a group with the longest.

Using the frequency analysis tool at the Black Chamber web site, we found that “A” is by far the most common ciphertext letter, and that its frequency exactly matches the frequency of “E” in plain English. So we made a fairly confident guess that ciphertext “A” stands for plaintext letter “E”.

Then someone pointed out that the first message has a single-letter word “C.” Since the only 1-letter words in English are “A” and “I,” this tells us that ciphertext letter “C” stands for one of those two plaintext letters. But we weren’t able to convincingly figure out which.

Looking at the next most common letters in the message, we thought that ciphertext “B” might be plaintext “I,” or ciphertext “D” be “T.” We didn’t find strong signs either for or against these guesses.

Then someone pointed out that the 3-letter word “LJA” is common across all 3 ciphertext messages, suggesting that it might be “THE.” We’ll try this hypothesis in Monday’s class.

Next

Spend some more time trying to break the substitution cipher.

Then, time permitting, start a new story, and a new cryptosystem: start reading Enigma.

Please start reading chapters 1 and 2 for Monday, and aim to finish those chapters by Wednesday. (Note that the chapters are the long things with titles, not the numbered but untitled sections in them.)

Next Lecture