Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Posting-Version: version B 2.10.2 9/5/84; site randvax.UUCP Path: utzoo!watmath!clyde!burl!ulysses!mhuxr!mhuxt!houxm!mtuxo!mtunh!mtung!mtunf!ariel!vax135!cornell!uw-beaver!tektronix!hplabs!sdcrdcf!randvax!jim From: jim@randvax.UUCP (Jim Gillogly) Newsgroups: net.crypt Subject: Re: Encryption using compression Message-ID: <2586@randvax.UUCP> Date: Sun, 7-Jul-85 20:36:57 EDT Article-I.D.: randvax.2586 Posted: Sun Jul 7 20:36:57 1985 Date-Received: Sat, 13-Jul-85 10:14:41 EDT References: <5992@duke.UUCP> Distribution: net Organization: Banzai Institute Lines: 72 In article <5992@duke.UUCP> bet@ecsvax.UUCP (Bennett E. Todd III) writes: > Wouldn't a simpleminded substitution or >transposition algorithm be beefed up to the point of requiring search >of the key space by applying a good compression program to the >plaintext first? The algorithm would certainly be strengthened (although probably not to the point of requiring search of the entire key space). However, you need to be a little careful. Some compression programs (e.g. "pack", a Huffman-coding program) will take a first pass through the data to assign the variable-length codes, put out the decoding tree at the beginning of the output file, then follow with the compressed data. To attack a simple subsititution of this (or perhaps even the result of running an XOR'ed shift register (or other "random number" stream) across it), the structure of the decoding tree could be observed from the program, and compared with the result. > If I were to use a simple substitution cypher with an >arbitrary premutation of bytes on the output of a good compression >program how could the resulting file be attacked? I'm not sure what you mean by an "arbitrary" permutation of the bytes. If you mean a fixed permutation like the initial and final (bitwise) permutations of the DES, there wouldn't be any additional security in the permutation, since we assume it's known to the cryptanalyst. If you mean picking a permutation based on some key-controlled "random number" stream, that will certainly add strength. Let's take the simple sub first. Simple substitution alone might be crackable given some assumptions about the underlying text. For example, if we assume English written in ASCII we can try a number of decoding trees based on standard English, including branches for situations where there are close choices. I believe there are a lot fewer likely decoding trees than possible keys. Spaces, for example, would have a short bit string and would happen frequently, so when we start getting close on the high frequency letters and digraphs we'd start getting reasonable-looking output. There will be more choices for low-freq letters and digraphs, but they will also show up in the text less often to mess us up; and when they do, they're likely to have similar-length encoding strings. A transposition of the kind you describe would probably be attackable with a chosen-plaintext attack, or maybe even known-plaintext. The chosen- plaintext approach is the nastiest possible attack for the cryptanalyst to make, since it assumes that not only does he know the correspondence between plaintext and ciphertext, but he can also control what plaintext is to be enciphered. This is sometimes the case in a database application, for example: I send an invoice to somebody who will be putting it into a database, and I include in my address (perhaps) some information whose encryption will tell me what I need to know. In _The Codebreakers_ somewhere David Kahn tells about a code system that was giving trouble ... the cryptanalysts produced a memo that included some words that were in doubt, leaked it to the target agents, and then read the encryption as it got sent verbatim to home base. All this is to point out that chosen-plaintext is not out of the question as an attack. In any case, if one is trying to figure out how a transposition works and has the luxury of chosen-plaintext, one can put zeroes everywhere except in one location, and see where it goes; put it everywhere possible and you've unwound the transposition for that block. If all blocks use the same transposition, you're done. If not, knowing how the transposition is produced may give some insight into the random number stream, which may be broken by as few numbers as have been used to produce this particular transposition. So compression alone won't be enough to turn a weak system into an "unbreakable" (modulo exhaustive key search) one. Note that the DES is not known (by me, anyway) to be subject to any of these attacks (including the most powerful chosen-plaintext attack). -- Jim Gillogly {decvax, vortex}!randvax!jim jim@rand-unix.arpa