Path: utzoo!utgpu!water!watmath!clyde!att!osu-cis!tut.cis.ohio-state.edu!mailrus!ames!killer!pollux!dalsqnt!rpp386!jfh From: jfh@rpp386.UUCP (John F. Haugh II) Newsgroups: comp.misc Subject: Re: Anybody have a checksum algorithm that detects byte-swap? Message-ID: <3341@rpp386.UUCP> Date: 29 Jun 88 01:20:59 GMT References: <735@vsi.UUCP> Reply-To: jfh@rpp386.UUCP (The Beach Bum) Distribution: comp Organization: Big "D" Home for Wayward Hackers Lines: 37 In article <735@vsi.UUCP> friedl@vsi.UUCP (Stephen J. Friedl) writes: > I am writing some sort programs on two different machines >and really don't want to move megabyte files around to see if the >output from identically-run programs is the same. > I have a naive algorithm of multiplying the byte just read >with the byte number: > > while (c = getchar(), c != EOF) > sum += (c * ++count); naive, i'll say ;-) i doubt you'll ever overflow a 32bitter, but a 16 bit machine will overflow after (possibly) 256 characters, assuming a 16 bit sum. [ unless you go checking 24MB files ;-) ] i suggest trying something more random - long sum; while (c = getchar (), c != EOF) sum = (((sum << 1) & 0xfffffffe) | ((sum >> 31) & 0x00000001)) ^ c; (in other words, a rotate left one followed by xor-ing in the character). this should be as fast or faster than yours (no multiply), and it shouldn't ever overflow. it's also not as complex as a full blown CRC16. i've used similiar code for hashing functions with nice results. - john. -- John F. Haugh II +--------- Cute Chocolate Quote --------- HASA, "S" Division | "USENET should not be confused with UUCP: killer!rpp386!jfh | something that matters, like CHOCOLATE" DOMAIN: jfh@rpp386.uucp | -- with my apologizes