Path: utzoo!attcan!uunet!seismo!sundc!pitstop!sun!amdahl!amdcad!rpw3 From: rpw3@amdcad.AMD.COM (Rob Warnock) Newsgroups: comp.protocols.ibm Subject: Re: (none) Message-ID: <23005@amdcad.AMD.COM> Date: 23 Sep 88 07:42:05 GMT References: <8809221350.AA14186@jade.berkeley.edu> Reply-To: rpw3@amdcad.UUCP (Rob Warnock) Organization: [Consultant] San Mateo, CA Lines: 59 In article <8809221350.AA14186@jade.berkeley.edu> "John A. Pershing Jr."writes: +--------------- | No, you're not missing anything. The reliability provided by the SNA DLC | layer is carefully preserved by all higher layers, so that additional | CRCs are probably redundant. There is probably a tacit "assumption" that | the various nodes are reliable -- that is, that they won't introduce any | bit errors without detecting the fault (e.g., via a machine check). +--------------- But actual disastrous experience in the ARPAnet (the infamous "black hole", among others) showed the ARPAnauts that there *are* often various nodes out there which can give errors without a machine check. Even today, that's still quite likely. Though the IBM PC family has parity-protected memory, many of the communications or network boards that plug into it don't. The same is sadly true for much more expensive environments. For performance reasons, many embedded controller systems do not use parity on their data memory. Many designers apparently feel that the parity problems were "solved" when the DRAM alpha problem was diagnosed and fixed in the early 80's, or they are using static RAMs which "aren't supposed to" have errors. The TCP/IP/UDP checksum is very good at catching single bit errors of the kind which arise in packet switches, routers, and bridges without parity memory, though not particularly good at communications-link long-burst errors. (Tat's what the CRC-16 or CRC-32 is for!) +--------------- | As I remember (it's been a long time), TCP doesn't make many assumptions | about the reliability of the lower layers; therefore, it needs some sort | of checksum to provide reliable transport. If, in fact, the lower layers | *are* reliable then TCP probably doesn't need the checksum; however, a | proper TCP implementation cannot make such an assumption. +--------------- The one assumption TCP *does* make is that lower layers will not deliver a corrupted datagram and claim it's correct. The TCP (and IP and UDP) checksum(s) are a low-cost but extrememly useful last-ditch protection of this assumption, and are well worth it. In fact, as 3rd-party vendors have entered the IBM SNA world, the assumption in the IBM world that bit errors will cause either physical checksum errors or CPU machine checks is probably becoming invalid, and SNA *should* have an end-to-end checksum a la TCP. (Probably too late, though...) By the way, "end-to-end" is the key. Having a network board compute your IP checksum for you is no good if the data lays around in non-parity memory *after* the checksum has been checked. (Though it's probably o.k. if the checksum is computed *on the fly* as the data is being DMA'd to host memory, but few [none?] of the so-called "smart" network boards do it this way.) Rob Warnock Systems Architecture Consultant UUCP: {amdcad,fortune,sun}!redwood!rpw3 ATTmail: !rpw3 DDD: (415)572-2607 USPS: 627 26th Ave, San Mateo, CA 94403