Path: utzoo!utgpu!watmath!clyde!att!pacbell!ames!oliveb!sun!kanawha!cs From: cs@kanawha.Sun.COM (Carl Smith) Newsgroups: comp.protocols.nfs Subject: Re: Suggestion for improved NFS monitoring Keywords: retries, xids, idempotentcy Message-ID: <80717@sun.uucp> Date: 9 Dec 88 05:12:36 GMT References: <773@sequent.cs.qmc.ac.uk> Sender: news@sun.uucp Reply-To: cs@sun.UUCP (Carl Smith) Organization: Sun Microsystems, Mountain View Lines: 49 > All RPC requests are marked with an identifier (xid) so that > incoming replies can be matched with requests. Currently most > NFS clients require that the reply xid exactly match the > request xid, and don't change the xid on retries, but there is > no compelling reason for this to be so. Although I can imagine situations in which a client might partition its XID space to encode things like retransmissions, I'd be most surprised if it weren't true that ALL clients require an RPC reply XID to exactly match the request XID. Also, it's not true that the XID isn't changed on retries. In ports derived from NFSSRC, the XID is (wrongly) changed when the RPC level times out and returns to the NFS caller, which then retries. We've recently fixed this in SunOS. The philosophy behind it is that it's more important to have correct behavior than to keep good statistics. :-) Since most servers keep a cache of recent successful non-idempotent transactions (including the RPC XIDs associated with those transactions), we'd like to make it easy for them to detect retransmissions, and they may do that only by doing bit-for-bit comparisons on RPC XIDs (after all, an RPC server doesn't know anything of the XID space partitioning its clients may or may not be using). Changing the XIDs on them only makes their job more difficult. > I suggest the following way of marking retries: > 1) All original xids have most & least significant byte = zero > 2) All retry xids have most and least significant byte = ones > > The reason for using both the most and least significant bytes > rather than just a single bit is that I want to detect these on > the server, and there is no reason why the client should waste > time putting its xids into network order - this means that the > least significant bit (bit 0) might turn up as bit 24 on some > systems. > > DOes anyone think this is a good idea? Do you want to be able > to identify retries on the server? This still won't allow you to tell how many retransmissions are occurring, and that may be interesting. Moreover, the simple detection that an RPC request is a retransmission is useless to the server. Let's use your unlink example. Suppose an NFSPROC_REMOVE operation is retried, that the server sees the retransmission but not the original request, and that the file doesn't exist. To return an error (NFSERR_NOENT) would be appropriate if the server knew that the file had never existed. To return no error (NFSERR_OK) would be appropriate if the server knew that the file had existed and had been removed by the original request. To know only that the request is a retransmission doesn't help in the least. Carl