Path: utzoo!attcan!uunet!husc6!bbn!rochester!udel!princeton!phoenix!haahr From: haahr@phoenix.Princeton.EDU (Paul Gluckauf Haahr) Newsgroups: comp.unix.wizards Subject: Re: Is write(2) "atomic" ? Summary: it's worse than just non-atomic, it loses data Message-ID: <3247@phoenix.Princeton.EDU> Date: 12 Jul 88 18:29:20 GMT References: <11410005@eecs.nwu.edu> Reply-To: haahr@princeton.edu (Paul Gluckauf Haahr) Organization: Princeton University, NJ Lines: 172 in article <11410005@eecs.nwu.edu> naim@eecs.nwu.edu (Naim Abdullah) writes: > Do UNIX semantics guarantee that write(2) calls will be "atomic" ? in general, no. it depends on the implementation. use some synchronization primitives or one byte writes only. worse than just mixing up data, if two processes are pounding away at a file, it data may be lost. (see below) > Suppose, process A executes write(fd, "123", 3) and process B > executes write(fd, "456", 3) "concurrently". The file descriptor fd > is shared between them (the file was creat(2)'ed for writing by the > common parent of A and B). Does UNIX guarantee that the contents of > the descriptor will be "123456" or "456123" (depending on which of > A and B won the race) but never "124536" ? Does it make a difference > whether the descriptor is a pipe or a terminal or a disk file or a > tape drive or something else ? some special file types may implement atomic writes. notably, berkeley sockets (at least under SunOS 3.5 and Ultrix 2.0) appear, empirically, to be fully atomic. this is probably to support "reliable" protocols like tcp/ip. (could someone who knows the tcp/ip protocol spec confirm whether or not it requires atomic writes?) as a side effect, pipes when implemented by socketpair(2) seem atomic. i don't know about system v streams, but version 9 streams and pipes are non atomic, and warn about this on their manual pages (along with a comment that a fast reader and slow writers can simulate atomicity). there is a shell archive at the end of this article containing two programs i have used to test the atomicity of writes. the first, write.c, creates two processes. each writes n strings of A or B to standard output, with the length and number of strings set from the command line. no synchronization is attempted. the original process writes strings of A, the child writes strings of B. so, for example, the output of "write 5 4" could be AAAAABBBBBAAAAABBBBBAAAAAAAAAABBBBBBBBBB (where 5 is the length of the writes, and 4 is the number of strings) count.c reads the output of write.c and counts the number of each character (sort of like uniq -c for characters instead of lines). the shell script bufs translates the number of characters into number of buffers, which will be fractional if there was a non-atomic write. "count | bufs 5" for the previous data gives 5 1 A 5 1 B 5 1 A 5 1 B 10 2 A 10 2 B where the problem comes in is larger buffers. using large enough writes gives fractional numbers in the second column, on an nfs or nd filesystem. on a sun, i have not been able to generate a partial record, i.e. "124536" from the original article, with a local disk. what i consider a more serious problem occurs much more frequently than fractional writes. data gets dropped. this occurs with both local and remote file systems. using a "write 8193 15" to a local (smd) disk with an 8192 byte filesystem blocksize on a sun (similar results were seen on a vax) gave 90123 11 A 8193 1 B 8193 1 A 8193 1 <<< empty 16386 2 A 114702 14 B examining the file with od showed nul (0) characters in that area. it takes fewer writes to get a similar result repeatedly with an nfs or nd filesystem. what seems to be happening is that between the time one process writes its data and when it updates the file pointer, the other process gets scheduled to run. to solve this problem, one would need to add locks or semaphores to file table entries to guarantee exclusive access to the file pointers. fortunately, the people who are doing (symmetric) multiprocessor unices have to do this anyway. paul haahr princeton!haahr or haahr@princeton.edu # to unbundle, sh this file # bundled by haahr on dennis at Tue Jul 12 14:04:59 EDT 1988 # contents of bundle: # write.c # count.c # bufs echo write.c >&2 sed 's/^-//' > write.c <<'end of write.c' -#include- -#define atoi(s) (strtol((s), (char **) 0, 0)) -#define streq(s, t) (strcmp((s), (t)) == 0) - -extern char *malloc(); -extern long strtol(); -extern int strcmp(); - -int main(argc, argv) - int argc; - char *argv[]; -{ - int pid, wpid, i, c, n, bufsize; - char *buf; - - if (argc != 3) { - fprintf(stderr, "usage: %s bufsize nwrites\n", argv[0]); - exit(1); - } - bufsize = atoi(argv[1]); - n = atoi(argv[2]); - - if ((pid = fork()) == -1) { - perror("fork"); - exit(1); - } - - if (pid == 0) - c = 'B'; - else - c = 'A'; - if ((buf = malloc(bufsize)) == NULL) { - perror("malloc"); - exit(1); - } - for (i = 0; i < bufsize; i++) - buf[i] = c; - - for (i = 0; i < n; i++) - if (write(1, buf, bufsize) == -1) { - perror("write"); - exit(1); - } - - if (pid != 0) - do - if ((wpid = wait((int *) 0)) == -1) { - perror("wait"); - exit(1); - } - while (wpid != pid); - - return 0; -} end of write.c echo count.c >&2 sed 's/^-//' > count.c <<'end of count.c' -#include - -main() -{ - int c, lastc, n = 0; - do - if ((c = getchar()) == lastc) - n++; - else { - if (n > 0) { - printf("%6d %c\n", n, lastc); - } - n = 1; - lastc = c; - } - while (c != EOF); -} end of count.c echo bufs >&2 sed 's/^-//' > bufs <<'end of bufs' -#! /bin/sh -n=$1 -shift -awk 'NF > 0 { printf "%8d %-8.3g %s\n", $1, $1/'$n', $2 }' $* end of bufs chmod +x bufs