Path: utzoo!utgpu!water!watmath!clyde!rutgers!mcnc!rti!trt
From: trt@rti.UUCP (Thomas Truscott)
Newsgroups: comp.os.misc
Subject: Re: Contiguous files; extent based file systems
Summary: Good opportunity for benchmarking
Message-ID: <1931@rti.UUCP>
Date: 18 Dec 87 06:10:10 GMT
References: <561@amethyst.ma.arizona.edu> <3228@tut.cis.ohio-state.edu> <177@cullsj.UUCP>
Distribution: na
Organization: Research Triangle Institute, RTP, NC
Lines: 24

In article <177@cullsj.UUCP>, gupta@cullsj.UUCP (Yogesh Gupta) writes:
> ... .  However, I find that if I create a 100MB file
> under Unix (BSD 4.2, System V Rel 1), the overhead in
> randomly accessing various parts of it is too high (due to indirect
> inode structures).  Any comments?

Do you have, or would someone be willing to write, a benchmark to
demonstrate this?  For example a program that does I/O on
a 100 000 byte file vs a 100 000 000 byte file.

Sequential I/O certainly seems linear with file size.
E.g. I made a 20 000 000 byte and a 2 000 000 file
on our Gould which has a 1k/8k BSD filesystem
and copying the larger file took "exactly" 10 times longer to copy
(0.925 real 0.00 user 0.40 sys  per 1 000 000 bytes.)
I think random access would be the same.
The larger file needs ~ceil(20 000 000/(8192/4)) == 2 indirect blocks,
and any reasonable I/O system will keep both in the buffer
cache at all times.

Rather than making files contiguous I would prefer larger blocksizes.
For example blocksize == tracksize lets a smart disk controller
eliminate rotational delay.
	Tom Truscott