Path: utzoo!yunexus!geac!john From: john@geac.UUCP (John Henshaw) Newsgroups: comp.databases Subject: Re: Unix machines for large databases Message-ID: <2728@geac.UUCP> Date: 10 May 88 19:26:11 GMT Article-I.D.: geac.2728 Posted: Tue May 10 15:26:11 1988 References: <564@hscfvax.harvard.edu> <3102@edm.UUCP> Reply-To: john@geac.UUCP (John Henshaw) Organization: The little blue rock next to that twinkly star. Lines: 50 In article <3102@edm.UUCP> news@edm.UUCP (news software) writes: >From article <564@hscfvax.harvard.edu>, pavlov@hscfvax.harvard.edu (G.Pavlov): ># But using raw disk i/o per se doesn't guarantee anything, does it ? > >It tends to promise that address locality implies spacial locality. This is >a nice assumption to be able to make when you want to improve your speed. > Stephen Samuel "Tends to promise" and "nice assumption" are a bit vague. They're implementation details only. Raw partitions offer the host DBMS the opportunity to control the data access in a fashion that guarentees various criteria (speed, data compaction, data clustering, recovery, forced writes, etc..) In the case of UNIX, this is rather important. The UNIX kernel buffers reads/writes from/to disk. This has the effect of not being able to offer "guarenteed delivery" of buffers to disk. Even the "sync" operation merely marks (all) buffers for writing - again no guarentee that data is written. UNIX does offer "raw data partitions" in which the program has control over the disk space. However, all I/O to this area is "blocked I/O", and the write will complete before control is handed back to the program (ie. host DBMS). This is a "forced write" - considered expensive but certainly necessary in some situations. There is no way (that I know of) to tell the O/S to write a buffer, and hand me back control immediately (non-forced write). (Buffer re-use at this stage is not possible until I either ask the O/S to see if the buffer is written, or the O/S tells me somehow.) Raw disk I/O guarentees forced writes at the expense of throughput. Maybe. (sigh) If you use a HSC50 or HSC70 disk controller, which has its own local cache, then a "forced write" is a transfer to HSC local cache only. (Damn! It's *still* not on disk. Will it ever get there?) It seems to me that access to data, and access to object code is quite different, and that most O/Ss are designed for the latter. A raw partition allows optimization of the former (as S. Samuel suggests above). This also points out that gains are to be made in the intelligent reconciliation of O/Ss and DBMSs. -john- -john- -- John Henshaw, (mnetor, yunexus, utgpu !geac!john) Geac Computers Ltd. If we don't pay for education now, are we Markham, Ontario, Canada, eh? going to be able to pay for ignorance later?