Path: utzoo!utgpu!water!watmath!clyde!att!osu-cis!tut.cis.ohio-state.edu!bloom-beacon!mit-eddie!nessus From: nessus@athena.mit.edu (Doug Alan) Newsgroups: comp.unix.wizards Subject: Autoconfig Message-ID: <9600@eddie.MIT.EDU> Date: 29 Jun 88 18:29:51 GMT Sender: uucp@eddie.MIT.EDU Organization: Kate Bush and Butthole Surfers Fandom Center Lines: 68 I'm led to believe from reading "Building Berkeley UNIX* Kernels with Config", that if your system works, you should be able to power down your system, pull out a controller from the bus (replacing it with a grant card), and reboot the system, and your system will still boot, as long as the controller that you removed wasn't critical for booting. Unfortunately, this usually does not seem work for us. Depending on which controller I pull out, some different sorts of things happen. For some boards it works. An example of this is a DHV11. If I remove this, the system still boots fine. For some controllers, the system hangs when it gets to the point in the autoconfig sequence where the missing controller would normally be found. A more preculiar mode of failure happens when I remove a second disk controller. The autoconfig sequence finds the first controller twice! And both times it finds it at the same CSR address. It assigns each disk drive to two different device names. The autoconfig sequence then merrily continues on, and seems to be working fine, until the system finally gets to the point where it tries to give you a /bin/sh. At this point it hangs. Does someone have any idea what is going on, and how I can get things to work, so that I can remove controllers without building a new kernal? We use VAXstation II's, running 4.3BSD+NFS (from U of Wisc). The disk controllers are Sigma RQD11-EC's (ESDI MSCP Qbus controllers). I also have another, perhaps related, problem, which maybe someone has an idea about. We have a uVax-II with two of the aforemention disk controllers and the aforementioned kernal. It also has a Wespecorp tape controller. I want to put in a DHV11, but whenever I do, it doesn't work right. With the DHV11 in, autoconfig seems to find it fine, but if I try to run 'stty' on one of the DHV11's terminal lines (let's say "stty all > /dev/ttyS0"), it hangs. If I do this from the Bourne Shell, I can ^C out of it, but I get some sort of error (I don't remember the exact message... perhaps something like "no such device"). If I do this from the C Shell, ^C and ^Z don't do anything. Another problem that seems to occur with the DHV11 in, is that some C programs, occasionally, when trying to dump core, cause the whole system to become wedged. I'm pretty sure I have the right device numbers on /dev/ttyS0, because we have other systems with a DHV11 and the same kernal, and the DHV11 works on them. The other systems, don't, however, have a tape controller and two disk controllers. Another piece to the puzzle is that the tape controller in the past seemed to be causing us some problems. The problem was that whenever a filesystem on a disk controller that was farther out on the bus than the tape controller, was dumped to tape, any process, including the process accessing that disk drive would hang. The fix for this was to move the tape controller to be further out on the bus than all the disk controllers. I thought for a while that perhaps the problem was that we weren't using the official DEC CSR addresses and interupt vectors for the disk controllers and DHV11. I didn't think with Unix this should make any difference as long as everything was spaced out enough. (The official DEC CSR addresses and interupt vectors are a real pain, because if you add another disk controller, you have to go and perform hairy calculations and then use those to guide yourself in flipping dip switches on the DHV11). In any case, I went through all the work of making all the CSR addresses and interrupt vectors be up to DEC standard, and this changed nothing. Anyone have any ideas? |>oug /\lan (or nessus@athena.mit.edu nessus@mit-eddie.uucp)