Path: utzoo!attcan!utgpu!jarvis.csri.toronto.edu!mailrus!cornell!ken From: ken@gvax.cs.cornell.edu (Ken Birman) Newsgroups: comp.sys.isis Subject: Re: ISIS suggestions Message-ID: <32799@cornell.UUCP> Date: 3 Oct 89 16:10:01 GMT References:Sender: nobody@cornell.UUCP Reply-To: ken@gvax.cs.cornell.edu (Ken Birman) Organization: Cornell Univ. CS Dept, Ithaca NY Lines: 254 This will be a rather long reply, since the original message was pretty long and consisted of lots of questions. Plus, I tend to be long winded. First, an overall comment. ISIS is a research project with limited financial resources, currently coming from DARPA and NASA. We are paid to do research, and although we are also encouraged to make the system available -- which we are trying our best to do -- unless commercial interest in ISIS creates a self-sustaining market, it will always be hard for us to respond to commercial pressures. For my group, the essential point is that we have built a powerful tool and want people to be able to use it. We didn't build this one to throw it away... At the same time, we aren't the Free Software Foundation. Will people pay for ISIS? If so, we can pay people to address some of the needs identified in your message. As you know, our company has been offering commercial support and consulting services connected with ISIS for a year now. We are certainly making money, but mostly for consulting -- not for support. You, and others, have pointed to things you would like to see in ISIS. Does this mean you would pay to see them in ISIS? If so, I think we should sit down and talk about it seriously... If not -- well, we do our best to respond to the needs that we perceive to be most urgent. All things considered, I think we do pretty well. > [from a prior message] ... lots of people seem to be looking at ISIS, > but it is unclear to me how many actually use it. My guess is that this is correct. We know of about 300 sites that are "looking" at ISIS, since we have names of about this many people who requested the system. But, in actual practice, I can only name about 20 or 30 active projects that seem to be doing something with the system. For example, here at Cornell the computer graphics people use it in image rendering programs, at Siemens there is a factory automation group using ISIS, at SAIC a group is using it to write a distributed application management system. We know of one French group using ISIS for experiments on telephone switching systems, a group at NASA that (was planning) to use ISIS in a playload management system for the spacelab, someone at XEROX doing distributed AI problems on ISIS, and quite a few people playing with ISIS at HPLABS, HP's research laboratory. The list of users of which I know is much longer than this -- and not everyone tells me what they do. But, the main observation to make is that almost all of these are advanced proto- typing efforts or experimental research groups. As for any technology, I think we have a learning curve to overcome before we see widespread day to day use of a system like ISIS. Obviously, people want something -- otherwise they wouldn't look hard at ISIS. My guess, however, is that they want something even larger and more ambitious -- perhaps a whole new kind of distributed operating system, with distributed tools for development, runtime application management, monitoring, debugging, network administration. Another problem is that ISIS is "different". This means that one runs some risk in using ISIS for a project (and depending on our style of computing). I am not surprised that many groups are being slow and cautious about this, especially commercial ones. It probably hasn't helped that we have been enhancing the system -- this gives an aura of unreliability to our stuff, although in practice (as those of you who really use it know), ISIS is extremely robust. This has been a dilema for us: do we tell you how to fix bugs that you probably don't run into (thus admitting to having a bug) or do we pretend the system has none (and risk having you notice one)? For what its worth, nobody has reported any bugs of any kind at all in ISIS V1.3, so far. My guess is that over the next year we will begin to see modest "commercial" use of ISIS, and that several of the research efforts that have been toying with the use of the system will make firm decisions one way or the other. >... rpc schema parser,faster isis overall, protos at the kernel level, etc. Quick answers: o We played with the idea of an RPC schema parser but had a lot of trouble coming up with a clean way to imbed broadcast semantics into a language-level RPC facility. Some interesting research topics that this seems to raise are: how one represents a process group at the language level (or a capability), how you express the idea of wanting zero, one or multiple replies, how to deal with type checking issues. I think that Robert Cooper (who runs the project with me and Keith Marzullo) would be happy to run a discussion of this topic. Perhaps someone could post a proposal for an RPC syntax covering this. A really crummy RPC stub generator was included in the demos area for a while. We recentl got rid of it because it wasn't very well done. Lacking answers to the questions above, I think it is premature to talk about multicast stub generation. Of course, it would be easy to generate stubs for programs that have pure RPC semantics -- one caller, one reply. I guess we haven't seen this as a topic with any big payoff for ISIS, though, since ISIS is really not an RPC system... o Faster ISIS overall. This is an ongoing focus of our current work. With the BYPASS mode in place, we should have a version of cbcast running as fast as any other multicast protocol from any other group. This is because BYPASS allows us to call a multicast transport protocol more or less directly from the sender process. Thus, if you have some magic, super-fast transport protocol that gives nothing more than reliable delivery, we can run a cbcast on top of it with very little overhead (and not going through protos). In ISIS V2.0, BYPASS mode will be the default (I hope). As for the rest of ISIS, we are working on a hierarchical group mechanism to let us scale groups up, a hierarchical site-views scheme to let us scale ISIS itself up, and tuning overall performance. However, overall performance of the system is actually pretty good. For example, an ISIS RPC (in BYPASS mode) outperforms a SUN RPC, on the same equipment. This is true for almost any size of message. So, our overhead is obviously not all that high. o Protos in the kernel. Well, for this we are waiting to see what MACH or Chorus provides in terms of kernel support for doing this. We are not interested in doing an OS that won't be "compatible" with what seems to be an emerging standard. I would certainly like to make protos into a kernel service once the option presents itself. We have also thought about trying to take advantage of shared memory, say messages. This could be a big win -- but raises hard protection issues. However, based on the recent posting to comp.os.mach, it isn't clear to me that we could do this easily under MACH just yet. Anyway, with the BYPASS change in place, protos won't be the bottleneck anymore. This will probably turn out to be the cost of UDP datagrams (which cost quite a lot under UNIX). What I would love to see is a UDP level multicast protocol. VMTP? >* you might consider making the ISIS manual texinfo. I'm not *sure* > it's a good idea but you might think about it. We currently have no plans to do this. Note, however, that you can run Xdvi or dvitool to read the dvi version of the manual from Xwindows or Suntools. Were you aware of the online manual pages? With the right environment parameter setup (MANPATH, I think), you should be able to say "man 3 spool" or whatever... >* We need some kind of performance discussion. I'd be happy to see a > "what is" column as well as a "soon" column, but I'd really like to > have some idea. I realize it will be highly application dependant, > (ie, traffic dependant), but even point to point, with and without > nullreplies, how many udp packets in a cbcast? etc. I agree! However, I've been waiting for the BYPASS stuff to be finished for this, simply because I want to report figures that are as good or better than anything you might read about for, say, the V kernel. I have no plans to write a paper on performance of ISIS without BYPASS in place, as this would be slow. We will also post some performance figures to this group as we obtain them. To answer your questions about packet counts: With BYPASS, and without ethernet multicast, 1 UDP packet per cbcast destination and 1 UDP packet for each reply or nullreply. When ISIS runs without BYPASS, the cost is higher: an RPC from the client into protos to invoke the protocol, and an RPC from protos to the destination to deliver each copy (plus another for each reply). Plus, BYPASS has no piggybacking, while protos has a complex piggybacking scheme with a garbage collection algorithm that runs periodically. We have measured piggybacking costs and on the average found that at most 30% of all bytes transmitted are due to piggybacking and that garbage collection never amounts to more than 5% of CPU cycles consumed by ISIS (this is because the protocol is cheap, runs rarely, and runs on behalf of LOTS of cbcasts). However, there are some ways to make ISIS run poorly. For example, when building a "service", using cbcast to reply to queries seems to be a bad idea, and I am installing a new option to reply_l ("f") to force the system to reply with fbcast for just this reason. The manual section, when we write it, will certainly cover these things. Sorry this seems complicated. There never will be any single, simple answer, because ISIS does things in different ways under different conditions, to optimize where we can. Just to repeat the bottom line, with BYPASS in place, cbcast in ISIS will have a realistic chance of being "the worlds fastest multicast" (under UNIX). >* Makefiles need better parameterization. Specifically, I want to be > able to change CC, CFLAGS, LD, LOADLIBES, BINDIR (default > /usr/spool/isis/bin is a poor choice), ISISHOME (your default is > /usr/spool/isis), LIBDIR (eg, /usr/local/lib with better names for > lib[12].a mlib.a, HEADERDIR (eg, /usr/include/isis for all headers), > MANDIR (eg, /usr/local/man), etc. with precisely one line. I don't > really care whether I have to change it in the makefile, a header > file, or command line but changing it in several makefiles seems > redundant. We'll have a hard look at this. These seem like simple and reasonable requests. >* lib[12].a and mlib.a are poor choices. Makes them difficult to > install in a global place like /usr/local/lib. Ah. Now, this is a tricky one. mlib is separate only because some of our people use it standalone. In particular, our own protos code uses mlib, but nothing from clib. Lib1 and lib2 are separate because the UNIX ``ranlib'' utility is totally braindamaged. Specicially, we wanted to arrange that if you don't use a given tool, say the "guard" facility (which we may eliminate, since nobody seems to use it), it wouldn't get linked in. We used to do this by putting the facility in question first, then anything that calls its init routine, and then a dummy version of the init routine. If you use the facility the linker finds it first, and you get it. If not, you get the dummy init routine and a smaller object file. ranlib(1) screws this up completely. You can't have duplicate defs in the same library with ranlib, so we needed to use two libraries. And, ranlib is mandatory these days. So, we put the optional stuff in lib1, and all else into lib2. So, unless you can see a trick, I have no way to build a single library that would also be "portable". Believe me, I would really prefer to do so. Suggestions? We are looking at using shared libraries, by the way. The problem is that we need to stay portable... >* would prefer that all headers were in and were > "install"'d there by the install makefile target. (also, would need > to change the references in the manual.) I guess that "make install" would be a reasonable thing to add. How about if we provide this for ISISV2.0? >* Makefiles force recompilation too frequently. That is, "touch > header" sort of defeats the purpose make in the first place dunnit? There are two issues here. First, the comment about the header is only relevant to one makefile -- the one for mlib. If you look closely, you will see that this is actually a no-op and that the header can be deleted with no effect. This seems to be something Tommy Joseph put in a long time ago. I certainly isn't mine. Now, we do rebuild libraries more than we need to, but this is just because some UNIX systems have an old version of make that doesn't support the notation of "a file in a library". Being portable is a pain. As for recompiling more than we need to -- I dispute this. It seems to me that the makefiles recompile just what they should. >* would like to see "make clean". Yes, I suppose we can do this for V2.0 too. Whew.... I hope that some of the readers of this group will respond to the points raised by Rich and/or to some of the responses outlined above. I know that there are readers out there somewhere! Lets hear from you!