Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP
Path: utzoo!utgpu!water!watmath!uunet!ig!daemon
From: daemon@ig.UUCP
Newsgroups: bionet.molbio.news
Subject: CSLG|COMMENTARY: From Andrew Coulson
Message-ID: <4309@ig.ig.com>
Date: Fri, 4-Dec-87 22:34:35 EST
Article-I.D.: ig.4309
Posted: Fri Dec  4 22:34:35 1987
Date-Received: Thu, 10-Dec-87 01:49:43 EST
Sender: daemon@presto.ig.com
Lines: 46

From: Sunil Maulik 

 4-Dec-87 12:33:43-PST,9076;000000000001
Return-Path: <@WISCVM.WISC.EDU:A.F.W.Coulson@EDINBURGH.AC.UK>
Received: from WISCVM.WISC.EDU by BIONET-20.ARPA with TCP; Fri 4 Dec 87 12:33:28-PST
Received: from UKACRL.BITNET by WISCVM.WISC.EDU ; Fri, 04 Dec 87 14:34:41 CDT
Received: from RL.IB by UKACRL.BITNET (Mailer X1.25) with BSMTP id 1126; Fri,
 04 Dec 87 20:27:53 GMT
Via:        UK.AC.RL.EARN; Fri, 04 Dec 87 20:27:52 GMT
Received:
Via:        000015001006.FTP.MAIL;  4 DEC 87 20:27:44 GMT
Date:       04 Dec 87  20:28:06 gmt
From:       A.F.W.Coulson@EDINBURGH.AC.UK
Subject:    CSLG Discussion or Conference
To:         MAULIK%arpa.bionet-20%RL.earn
Message-ID: <04 Dec 87  20:28:06 gmt  100798@EMAS-A>


       Searching large databases for sequence similarities.

        Since Alex Reisner has mentioned transputers, I would extend what he say
somewhat:-
        Viewed as a general purpose computing machine, a single T800 has
comparable power to a VAX (?750 ?780); the NWS algorithms for a single pairwise
comparison will run in a practicable time on a VAX.
So the database search problem can be farmed out to a heap of transputers, using
one for each pairwise comparison. If you have 1000 transputers (the planned
size of the Edinburgh Concurrent Supercomputer; the first 200 are being installe
at present), and 6000 sequences in the database, each machine only has to do
six jobs.  So the problem can be mapped in a simple way onto ECS, and no
doubt we shall do some searches this way, because why not?  But this application
(as I've said already) doesn't use the general purpose computing facilities of
the transputer processor,  so though it may be an effective solution, it will
not be a cost-effective one compared to the DAP.  Of course, this may not
matter to individuals who do not themselves have to find the true economic
cost of the computing they do......
         I don't know much about the Connection Machine apart from what I read i
Scientific American, but I think a similar point may apply.  As I understand
it, it will be possible to use a similar mapping to that used in the DAP search
program on the Connection Machine, but the only communication this needs is
between each node and two neighbours, so that the expensively-provided richness
and flexibility of interconnection may not be used very effectively.  As I say,
I don't know much about it, and if anybody has an idea for mapping the problem
in a way which makes better use of the particular strengths of this machine
(or of a transputer array), I shall be interested to hear it.
-------