Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Path: utzoo!utgpu!water!watmath!uunet!ig!daemon From: daemon@ig.UUCP Newsgroups: bionet.molbio.news Subject: CSLG|COMMENTARY: From Andrew Coulson Message-ID: <4305@ig.ig.com> Date: Fri, 4-Dec-87 22:16:40 EST Article-I.D.: ig.4305 Posted: Fri Dec 4 22:16:40 1987 Date-Received: Thu, 10-Dec-87 01:43:30 EST Sender: daemon@presto.ig.com Lines: 40 From: Sunil Maulik4-Dec-87 12:33:43-PST,9076;000000000001 Return-Path: <@WISCVM.WISC.EDU:A.F.W.Coulson@EDINBURGH.AC.UK> Received: from WISCVM.WISC.EDU by BIONET-20.ARPA with TCP; Fri 4 Dec 87 12:33:28-PST Received: from UKACRL.BITNET by WISCVM.WISC.EDU ; Fri, 04 Dec 87 14:34:41 CDT Received: from RL.IB by UKACRL.BITNET (Mailer X1.25) with BSMTP id 1126; Fri, 04 Dec 87 20:27:53 GMT Via: UK.AC.RL.EARN; Fri, 04 Dec 87 20:27:52 GMT Received: Via: 000015001006.FTP.MAIL; 4 DEC 87 20:27:44 GMT Date: 04 Dec 87 20:28:06 gmt From: A.F.W.Coulson@EDINBURGH.AC.UK Subject: CSLG Discussion or Conference To: MAULIK%arpa.bionet-20%RL.earn Message-ID: <04 Dec 87 20:28:06 gmt 100798@EMAS-A> Searching large databases for sequence similarities. Ellis Golub asks whether good searching methods find anything of biological interest. You bet they do! Our best case so far is still in press with PNAS (Bownes, M., Shirras,A., Blair,M., Collins,J. and Coulson,A. "Yolk protein degradation times release of ecdysteroid in insect embryogenesis."), but two that have already appeared are in Nature 1987,326, 614-617 and 328,766. We have not applied this method very much so far to DNA databases; the main reaon for this is lack of resources (= personnel), but there are also scientific reasons. For any cistron, it is always better to perform searches and comparisons at the protein level (see Collins J.F. and Coulson,A.F.W. "Molecular sequence comparison and alignment" in Bishop,M.J. and Rawlings,C.J. "Nucleic Acid and Protein Sequence Analysis a Practical Approach", IRL, Oxford, 1987, pp 323-358, if you need convincing); and in any case there is really no scope for sophisticated matching scores in nucleic acid comparisons. Nor do we have much general information about the accepatibility of gaps in non-coding nucleic acid sequences. My feeling is that pattern detection methods are really what is appropriate for nucleic acid database searching, rather than the NWS algorithms. -------