Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP
Path: utzoo!mnetor!uunet!husc6!rutgers!ukma!psuvm.bitnet!uh2
From: UH2@PSUVM.BITNET (Lee Sailer)
Newsgroups: misc.wanted,comp.databases,sci.research,comp.ai,comp.misc
Subject: Re: pattern recognition software (recognizing humpback fins!) wanted
Message-ID: <26070UH2@PSUVM>
Date: Wed, 25-Nov-87 11:19:38 EST
Article-I.D.: PSUVM.26070UH2
Posted: Wed Nov 25 11:19:38 1987
Date-Received: Sun, 29-Nov-87 13:37:06 EST
References: <1163@uhccux.UUCP>
Distribution: na
Organization: The Pennsylvania State University - Computation Center
Lines: 33
Xref: mnetor misc.wanted:1703 comp.databases:606 sci.research:295 comp.ai:1165 comp.misc:1699

I can think of some pretty good ways to do this, but not with
database software, unless the matching problem is really simple.
     
The current masters of *sequence matching* are the molecular biologists,
who spend a lot of time matching LONG sequences of RNA, DNA, etc.
     
One approach
     
Can the fins be described with a simple sequence of tokens or symbols, like
     ?  If so, then you've
got the DWIM (do what I mean) or spelling correction problem.  Given a
sequence of symbols, find the set of legal sequences that are close.
This turns out to be a graph search.
     
Another approach
     
Are accurate measurements needed to distinguish nearly identical fins?
If so, then a fin must be described something like this:
     
  gap of 15.2mm
  notch width 5mm depth 3mm
  gap of 45 mm
  notch width 3mm depth 5mm
  tip
  etc etc etc
     
If you think of a 'gap' as a notch with width 0, and the tip as a notch
of width and depth 0, then each feature characterized by a triple
of real numbers.  Using the   and  as
landmarks, it ought to be possible to think up some way to convert
each fin to a point in N-space, and then to compute the distance
between a new fin and the 300-400 fins already in the database.