Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP Path: utzoo!mnetor!seismo!nbires!hao!gaia!zhahai From: zhahai@gaia.UUCP (Zhahai Stewart) Newsgroups: comp.databases Subject: Indexed Text Database Query Message-ID: <217@gaia.UUCP> Date: Wed, 17-Dec-86 15:49:03 EST Article-I.D.: gaia.217 Posted: Wed Dec 17 15:49:03 1986 Date-Received: Fri, 19-Dec-86 02:05:45 EST Reply-To: zhahai@gaia.UUCP (Zhahai Stewart) Organization: Gaia Corp, Boulder, CO Lines: 20 Keywords: text index invert There are several commercial text indexing products on the market, which will keep a master index for a set of files (or articles) which for any keyword can quickly tell which files contain that keyword. Further, one can query for a boolean combination of keywords (OR, AND, maybe NOT), and (best trick yet) ask for two keywords to be found within N words of each other. One must of course let the text indexer know when a file is deleted, added, or revised. My question is how this is done, algorithmically. The most obvious approches are slow and would build rather large indices. I am looking for either a description or a reference to some source which treats this subject with enough detail to support an implementation (assume a decent foundation in data structures and basic algorithms). Thanks for any help. ~z~ -- Zhahai Stewart {hao | nbires}!gaia!zhahai