Xref: utzoo uw.talks:28 ont.events:1308 Path: utzoo!utgpu!watmath!watcgl!ksbooth From: ksbooth@watcgl.waterloo.edu Newsgroups: uw.talks,uw.icr,ont.events Subject: ICR Evening Lecture Series Message-ID: <11594@watcgl.waterloo.edu> Date: 24 Sep 89 03:47:20 GMT Sender: ksbooth@watcgl.waterloo.edu Distribution: uw Lines: 39 The ICR Evening Lecture Series Monday, September 25, 1989 8:00 p.m. DC 1302 Towards an Electronic Oxford English Dictionary Frank Wm. Tompa Professor of Computer Science Since 1984, the Oxford University Press has been working towards computerization of the 20-volume Oxford English Dictionary (OED). As a joint venture with the Press, the University of Waterloo has been designing an on-line dictionary database suitable for editors charged with maintaining the OED, lexicographers working on other dictionaries, and researchers who wish to consult the OED interactively. Several innovative components have been developed as part of the project. For example, the PAT text searching engine retrieves all occurrences of words or phrases appearing in the 540 Mbyte OED in less than 1 second. The approaches used to computerize the OED are equally applicable to managing the text inventory of many other organizations, whether or not the enterprise has formal publication as a goal. This lecture will recount the experience at Waterloo with the OED to illustrate the major ideas that have emerged. Prior to his appointment to the Department of Computer Science in 1974, Frank Tompa attended Brown University, from which he received BSc and MSc degrees in Applied Math, and the University of Toronto, from which he received a PhD in Computer Science. He is currently a Professor in the Department, a member of the Data Structuring Group, and a co-Director of the UW Centre for the New OED. Professor Tompa's interests span the fields of data structures and databases. In recent years, he has been particularly interested in database design for videotex systems and in the design of text management systems suitable for maintaining large bodies of text.