From: utzoo!decvax!ucbvax!C70:editor-people
Newsgroups: fa.editor-p
Title: Re: Voice driven editing.
Article-I.D.: ucb.1267
Posted: Fri Jun  4 00:33:36 1982
Received: Sat Jun  5 01:11:36 1982

>From gaines@RAND-UNIX Fri Jun  4 00:30:54 1982
Henry,
  I've been waiting to see if you would get any response to your request
for information about voice-driven editing.  So far, I've seen no replies,
but if you have any that weren't circulated to the list, please forward
them to me.  I have been interested in the subject for some time, but have
done nothing other than think some about the problems.  I have not heard of
much that is happening, either.  My impression is that the speech
recognition people have not yet realized that this is a prime application
area for them, and are not sensitive to its advantages.

There are important elements present in voice-driven editing that are not
present in other speech recognition situations.  The user has a second
input device (the keyboard) available, and it is a feedback situation.  The
user can correct the errors of the speech recognizer.  There are two
important consequences which make this a good task for the study of speech
recognition.  One is that a continuous learning approach can be taken to
speech recognition, since feedback on errors will always be available.
Most speech recognition situations provide only an initial learning period.
The second is that the task can be divided between the keyboard and the
speech recognizer, so that speech input need not be used for everything
until it has advanced far enough.  We might, for example, use voice to
control the cursor and for commands, while continuing to type most words,
at least until word recognition gets much better than it is now.

Another avenue to be explored is stylized speach.  The hardest problem area
in speech recognition, as I understand it, is to recognize continuous
speech.  If there is even the slightest pause between words, recognition is
much easier.  While in many applications restrictions on the speaker would
be unacceptable, it might be acceptable to many when entering text, since
there would still be a substantial efficiency gain over other forms of text
entry.  Also, we could devise sounds quite different from english for
commands (a la Victor Borge!).  The speaker can become trained, as well as
the speech recognizer.

I discussed this recently with Bea Oshika at SDC, who has been active in
speech recognition for many years.  She pointed out some cognitive problems
with mixed mode input.  People, she claims, don't do well at talking and
carrying out manual tasks at the same time.  But I suspect that training in
voice + keyboard input to an editor could produce a more efficient result
for most people.  At least it is an interesting question to investigate.

Stock Gaines