1997-01-21 - Re: Keyword scanning/speech recognition

Header Data

From: Rick Smith <smith@sctc.com>
To: pgut001@cs.auckland.ac.nz
Message Hash: b97b46ee7b9b8df9646581188a28e6766355cc21fa5663705814522bf1d9369b
Message ID: <199701212359.PAA14327@toad.com>
Reply To: N/A
UTC Datetime: 1997-01-21 23:59:49 UTC
Raw Date: Tue, 21 Jan 1997 15:59:49 -0800 (PST)

Raw message

From: Rick Smith <smith@sctc.com>
Date: Tue, 21 Jan 1997 15:59:49 -0800 (PST)
To: pgut001@cs.auckland.ac.nz
Subject: Re: Keyword scanning/speech recognition
Message-ID: <199701212359.PAA14327@toad.com>
MIME-Version: 1.0
Content-Type: text/plain


pgut001@cs.auckland.ac.nz wrote:

: I was talking to someone recently about the feasibilty of keyword-scanning
: phone conversations....

My first "real" job in the computer industry was for a garage shop
doing speech recognition. We did a demo system in 1977 for Rome Labs
that did exactly what you're asking about: scanning a stream of
continuous speech over a telephone line looking for key words. It was
tolerably effective: I forget the success rate but it was above 90%.
But we were never asked to go past the research prototype.

We did it the "hard way" in that we were trying to solve the "talk to
the computer" problem which is harder than the "look for something
suspicious worth looking closer at" problem. I expect they were looking
for something to cut down on their false positives and perhaps we weren't
significantly better than what they were already doing.

: "Discrete Utterance Speech Recognition without Time Alignment", John Shore
: and David Burton, IEEE Trans.Information Theory, Vol.29, No.4 (July 1983),
:  p.473.
:  
:This generates a feature vector every 10-30ms from input speech which is
:compared to pre-generated reference sequences.  It also has references to many
:other papers covering the same area.

When I worked in the field "discrete utterance" was the buzz phrase
for talking with - pauses - between - each - word. Ecch. Our
commercial systems at the time (late '70s) used discrete speech
without time alignment since we could process 8 input channels
simultaneously.  Ahhh. The joys of microcoding for a 74S181 ALU.

Rick.
smith@sctc.com            secure computing corporation






Thread