Sunday, January 29, 2006

IBM Speech Technology Roundup

PC Magazine has an article on speech recognition technology research at IBM. The article starts off reporting on the release of Embedded ViaVoice 4.4 and then goes on to talk about the various bits of speech research going on across IBM - and there is quite a lot.

After the hype of a few years ago quieted down, speech recognition technology has been progressing slowly but steadily with implementations being fielded, tinkered with, and re-tried until useful applications emerged in the darwinian world of day-to-day business. Just try making an airline reservation over the phone or calling a major credit/charge card company and you will see what I mean.

Some pretty impressive achievements are claimed in the article about not only speech recognition but language translation. English-Chinese translation is being addressed by a project called MASTOR (Multilingual Automatic Speech-to-Speech Translator) while on-the-fly translation of Arabic television to create English subtitles is tackled by project Tales. Tales is said to achieve an accuracy rate of between 60 and 70 percent with a four minute lag time (for processing) and 80 percent with a longer one. In an article like this, one usually doesn't get enough details to satisfy one's technical curiosity - such as how is 'accuracy' defined and exactly what conditions it achieved those results - but even if these numbers are a little too optimistic, it is still more than good enough to 'gist' Arabic in almost in real-time.

No comments: