Speech Recognition

by Bob Seidel

* There was an interesting detective article in the papers this past week - perhaps you saw it. A couple in New York had found a fairly expensive digital camera in a cab and decided to try to return it to its rightful owner. The cabbie was no help, so their only clues were the photos and videos already stored on the camera. To make a long story short, they finally were able to track down the owner, who was in the US on vacation from Ireland. If you haven't read the article, you can find it at news.google.com - search for "found camera".

But that brings up an interesting point: How many of us properly label our expensive electronic equipment? It is true that someone can often find out your information from your cell phone, but how about your camera, or MP3 player? Don't be a victim - label your stuff now. If you don't want to put a sticky label on your camera, you could always take a photo of yourself holding a card with your address and phone number on it and leave it in the camera. Be creative, but be safe!

* Speech recognition is getting a lot of press lately. In particular, a series of Ford commercials touting their latest car with voice recognition to play music and make cell calls. So what is going on and what has changed?

The big three of non-textual human interaction has always been: text-to speech, speech recognition, and handwriting interpretation. I was involved in these tangentially since I started working with computers.

My first experience with text-to-speech and computer voice synthesis was back in the late 70's. At work I had a chance to play with an early speech synthesizer, but most of my work was at home on my TRS-80. Yes, I did have a speech synth on my trash-80, and it worked pretty well. I could program educational games for my children that actually spoke to them. This was especially beneficial when the kids had not yet learned to read. Oh, the sound was very machine-like, but quite legible. Speech synthesis technology was the most straightforward of the three, and these days had attained a very high level of quality.

Handwriting interpretation has never worked well. Let's face it - our handwriting is atrocious, and with the advent of PCs it is getting worse. Where schools used to stress penmanship, now they stress keyboarding. Apple tried first to do direct handwritten input in a handheld computer called the Newton; the results were literally comical. The Newton was withdrawn quite quickly. And it's not much better these days. My pocket PC (aka AT&T Tilt) does it to some degree, but it works best if you use a special character set, and it is not at all normal cursive writing. I don't use it much as the Tilt has a slide-out keyboard.

But lately speech recognition has taken some giant leaps forward. Companies like IBM and Dragon were leaders a few years ago, but it seems now that Microsoft has assumed the lead and it aggressively marketing it for everyday products, such as in cars. In fact, I use the Microsoft Voice Command package in my Tilt (an optional package that costs about $25) and it is pretty accurate.

The most important thing to realize is that you do not have to pre-record your voice to use this facility - it really does recognize any voice. Thus I can dial any of my hundreds of contacts with no prior effort. It will also tell me the date or time, the status of my battery, and my upcoming appointments.

There are still some drawbacks. First, you still need a fairly quiet environment. My phone will often "false" if there is a TV or radio playing in the background. You have to speak clearly, and provide a more than normal quiet space between words. You also have to learn the vocabulary that is in use for your particular device. For example, I can say "Dial" or "Call" to start a command, but something like "Get" would just confuse the computer.

I predict that you will see much more voice recognition in the future. For now, I am going to finish this column and call my editor. "Call Suzi Drake at work"!

(Bob Seidel is a local computer consultant in the Southport - Oak Island area. You can visit his Website at www.bobseidel.com or e-mail questions or column ideas to him at bsc@bobseidel.com. For specific inquiries, please call Bob Seidel Consulting, LLC at 278-1007.)