Home > Columns > Charles Moore
The PowerBook Mystique

Talking To Your ‘Book - Using iListen Dictation Software With Apple Notebooks

by Charles W. Moore

MacSpeech Inc. is celebrating its 10 year anniversary in 2007. The anniversary celebration kicked off with an expanded presence at Macworld in January, and will continue all year.

Incorporated in 1997, MacSpeech grew from a small core staff of programmers to a company with the top selling speech recognition solution for Macintosh (and the only one currently being actively developed). With customers in almost 100 countries, MacSpeech truly became “the little (speech) engine that could,” outlasting even IBM on the platform, who left the Macintosh speech recognition market in 2003.

Last year, MacSpeech released iListen 1.7, the seventh version of that software. Originally introduced in 2000, iListen is now available in versions for US, UK, and Australian/New Zealand English, as well as German, Italian, and Spanish. iListen is unique in that it allows a consistent method of dictation into virtually any application – a feat that has yet to be equaled by any other speech recognition program on Mac or PC. MacSpeech also sells add-ons for iListen that allow it to transcribe speech from a recording, or add commands for many of the most popular Macintosh applications.

To celebrate its 10th anniversary, MacSpeech has published a PDF with its history and vision. Called “The MacSpeech Story,” it can be downloaded from the MacSpeech web site by clicking on the 10th Anniversary Badge on the MacSpeech home page.

Since the first public release of iListen in 2000, MacSpeech, has just gotten better and better with each subsequent version. The current version 1.7, which is a Universal Binary, is now a mature and reliable product, fast and accurate, giving Mac users an excellent speech recognition solution.

Dictation is one of the most processing-intensive tasks many of us will ever want to use computers for, and the more power you have on tap, the better. I’ve used iListen on mostly less than cutting-edge Apple laptops from the get-go (it did run very nicely on my G4 Cube), and find that version 1.7 is satisfactorily lively on my 1.33 GHz PowerBook with 1.5 GB of RAM, at least on a reasonably fresh restart. Transcription response does get a bit draggy when the memory gets cluttered after several days of uptime, and the swapfile parade starts. I’m looking forward to a big performance improvement when I finally upgrade to an Intel Mac.

In the meantime, iListen is working very well for me on my current hardware.

Why use dictation software? The most common reason would be some sort of medical condition or handicap that makes typing difficult or impossible. I have battled nerve pain in my arms and hands (actually all over my body) for many years, and since the early ‘00s typing documents of any length, such as this column, has been difficult and aggravates the condition. I can type for short periods on certain keyboards (PowerBook keyboards have always been pretty good, and so are freestanding boards that have a notebook keyboard type scissors-action such as the Kensington SlimType, but for anything longer then a few paragraphs, I usually employ iListen.

Even if you don’t suffer from typing pain, dictation can be a lot more convenient for, say, transcribing large amounts of hard copy text into a word processor or text editor. It also may be useful to you to to be able to control your computer using voice commands rather than the keyboard or mouse. And some people find it more amenable to express our thoughts orally than with a keyboard or pen and paper. I’m not one of the latter, and my usual modus operandi is to compose articles and columns longhand, read the primary draft into a text editor using iListen, and then do the final edits on the computer. Whatever the case, once you get used to using dictation software, is hard to imagine getting along without it.

So what is iListen like to use?

MacSpeech has the installation process nicely streamlined these days, and there were no surprises, or any change in that department from iListen 1.6. It’s not a drag & drop install, but it is very quick and hassle-free. Note that iListen must have the Enable Access for Assistive Devices checkbox checked in the Universal Access pane of System Preferences. If it isn’t, iListen will ask you to turn it on when it is launched.

Before you are ready to do workaday dictation with iListen, there is some setup required. First you need a really good quality microphone. iListen isn’t quite as picky about microphones as, say, IBM’s now moribund ViaVoice dictation software product for the Mac was, but you still need a microphone that can deliver a clear and clean audio signal. According to MacSpeech, Mac OS X demands a considerably stronger audio signal than Windows does, so many computer microphones don’t work well with OS X because they were designed with Windows computers in mind. MacSpeech has a list of approved microphones for iListen on their web site. They also offer an excellent selection of iListen-compatible microphones in their online store.

The first thing that will happen when you start up iListen is that it will walk you through an orientation routine called “Set Up My Microphone,” which will determine whether your microphone is suitable for dictation. That taken care of, it will be necessary to create a voice profile by running another setup routine that “trains” the software to recognize your particular voice idiosyncrasies. Basically you just read one or more of the training stories that are provided in the setup window, and then the computer will take a few minutes to process the data collected. You don’t need to read all of the training stories. I’ve found that I usually get half decent accuracy after about three of them the have been entered. Multiple users can share iListen on a single machine, but each must create his or her own profile, and must train iListen separately.

Once basic training is completed (you can always return to the “Learn My Voice” module for further training) you’re ready to start using iListen in earnest. iListen has three modes: dictation, spelling, and command. You can switch among them using mouse clicks on the iListen floating feedback palette, or just use voice commands such as “Switch to Command Mode,”, Switch to Spelling Mode” or Switch to Dictation Mode.” The latter is the default.

When you are ready to dictate, make sure that you have positioned your mouse cursor at the desired point your document or text field and click the microphone icon in the feedback palette. To turn off the microphone, click the mic. icon again and a red slash will appear to show that the microphone is off. You can also use the voice command: “Go to Sleep.,” (and “Wake Up”) to activate/mute the microphone again. If you just want to enter Command Mode momentarily, say “One Shot Command” and the program will temporarily enter Command Mode, and then automatically return to Dictation Mode once a command has been executed.

MacSpeech recommends not looking at the computer screen while you’re dictating, since many users who keep their eyes on the document will find themselves pausing to wait for iListen’s transcription to catch up with their dictation. Accuracy will be much better if you just speak to the program using a normal voice and cadence. This is not usually a problem for me, because I’m not a touch typist, and tend to look at the keyboard a lot anyway, and what I’m dictating is more often than not being read in from hard copy. However, touch typists are inclined to keep their eyes on the document.

Note that you will also have to speak punctuation when dictating.

Computer dictation is far from a perfected process yet, and even with the program fully trained, a a certain amount of mistakes are inevitable. A particular hazard of dictation software is that typos (or “dictatos”) are all perfectly spelled, which means spell check software won;t be much help for proofing. I’ve managed to publish more than a few dictation bloopers over the years.

Navigating using voice commands in iListen requires a bit of learning curve scaling. To find a particular word or phrase in the document, say “Do Select” and the program will highlight the first occurrence starting from the beginning of the document. You can also say “Do Search” and iListen will jump to the next occurrence. Once you have a word or phrase selected, you can dictate a different word or phrase, and the new one or replace the old one in the document. Note that iListen needs to keep track of the document insertion point at all times, and if you move the insertion point manually using the mouse even once, it will lose track. The voice search and select function allows you to correct your document without having iListen lose track of the insertion point, and thus the program’s ability to correct mis-recognized words.

iListen’s correction window isn’t for editing your document or correcting spelling mistakes, but rather correcting mis-recognized words, and to help the program fine tune its training of your voice profile, thus improving future accuracy. In order for this to work, you must use correction mode before editing your document manually. In other words, don’t manually change the insertion point before you have run the correction routine. Personally, I think that this aspect of iListen is one that could use a lot more work to make it more flexible, forgiving, convenient, and intuitive. However, if you persevere, you can improve the program’s accuracy substantially using the correction window.

You can also use the correction dialog to add new words to iListen’s dictionary. From time to time you will want to use words that iListen simply does not recognize, but that you don’t use often enough to add to the dictionary. For example certain proper names, particularly foreign ones. In this instance, you use spelling mode, which allows you to spell a word or phrase by pausing briefly between each spoken letter. To type the capital letter, just say the word “capital” followed by the letter name. You can also use the military alphabet (alpha, bravo, Charlie, Delta, etc..) to enter letters, which is helpful because it is difficult for speech recognition technology to accurately recognize similar sounding letters such as ‘F’ and ‘S’, or ‘M’ and ‘N’.

iListen’s third mode, “Command Mode,” allows you to use your voice to do almost anything you would normally do using menus and mouse clicks. iListen incorporates several command sets, which are collections of AppleScripts that tell iListen to execute one or more commands. Command sets are similar to Apple’s Speakable Items, but are all self-contained within iListen.

Global commands are available all the time - as long as iListen is in command mode. Application - specific command sets contain commands that work in a particular application while iListen using command mode. This is how iListen knows not to execute a command meant for Microsoft Word (for instance) when you are working in Apple Mail. If you are running Mac OS X version 10.4 Tiger or later, you can say “Open Spotlight” followed by a word or phrase to activate the program’s Spotlight search feature.

iListen’s Voice Launcher will allow you to open over 150 Macintosh applications simply by saying “Open” followed by the application’s name. Voice Launcher commands are can be edited too. For instance, if you would rather say “Open Word” instead of “Open Microsoft Word” simply change the command name in iListen’s item editor. iListen’s list of Web Favorites will take you to over 150 useful web sites. You can jump to a web site by saying “Jump to” followed by the name of the web site. You can also add your own web sites easily using iListen’s item editor.

iListen also supports text macros in two categories: Normal and Dictation. Normal text macros are only available when iListen using Command Mode. Dictation text macros are available when you’re in Dictation Mode. The latter are useful for including such things as closing signatures or any other text that you type frequently. iListen text macros can hold up to 32000 characters each, or the approximate equivalent of eight pages.

Yet other iListen feature in a routine called “learn my writing style” from the “speech” menu. Using this tool, iListen can analyze documents to determine the user’s writing style, which improves accuracy by training iListen to recognize the way a particular user put words together. You must of course analyze documents written by the user, and not someone else. Fifth with learn my writing style can recognize either plain text were rich text formatted documents.

ilisten’s Transcription feature will generate text from WAV and AIFF audio files. Using a Digital Voice Recorder, you can record now, transcribe later.

The main new features in iListen 1.7 compared with version 1.6.8 are:

- Universal Binary. Now works natively on both Intel-based and PowerPC Macs.
- Totally revamped Commands. Deprecated commands have been removed and new ones added.
- Faster training. Users can now be up and running in as little as 5 minutes.
- Enhanced accuracy and faster typing.
- Profiles are now stored outside the application in a new profile package with a “.voice” extension
- Parameterized Variables (numbers only - you can now create commands that have a variable number in the command name)
- Faster Text Macros
- New Voice Package
- Backups can now be renamed
- TypeKey Helper is built-in
- Text Macros can paste into a document instead of type
- Completely rewritten Voice Launcher
- Completely rewritten Web Favorites
- iListen now installs into an iListen folder instead of a folder named “MacSpeech.” this means you can have an older version of iListen installed at the same time as 1.7 to assist in making a smooth transition.
- Profiles now live in a new type of container called a Voice Package. You can have as many profiles in a Voice Package as you would like (up to available hard drive space). You can also have multiple Voice Packages on your computer. iListen 1.7 automatically creates a Voice Profile with the name of the first Profile you create and saves it in the Documents folder of your Home folder. You can rename the Voice Package by selecting it in the Finder and changing its name (be careful not to change the “.voice” extension, however).
- You can also now rename an iListen Backup package (as long as you don’t change its extension).
- TypeKey Helper is now integrated into the Command Editor window.

- You can now have Text Macros pasted instead of having to wait for them to be typed out. (You must turn this on in iListen’s preferences. The default is for this feature to be turned off.)

- The “chooseApplication” AppleScript has been modified to support Mac OS X’s application file IDs. The new syntax is as follows:

chooseApplication “TextEdit.app” application file id “com.apple.TextEdit”

If you already have iListen 1.6.x or an earlier version installed, iListen 1.7 now resides in a folder named “iListen” instead of “MacSpeech.” This means 1.7 and 1.6.8 can coexist on your computer at the same time, easing a transition to the new version.

A question I get asked fairly frequently is how does iListen compare with PC-only Dragon Naturally-Speaking, which is widely regarded as the gold standard of computer dictation software.

This is difficult for me to answer, because I don’t have any direct experience comparing iListen with Dragon Naturally Speaking, and a problem is that unless one has the extended amount of time it would take to thoroughly train both programs to one’s unique voice profile, the test would not be very meaningful. Another is that, as MacSpeech’s Chuck Rogers has explained to me, response with these programs is so idiosyncratic to a particular user that one person could get better results with one program while another might do better with the other depending on their individual voice characteristics and perhaps their microphone as well. Some dictation software works better with certain mics.

MacSpeech has posted a short commentary on the topic “How Does iListen Compare with Dragon NaturallySpeaking?”, which may be helpful.

System requirements:
iListen 1.7.0 will run on Mac OS X 10.3.x and Tiger (10.4.x). iListen 1.6.8 (using a separate installer) is bundled, and will run on Mac OS X 10.2.8. (You can use the same voice profile under both operating systems by using the back up and restore functions).

iListen is a big, RAM and processor power hungry application. I would suggest that a 500 MHz Mac with at least 512 MB of RAM would be the bare minimum system you would want for reasonable performance.

iListen 1.7 still sells for $99 for a standalone version or as little as $149 for a version that includes a noise-canceling microphone. Registered users of iListen 1.6 or later can upgrade for $39.95.

Registered users of iListen 1.5.1 and earlier can upgrade for $69.95.

For more information, visit:
http://www.macspeech.com/

***

Note: Letters to PowerBook Mystique Mailbag may or may not be published at the editor's discretion. Correspondents' email addresses will NOT be published unless the correspondent specifically requests publication. Letters may be edited for length and/or context.

Opinions expressed in postings to PowerBook Mystique MailBag are owned by the respective correspondents and not necessarily shared or endorsed by the Editor and/or PowerBook Central management.

If you would prefer that your message not appear in PowerBook Mystique Mailbag, we would still like to hear from you. Just clearly mark your message "NOT FOR PUBLICATION," and it will not be published.

CM



apple