Help for a dead language: Three spelling checkers

©1982 Lawrence I. Charters

Bremerton, Washington

80-U.S. Journal, February 1982, pp. 23-26

The advertisement says, “Throw It Away (your dictionary, that is),” and implies that bad spelling is a thing of the past — once you have a computer dictionary. While this is a great ad, it isn’t quite accurate. A spelling checker should definitely improve your spelling, but you’ll still need your trusty paperback version. Spelling checkers are huge, complex, expensive programs; even so, they are not masters of English. Spelling checkers are thorough. A good spelling program will find errors that even the pickiest English teacher may fail to see. Spelling dictionaries may not seem particularly exciting, but you may be surprised.

Some History

When the term “word processing” first came into vogue, most people didn’t have the slightest idea what it meant. The concept of a “processed word” implies that there must also be “unprocessed” or “raw” words and did anyone really want to know what they were? (Some things are best left unknown…) Then, along came personal computers, text editors, Electric Pencil and Scripsit. Suddenly, a “word processor” was revealed as something very useful and desirable: a magic typewriter. Pencils, pens, typewriters and correcting fluid became obsolete. Even the worst typist could churn out letter-perfect copy given a computer, decent software and a printer. Well, almost…

When IBM introduced their Displaywriter word processor not too long ago, one feature came in for special attention: the Displaywriter had a built-in 50,000 word dictionary. After a document was finished, Displaywriter’s dictionary (on diskette) would check every word in the document and all “unknown” words were highlighted in reverse video. The writer could then go back through the document and see if the highlighted words were typographical errors, misspellings, or simply words or names the dictionary didn’t recognize. The Displaywriter, at $8,000, was an instant hit and the dictionary was no small part of this success.

After seeing the Displaywriter in action, many observers wondered, “But what if you don’t have $8,000 and are a rotten speller? Is there hope?” Despair not. Anything IBM can do, a TRS-80 can probably do, and cheaper, too.

Three Different Approaches

For this evaluation, we examined three spelling checkers which are available for the TRS-80. The least expensive, Proofreader, from Aspen Software (formerly Soft-Tools), will scan a disk document file of any length, keep track of all unique words and display the number of unique words found. These words are then sorted and compared to Proofreader’s dictionary of 38,000 words and any suspect words are listed to the screen, one per line. If desired, Proofreader will write this list to a disk file, which can then be printed by a printer. Proofreader does not have any provision for correcting the document; you must use your word processor, text editor, or Proof-edit (also from Aspen) to go back and manually search out and correct any misspellings or typographical errors. Proofreader comes on three diskettes and requires 37K and one disk drive.

Slightly more expensive, and very different in approach, is Hexspell, from Hexagon Systems. Hexspell will scan a disk document file of any length, scrolling the document up the screen as it does so and compares the text against its 29;000 word dictionary. When Hexspell comes across something it doesn’t recognize, it shows the word in context and asks if it should leave the word as is, add the word to Hexspell’s dictionary, or replace the word. If you choose to replace the word, Hexspell asks you to type in your correction. While Hexspell is active, it maintains its own separate copy of your document and when finished, automatically copies this corrected version back to your original file. Hexspell also provides a total word count for the document and notes the number of words which it did not recognize. Hexspell comes on two diskettes and requires a 48K, two-drive system.

Perhaps the best known (and most expensive) system is Microproof, from Cornucopia Software. It is available in a number of configurations including a version which will handle any size document. The version we received handled documents up to about 4,000 words, the limit imposed by the TRS-80’s memory. In its basic form, Microproof first scans the document, then checks it against its 50,000 word dictionary. Suspect words are listed on the screen or, optionally, printed by a printer. You must then use your word processor or text editor to find·and correct all misspellings.

With the optional correcting ($60) and word processing integration features ($35), Microproof operation is greatly simplified. From Scripsit, Electric Pencil and other popular word processors, you can directly call Microproof, which automatically scans whatever document you are working. Suspect words are then listed on the screen, one at a time, and you are asked if you wish to leave them as is, replace them, or add them to Microproof’ s dictionary.

Microproof will, after initial corrections are made, display in context only those words specifically requested. Obvious misspellings, such as “sincereley” for “sincerely”, can be handled quickly and puzzling terms (is “IL” a misspelling of “ill” or an abbreviation for “Illinois?”) can be given greater attention. Once the document has been corrected, Microproof reloads the word processor and the corrected document. Microproof comes on one double-sided “flippy” diskette and requires 32K and one disk drive.

Though not mentioned in Cornucopia’s advertising, a two-drive system (or operating system allowing single-drive file copying) is required for setup.

The Details

All three spelling checkers were tested on a 48K, two-drive (35-track) TRS-80 Model I, with Radio Shack’s lower case modification, memory, drives, etc., using TRS-DOS and Apparat’s NEWDOS/ 80version1.0. The only non-standard hardware was Percom’s Doubler II, used with double-density NEWDOS/ 80 2.0. Sample text files for testing were prepared with standard Scripsit and Scripsit modified with Acorn Software’s Superscript.

Proofreader

Aspen Software’s Proofreader performed rapidly and consistently. The main program, PROOFRDR/ CMD, is loaded first, then the file to be checked. Once Proofreader has sorted all unique words in the document file, the two dictionary disks, DICTl/ BIN and DICT2/ BIN, are loaded one at a time into any drive (a system disk is not required). Both dictionaries are around 69,000 bytes long and Proofreader scans them in alphabetic order according to length. In fact, if a document consists of nothing but short words, Proofreader won’t bother to prompt for DICT2/ BIN. If desired, an optional dictionary AUXDICT I TXT, is checked. As a final step, Proofreader prompts for a system diskette to be loaded in drive 0 and will then save the suspect word list to disk if desired. The only apparent way to “crash” Proofreader is to ignore this prompt — all other steps are carefully checked by the program. If you do crash the program, no harm is done since Proofreader does not modify the original document file. All necessary instructions are carefully presented in the eight page unindexed manual.

Proofreader does not distinguish between upper and lower case, does not check anything beginning with something other than a letter and does not check single-letter words. When the list of suspect words is displayed on the screen, it is in upper case only and if printed on a printer, appears all in lower case. Proofreader also makes broad assumptions about words ending in “s”, which occasionally creates problems with plurals and words that Proofreader thinks might be plurals. When the suspect word list is printed on a printer, line feeds are inserted after every word. This co·nsumes massive amounts of paper if you are a rotten speller. If your paper budget is getting low, these line feeds can be removed by first loading the list into Scripsit and then replacing them with blanks.

Since Proofreader lacks a correcting feature, you must edit your document with your word processor, guided by the list of suspect words. AUXDICT/TXT is used to load the entire file into your word processor and make whatever changes are desired. This limits the size of the dictionary to whatever the maximum document size is for your word . processor, unless you are skilled enough to write a sequential access program specifically tailored to the task. There is no way to edit the main dictionaries; fortunately, they do appear to be accurate. Aspen’s Proof-Edit package, available separately, allows editing of all Proofreader dictionaries.

Testing Proofreader under both single- and double-density operating systems revealed no obvious weaknesses.

Hexspell

Hexspell was written with the Microsoft BASIC compiler and this gives it a very different flavor. To run Hexspell, you type “BRUN SP”, which loads the Microsoft BASIC run time package (BRUN/CMD) and Hexspell’s chaining program (SP / CHN). Hexspell prompts for the dictionary disk to be mounted and then for the file to be checked. The document file is scrolled up the screen while you proofread the document. Since the entire document · is displayed, in proper upper / lower case (you must use your own lower case driver), proofreading is unbelievably simple. When Hexspell finds a suspect word, all you need to do is type “L” for Hexspell to learn the word, “S” to skip the word (leave it as is), or “R” to replace the word.

Hexspell will check the spelling of replacement words as well as words in the original document. In other words, if Hexspell complains about “misstake” and you replace it with “misteak”, Hexspell will flag it again. Though seemingly a trivial item, this feature sets Hexspell apart from the other two spelling checkers — Hexspell checks everything. Hexspell ignores all characters but letters, will check single letters and treats upper and lower case identically. A nice feature for writers and anyone else interested in word counts is Hexspell’s summary count of total number of words in a document, together with total number of unrecognized words. As numbers are ignored, “1982” would not be counted as a word (though editors would count it). On the other hand, “M60Al”, the designation of a type of U.S. tank, would appear to Hexspell as “MA”, and count as two words.

Hexspell has two dictionaries, one which remains in memory (SPELL/ MEM, about 17,500 bytes or 6,000 words long), and one which remains on disk (SPELL/ LST, about 70,000 bytes, or 23,000 words long). The in-memory dictionary contains commonly used words, which greatly speeds Hexspell’s operation. 1 After each proofreading session, Hexspell automatically adds any words . you asked it to learn to SPELL/ MEM, and bumps infrequently used words onto SPELL/ LST. SPELL/ LST, in turn, “forgets” unused words to make space. After a while, this ripple effect tailors Hexspell to your personal vocabulary. With use, Hexspell becomes so familiar with your vocabulary that it seldom needs to consult the disk-based dictionary.

Hexspell can, at your request, ignore all words that begin with a capital letter. This option avoids checking of proper names and terms and speeds the proofreading process. (Hexagon also recommends it as a check for lower case letters in BASIC programs.) Another unusual feature is a BASIC program which clears the entire dictionary. If you wish to teach Hexspell a foreign language or a specialized vocabulary, running CLEAR/BAS will “zero” the dictionary and allow you to start from scratch. Using this feature, a 140,000 byte science fiction book catalog (containing 1800 titles) was fed to Hexspell. When finished, Hexspell recognized Isaac Asimov and Roger Zelazny to be good people and had no trouble with androids, quarks, wizards, stainless steel rats, ringworlds, white dragons and other essentials.

If Hexspell has any faults, it is in error handling — all traceable to Microsoft’s BASIC Compiler. Microsoft spared no expense on error trapping — they didn’t spend a dime. As a result, Hexspell displays great consistency when it comes to errors: it crashes. Fortunately, there aren’t that many ways to generate errors and none will damage your original document. If you try to check a non-existent file, Microsoft’s BASIC run time package (BR UN / CMD) will respond with:

Error. File not found at 5C43.

This cryptic message can be easily avoided by giving Hexspell correct file names.

Hexspell works well under TRSDOS, NEWDOS/ 80 and LDOS. Under a double-density system like NEWDOS/ 80 2.0, you can fit Scripsit, all necessary system files and all Hexspell files on one 35-track double-density disk. Hexspell comes with a well-written spiral bound manual containing 10 pages of instructions and a good table of contents. A sample text file (on diskette) is also included to illustrate Hexspell operation.

Microproof

Cornucopia Software’s Microproof is the fastest spelling checker of the bunch. Without use of the optional correcting and word processing integration features, Microproof operates much the same as Proofreader: text files are scanned, the contents compared to Microproofs dictionary and any suspect words are then listed on the screen. If desired, these suspect words can be printed on a printer. Unlike Proofreader, no provision is made for saving the list to a disk file.

Microproof’s three dictionaries — DICTl , DICT2 and DICT3 — contain an entire Webster’s pocket dictionary of 50,000 words, yet take up just 56 grans (70,000 bytes) on diskette. This minor miracle is made possible by “coding and two sophisticated packing techniques.”

Words are identified as verbs, nouns, adverbs or adjectives. By coding “fast” in the dictionary as an adjective, Microproof will recognize , “fast”, “faster”, “fastest”, “fasten”, “fasting”, and “fasted.” Microproof will recognize certain prefixes, too, which means that you may slip an occasional “irregardless” or “inclosed” past it without trouble.

These inconsistencies and others are discussed in the 30-page manual: “In some instances, a correctly spelled word that is actually in Microproofs vocabulary will appear on the error list. This can happen when the correct word is located alphabetically close to another word which appears on the error list and actually is incorrect.” If this type of thing bothers you, Cornucopia offers a “literal” dictionary which avoids this problem at the expense of speed.

Adding words to Microproof is fairly simple: just create a file containing the words you wish to add and ADDTODIC/CMD will make this part of DICT3. PRINTDIC/CMD will allow you to edit and delete words from the expansion dictionary.

Microproof treats upper and lower case letters the same, does not check single letter words and ignores all non-letter characters. All words appear on the screen in upper case only and if printed on a printer, words appear in lower case.

According to Cornucopia’s advertising and documentation, Microproof has no trouble with hyphens, but this isn’t exactly true. Microproof cannot handle compound words which use hyphens, such as “TRS-80”, “double-sided” and “fool-proof’. Cornucopia stated:” … we chose to reject word configurations like very-nice or not-too-greasy. Thus, a pair of proper words will be accepted with a hyphen between them only if they are found in Microproofs dictionary in hyphenated form. The user may, of course, add them himself as we did with ” soft-sector” in the enclosed test file. End of line hypheqation is, of course, ignored and the two halves of the word are treated together as one word.”

The fully integrated version of Microproof (using the optional word processing conversion and correcting options) is exceptionally powerful. When ordering the software, you must clearly state the model of your machine and (for the integration feature) the word processing software package you are going to use.

Non-technical types will appreciate the integrated version’s ease of use. While Hexspell and Proofreader (and standard Microproof) are called from DOS, the full-blown Microproof is available directly from inside the word processor. After writing your document, type “M” on the · command line and Microproof will automatically load, scan your program, prompt for the dictionary diskettes, guide you through the correcting phase and reload your word processor and corrected document. From your point of view, it appears that you never left your word processor!

During the correction phase; typing a “+”, “?”, or “!” will, respectively, add the suspect word to the dictionary, show it in context, or abort the proofing process. The use of single keys speeds proofing and the use of shift keys adds some measure of protection against errors. Pushing the Enter key will leave the word as is and corrections are entered by simply typing in replacement words. All corrections are global. In one test, typing “Orchestra” replaced all 543 occurrences of “Adams” in the document. This is a unexpected plus because Scripsit, in comparison, could not have handled as many changes — just 255 global replacements are allowed — and would have taken much longer as well.

Microproof documentation is outstanding. The manual is thirty pages long and includes a table of contents. Though a wide variety of word processors, machines and operating systems are mentioned in the manual, the writfog is quite clear. A complete step-by-step programmed learning course, using a sample text file (included on diskette), is provided. Using NEWDOS/ 80 2.0 in double density, you can put Scripsit and all Microproof files on one double density diskette with room left over.

Table of spellcheck features
Table from the original article as it appears in 80-U.S. showing the relative merits of the spell checkers.
Recommendations and Considerations

There are some things that even the best automatic spelling dictionary cannot cope with. Consider the following two sentences:

Wants ponder dime, dare worst aye ladle gull culled Ladle Rat Rotten Hut.

Defeat other folks when other defense abhor detail.

Though they contain no misspelled words, both sentences make no sense unless they are said aloud, in which case you get:

Once upon a time, there was a little girl called Little Red Riding Hood.

The feet of the fox went over the fence before the tail.

This little demonstration might be summarized as: spelling isn’t everything.

In terms of recommendations, Cornucopia’s Microproof (the full version) would be ideal in a business environment. It is the fastest of the bunch and the ability to call it directly from and return to the word processor is a definite plus if it will be used by non-technical personnel. Cornucopia seems firmly committed to product improvemeqt and their customer service is quite good.

If you have a one-drive system and a limited budget, Aspen’s Proofreader is an appropriate investment. It is simple, fast and accurate. Though it lacks any ·provision for editing the document being scanned, Aspen has released two programs (currently being evaluated) which will overcome this limitation. Proof-Edit ($30 Model I/III) will allow interactive corrections, much like Microproof. Grammatik ($49 Model I, $59 Model III) will , check for grammar and punctuation errors and produce word frequency counts — just the thing for writers!

As far as personal preference is concerned, Hexspell is the local favorite. Though available only for Model I (as of this writing-mid October) and requiring 48K and two drives, it is a bargain at $69.00. Hexagon had problems with delivery for a few months (thanks to Canada’s mail strike), yet went out of their way to accept telephone orders, questions, complaints and general conversation. Using Hexspell, there is a feeling of control and control is essential in proofreading.

Final Comment

All these programs will check spelling, but none can act without human judgment. You still need a book-type spelling dictionary to look up the words flagged by the programs. Highly recommended is Webster’s New World Speller/ Divider, William Collins, publisher, $2.95. Unlike a regular dictionary, it contains nothing but words-no illustrations or definitions to slow you down. It lists 33,000 words, spelled and syllabified, and has a tough plastic cover. It works without a computer or even electricity . . .

Proofreader, Aspen Software, P.O. Box 339, Tijeras, NM 87059, (505) 281-1634, $54 (Model I), $64 (Model III), $109 (Model II).

Hexspell, Hexagon Systems, P.O. Box 397, Station A, Vancouver, BC V6C 2N2, (604) 682-7646, $69 (Model I).

Micro proof, Cornucopia Software, P.O. Box 5023, Walnut Creek, CA 94596, (415) 524-8098, $89.50 (Model I / III), $149.50 (Model II}, $60.00 (correction option), $35.00 (integration option).

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Western Pacific dragons and other real creatures

%d bloggers like this: