Variant Conversion Info (VARCON) Revision 4.1 August 10, 2004 Copyright 2000-2004 by Kevin Atkinson (kevina@gnu.org) This package contains tables to convert between American, British, and Canadian spellings and vocabulary as well as well as a table listing the equivalent forms of other variants. The latest version can be found at http://wordlist.sourceforge.net/. The abbc.tab contains mappings between American, British with "ise" spelling, British with "ize" spelling, and Canadian spellings. The fields are separated by a tab character and have the Unix EOL character. The first four columns are the spellings respectively. The last column is used to mark words where the American or British spelling is also used in the British or American spelling but only when the word has a certain meaning. American words that are also used in British / Canadian spellings are marked with a "A" while British words that are also used in American / Canadian spellings are marked with a "B". The file voc.tab is like abbc.tab except that it converts between vocabulary instead of spelling. If more than one word if often uses to describe the same thing the words are separated with commas. The last column contains additional notes on when the word is used. Unlike abbc.tab it is generally not suitable for automatic conversion. The file variant-also.tab contains additional mappings between various spellings of a word which are not necessarily region dependent. Only mappings that are not listed in abbc.tab are included in this mapping. No attempt is made to distinguish the primary form of a word. The file variant-infl.tab is like variant-also.tab except that it is created automatically from the AGID inflection database. The file variant-wroot.tab is like variant-infl.tab except that it also included the root form of the word. Both the translation array and variant table includes a lot of strange affixations of words which I have no intention of removing as some people might find them useful. The "make-variant" Perl script will combine abbc.tab, variant-also.tab, and variant-infl.tab into one huge mapping and will output the result to "variant.tab". If the "no-infl" option is given than variant-infl.tab will not be included. The "split" script will create 5 words lists from abbc.tab: american.lst, british.lst, british_z.lst, canadian.lst, and common.lst. The common.lst file contains words that were marked in the last column as explained above and the other four contain all the possible words that might be used by that particular country, included some of the words marked in the last column. The "translate" Perl script will translate a text file from one spelling to another. Its usage is: translate [] is any of -?,-h,--help this screen -m,--mark mark words where the translation is questionable -i,--include include words where the translation is questionable is the file name of the translation array, defaults to "abbc.tab". and are one of: american, british, british_z, or canadian. british-ise and british-ize can also be used. Text is read in from standard input and is outputted to standard out. Words are marked with a '?' before and after the questionable word when the option is enabled. If you discover any errors in these mappings, besides the strange affixations, or have suggestions for additions be sure and let me know at kevina@gnu.org. SOURCE: These mappings were compiled from numerous sources. The abc.tab was originally created from the American and British word lists found in the Ispell distribution and the Canadian word list created by Garst R. Reese : What I have discovered is that Canadian is a modification of British. Canadians use ize ization, izing izable like Americans, and gram instead of gramme. The one exception I found was practise. It does not go to practize. Otherwise they use British spelling. So, what I am currently checking books with is a an edited version of British, where I have changed all occurrences of ise to ize, isab to izab, isation to ization, ising to izing, and gramme to gram except I allow programme, which is sometimes proper unless you are talking about a computer program. I did bunches of greps to be sure these substitutions would work as expected. Many other words have been added to abc.tab which were not in the original Ispell word lists. Many different web sources were consuled when crating the tables. They include: The American-British British-American Dictionary http://www.peak.org/~jeremy/dictionary/dictionary.html American and British Spelling Differences http://www.peak.org/~jeremy/dictionary/spellcat.html Dave (VE7CNV)'s Truly Canadian Dictionary of Canadian Spelling http://www.luther.bc.ca/~dave7cnv/cdnspelling/cdnspelling.html Canadian Spelling Convention http://imej.wfu.edu/articles/1999/1/02/demo/tutorial/canas.html Cornerstone's Canadian English Page http://www.web.net/cornerstone/cdneng.htm Inter-Play Translation: British/Canadian/American Spelling http://www.inter-play.com/translation/spel-ukus.htm Inter-Play Translation: British/Canadian/American Vocabulary http://www.inter-play.com/translation/voc-ukus.htm As well as several online dicionaries: Marriam-Webster: http://www.m-w.com/ American Heritage: http://www.bartleby.com/61/ Cambridge (ESL): http://dictionary.cambridge.org/ CHANGELOG: From Revision 4 to Revision 4.1 (August 10, 2004) - Fixed various errors ib abbc.tab - Removed clause 4 from the Ispell copyright with permission of Geoff Kuenning. From Revision 3 to Revision 4 (August 7, 2004) - Added a column to "abc.tab" for the British "ize" spelling and renamed the file to abbc.tab. - Added verb forms of prize/prise to abbc.tab, removed from variant-also.tab From Revision 2 to Revision 3 (January 2, 2003) - Added an option for not including variant-infl.tab for the make-variant perl script - Added the file variant-wroot.tab - Added a few entries given to me by Francis Bond and Edward Betts From Revision 1 to Revision 2 (January 27, 2001) - Removed all "B" markers because I could not find any evidence for them - Corrected a few Canadian entries, especially those with the "B" markers - Added some more entries by trying fixed changes (such as ize to ise) to words in SCOWL and hand-checking over the ones with semi-common words in them. - Added variant-infl.tab COPYRIGHT: Copyright 2000-2004 by Kevin Atkinson Permission to use, copy, modify, distribute and sell this array, the associated software, and its documentation for any purpose is hereby granted without fee, provided that the above copyright notice appears in all copies and that both that copyright notice and this permission notice appear in supporting documentation. Kevin Atkinson makes no representations about the suitability of this array for any purpose. It is provided "as is" without express or implied warranty. Since the original words lists come from the Ispell distribution: Copyright 1993, Geoff Kuenning, Granada Hills, CA All rights reserved. Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met: 1. Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer. 2. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution. 3. All modifications to the source code must be clearly marked as such. Binary redistributions based on modified source code must be clearly marked as modified versions in the documentation and/or other materials provided with the distribution. (clause 4 removed with permission from Geoff Kuenning) 5. The name of Geoff Kuenning may not be used to endorse or promote products derived from this software without specific prior written permission. THIS SOFTWARE IS PROVIDED BY GEOFF KUENNING AND CONTRIBUTORS ``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL GEOFF KUENNING OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.