1 Unofficial Alternate 12Dicts package (Alt12Dicts)
3 Packaged by Kevin Atkinson
7 The files contained in this archive are the result of a rather
8 extensive conversation between me (Kevin Atkinson) and Alan Beale, the
9 author of the 12Dicts package. I can be contacted at kevina@gnu.org
10 and Alan Beale can be contacted at biljir@pobox.com. This archive
11 contains almost all the information originally found in release 4 of
12 the official 12Dicts package but in a different format as well as a
13 good deal of additional information.
15 This version has been updated with information from version 6.0.2 of
16 the official 12Dicts package, Version 5 and 6 of 12Dicts include a
17 number of new files not found in version 4; this package does not
20 The latest version of this package and the official 12Dicts package can
21 be found at http://wordlist.aspell.net/
23 The file README-orig contains the original Readme file distributed
24 with the official 12Dicts package. README-infl contains the Readme
25 file for 2of12infl.txt and finally README-agid contains the Readme for
26 AGID which 2of12infl.txt is based on.
28 All of these files have been explicitly placed in the Public Domain by
32 2of12full.txt description:
34 The file 2of12full.txt contains the all words appearing in more than
35 than one of Alan Beale's source dictionaries. Each line contains five
36 numbers, being the total number of dictionaries, the non-variant
37 entries, the variant entries, the non-American entries and the
38 "second-class" entries (appearances without a separate definition).
39 Counts of zero are replaced by hyphens. For instance, the entry
41 7: - 2# 5& -= aeroplane
43 indicates that the word "aeroplane" is listed in 7 of the
44 dictionaries. None list it as a primary American word, 2 list it as a
45 variant form, and 5 list it as a non-American word, and none list it
46 as a second-class word. Note that words may be marked with a "&" for
47 either of 2 reasons. They may represent a non-American spelling of an
48 American word, such as "aeroplane" or "gaol", or they may represent a
49 word not normally used in American English, such as "bloke" or
50 "lorry". Also note that there are two main kinds of second-class words
51 - ones listed in the entry for another word without definition
52 (usually associated with the suffixes -ly, -ness or -er/or), and ones
53 appearing in a list of undefined words with a common prefix. Finally,
54 observe that the numbers of non-variant, variant and non-American
55 entries will sum to the total dictionary count, while the scond-class
56 entry count is independent of them, except that of course it is never
57 greater than the total count.
59 Words marked with a colon (":") after it are abbrevations which are
60 entirely lower-case and alphabetic.
62 This file contains almost all the information found in the normal
63 12Dicts with two exceptions:
65 1) "Signature words" which did not appear in at least two dictionaries
66 are not included in 2of12full
68 2) The sources used differ in one respect from those for the 2of12 and
69 6of12 files. See README-infl for a full description.
72 signature.txt description:
74 The file signature.txt contains a list of signature words. Signature
75 words are words are words which failed are not in at least 6
76 dictionaries but Alan Beale thought should be included at the 6of12
77 level (see README-orig). Examples of some of the sorts of words are
80 1. Words of the same category as other included words. An example is
81 the astrological sign "Cancer", which alone of all the astro-
82 logical signs fails to appear in 6 or more of the dictionaries.
83 Similarly, the omitted holiday "Christmas Eve" was added.
84 2. Vulgarities, sexual terms and insults. Some such words were
85 already included, but most of the source dictionaries were quite
86 squeamish about them. These words are very widely known indeed;
87 I hold that any list of "common" words which does not include the
88 infamous f-word is simply discredited thereby. Some may feel that
89 it would have been better to leave some or all of these terms
90 unmentioned. Nevertheless, the expression of blasphemy,
91 unwarranted contempt, and perverse lust, whether in words or in
92 deeds, is a very human trait. Suppressing the evidence of these
93 aspects of the human condition in our language makes no more sense
94 than excluding "leprosy", "gangrene" and "dementia", no matter how
95 unpleasant they may be to contemplate.
96 3. Conventional conversational phrases so common as to be practically
97 invisible to native speakers. Examples are "thank you", "good
98 night", "uh-huh", "of course" and "gesundheit".
99 4. Sports terminology, especially for football and baseball.
102 signature2.txt description:
104 The file signature2.txt contains inflections of irregular verbs not
105 explicitly mentioned in 2 source dictionaries, such as "outfought" and
109 variants.txt description:
111 The variants.txt file contains a subset of the words appearing in at
112 least one of the 12 source dictionaries marked as variants or
113 non-American. This list contains only the words which are spelling
114 variants, words which represent different ways of saying the same
115 thing (such as "henceforward" as a variant of "henceforth") and
116 non-American words without a similar American form (such as "telly")
117 have been removed. Each entry is followed by a tab, and a notation
118 indicating which of several classes the word falls into. To describe
119 the classes, it is best to do a little algebra. Let NV be the total
120 number of non-variants, A the number of American variants, B the
121 number of non-American variants, and V=A+B. Then the following
122 annotations are to be interpreted as follows:
128 #? - A >= B, 0.65*NV < V <= NV
129 &? - A < B, 0.65*NV < V <= NV
131 Simplifying, the choice between # and & indicates which variety of
132 variant dominates, while ! and ? indicate a stronger or weaker than
133 average agreement on variance.
135 Additional notes on the list from Alan:
137 I should note a couple other characteristics of this file. First of
138 all, there are cases where spellings exist which are clearly
139 variants of one another, but where this is not recognized by the
140 source dictionaries. An example is the pair "levelheaded" and
141 "level-headed". These are clearly the same word, but none of my
142 sources lists both of them. I have chosen not to go beyond the
143 source dictionaries and put such words on the variants list, even in
144 obvious cases like this one.
146 I should also note that there are cases where the question of
147 whether 2 words are spelling variants or actually different words is
148 not easy to answer. For instance, consider the pairs
149 "lengthways"/"lengthwise" or "toward"/"towards". I've simply made
150 whatever decision seemed best to me in cases like this ("lengthways"
151 is a variant, "towards" is not), but recognize that any other
152 observer (who could bring himself to care) would be likely to
153 occasionally disagree.
155 Variants.txt has not been updated for release 6, as critical
156 information about how the list was contructed has unfortunately been
160 abbr.txt description:
162 This file contains (almost) all the abbreviations and acronyms from
163 the 12Dicts sources. Abbreviations which also in a list of common
164 personal names (of about the same completeness as the ESL dictionaries)
165 are marked with a tilda ("~") after it. There are still likely to be
166 some abbreviations not marked with a tilda that match less common
169 Additional notes from Alan:
171 For words containing upper-case, I [Alan Beale] had not recorded
172 whether a word was an abbreviation, so I was forced to remove the
173 non-abbreviations from the list by hand. Because of the need to
174 remove non-abbreviations, I limited myself to consideration of
175 upper-case words of 6 or fewer characters. It is possible that a
176 small number of acronyms or abbreviations longer than 6 characters
177 might have been missed.
180 variant-notes.txt description:
182 The file variant-notes.txt contains some additional notes on
183 questionable variants sent to me when I pointed out that nought was
184 not marked as a variant.
187 2of12id.txt description:
192 2of4brif.txt, 3esl.txt, and 5desk.txt neol2016.txt description:
194 These files are identical to the orignal files in the 12Dicts package.
195 See README-orig for more info.
198 neol2016.poss description:
200 Possessive forms for words in neol2016.txt. (Created by hand by
201 Kevin Atkinson, not provided by Alan).
204 signature3a.txt description:
206 The signature phrases from 3of6all.txt.
209 signature3g.txt description:
211 The signature words from 3of6game.txt.
214 signature4lem.txt description:
216 Extra head words added to 2+2+3lem to add British/American versions of
217 words when only one form was present, plus a few other words added for
221 signature4cmn.txt description:
223 Some very common abbreviations, capitalized words and contractions not
224 present in the BYU data, added to 2+2+3cmn.txt.
227 5d+2a.names2016.txt description:
229 A short list of names of renowned individuals since 1999 (plus one
230 government program and one social media site), added to 5d+2a.txt.