I’m saying goodbye to 2016 in appropriate fashion: spending time with my family, eating a lot, fighting a cold, and studying word things.

Over the years that I’ve been at this word study and teaching and training thing, I’ve encountered references to a 1966 study known as The Stanford Spelling Survey, by Hanna, Hanna, Hodges, and Rudorf, four professors of education who analyzed 17,310 English words and wrote up their research in an article that’s cited over and over and over.  From this analysis of less than 2% of English words and a lot of number crunching, Hanna et al. concluded that English is 67% “regular.” That study has been used as the foundation of so much of modern phonics, including pedagogical decisions based on what patterns are considered “regular,” “common,” and “exceptions.”

This 50-year-old phonocentric study was brought to my attention again while I was working on my dissertation this past week, and also by a comment on my last post which I did not publish out of deference to the writer, who, like me, is a business owner with a public profile; unlike me, she runs a phonics center that trains people in Wilson and LETRs and other shopkeeping packages that I’ve countered with linguistic evidence many times before.  She wrote a comment to argue that the “frequency of occurrence with regard to nonsense words” matters, and cited a table from a 2010 book (which I have) that was copied from a 1976 book (which I also have), which itself was citing an article from 1966 (which I also have), that was in turn built on one author’s question from 1949 (yes, I have that too).

Paul Hanna’s 1949 question was “regarding the correspondences [of graphemes and phonemes] and their consistency in spelling,” as explained in the 1966 article. Twice I was directed to that 1966 article in my studies this week; there are no coincidences. As I said, I run into citations of that study frequently. It’s common. But this week’s two encounters were louder in my head than usual.  My email response to the LETRs Lady was clear and direct: I explained clearly that the “frequency of occurrence” of nonsense words is zero, and the “frequency of occurrence” of actual phonemes and graphemes in nonsense words is zero. The only evidence she had given me at all was a citation of a book citing another book citing an article, right? So I decided to trace it back to its source.

That table (which can be googled) was first published by Elsie D. Smelt in 1972 and has been cited widely since; her figures are taken from the 1966 Stanford Study. Smelt’s table says that “the most common way of writing each vowel sound is with one letter,” and this claim is attributed to the Stanford study as well. But what exactly do we mean by “common” or “frequent,” and how does that knowledge help readers and spellers? While single-letter vowel spellings may be the default grapheme for “long” and “short” vowel phonemes, spelling and reading strategies are not based on statistical calculations by proficient readers. Moreover, while we have only 6 single-letter vowel graphemes, we have more than 30 vowel digraphs and trigraphs, a ratio that troubles the notion of single letters being the “most common” spelling.  Let’s see what Hanna et al. actually say.

Here’s the basic framework they offer:

“These structural components of oral language include: (A) the phonetic reservoir from which a phonemic code is selected, (B) the phonemic base, (C) the morphological base, that is, the arrangement of phonemes into speech units which minimally express meaning, (D) the syntactic and grammatical base, that is, the arrangement of morphemes into syntactic patterns, and (E) the semantic base, which conveys meanings in terms of the conceptual system of a language community.” [I’m substituting his numbers with letters to make this post easier to write.]

Two things struck me right away: first, that these educators at least acknowledge a distinction between phonetic and phonemic concerns, which is more than I can say for many present-day phonics resources; and second, that they — and everyone who has followed in their formidable footsteps — have the way a language works totally backwards. Now, they’re talking about oral language rather than written, but the point is the same: you don’t start with phonetics and end up in meaning; rather, you start with meaning and from there, can analyze words (lexemes) into their sublexical (smaller-than-word) structures, including morphemes, phonemes, and the graphemes that pinpoint and reveal them.

In the word study I’m engaged in, we ask four questions:
(1) What does it mean?
(2) How is it built?
(3) What are its relatives?
(4) What segments and features of pronunciation matter to meaning? These segments are the only ones that are  revealed in the spelling.

Question 1 has to be first — there’s no point in knowing how to write a word whose meaning you don’t know.  And Question 4 has to be last — you can’t figure out the orthographic phonology until you have evidence for the other pieces. But Questions 2 and 3 can and do toggle considerably in any investigation. So you start with meaning, and you stay rooted in meaning all the way through. What does it mean?  And even Question 4, which deals with pronunciation, only concerns itself with aspects of pronunciation that matter to the meaning. So it’s the Stanford Study’s fifth and final concern — semantics, “the conceptual system of a language community” — which is where we actually need to start.

Our second question, How is it built?, is captured more or less in the Study’s third and fourth concerns, in which “the morphological base” and “the arrangement of morphemes” is considered. They define morphology as “the arrangement of phonemes into speech units which minimally express meaning.”

Oh if only there were some way to make those “speech units” that we use to “express meaning” visible!

Working backwards still, the Study’s second concern is phonology, the “phonemic base.” The reason there’s any fifth piece is because they’re talking about oral language, so phonetics is a thing because it’s actually spoken, and because although they differentiate phonetics from phonemics, they don’t seem to have any idea in the article that phonetics has nothing to do with orthography.

Of course, the Stanford Spelling Study doesn’t even mention etymological relatives, because it has no idea about the etymological governance of graphemes. It can tell you that 10% of the 17,000 words  that have /i:/ are spelled with <ee>, and 10% are spelled with <ea>, but it can’t tell you why <beech> and <beach> make sense. This study knows nothing about etymological markers or why words have a single, final, non-syllabic <e>. We know better now, so why is 21st-century so-called reading research still so married to a half-century-old, roundly debunked understanding of graphemes?

Seriously, professionals need to stop embarrassing themselves by clinging to these relics.

I also took a look at the numbers and at the phonemic and graphemic inventories used by this seminal study. It’s a bloodbath. I am not exaggerating. The phonemic inventory is lifted directly from the Merriam Webster Dictionary, which is important, because even if dictionaries were actually right about everything (they’re not), we’re still talking about a dictionary that has been updated and changed multiple times, including with regards to its pronunciation key, over the past 50 years. So the “research” that people want me to consider is based on a 50-year-old dictionary, interpreted by 50-year-old research, cited 40 years ago, and then re-cited in very recent years, none of which is evidence of anything at all about the language other than what cruddy research practices we have in literacy education.

The authors themselves “readily admit[] that this pronunciation key [from the Merriam-Webster Dictionary] has several critical weaknesses.”  They also acknowledge that linguists don’t always agree about everything, and that their graphemic inventory (which was all about how easily a computer could process 17,310 words) was also flawed.: “Unfortunately, complete consistency with this criterion could not be maintained, and so some exceptions to this general rule will be found among the list.” So we’re in exception-land, which is really not science. They do ask questions like “Is <I> a part of the graphemic option <TI> or <IO> in nation? In conscience, is <I> a part of the graphemic option <SCI> or <IE>?”, and they conclude that “Again linguists disagree upon this point.”

Well, folks, linguists may have disagreed on that point a half century ago, but orthographic linguists don’t disagree about it now. I already laid out proof in another post that there’s no <ie> in conscience — no matter that Louisa Moats says there is as though she proved it (she didn’t). Linguistics is a science, and we know more now about these kinds of questions — we have better tools now than we had 50 years ago, like the lexical word matrix, the orthographic word sum, the mini matrix maker, and the Online Etymology Dictionary, and better, faster ways of disseminating and discussing investigations and new information (in real time online classes, on editable websites and social media. We don’t have to carry around some dusty old misunderstanding like it’s our last keepsake from our long lost Pappy.

For reals, why are professionals — researchers and educators, of all people — clinging to 50-year-old research that didn’t even conceive of today’s scientific tools? Can you imagine if a surgeon or a rocket scientist did that? Mayhem. Can you imagine if we elected someone who ignored and denied modern climate science as President? Oh, wait… Sigh.

Science matters. Understanding the difference between factual, physical evidence, scientific consensus, and the repeated sub-letting of citations from, uh, wherever, something sciency-sounding, is just so critical to everything.

Among the lettery circus freaks that the Stanford Study offers in its admittedly troubled graphemic inventory are a *<bt> in debt, a *<ua> in guard and a *<cc> in occur. In real life, the <b> in debt is an etymological marker (debit); the <u> in guard, guaranteeguerillaguest, etc., is part of <gu> digraph that can mark an etymological relationship to cognates with a <w>: guard~warden, guarantee~warrantee, guerilla~war, guide~guise~guywire~wit~witness (‘to see’), guile~wily.  And as any regular reader already knows, the two <c>s in <occur> are each in separate morphemes. That’s like saying that there’s an <ea> in react or a <th> in hothouse. Big fat can of graphemic nope.

I could go on and on and on and on, but I’m gonna go hang out with my kid and watch a ball drop on this crazy calendar year. I’m not much for resolutions, but I’d welcome resolve to move into 2017 not clinging to antiquated phonics research like it’s a bible or a gun and something evil is after you.

I’m sorry that modern phonics is built on a rickety, outdated, dismantled, misguided, misquoted old study. I’m not sorry for pointing it out, and I’m not sorry for yelling a little. If you were clinging to a life raft of the same age and quality and I had a new speedboat, I’d be yelling just as loudly to save your life as I am now.

For the last couple of days, I’ve been running into a lot of online phonics apologia about the use of nonwords, nonsense words, pseudowords, word-attack words, phonemic decoding items, and/or so-called “detached syllables,” in instruction, intervention, and assessment. For starters, the fact that these things have so many different names should cue us in that they are not an actual thing, not a scientific thing, anyhow (just like so-called “sight words”). They are not an actual category, if for no other reason than that many of the examples I’ve encountered over the years are actually real words in use in English, like cam, pate, lander, din, rayed, oft, knap, sedge, bi, [P]og, ta, lat, lum, barchan, and a lot more. Some people collect stamps; I collect linguistic scat from literacy educators and I study it.

People like to argue that nonwords are an effective means of teaching or assessing a student’s knowledge of what they call “grapheme-phoneme correspondences,” or GPC. But every single one of these nonword materials and studies misapprehends both a G and a P, as evidenced by such fabricated baloney as the “quadgraphs” [sic] like *<ough> and *<eigh>, and by the failure to even consider the difference between phones and phonemes. The fact of the matter in our writing system is that no G has a C to a P outside of an M, and M stands for morpheme. Once you remove phonemes and graphemes from a meaningful context, they’re no longer phonemes or graphemes.

To a resource, they all erroneously assume phonological primacy; that is, they remove orthographic phonology from its meaningful context because they wrongly assume that it’s primary to the meaningful constraints and influences of morphology and etymology. That very practice effectively means it’s no longer phonology, because phonology — including phonemes and the graphemes that spell them — is distinctive for meaning and it’s language-specific; nonwords are neither. It is noncontroversial that English orthographic phonology is delimited and constrained by meaning, structure, and history, regardless of how that fact makes people feel.

More than one person has suggested that nonwords were the only way to “break” a student of the habit of guessing at words, often in isolation. Well, you can break an overeating habit by taking up smoking, too, and you can kick a heroin habit by taking up methodone, but that doesn’t mean that the new habits have no harmful consequences. I’d rather focus my scholarship on what I can build than on what I can break.

I’ve also heard from a number of people working with “older” children who are called “treatment resisters” or “treatment fatigued” — kids who spend YEARS in Barton or Wilson and never get past so-called closed and open syllables [sic]. They may begin to “read” better (depending on what you think “read” means), but they continue to spell and write years behind their eulexic peers, largely spelling everything based on the way they pronounce it, because that’s exactly what they’ve been taught to do. I’ve heard from teachers and parents of children who read years ahead of their peers too, kindergarteners who read 3rd grade chapter books with ease, but have no idea how to spell or how to “decode” unfamiliar words, so they’re subjected to nonword drills in order to “measure” their “knowledge” of “alphabetics” or of GPCs.


So here is my analysis of all of the nonwords featured on a publicly available assessment called The Nonword Reading Test. The test instructions say “Either a regular or an irregular pronunciation is acceptable,” but no definition of “regular” or “irregular” is offered beyond that for <soser>, “soaser” would be “regular and rhyming with “loser” would be irregular.

First of all, there is NO ENGLISH WORD spelled with the sequence <oaser>, or even with an <oase> to which we could add an <er>. So how on God’s Grapheme Earth is that “regular”? Moreover, they do NOT specify how the <oa> or the medial <s> in “soaser” would be pronounced. Is the <oa> pronounced as in boat or as in broad or as in oasis? Is the <s> pronounced as in wiser, eraser, or pleasure? And how is the child or the teacher supposed to know or understand that?

You know why <loser> is spelled with an <o>? Because <looser> is a different word, and <lose> is cognate to <loss> and <lost>. What’s “irregular” about that? Just because teachers and researchers and psychometricians are generally ignorant to that breathtaking fact makes it no less a breathtaking fact. Context matters to so-called GPCs. Otherwise they’re neither Gs nor Ps, and any Cs you think are there are not real.


While we’re talking about <soser>, we may as well take a closer look at, um, <closer>: in“this street is closer than that street, the medial consonant is [s]. In he’s the best sales closer of the month” it’s a [z] — those two examples have two different suffixes that happen to spelled and pronounced the same, but don’t share a meaning! The ONLY way you know how to pronounce that word is if you know what it means. And that’s not even considering the pronunciation of the <s> in <closure>.

My analysis provides incontrovertible evidence against the motivating characteristics of all nonword resources: That dusty old crooked Assumption of Phonological Primacy.

The CrAPP.

Here’s the list from this test, along with English words I provide that share (some of) the same sequences of letters. If it feels like some kind of shameful hell for you to read through these, just imagine you’re a 12-year-old dyslexic with an IQ of 138. Or really, anyone.

One Syllable
1. plood: food, good, blood

2. aund: auberge, auto, Auschwitz — and <aunt> can rhyme with pant [ænt], haunt [ɔnt], or font [ɑnt], depending on your dialect.

3. wolt: colt, but also wolf, wolverine, woman, word, work, worm

4. jint: pint, lint — in many dialects lint and lent rhyme.

5. hign: sign, malign, benign, but signal, malignant; also hour, honor, and herb.

6. pove: shove, move, stove

7. wamp: ramp, swamp, swam

8. cread: bread, bead — for crying out loud, <read> is both [riːd] and [rɛd] — and how about create, or triad?

9. slove: glove, stove, prove — haven’t we been here before?

10. fongue: tongue, fondue, wrong, humongous, segue

11. nowl: bowl, fowl, snow, now, lowly, bowlegged

12. swad: swan, swam, swamp (is there an echo in here?)

13. chove: choir, cholera, chop, chef, and see pove and slove

14. duede: suede, due, clued, cued, swede, educate

15. sworf: sword, swollen, sworn, swore, word, work

16. jase: base, phase — vase, for crying out even louder, can be [veɪs], [veɪz], or [vɑz]

17. freath: breath, wreath, great, smooth

18: warg: war, warm, forward, wary, argue (there is no English word that ends in <arg> — if it’s a detached syllable, then what about larger?)

19. choiy: the graphemes <oi> and <y> are never, ever in sequence. Even <iy> is tightly constrained: that sequence is either across a morpheme boundary (as in multiyear) or in a non-English word, like teriyaki or aliyah. Consider joy and soy and bok choy.

Two Syllable (so much for that ‘detached syllable’ rationale)
1. louble: double, rouble, boucle, tousle, loud

2. hausage: sausage, usage, garage, stage, courageous, also hour and honor and herb again.

3. soser: loser, poser

4. pettuce: lettuce, induce; petty has a double <t>; petting has a doubled <t>; flattop has neither.

5. kolice: police, policy (some people say POlice), malice, preslice. And why does this have a <k> before an <o>?

6. skeady: steady, beady, skean

7. dever: clever, fever — hell, lever can be both [‘lɛvɚ] and [liːvɚ]!

8. biter: This is not a nonword. It’s a word: “My new puppy is a biter.” Nonetheless, if it were, say, <piter> instead, notice writer, whiter (note the different <er> suffixes), liter, arbiter

9. islank: island, mislay, Islam, mankiller, and anyhow, vowel pronunciation is often disputed before [ŋ], but the orthographic phonology is revealed by the graphemes.

10. polonel: colonel, colony, colon, polish, police, Polish — what in the hell can *polonel tell you about anything at all? Someone please make it stop.

11. narine: This is actually a word; it means “pertaining to the nostrils” or the same as “narial.” Criminy, is your google broken? But also, marine, margarine, alkaline, urine, line, incline…

12. kiscuit: biscuit, intuit, circuit, circuitous, recruit, and how about Jesuit? The Jesuits have always valued knowledge and evidence.

Why 19 monosyllables? Why 12 disyllables? Why 31 total? Only the <shade + ow> <know + s>.

This “test” features the following rough distribution of graphemes, depending, for example, on whether the <s> in <islank> and the <g> in <hign> are supposed to be graphemes or markers, or on whether the <<ui> in <kiscuit> is one grapheme (bruise) or two (intuit). Those are just a few examples of the ascientific foolishness embedded in here that makes a real scientific analysis challenging:

b (3)
c (4, including both [k] and [s])
d (5)
f (3)
g (2-3, [g] and [ʤ] and [∅])
h (0-2, initial only, which could be French markers)
j (2, initial only)
k (4, including the unconventional *kolice)
l (9-10, including *polonel. Honestly.)
m (1)
n (9)
p (4, initial only)
r (2, initial only)
s (8-9, most of which have multiple possible pronunciations)
t (5, including <tt>)
v (4, of which 3 are in an <ove> rime)
w (3-4, initial or following <s>)
ch (1, initial only)
th (1, final only)
gue (1, or maybe it’s a <g> followed by a <ue>, as in argue, or a <g> followed by a <u> and an <e>, as in segue. Who knows?)
That’s 17 of 20 single-letter consonant graphemes (x, y, and z didn’t rank), two digraphs (out of more than two dozen), and whatever the heck <gue> is supposed to be. Why are <n> and <l> — which have a single phonemic association — as important as <s>, or more important than <c> or <ch>, which all have multiple pronunciations?
I so want to cuss right now. FFS: the middle F stands for Fonics, though.
a     (4-5)
e     (3)
i      (8)
o     (8-9, including whatever the hell is up with *polonel)
u     (0-2, depending on whether the <u> in *duede or in *fongue is a grapheme or not)
y     (2)
ar   (2)
or   (1)
er   (3)
au   (2)
ea   (3)
oi    (1)
oo   (1)
ou   (1)
ow   (1)
ue    (0-1)
ui    (0-1)
Final non-syllabic <e> (10, of which 3 are in an <ove> rime)
This includes 5 or all 6 of the single-letter vowel graphemes, but <i> and <o> are featured 2-3 times as much as <a> and <e>. It also includes three of many rhotic vowel spellings (why <or> but not <oar>, <ore>, <oor>, or <our>, which can all spell [ɔɹ]?) It also includes 6-8 vowel digraphs (out of around 30) and zero vowel trigraphs (we have two). This doesn’t even include half of the orthography’s vowel graphemes, the vast majority of which are digraphs. You know why <feat> has an <ea> and <feet> has a <ee>? I can give you at least two good reasons for each word. And they make total sense.

How is this nonword GPC inventory in any way representative of any kind of coherent “knowledge” about graphemes, phonemes, or their alleged correspondences? It’s just not. Whoever slapped it together — as with every single nonword resource I’ve ever seen, used, or recently investigated — has no idea that <w> can mark the phonology of a subsequent <a> or <o>, or that an <ove> rime has multiple possible pronunciations. I can think of at least three good reasons why <move> is spelled with an <o>; nonwords can’t think of a single one.

As my good and wise friend and colleague says, if a child writes *<dun> instead of <done>, you have all the information you need that he already owns the CrAPP concept of GPCs, and that it’s already doing its damage.

Can anyone offer any explanation that makes this kind of nonsense anything other than a sadistic but nonlethal method of collecting meaningless data about meaningless “knowledge” about meaningless “patterns”? I welcome any and all nonsense word measures. I guarantee you I can find you massive problems with any one of them.
Ighm aul ierse. Doar’z oapon.

Several days ago, a friend’s Facebook comment got me to thinking about the word pink. I like pink. And pink things. Probably to a pinkfault. I still daydream about a pink-rhinestone-covered stapler a former colleague had. I have pink pillow shams, lots of pink clothes, pinkish boots, a pink flashlight, and a pink lampshade. I can’t resist snapping photos of pink sunrises and sunsets from my hilltop home. I need a new pink purse because I’ve worn out the last one. I even made the instruction cards in my first InSight Words deck pink.

So the word was stuck in my head for a few days, which means it had to be investigated if I had any hope of accomplishing anything else. It turns out there are no fewer than seven different base elements spelled <pink> in English:

  1. The color pink  is named for the flower.
  2. The flower (Dianthus) may be named for its ‘pinked’ edges (perforated or punctured) — think pinking shears. Or it may be named for pink eyes — not conjunctivitis, mind you, but an early Modern English phrase on loan from the Dutch pinck oogen, ‘small eyes,’ — referring to the flowers’ appearance reminiscent of small, half-closed eyes. The pink in these pink eyes doesn’t historically refer to the color, but to size.dianthus
  3. The first hypothesis for the flower’s name, it’s ‘pinked’ edges, is its own etymological wild goose chase. Found today mostly in reference to sewing or design, this <pink> may be related to Germanic words like peck, pick, and/or pike, or to Latinate words like puncture, poignant, pungent, punch, and pugnacious.
  4. The second hypothesis for the flower’s name, pink [‘small’] eyes, works well as a translation of the French synonym oeillet, a ‘little eye.’ The Dutch word pink has a historical denotation of ‘small,’ and is used to refer to the pinkie (or pinky) finger, whence the English name for the littlest manual digit.
  5. The ‘small’ sense also shows up in the name of a pink, a fast, nimble little watercraft common in the  Atlantic ocean during the 17th and 18th centuries. The Spanish pinque and Italian pinco also reflect this Dutch derivation.
  6. Some folks say an engine knocks and pings; others, mostly Brits, say it pinks.
  7. There’s also a dated term pink that refers to a kind of lake (lacquer) pigment, but it’s yellowish and of uncertain origin. Go figure.

The pronunciation of pink is worth paying attention to: #6 is onomatopoeic, and #3 belongs to either one or another family of words that also kind of sound like what they mean: pike, pick, and peck, or puncture, punch, and repugnant (literally, something that ‘punches back.’) The word pink has a nice ring to it. It’s sharp and tingly and saying it makes you smile a little.

Pink has a straightforward orthographic phonology, too: it has four graphemes <p i n k> and four phonemes /p ɪ n k/. The phonetic realization of those four phonemes, however, sends a lot of folks into quite a tizzy. The /n/ is realized as a velar [ŋ] because of its coarticulation with the velar /k/ — the same thing happens in words like distinct or banquet, but few phonics programs address [ŋ] beyond monosyllables. The /ɪ/ is nasalized, and often raised by the velar coarticulation too, so it ends up feeling more like an [ĩ] — a long, nasal eeeee. That’s the part that makes you smile.

Traditional phonocentric approaches teach this and other velar nasal patterns as whole rimes (ink, ank, onk, unk) and giving them made-up names like “welded sounds” or “nasal blends,” rather than taking an accurate look-see at the orthographic phonology. Instead of studying the phonology of <n> — which can be realized as [ŋ] before a velar consonant — these approaches add to the cognitive load for each child by piling eight new patterns (including ing, ang, ong, ung) into the mix, and often not clearly identifying them as rimes and not as graphemes or as that phonics horror of horrors, “blends.” This is largely because phonics is so stuck in its misapprehension of the phoneme that it can’t deal with the difference between the /n/ phoneme and the [ŋ] allophone. [I’m happy to consider an argument that there is a /ŋ/ phoneme, but it has to present an accurate understanding of the difference between a phoneme and an allophone.] Another phonics problem I’ve observed time and again is the failure to differentiate between an <ing> rime and an <ing> suffix. This distinction is a non-negotiable understanding in orthographic study: the same sequence of letters doesn’t always bear the same identity or the same function. It depends on which word they’re surfacing in.

My spelling teacher (who happens to be French) always says that there are no coincidences. As I was working on this pink-inspired piece, I spoke with a colleague who told me about a 3rd grader she works with who has a very hard time with the inks anks onks and unks of her Wilson Reading System instruction. The child reads words with these rimes just fine in connected text, but not in isolation. I bet you a dollar that she’s trying to “sound them out” and is trying to string [p ɪ n k] together, for example, but can’t make sense of it without a meaningful framework. My question — my colleague’s question too, which is why she contacted me — is What in the heck is the goal of “reading” words in isolation if she can read them fine in text?

I can’t answer that in any way that I can argue has the child’s best interest, her engagement with language, or her lifelong development as a literate soul, at heart. The bloom is off the phonocentric rose.

The phonology only has structure in a meaningful framework, which word lists really never provide. The ways in which <pink> makes meaning are interwoven with each other and with our history.  According to Oxford, the use of pinkie for ‘little finger’ was reinforced by the color sense (#1), but of course, that only works well for pasty Celts and Anglo-Saxons, not across the English-speaking world. The association between the flower, color, and flesh is also reflected in the word rose (think rosy cheeks), but especially in the name of one kind of dianthus, the carnation. In late Middle and early Modern English, the Latinate words carnation and incarnation were used to mean ‘the color of flesh,’ anything from ‘blush-color’ to ‘blood-color.’

Again, this whole pink-flesh connection only really works, at least on the surface, if you’re a white person. Oxford points out that not all carnations are pink, so of course not all dianthus are pink. Likewise, not all flesh is pink. I’d say Duh, English, but the French did it first.

I’ve also learned from my spelling teacher that the study of the writing system necessarily and organically brings about the possible study of so much more. What does it mean, in a world where we argue about whose lives matter, that the historical association of pinkness with human skin is captured in our written language? How would today’s third-grader respond to the information that my childhood Crayola box had a pinkish crayon labeled “Flesh,” but hers does not? What might a study of words like white and black reveal to us? I’m not interested in this because I had some social studies agenda in mind when I started studying pink; rather, these questions are where the study of pink led me. Just in time for Martin Luther King Day and everything.

I wrote that. Then I saw this:
skin ffs

There are no coincidences. That’s not some kind of mystical statement; it’s an observation. There are no coincidences; there are the connections that we conceive of, the stories that we tell, and the meaning we make.

Tickles me pink.

Scholars who take the Old English for Orthographers LEXinar take a critical look at what’s said about the historical origins of words. Many categorically false, totally ascientific claims have been made in print by language educators widely considered to be “experts.” It’s been going on for decades.

In the 1980s, Bob Calfee’s “Layers of the English Language” triangle listed a dozen words as having an “Anglo-Saxon” origin. Two of them are definitively not Ango-Saxon: cry is Latinate, and jump isn’t attested until the Modern English period. A third, grave, is a homograph. One of the pair (dig a grave) does have an Old English origin; the other (a grave illness) is Latinate. Certainly there were less ambiguous options available.

More recently (2004), Louisa Moats has claimed that tube is Anglo-Saxon (it’s French), that television is Latin (actually, the <tele> is Greek), that biodiversity is Greek (not quite — the diversity piece of the compound is Latin). Moats also indulges in a fantasy of Anglo-Saxon origins for amuse, engender, enable, and endure, all of which are of French origin in real life. Other words that Moats falsely associates with Anglo-Saxon across her work include crash, age, lilac, recess, cable, bugle, title, dabble, problem, commit, and adept, most of which were adopted from French. She also attributes gravity to Greek. Poor old French! It doesn’t even get a layer in the triangle.

Moats is unfortunately in good company. In a 2009 article in American Educator written with her reading science colleagues (Joshi, Treiman, and Carreker), more than half of the examples of Anglo-Saxon words and patterns they give are flat-out wrong, not including ambiguous examples like Calfee’s grave or the homographic found (past tense of find, to establish, and to pour molten metal — two of which are French). They mistake a Greek origin for ache (it’s actually Old English) and Anglo-Saxon origins for the following words: carpenter, farmer, grocer, butcher, passable, agreeable, punishable, catch, pouch, rich, age, saved, and plentiful, but they’re mostly adopted from French.

Le sigh.

All in all, this article alone boasts more than 40 etymological lies in 12 pages, and that’s just one piece of writing from these prolific authors. This is not an occasional error or a minor problem. It’s epidemic. It’s malpractice, and I’m not mean or nasty for calling it out. I’m right.

Now, I don’t know everything, and I don’t expect others to know everything. I make mistakes in my work, and when others point them out, I am grateful for the opportunity to deepen my own understanding. I’m not unreasonable. I don’t, for example, fault Moats and company for being unable to explain the spelling of words like us, thus, yes, if, his, much, which, such, or, all of which she refers to as “exceptions” to the final patterns <ss>, <ff>, and <tch>. These words aren’t really exceptions — there’s no such thing; rather, they’re function words, which take the smallest possible spelling (see in/inn, of/off, or/err). I don’t expect educators to have a good command of this yet, as it’s not necessarily terribly common knowledge, and not something a plain old dictionary will flag.

However, etymology — a word’s origin — is not a matter of guesswork or opinion. Any proper dictionary can tell us where words come from. We can look them up in the Online Etymology Dictionary on our phones for free, for crying out loud. People with Ph.D.s and secure jobs should be able to ask an intern or proofreader to look up all the examples in a dictionary if they don’t care to reap the incredibly rich, captivating understanding that word study brings for themselves. Either way, it is a professional and ethical imperative that these authors begin to ensure that the teachers and scholars reading the words they write will not continue to be systemically misinformed.

In addition to the rampant etymological underhandedness in print, teacher trainers and workshop speakers perpetuate the same careless claims in classrooms and conference rooms. I’ve heard countless examples myself, and colleagues who know better report them to me.

It kind of makes me mad. Like, mad angry and mad crazy.

Mistakes don’t make me mad. But willful, continual misinformation makes me mad. Irresponsible scholarship makes me mad. False claims of expertise make me mad. And, as faithful LEX readers will recall, experts meeting corrected information with denial and deflection make me really, really mad.

Well. Today I received the following email from a colleague:

“I was attending an Indiana IDA meeting yesterday in Indianapolis. In an adjacent room, [famous teacher trainer guy] was conducting his 1-day morphology training. I stuck my head in for about 10 minutes to hear him talking about how morphology builds vocabulary—OK so far.

BUT this was his example:

crazy is Anglo-Saxon

insane is Latin

lunatic is Greek

I just had to walk out.”

Now, this man is a well-known, well-traveled, well-respected trainer whose work I have found troubling before. He has a habit of telling teachers not to teach the schwa because it’s “too complicated.” Of course, this advice is problematic because the schwa is the most common phone in an English utterance, but what really fries me is the all-too-familiar “don’t worry your pretty little heads” tone of a man telling a roomful of female educators what’s too hard for them to understand. Yuck.

Now, as far as crazy/insane/lunatic go, of course, I find the choice of subject matter to be a little ironic, ’cause I do indeed think it’s a little crazy to make unsubstantiated claims about word origins while stressing how important word origins are to word study. As you might expect, our morphology “expert” only got one of his examples right: insane does actually have a Latin root. But crazy is built on a French loanword, and lunatic is derived from luna, the Latin word for moon.

Instead, if you really want to have a look at cross-linguistic synonyms pertaining to insanity, I’d submit the following:

Old English: moony

Latin: lunatic

Greek: selenomanic

See? That makes a lot more sense.


Old English: mad

Latin: insane

Greek: psychotic

I’d love to be able to include crazy, but its origin is a hard tail to pin on that tired, old layers-of-language donkey. It’s originally Germanic, but was adopted from French. And the French in question is Norman French, not Parisian French. This is true of so, so many words educators erroneously attribute to Anglo-Saxon: they’re short, common, everyday words, but they were Norman French contributions, not Anglo-Saxon. Some of them are Latinate; others are Germanic. After all, it was a really Germanic French that English was adopting words from in the late Middle ages.

Now, as I said, I’m not unreasonable. I get that understanding the nuances of language history and word histories requires study. After all, that’s what I do. I am sympathetic to the fact that most people don’t have the depth of etymological knowledge that I have. I get it. But that’s just the thing: you don’t have to have extensive knowledge of etymology in order to get it right, at least most of the time. You just have to look in a dictionary. Someone else has already done the study for you. It takes less than a minute or two to look up and read the entries for crazy, insane, and lunatic online.

Moreover, I’m not talking about generally held folk etymologies that get a foothold in the cultural rock wall; I’m talking about people who are widely regarded as “reading scientists,” people others rely upon for linguistic expertise and accurate information about language. Etymology as a field of study involves using established practices of comparative linguistics, based on the broader principle of the scientific method. The etymological guesswork across “reading science,” where every other example of an Anglo-Saxon word isn’t Anglo-Saxon, is pseudoscience, neither scientific nor a method.

Look, writing books and articles and speaking at conferences are activities that require research and preparation. I’m not a lunatic for pointing out that conference speakers, certified trainers, and respected, peer-reviewed authors be held to a higher standard when it comes to the empirical claims they make about words. Factual rigor is not an insane expectation for scholarly speaking and writing.

I’m not crazy.

I am, however, pretty mad about etymology.

Phone Home

I get a ton of emails. I mean, a ton. I have several email accounts, and it’s a part-time job to keep up with them all. Of course, nowadays, I also access email on my phone. I know I am not alone in this. Needless to say, a lot of the emails I get are language questions. Here’s one I got this morning, and I decided to turn it into a LEX Q&A, so more people can benefit from the dialogue than just us two. (The email has been edited for formatting and asides).

[W]hat is the final phoneme in the word cat when it is at the end of a sentence?  “I saw a little cat.”  It’s not the same as at the beginning of tip, but is it just an allophone of /t/?   I was reading about the “flap” and it doesn’t seem like it would be a flap, because my tongue stops on the roof of the mouth rather than tapping there. But I’m not sure how the flap works either. I feel as though when I say little I go straight from /ɪ/ to /l/. But there’s a difference between the way I say little and Lil. If I try to say Lil as a two syllable word with just the /l/ in the second syllable that’s still not the same as little so something is happening with my tongue, but I can’t figure it out. It almost feels like I’m squishing air out of the sides of my mouth in between the /i/ and /l/ and pushing my tongue more forcefully up with the final /l/ in little.

Aaaaaaand, my response: What a great question! And an important one, too. One of the biggest problems with the decades-old emphasis on “phonemic awareness” is that most teachers don’t really understand what a phoneme is. They think it’s a “minimal unit of sound” or some such; it’s not. It is minimal, and it is a unit, and it does have to do with language as it is pronounced, but it’s not actually a sound. Moreover — and this is critical — it’s distinctive. What this means is that, while it carries no meaning itself (the /b/ in /bɪt/ doesn’t mean anything), it is distinctive for meaning — it differentiates meaning — from other phonemes (the /b/ in /bɪt/ and the /p/ in /pɪt/ distinguish the meanings of those two words. That all happens in your head.

Elsewhere, however, there are different physical realizations of pronounced words and utterances. Those physical realizations have structures that can be studied, like all physical things. The phoneme /t/ is conceptual, a psychological category, container, or class — choose your metaphor — with several different possible members. Those members — all the members of the phoneme /t/ — are its allophones. Some physical realizations of /t/are aspirated. That is, they have a little release of air when the tongue is released from the roof of the mouth. That’s like in the word top. Phonemically, we would represent this as /tɑp/, but phonetically, it’s [tʰɑp]. If we put a /s/ in front of the word, however, the aspiration isn’t there: [stɑp]. You can see and feel the difference if you pronounce those two words aloud while holding a kleenex in front of your face. But phones aren’t necessarily distinctive for meaning: if you were in my car and yelled [stʰɑp], I would totally slam on the brakes. The [] and the [t] are allophones of the same phoneme, /t/. Other allophones of /t/ in English include [t ̚ ], [ʔ], and [ɾ], also known as the “flap.”

So, to answer your question directly, the phoneme at the end of cat is the same as the phoneme at the beginning of tip, but they are different phones. They are phonologically the same, but phonetically different. Yes, that makes them allophones of the same phoneme, different members of the same class.
Another allophone is the flap [ɾ] in your pronunciation of little. A Brit would be likely to say [lɪtʰəl], while an American more likely to say [lɪɾḷ]. The difference between Lil and little is that flap — your tongue briefly taps the alveolar ridge, before releasing the [l] laterally. There’s a co-articulation from the [ɾ] to the [l]: both of them have an alveolar place of articulation. You don’t have to move your tongue to get from one to the other. They are also both voiced. The difference between them is in their manner of articulation: [ɾ] is a flap, and [l] is a lateral approximant. That lateral refers to the release of the air out the sides of your tongue, just as you articulated in your question. The “more forceful” push of your tongue to the alveolar ridge in little? That’s the flap.

Phones and phonemes are not for sissies, but a clear understanding of the difference is absolutely critical for scholars and teachers of the written word. Writing systems’ representations of pronunciation may target syllables, or it may target phonemes, or both. But spelling never, ever targets phones; there’s no such thing as a non-phonetic word, or rather, all written words are non-phonetic. When a child writes <chree> instead of <tree>, she’s not mishearing the word; she’s ascribing the physical phone she is saying or hearing to the wrong phoneme in her head. *That’s* phonemic awareness, but teachers may be at a loss to remedy it unless they have clarity about what’s going on phonetically in that word.

No pithy ending in this post, no clever turn of phrase. No LEXlover’s delight. What do you want from me? It was an email. If you’re still reading this far, good for you, and you’re welcome.

