Wholesome Yiddish and Language Ideologies

A little while ago, a linguist friend of mine sent me a video of Mayim Bialik talking about Yiddish. The video has since gone viral. In it, she explains “why Yiddish has two words for penis.” Aside from the laughably low-ball (pun not intended) number of penis words in Yiddish, there were a few factual errors. But, more than the factual errors, the assumptions that underpinned the whole endeavor were a little, shall we say, unsavory. It ultimately inspired me to create a weekly series on wholesome Yiddish words. I should explain the content first, and then my qualms with the underlying assumptions.

Before I continue, Mayim Bialik, for those who don’t know, is a super-smart actress, made famous by her role on the Big Bang Theory. She holds a PhD in neuroscience, and is the new host of Jeopardy, having endured a tremendous amount of misogynist and antisemitic abuse on social media after filling Alex Trebek’s shoes. She’s also Jewish, having become Modern Orthodox after a reform upbringing, and her grandparents were Yiddish speakers from the heym (Wikipedia claims Poland, Czechoslovakia, and Hungary, but those countries’ borders and relationship to Jews are not always easily projected clearly into the past). She knows enough Yiddish to have acted in it, in a fantastic episode of the web series "YidLife Crisis.” I don’t always agree with her — especially her takes on vaccines that she later walked back — but I respect her tremendously. That’s why I was saddened when she used her enormous platform to repeat clichés about Yiddish that are common tropes (pun not intended), but that are not fully correct, and that can be harmful. She was attempting to educate, and to be self-deprecating, but her attempt at language valorization fell flat.

The video she posted sought to explain why there’s the word “schmuck” and also the word “putz” and, additionally, to inform people using them cavalierly that they’re calling people a penis. This is genuinely funny, and a great way to get people to engage with a stigmatized language. She explained that Yiddish is a “conglomerate language” (linguists use the term “fusional language” sometimes, or “contact language,” instead), and that 30 percent is Hebrew, and the rest is “German, with German grammar, and also a bunch of slavic languages.” She elaborates that that means that “sometimes we have more than one word for something like the penis.” She explains that calling someone either of those is equivalent to calling them a dick (true!) and then relates that she went to school with a gentile named Schmuckler, and that it was awkward.

So here’s the thing: that all sounds perfectly truthy, but it isn’t completely accurate. And there’s a deeper issue to tackle, once we get the facts straight. Let’s get the facts first. So, first: the origin of Yiddish words is approximately 30% Hebrew and Aramaic (sometimes “HA” as a shorthand). That “and” is doing some heavy lifting. But also, notice, I didn’t say it’s 30% HA, I said that 30% has that origin. The words themselves are not pronounced the same, sometimes mean different things than in Hebrew and Aramaic, and are more often than not subjected to Yiddish (that is, Germanic) declension and inflection. Even the parts that are all from Hebrew get combined differently in Yiddish: the plural of shabbat in Hebrew takes the feminine Hebrew plural suffix: shabbatot. The plural of its reflex in Yiddish, shabes, takes the reflex of the masculine Hebrew plural suffix: Shabosim. Second: just as it isn’t Hebrew per se, the rest isn’t German. It’s a language that is a descendant of a dialect of Middle High German, so Yiddish is about as German as French is Italiian. They’re cousins, not, you know, the same person. Yiddish does not have German grammar. A simple example to prove the point: In German, in subordinate clauses, all the verbs go at the end. In Yiddish, they do not. But there’s tons of other examples. Yiddish doesn’t decline nouns in the same way (especially not contemporary Hasidish Yiddish). Yiddish doesn’t retain a distinction between accusative and dative case marking. Yiddish has words descendant from MHG that are different than the words in German for the same concept (for instance, instead of heute for ‘today’ Yiddish has haynt, which does not have a living descendant in modern German, and historically meant ‘tonight.’ Because Judaism). Third, and most egregiously: Neither of the words she mentioned is from Hebrew. There was no reason to mention the HA component of Yiddish. Putz is neither of Hebrew nor of Germanic origin, it’s from Romance language origin, either through Judeo-romance, or through contact with Romanian. (Shmok, coincidentally, is cognate with German schmuck, meaning ‘jewel’ so that poor Schmuckler’s name just meant “jeweler.” Though note that while cognate, this does not mean that the Yiddish word originally meant “jewel” see here for more.).

But that’s not the real problem. English has way more words for penis than Yiddish. Think about it this way: English has all of the words for penis that Yiddish has, because it borrowed them all, and it has a ton of others. English borrowed putz, shmok, shmekl, shvantz, shlong, and so on, but also has penis, johnson, dick, cock, (a thought occurs — what am I doing to my search engine optimization?), sausage, kielbasa, knob, dong, pecker, and many, many others. But we tend to think of Yiddish as the language with all the dirty words that need explaining. English, by all appearances, is far more obsessed with the male organ than Yiddish could ever hope to be.

Yiddish was historically looked down on as a jargon, as broken German, as uncultured, backwards, and boorish. And it is still thought of that way by many people. Often, even, the descendants of Yiddish speakers who, whether reluctantly or enthusiastically, discarded Yiddish in the hopes that their children and their children’s children would be accepted. But Yiddish is a language with a rich, thousand year history. It’s the language of culture, literature, music, and devout religiosity. When we reduce Yiddish to words for one’s member, it does a disservice to those who lived full lives in the language, and it goes a long way toward implicitly justifying why ‘nobody’ speaks the language — it’s just a backwards relic of the old country. In reality, it’s not that people discarded Yiddish, it’s that six million native speakers were systematically murdered. We should not, even implicitly, reproduce the myth that it' was just discarded because it is in some way lacking (perhaps, lacking in culture?). There are other factors, too. Certainly, the fact that English has so many cognates may have made it seem redundant to teach the next generation Yiddish (why say “shvitz” when you can say the same word in English: sweat? What makes Yiddish truly distinct from other dialects of German is still the Hebrew and Aramaic, and if you have a Jewish education, you’ll know those words anyway). And of of course, plenty of Borsht Belt comedians got around FCC censorship by simply swapping out the English word for the Yiddish one, sometimes saying deeply offensive things, but playing off the goyish censors’ sentiment that Yiddish isn’t a real language worth worry about censoring in the first place. Hell, people even spell Yiddish as though it’s High German, dutifully adding letters to represent historical sounds that never existed in Yiddish (why are there two cs in “schmuck”?).

I have absolutely no doubt that Dr. Bialik was just having fun pointing out that both words mean “penis.” But it got me thinking about how Yiddish is portrayed. I have friends who are native speakers of Yiddish. And frankly, they almost never — maybe even never never — say those words. They study in Yiddish, shop in Yiddish, discuss religion in Yiddish, and when they sleep, they dream in Yiddish. To reduce that entire life, as is often done, to just ways to insult someone (meshugge, shlemiel), words for ‘penis’ (shvantz, shmekkie), or both (putz, schmuck), is deeply saddening to me. So I decided to launch a weekly series of wholesome Yiddish words. I’ve already got videos about naches ‘pride in the accomplishments of others,’ nafka mina ‘a practical difference,’ nign ‘a melody,’ shkoyakh ‘thank you' (with a very interesting history), zise khaloymes ‘sweet dreams,’ and more — with a focus on what makes Yiddish unique, and the history and significance of these words. Hopefully, this series leaves people wondering not “why does Yiddish have so many words for penis'“ — a question predicated on a false premise! — but “why does Yiddish have so many ways to express pride in the accomplishments of others?” Or “why does Yiddish have so many words for blessing, benediction, and study that are from Romance language roots, when it’s a Germanic language?” Or maybe, “why does Yiddish have so many words for special foods you eat one day a week?” Videos will be updated weekly. I’ve left a sample below.

Dr. B., I’m still a huge fan. If you ever want to talk Yiddish and linguistics, I’m at your service.

Happy learning, and yasher koakh.

Tense, aspect, and mood, oh my!

I get asked about tenses in various languages A LOT. Often, the person asking is actually asking about aspect, but that’s a concept that nobody learns about in school (unless it’s an undergrad class in linguistics). I don’t get this. Knowing about aspect and how it differs from tense would make life easier for anyone even remotely interested in language.

So I made a quick video explaining why “Your textbooks lied to you about ‘tense’.”

In it, I discuss the distinction between tense and aspect; speaker, reference, and event time; imperfective and perfective aspects; the fact that not all languages have both; and tense/aspect syncretism. All with references to Star Trek, Black Dynamite, and the Tinder Swindler. Enjoy!

More on pronouns: You’re all wrong about everything

Almost a decade ago, I wrote about pronouns and got hate mail from Jordan Peterson fans for years. I decided to wade back into that morass to point out that while everybody is arguing about pronouns now, most people have no idea what they’re talking about. In this video, I explain how the current public conversation gets facts wrong about what pronouns even are, what gender is, how hard new pronouns are, and how easy new pronouns are. I also discuss the underlying cultural assumptions that drive the discourse: extreme individualism on both sides of the debate.

I also make some tentative recommendations, boiling down to not demanding people share their identity, respecting people’s identities when they do share them (if you respect the person), and somewhat contentiously, I recommend audience design over ideological rigidity, meaning balancing the desire to respect people’s identities with using language your hearer will understand.

The details are in the video below. Enjoy!

HAPPY NEW YEAR!

Happy new year everyone!

In the last year, I launched a Youtube channel, Patreon, and merchandise through Spring (all @languagejones), and the response was incredible. I was blindsided by two of my videos surpassing a few hundred thousand views, and I am so encouraged by the response. In 2023 I will be building both the channel and this website — that is, “building my brand” — and I have a lot in the works that I am very excited to share.

I also have an accepted publication I’m very excited about, on studying lexicalization in AAE using large social media corpora; That is, what’s becoming a trademark approach of using novel techniques to examine understudied phenomena in African American English (and other language varieties) rather than rehashing the same tired old studies and methodologies (who really needs another study on postvocalic r deletion?).

I have a few books in the works that I am VERY excited about finally sharing with the world, the first of which should be done and possibly even available for purchase in 2023.

I also have had some successes in my personal life, which further underscore that my choice to pursue a career trajectory other than postdoc -> adjunct -> assistant professor was the right choice for me. I am still reluctant to share my personal successes or hardships publicly, since there are a few folks in linguistics who are unhealthily obsessed with me and I don’t wish to give them joy (at my hardships) or fodder for the evil eye (*pyu pyu pyu), but even they seem to have their own things to focus on lately, and I will be discussing more of my career trajectory and the work I do now, on here and on my youtube channel, in 2023.

I wish everyone the best in the new year, and a year of joy and gladness.

Happy New Year!

PS: my youtube channel can be found here: https://www.youtube.com/@languagejones6784

and my Spring shop can be found here (with more to come, including some sweet Ugaritic designs, in the new year): https://languagejones.creator-spring.com/listing/languagejones-big-wen-hoodie?product=227&variation=2664

Nancy, Nancy

Like many of you, last night I watched the first evening of congressional hearings on the events of January 6, 2020. At one point in the video they played, there was a bone chilling sequence of shots as a crowd of armed men wandered the halls of congress chanting “Nancy, Nancy…” as they tried to find Speaker of the House Nancy Pelosi.

Their chant was exactly what it was intended to be: menacing.

But…why? Or rather, how?

Just as some claim that the people who breached the capitol on January 6th were peaceful demonstrators or regular tourists, I would not be surprised if some claim that it was a harmless chant, or even one in support of the Speaker. That got me thinking about how exactly it has the effect of being menacing.

For reference, here is a video (warning, it’s disturbing, obviously).

In short, we use different intonational patterns to do different things in English. The one they were using here is associated with, obviously, looking for someone. In this case, it’s a sing-songy fall: you pick a pitch, hold it on the first syllable, and then pick a pitch a minor third below and hold it just as long on the second syllable. There are other variations — in fact, the way we discussed this in Phonology in grad school was with the “Oh, [name]” paradigm. That starts with a lower note, and then has the “Nancy” pattern. If you’re a musician, that would be C, E, G in the key of C. The person filming in the video above starts with this figure, and then modulates upward. You use the same pattern no matter the length of the name, so you can fit two notes to one name (“Oh, Jaaaaa-ack”) or fit all three to a longer name, where you hold the low note until you get to the primary stressed syllable, which gets the high note, and the drop off comes after that.

This is classic “looking for you” intonation in English. Not inherently menacing, but also, not inherently not menacing. But it’s also not something that occurs completely in a vacuum. I was tempted to call this the “Wendy” intonational pattern in writing this, because I could have sworn Jack Nicholson uses it in the “bat scene” in The Shining, but I seem to be having a Mandela Effect/Berenstain Bears moment. Nevertheless, it’s a classic mainstay of the horror genre. You might use this intonation when looking for a child playing hide-and-seek (indeed, Ladd mentions hide-and-seek explicitly in Intonational Phonology (1996)), but when it’s armed men who have breached a secure area and are looking for a public figure to murder, it’s clearly used ironically to heighten the menacing effect. They weren’t planning a tickle party for Pelosi, regardless what asinine defenses they may come up with after the fact.

Lest anyone say they were just chanting their support (the “boo-urns” defense), that would be a sustained pitch on the first syllable and a very short second syllable (I’d notate this musically as a whole note followed by a staccato quarter note or eight note a fourth above; C for a bar followed by a short F, if we’re in C).

As horrifying as this particular topic is, it underscores what I like about linguistics: it helps make sense of the world. We all know that the chants were intended as menacing, and we all know that any claim otherwise is asinine. We could, with minimal prompting, generate a variety of intonational patterns for the same name or phrase or sentence (think about it: how would you say “Nancy” to indicate surprise? Dismissing a ridiculous idea from someone named Nancy? To indicate tender, loving emotions?). Really think about it! Here are my answers:

But most of us could not put to words, or music, what we inherently know about our native languages. And having the skills to dissect, analyze, and discuss those patterns doesn’t just help one learn other languages (although, it does do that). In this instance, it helps make sense of why some arguments are clearly disingenuous, in a very high-stakes context.




©Taylor Jones 2022

Have a question or comment? Share your thoughts below!

From Daddy to Zaddy

One of the main reasons I started a YouTube channel is that, while I’m very comfortable in a text-based medium, there are just certain things about spoken language that it will always be easier to discuss with recourse to sounds. The more I want to talk about specific sounds and sound changes, the more I force my readers to engage with specialized technical tools for linguistics. Especially the International Phonetic Alphabet. With that in mind, there are quite a few ideas that I’ve been banking, because they just don’t work as well in text.

One of them is the origin of the slang term zaddy. There are a surprising number of bad explanations out there, most of which were obviously made up by the person doing the explaining.

While I can’t prove it without doing a large-scale study, I am convinced that zaddy originated from the pronunciation of “daddy” in Spanish-influenced English varieties (like Puerto Rican English in NYC), filtered through AAE, before being adopted by the mainstream as a different word. This analysis relies on:

  1. understanding that the same sound, /d/ in this case, can be pronounced in slightly different areas of the mouth, and that one of them is with the tip of the tongue touching the back of the top front teeth,

  2. Knowing that some regional varieties pronounce /æ/ as in “cat” like starts with an /i/ as in “feet” in certain contexts,

  3. Knowing that some of those regional varieties include before /d/ (as well as before nasals like /n/ and /m/),

  4. Knowing that African American English generally doesn’t do this,

  5. Knowing that coronal stops like /t/ and /d/ affricate (that is, get buzzy or hissy) when they precede that /i/ sound,

  6. Knowing that we don’t tend to hear other people perfectly, but instead fit what sounds they produce into our mental organization of sounds in our own language and accent.

This is a relatively straightforward and simple collection of facts, but following the thread and combining all six of them to see how you get zaddy from daddy can be difficult without audio examples. So with that in mind, I made a video explaining what I think is the real origin of zaddy (and zamn!), and it includes audio examples that cover everyone from John Leguizamo in the 1990s and Salt ‘n’ Pepa, to Yung Baby Tate, Ty Dolla $ign, and Saturday Night Live. Enjoy!

-----

©Taylor Jones 2022

Have a question or comment? Share your thoughts below!

Big Announcement: I'm on Youtube!

There is so much about language that’s just better if you can include either sound (for spoken languages) or movement (for signed languages) and there are so many fantastic youtube channels out there, that I was inspired to start my own!

My goal is not to teach a class on linguistics (for that, I would recommend The Ling Space, or Crash Course Linguistics). Instead, I wanted to share some of the aspects of linguistics that fascinate me. I have started making videos that are serious overviews of a topic (including more than a few on African American English), as well as discussions of language learning and polyglot issues, and shorter language “bites” where I share crazy, interesting things about language that are easy to overlook (like how politeness is signaled in Persian by how much air you let out of your nose when you’re talking). I have also started interviewing linguists I know about their work, and plan on having much more of that.

My inspiration for this channel was, in part, my friend Kierstyn, who lives down the hall from me. When she has questions about language, she’ll come over and ask. So many people have so many unanswered questions about language that have really interesting answers, and I wanted to open up Kierstyn’s Question Corner (soon to be a real segment) to everyone else, welcoming you into my home and then ranting at you for a few minutes about case marking, or indirect communication, or the difference between tense and aspect.

I’ve already started adding video components to my blog posts, and I anticipate a semi-regular pace of uploading videos. I had a soft, borderline stealth, launch in 2020, and now in 2022 I think I’ve got the general gist of how to produce content on a rolling basis. I’ll be starting a patreon, and I have merchandise available here.

If you like this blog, I think you’ll like the channel. If you do, please like, subscribe, and comment — I especially welcome comments on any of my videos with questions about language, which I might address in future videos.

-----

©Taylor Jones 2021

Have a question or comment? Share your thoughts below!

Duolingo Yiddish: A Guide for the Perplexed

18 days ago, Duolingo released their newest language course: Yiddish. There has been an unsurprising amount of controversy around the course, which I will briefly explain, but there has also been a lot of confusion about the language itself, especially the relationship between spelling and pronunciation.

I wil l start with the caveat that while I am a trained linguist and I already spoke some Yiddish, I am an enthusiast and neither a native speaker nor a Yiddishist™️. That said, I am friends with people who are both, and I have been asking them my own questions, and keeping an eye on the questions (and conflicts!) that have filled my social media feeds in the last two weeks. This is not intended to be an academic work so much as a toolkit for people who are struggling. Note: I am going to be saying “Hasidic” and not “Hungarian/Poylishe/etc.” in the rest of this post, but I recognize that it’s a simplification and so should you. So let’s get down to tachles:

What kind of Yiddish are they teaching and why is there conflict?

Yiddish is complicated. Until very recently, it was incorrectly considered by many to be merely “bad German,” and it wasn’t until the 20th century that there was a concerted effort from a handful of linguists to study and describe Yiddish as its own language. Yiddish: A Linguistic Introduction by Jacobs has an excellent summary of the arguments that demonstrate conclusively that Yiddish is not merely a form of corrupted German, but is instead a fusion language descended in part from Middle High German, but also drawing on elements from Judeo-Romance languages, Hebrew, Aramaic, and Slavic elements. There are various other arguments about population movements and origins that are complicated, contentious, and not super relevant here.

Yiddish historically had dialects. Before the Holocaust, Yiddish was spoken across a large geographic area (Ashkenaz, or the heym). Historically, it could be divided into Western, Central, and Eastern, with further subdivisions: Northeastern, Central Eastern, South Eastern, and so on. These each had slightly different grammatical patterns, sometimes very different accents, and, of course, different words for the same thing. If you want to see maps of word and accent differences, @seydproject on Twitter are working hard on digitizing historical atlases of Yiddish.

Yiddish has new dialects too. Even before the holocaust, there was a push toward creating a standardized literary form of Yiddish. Its pronunciation was mostly Northeastern, but its grammar was drawn from other regional varieties and emerging norms in Yiddish literature. In many ways, what people refer to as YIVO Yiddish (for the YIVO Institute) is an imagined standard that wasn’t ever spoken natively by anyone, but that was a reasonable compromise for a cross-dialect, cross-regional “standard.” Separate from that, there are new dialects of Yiddish that are different from what people spoke 100 years ago, and that are spoken by millions of native speakers around the world — mostly in very “religious” communities (not that they’re actually more religious or more traditional than everyone, but they’re usually discussed in those terms). There was a melting pot of Yiddish in the Lower East Side roughly 100 years ago that may have been its own distinct thing, as well.

The new dialects have different grammar. Hasidic Yiddish in Brooklyn has a different grammar than YIVO standard Yiddish, than the dialects spoken in Europe a hundred years ago, and than the secular books people may be interested in reading in Yiddish. Case marking has all but disappeared in Hasidic Yiddish, so the definite articles der, di, dos are present in some formal publications, but in casual speech are almost always all de. This, in part, explains some of the pronunciations of di and der that slip in to Duolingo. Agreement on adjectives has likewise all but disappeared. In fact, case marking on pronouns even seems to be on its way out, with the dative fossilized in some forms, but accusative case encroaching on it.

The Duolingo team came to an ingenious compromise. The team was largely (all?) native speakers, but they wanted the course to potentially serve as a bridge between secular Yiddishists, who may be learning out of nostalgia or to read Sholem Aleykhem in the original, and Hasidim and other non-secular communities where Yiddish is the lived language of every day life. If you learn YIVO, you can’t speak to native speakers. If you learn Hasidishe Yiddish, you won’t grok the grammar in an Isaac Balshevis Singer novel. So what did they do? They chose to teach the spelling and grammar of YIVO standard Yiddish with the pronunciation of contemporary Hasidic Yiddish. This means if you want to speak with people, you’ll learn the vocabulary and pronunciation to be able to, and the grammatical changes you’d have to make are to have to do less conjugating and declining. If you want to read old books, or take a college class, you will know the academic grammar, and can figure out pronunciation from the standardized spelling. It’s literally a win-win. So of course everyone is mad about it.

Why is the spelling WEird?

There are two reasons the spelling doesn’t seem to line up with the pronunciation:

  1. YIVO Yiddish has artificial mergers that contemporary native speakers keep distinct. It’s just like how New Yorkers keep the sounds in cot/caught or Don/dawn distinct but Californians cannot hear or reliably pronounce the difference. In this example, the Duo speakers are the New Yorkers.

  2. Hasidic Yiddish has natural vowel shifts that led to mergers that YIVO Yiddish keeps distinct.

I will explain both in turn.

Orthographic ‘Mergers’

YIVO has collapsed two different historical vowel classes into “oy.” This is the main point of confusion from YIVO standardization: words like אזוי azoy ‘so’ and רויט royt ‘red’ are pronounced like they’re spelled, with an “oy” sound. [This is actually just a fact about written Yiddish, and not necessarily a decision made by YIVO, per se, as Isaac Bleaman helpfully pointed out]

But words like פרוי froy ‘woman’, הויז hoyz ‘house’, ברוין broyn '‘brown’ were not historically pronounced that way. They were historically, and are currently, pronounced like fro:, ho:z, and bro:n, where the colon indicates vowel length (if you listen closely, there’s what sounds like an off-glide as well). YIVO has — I cannot emphasize this enough — artificially collapsed two pronunciations that were historically distinct for many, many Yiddish speakers. So it’s not that the speakers on Duolingo are inconsistent in how they pronounce oy, rather they are very consistently pronouncing two different vowel classes differently.

Hasidic Vowel Shifts

There are certain patterns of change that languages follow so often that linguists are really not surprised by them. In fact, if you read Bill Labov’s magnum opus, Principles of Linguistic Change, there is an entire volume dedicated to predictable patterns of vowel change over time (with pages on pages discussing Yiddish vowel shifts!). This is not the place for it, but pretty much all of the vowels in English moved in what’s called the Great English Vowel Shift, and our spelling is weird because we have the old spelling with the new vowels. (Why is “Tim” not pronounced the same as “team” and “time” not “teamy”???). It has been helpful for me to think of the Hasidic pronunciations in Duolingo as what’s called a chain shift. That is, one vowel moves to a new place and another follows and takes its place.

If you are starting from YIVO Yiddish as an imaginary starting point, then here are all the changes you need to keep in mind to be able to navigate between YIVO and what you learn in Duolingo:

ey becomes ay. For most varieties of English, this means the vowel in English “say” is pronounced like “sigh”. You’ll see this in words like the ones for tea, one, egg, etc.

ay becomes long a. This is exactly the same as a Southern American accent pronunciation of words like “night” and “time”, and is what linguists call ay-monophthongization. So mayn ‘my’ becomes ma:n and man ‘man, husband’ stays man. “My husband” is two almost identical words but one is longer than the other: מײַן מאן ma:n man.

aw (as in coffee, hawk, caught) becomes o or u: That is, ָא is pronounced like it should be spelled with a vav. That’s how you go from YIVO vos iz dos to Hasidic vus iz dus.

YIVO ָא also represents more distinct sounds in Hasidic Yiddish (for instance, the oo of English “book”), so pay attention!

u becomes i. i stays i. This kind of merger happens all the time, and you should see the history of Greek if you think it’s bad in Yiddish. So YIVO vu ‘where’ and vi ‘how’ are both pronounced vi. Blumen ‘flowers’ (cognate with English “blooms”) becomes blimes. The word for “you” which is cognate with German du and English thou, becomes di. The word kumn ‘to come’ becomes kimn.

[Bonus: in some cases i becomes e in Hungarian/Hasidic sound shifts. Thanks to Isaac Bleaman for pointing this out]

This is not a historically accurate chain of events, but if you want to be able to navigate between, say, Duolingo and a college Yiddish course, here’s the pattern:

YIVO to Duo: ey > ay > aa and (some) a > u > i. You’ll have to just memorize oy for each word as they come.

Duo to YIVO: aa > ay > ey and (some) i > u > a. Spelling will tell you how it’s pronounced in YIVO if you just read a vav as u, a yud as i, and so on.

Vocabulary differences

Academic Yiddish has many of words that were simply made up by Uriel Weinreich when he wrote a dictionary of Yiddish. I love all of them (like der parol for '‘password’), even though some may never have been in widespread use. Many are borrowed from German or French.

Living Yiddish has many loanwords from the languages contemporary Yiddish speakers are in contact with. So YIVO fenster ‘window’ is more often than not now vinde. And the formal, academic geyn shpatzirn ‘going for a walk’ is vakn. Duo seems to be teaching the fun new stuff. If you want the old stuff, moving on to Weinreich’s College Yiddish after Duolingo would be a good next step [EDIT: an earlier version characterized the material in College Yiddish as “possibly made up” — The textbook is all legit, the dictionary may have some more fun neologisms. Thanks to Isaac Bleaman for pointing this out.]. If you really want to be good at Yiddish, congratulations, you’re learning two registers that way and you can vak with your friends but gey shaptzirn in formal writing.

[edit: there was a some discussion on Twitter of when to use vakn and when to use shpatzirn and it seems as though for some speakers who use both, vakn is goal oriented and shpatzirn is like a ‘stroll’.]

What’s the deal with R?

The variety of Yiddish spoken by the voice actors for Duolingo has what is called an “apical” r — meaning it’s tip of the tongue on the roof of the mouth, or as one person on Twitter put it, “Scottish R.” "YIVO Yiddish has a “uvular” r, pronounced at the back of the throat, which most people might know as the “French R.” The history of uvular r is a rabbit hole. Actually, just the study of r in general is a bewildering mess (trust me, I’m an expert), and which one a language uses has all sorts of downstream effects for coarticulation and language change over time. All a learner needs to know is that it’s Hasidishe = Scottish, YIVO = French. And the Hasidishe r is effectively the same sound that most Americans make when saying ts and ds in the middle of words, like ladder and latter. That’s why it sounds like the character’s name in Duolingo is “tseedl”.

More vocabulary differences

Yiddish English is not Yiddish. What this means is that words you may know in English, everything from schmuck to schvits to kvetch, may have a different meaning than you anticipated in Yiddish Yiddish. Babka, however, transcends language.

Good luck!

The above was definitely not an exhaustive, or even strictly correct description, but hopefully I showed enough of the correspondences that any of you out there learning Yiddish understand a little better how to navigate between YIVO and Duolingo pronunciations, and appreciate the compromise they made. I’m very happy with their course, even if they say ni and not nu, because it’s a fun, easy entree into spoken Yiddish, and I hope what I’ve written here helps others navigate the differences and get the most out of the course.

Here are some other resources:

For learning Standard Yiddish:

  • College Yiddish by Weinreich

  • Colloquial Yiddish by Kahn (there’s some fuzziness around “standard” pronunciation in this one)

  • Yiddish in 10 Lessons by Werdyger (also some fuzziness around “standard” pronunciation)

  • Basic Yiddish (grammar) by Margolies

For learning about Yiddish:

  • Yiddish: A Linguistic Introduction by Jacobs (warning: it’s very much a linguist’s book)

  • Grammar of the Yiddish Language by Katz

Some academics to follow:

I don’t know everyone working on Yiddish and I am bound to leave people out. People working today who have helped shape my (still rudimentary) understanding include Chaya Nove, Isaac Bleaman, Rachel Steindel Burdin, Zoë Belk, Lily Kahn, and others. There’s also the wonderfully named @jewyid on Twitter.

Good luck and happy learning!

-----

©Taylor Jones 2021

Have a question or comment? Share your thoughts below!

"I ate too many drugs": ARE YOU KIDDING ME?!

On Thursday, the defense in the Derek Chauvin trial played a body cam video clip of the arrest of George Floyd, that subsequently led to his death. As part of their broader strategy of casting aspersions on Mr. Floyd’s character to sow doubt among the members of the jury, they played an audio clip with multiple speakers and significant background noise, and claimed that Mr. Floyd said “I ate too many drugs.” They attempted to convince the witness on the stand, a use-of-force expert from the LAPD, that Mr. Floyd had said “I ate too many drugs.” Initially, the expert agreed, but he very shortly took that back, and stated he thought he might have heard “I ain’t do no drugs.” I was made aware of this shortly after it happened, and subsequently many people reached out to me about it, because I am an expert on African American English (AAE) structure and accents, and I am an expert on miscomprehension and misrepresentation of AAE in the courtroom. After reviewing and analyzing the audio, it is my expert professional opinion that Mr. Floyd did *not* say "I ate too many drugs," and instead said "I ain't do any drugs."

I took to Twitter to write a short thread about the audio in question, and promised a longer blog post. This is that post. Regardless of legal strategy on either side, I think accuracy is important, as getting to the truth will help ensure a just outcome. It is unlikely that the prosecution team in Minneapolis would benefit from my or any other expert witness testimony, as the defense was unable to convince their witness to concede the point, and he instead maintained that George Floyd did not claim to do any drugs.

There is linguistic and non-linguistic evidence for this analysis. I will discus each in turn.

The linguistic evidence:

1. Mr. Floyd speaks African American English (AAE), and makes use of the negative marker "ain't." Earlier, in body cam footage, he said "I ain't do nothing!" ("I didn't do anything" in classroom English). This will be relevant later. AAE is not the only variety that uses “ain’t” but it does make use of “ain’t” in some ways that other varieties of English do not, especially in the past tense. See, for instance, Sabriya Fisher’s dissertation, available here, which draws the explicit link between “I ain’t do” and “I didn’t do.”

2. In many varieties of AAE and in Mr. Floyd's speech, "ain't" is pronounced [e͡ɪ̃ʔ]. If you don't read IPA, the important part is that the n is often pronounced as nasalization on the vowel, and not as a separate, following segment (think of French "on" or Portuguese -ão). I have provided an example of two different ways of saying “ain’t” — one with a fully distinct /n/ and /t/, and one pronounced with nasalization on the vowel and /t/ realized as a glottal stop: [e͡ɪ̃ʔ].

3. Mr. Floyd's pronunciation of the oo vowel in "do" follows a pattern common in most varieties of North American English, where it glides between two vowels...linguists sometimes represent this as /uw/. w is VERY close to m (try for yourself, compare "awa" and "ama"). The main difference between the two is whether you mostly or completely close your lips (and how much air then goes out your nose). Going back to "awa" and "ama" how confident are you that you could clearly distinguish them in speech while under duress? From the earlier footage, it is also clear that Mr. Floyd nearly closes his lips entirely when saying that /uw/ vowel.

4. While textbooks (such as Dr. Lisa Green’s fantastic African American English: A Linguistic Introduction) and article about AAE will explain that AAE uses negative concord (also known as "multiple negation" or "double negatives"), it is not always obligatory, and there are some instances when speakers may use "any" instead of "no." For instance, for emphasis, as in "I ain't do ANY drugs." I add this in part because well-intentioned linguists, some of whom have studied AAE and some of whom may not have, have made the claim he said “I ain’t do no drugs,” but as the evidence will show, he said “I ain’t do any drugs.”

5. There was significant noise in the audio, and multiple voices talking at the same time. I believe the noise and other voices contributed to the incorrect perception that Mr. Floyd said "too many" and not "do any". It is highly irresponsible for the defense to have played audio and asked a witness to determine what Mr. Floyd was saying while there was so much noise and so many other voices. In fact, one of these overlapping voices contributes to the perception of an /m/ in Mr. Floyd’s speech. At exactly the time Mr. Floyd is saying “do any” a responder is saying “was he responsive?” The /w/ in “was” happens right in the middle of “do any” and both speakers producing /w/ contributes to the perception of an /m/, as both are voiced and bilabial, and noise makes it harder to distinguish nasals from non-nasalized segments (that is, /m/ versus /w/).

6. nasalization (like in the word "ain't") is hard to hear in a noisy channel. It's not surprising that "ain't" could have been misheard. In fact, this is exactly the kind of mishearing I wrote about in 2019 in Language, with Jessica Kalbfeld, Ryan Hancock, and Robin Clark.

Because I was curious just how confusable these two statements are, without all the extra noise, I recorded my own voice saying “I ain’t do any” and “I ate too many.” I used my own voice because the audio from the body cam is so noisy, with so many overlapping voices, that it is impossible to perform a meaningful spectrographic analysis. As you can see, they are EXTREMELY similar. The main cues for “I ain’t do any” are decreased loudness of some frequencies. This means that in a noisy channel — say, a body cam with lots of movement and multiple overlapping voices — it becomes difficult to tease these apart phonetically, and we have to use both what we hear and outside world knowledge: what other sounds are contributing to our speech perception? What is the quality of the audio? What dialect is the speaker speaking? Is the sentence plausible? Are there other plausible, perhaps more plausible, alternatives?

top: “I ain’t do any”. Bottom: “I ate too many”

top: “I ain’t do any”. Bottom: “I ate too many”

It’s clear from the above that the two options are very similar, so here are some of the differences. Differences which are effectively obliterated by the noise in the unprocessed audio the defense played. The key points are that

  1. /d/ and /t/ are usually only distinguished at the beginning of a word by aspiration (an aitch-y kind of sound) which would be masked by noisy audio, so “too” becomes plausible in part because of their refusal to process the audio.

  2. /w/ and /m/ are distinguished by full closure of the lips, and airflow out the nose. In recorded audio, this means some frequencies are not as loud for /m/ as for /w/, but they’re otherwise basically the same. On a spectrogram, this really just means that some of the shape of an /m/ is slightly lighter than a /w/. In a noisy channel, this distinction is again hard to hear, and we rely on our brains to fill it in based on what we expect to hear. Which is, in part, shaped by our biases about the speaker. There is plenty of work in this domain, especially as relates to housing discrimination, including research by John Baugh, and recent (forthcoming) research from Kelly Wright.

  3. Certain pronunciations of “ain’t” and “ate” are only distinguishable, again, by nasalization. Again, this distinction is made harder to hear by noisy audio, cross-talk, etc.

top: “I ain’t do any”. Bottom: “I ate too many”

top: “I ain’t do any”. Bottom: “I ate too many”

7. "I ain't do any drugs" is a normal, grammatical sentence in AAE. Comparing Mr. Floyd's statements earlier in the stop ("I ain't do nothin'") to the statement in question, the first 3 syllables sound EXACTLY THE SAME. We have audio of him saying "I ain't do" to compare against, and it's clear that's what he's saying. Watching the video, he also draws his lips very close together when pronouncing “do,” which may further contribute to the /w/ ~ /m/ confusion the defense attempted to sow with the later audio clip. If we really wanted to get to the bottom of this, we'd also want to know how Mr. Floyd pronounces the past tense of "eat." Given his upbringing in the south, there is a high likelihood that it is pronounced with a different vowel than "ain't." Earlier recordings, or speaking with his family might illuminate this. I would not be surprised if his pronunciation of "ate" is closer to [ɛt] "et". But we can't ask him anymore.

The extralinguistic evidence:

1. "I ate too many drugs" is a strange sentence. I have known plenty of people who have experience with drug use, and none have ever referred to it as "eating drugs." This is just not how people talk, and is highly implausible. The only example I can think of that discusses drug use with “eating” is a reference to the Odyssey, used during the Opium Wars in the 1800s.

2. Context of the speech act is important. The defense claims that Mr. Floyd was freaking out (perhaps due to substance abuse). It is clear, watching the video that he is begging for his life, and attempting to negotiate, but defer to the officers ("please, mister officer").

3. Why would someone who has been insisting for minutes "I ain't do nothing" suddenly switch to the bizarre sentence "I ate too many drugs" interjected in the middle of other protestations of innocence?

4. Misunderstandings or misrepresentations of AAE are often used to discredit and discount Black people's speech, especially in a judicial setting. See John Rickford and Sharese King’s fantastic 2016 article (available here), or Jones, Kalbfeld, Hancock & Clark 2019, (available here) both in Language, the flagship journal of the Linguistic Society of America.

It is possible that the defense is not acting cynically, and simply lacks basic knowledge about spoken African American English, but this is not the first egregious mistake they've made. Last week, they claimed his statement that he had been "hooping" (that is, shooting hoops, also known as playing basketball), was an admission of ingesting drugs rectally (no, really. See this article about it, for instance.). Perhaps this goes without saying, but not only is "rectally ingesting" not what "hooping" means, but also not what "eating" refers to. This is also not the first time the crowd-sourced Urban Dictionary has been used in a legal setting to "explain" African American speech, with absurd results; in Jones, Kalbfeld, Hancock, and Clark 2019 we discuss a case in which a judge went to Urban Dictionary to figure out the meaning of the word finna. As an aside, people were mad that it was recently added to Merriam-Webster, but being documented in a real dictionary has real-world ramifications in the judicial system.

There is much more to say about this, but I want to reiterate: my expert professional opinion as a linguist whose PhD and research program revolve around AAE, and as someone who lives in and grew up in AAE speech communities, is that Mr. Floyd unequivocally said "I ain't do any drugs."

Lastly, it is important to share the voices of Black people who speak AAE and who have the appropriate expertise (and who fit one or the other of those criteria), which is why I am careful to cite such scholars and point reporters, and lawyers seeking expert witnesses, in their direction. As we all should. None of us should stand idly by and watch injustice. We should all be fighting for equal access to justice under the law, and I that is why I am adding my voice to the voices of others.

-----

©Taylor Jones 2021

Have a question or comment? Share your thoughts below! [comments disabled on this post for what, in retrospect, should have been obvious reasons]

/

A LOOK AT REGIONAL VARIATION IN AFRICAN AMERICAN ENGLISH ACCENTS

Last April, at the height of the first wave of the COVID-19 pandemic, I defended my dissertation. It will come as no surprise to anyone that I’m only now getting around to writing about it — everyone I know who has a PhD needed some distance from their dissertation before they could really condense it and get out of the weeds enough to talk to regular people about it.

My 2020 dissertation was the first ever general description of regional variation in African American English accents. Plenty of other researchers have studied individual phonological variables (like whether or how often you pronounce an /r/ after a vowel, or if you pronounce words with a syllabic /r/ like Nelly saying “it’s getting hot in hurr”), other researchers have studied differences between places (like if you pronounce fewer /r/s if you’re from New York, or more hurrs for heres if you’re from St. Louis), and other researchers had studied entire vowel systems — roughly, how you pronounce all the vowels in English, so what does it really sound like when you say GOOSE and FOOT and is the vowel sound you make there different than someone else’s? — but mainly in one place. (shoutout, though, to Charlie Farrington, who wrote an excellent dissertation, available here, that looked at a single understudied variable — replacement of /t/ or /d/ with a glottal stop — and how it varied across four cities. He used the growing Corpus of Regional African American Language, or CORAAL, and his diss has the excellent title: Language Variation and the Great Migration: Regionality and African American Language). My dissertation was the first work to look at the entire vowel system for African American English speakers across the entire country. 


To do this, I used a standardized reading passage. But it’s not as simple as it sounds, because I had to write a new one (with help and input from lots of linguists who are also native speakers of AAE, to reduce regionalisms from my own personal experience with AAE — y’all know who y’all is).  Existing reading passages were, to quote a friend of mine who I had read one, "wack” (check it out for yourself). The reading passage I used is a short story about Marcus Junior, AKA Junebug, going to the barber shop by himself for the first time, just before his 12th birthday. It was intentionally designed with lots of characters and quotes to encourage using AAE instead of formal classroom English. (I’ve actually been asked about illustrating it and making it into a children’s book — if you’re connected to that world, get at me) I did some technological workarounds so people could go to my website and record themselves reading the passage, “Junebug Goes to the Barber”, and upload it from the comfort of their own homes. I solicited participation from friends, family, extended family, Facebook, Twitter — you name it. I got big pushes from connected people like Jon Jon Johnson, Lee Colston II, @afrothighty, and NPR’s Gene Demby. Ultimately, I got more than 200 recordings, about 180 of which I used for my analysis. That’s not a lot, but it’s also 12 full hours of audio and hundreds of thousands of vowels to measure. I asked people to change things they felt were unnatural, so that means I also had to retranscribe and align each of the recordings manually. The biggest difference was that there is a near universal preference in AAE for “everybody” and the reading passage had a few “everyone”s in it — this word preference is not something that has been written about by any linguists, to my knowledge. Shout out to Gene Demby for getting that conversation started. The whole survey and reading passage are available here.

I wanted to compare to the gold standard, the Atlas of North American English, but our data collection techniques were very different. To compare against the ANAE, I decided to use modern geostatistical methods (kriging, getis-ord Gi* statistic, etc.), and I had to first show that these methods got results at least as good as the ANAE on the ANAE data. So I did that, corroborating the ANAE findings, but also making some new observations about the Northern Cities Vowel Shift along the way, challenging the dominant interpretation of how that shift started and spread. Then I used the same techniques to map pronunciation patterns in AAE. Lastly, instead of drawing dialect region boundaries by hand and superimposing my hand drawn maps to make dialect regions (a classic technique!), I used techniques from computational historical linguistics and from biology to allow the data themselves to tell me where the boundaries and clusters are. I used k-means clustering and hierarchical clustering analyses to determine how many regional varieties of AAE they are, and what their boundaries are.

My participants were overwhelmingly young, female, and well-educated, meaning that for all of my findings about how AAE differs from white Englishes, these findings are conservative, and understate the differences. As any sociolinguist will tell you, in general, the higher the level of education we attain, the more work we do to erase our unique, local accents — insofar as the features of the accent are something we are consciously aware of.

A note on maps: Some of the maps below use a technique that’s used in mining and in weather maps, to interpolate values for visualization. Do not over interpret where there are no people. I don’t have any participants from Wyoming, so the values there are just a computer’s statistical best guess based on what’s near by and what’s farther away. More research is definitely needed, and bigger projects with more people in each city (like the Corpus of Regional African American Language, or CORAAL), will shed more light on these nuanced differences. For all of the maps, the lighter color is usually more intensity of the shift under discussion, and the darker color is usually less intensity (or absence).

A note on audio: Some audio examples here are from my dissertation, others are celebrities, and some are recordings from the street. If they are only labeled with a place, they are from my dissertation research, and I am protecting the participants’ identities.

So what did I find?

This is barely scratching the surface, since this is just the first in a series of blog posts, but my main findings were:

There is no one Black Accent. 

Black folks (and linguists) been knew this. AAE exhibits strong regional variation, so people from NYC sound different from Philly and they both sound different from Atlanta and Chicago. California is different from all of them (but has similarities to DC and Baltimore, by coincidence), and Kansas City is doing its own thing. This sometimes surprises people to hear, but think about celebrities’ voices: Jay-Z (NYC) doesn’t sound like Kevin Hart (Philadelphia), and you’d never confuse either for Ryan Coogler (Richmond, CA). 

This dramatic variation existed even among highly educated people who have a strong command of “classroom” “standard” English. Even during a reading task, which are known to cause people to speak more formally and more carefully than in casual conversation. 

This means that…

Things claimed to be universal in AAE are not.

The PIN-PEN merger has historically been claimed to be a universal feature of AAE. That’s great, except it is absolutely not universal in the Northeast. Yes, I hear it in Harlem. I also hear vernacular AAE speakers who distinguish between PIN and PEN, in both NYC and Philadelphia. (Sharese King has already written about this in California, see below for some NYC examples). 

AAE is supposed to not exhibit the COT-CAUGHT merger, and by and large it doesn’t, even in places where everyone else has it. So for instance, Black folks from California tend to pronounce “on” like white New Yorkers (or sometimes, white Southeasterners) and not like white Californians. Don’t believe me? Listen to how Tiffany Haddish pronounces “on” “dog” and “ball”, or how Snoop Dogg says “on.” Yes, they’re different from one another, but they’re also very different from the pronunciations in other California accents.

But here’s the thing. AAE speakers in parts of Florida, Georgia, and South Carolina often do have the COT-CAUGHT merger, opposite local white people. As an aside, I remember years ago explaining the COT-CAUGHT merger to a friend from Atlanta in a cafe in Harlem, so I expected this finding, but it seemed to really surprise quite a few linguists (when you read this, hi Bri-bri!).

That brings me to the next finding:

AAE has a lot of the same kinds of changes as white dialects, but they follow a completely different geographic distribution, and may have developed completely independently. So white people have the COT-CAUGHT merger in California but not in Georgia, and Black people have it in Georgia but not in California. White people say words like DOWN so it sounds like day-own in parts of the Deep South, black people do it in New York (compare Jay-Z saying “bounce (with me),” “down,” or “uptown” to Robert De Niro saying those same words). White folks are slowly moving where they pronounce words like GOOSE and GOAT further forward in the mouth in the Southeast, moving westward toward Texas, and Black folks do it in the Midatlantic (most especially DC and Baltimore) and in California.

These shared patterns include chain shifts (not just one-off changes) described in the Principles of Linguistic Change, but, again, for totally different regions. For instance, the Back Upgliding Shift, also known as the “second Southern vowel shift” is present in AAE, but it’s not limited to the South, and isn’t present for Black folks in my sample from all the places it is present among white English speakers.

The “back upgliding” shift, or “second south” shift, from the Atlas of North American English

The “back upgliding” shift, or “second south” shift, from the Atlas of North American English

The Back Upgliding Shift in my data.

The Back Upgliding Shift in my data.

For reference, here’s the same shift in the Atlas of North American (white) English:

Screen Shot 2021-03-24 at 5.29.09 PM.png

That’s because:

Black accents pattern with the Great Migration. As black people fled racial terrorism in the South, and migrated across the country, their patterns of movement were very different than the patterns of movement of white people across the country. To over simplify, black people moved south-to-north, white people moved east-to-west. Segregation and Jim Crow only amplified this, so Black people in Chicago tend to sound more like Black people in Mississippi than white people in Chicago. In fact, one linguist made a convincing argument that, at a minimum, you can’t rule out “fear of a black phonology” as a main driver of the Northern Cities Vowel Shift (Van Herk, 2008). If there were already black people in decent numbers, as in New York, there was a founder effect — newcomers learned to speak like the people who were already there. If there wasn’t already a large Black population, like in Chicago, this didn’t really happen. These things play out in complex ways that are dependent on which parts of an accent are really noticeable to people and which aren’t.  

Even more than that, there were already differences in Black accents across the South. Regional variation in Black accents today are the product of modes of travel in the 19th and 20th centuries (rivers and railways). But the starting point was shaped by the location of shipping routes and slave ports where abducted and enslaved Africans were first taken. 

what are the patterns?

I was curious what story the data would tell without me interpreting them, so I used a few different clustering algorithms on people’s vowel spaces. I gave the computer all the vowel measurements for each of the vowel classes for each person, but did not give the computer any geographical data, and I asked for it to group similar with similar. Using Agglomerative Nesting, or AGNES, to look at hierarchical structure without geographic data, the results showed strong geographic patterns. People from The Bronx sounded like other people from The Bronx, and when you measure all of their pronunciations, they’re closer to each other than to people from anywhere else. But people from Brooklyn form the next closest grouping. And people from Philly are closer in their pronunciations to people from Brooklyn and The Bronx than people from Atlanta are. And so on. 

An example sub tree from my dissertation research. (I know this is tiny; I will share more readable versions of the trees in future posts).

An example sub tree from my dissertation research. (I know this is tiny; I will share more readable versions of the trees in future posts).

The question is then, how do you group these clusters? There are a handful of different statistical techniques to determine this from the data, and they all seemed to suggest around 10 groupings. Using knowledge about the real world, it looks like it should probably be about 12: the computer wants to group California with the DMV (D.C., Maryland, and Virginia), probably because they both pronounce the GOOSE and GOAT vowels with the body of the tongue further forward in the mouth (audio examples below); and it wants to group North Carolina and Michigan, which may be one group based on patterns of migration, or may not. 

In the future, I plan to build on this research, and to make more artistic maps — these were for my dissertation, which is a target audience of about 3 people. 

Mapping with 5 clusters really captures the Great Migration, but loses some of the granularity of the East Coast, and important differences up the Mississippi. It also looks almost exactly like the maps I produced of lexical variation in Twitter data in 2015.

aae_kmeans_5clusters.jpeg

Mapping 10 clusters gives a better perspective on regional differences, especially in the Northeast, and shows more granularity up the Mississippi. Chicago and Jackson, Mississippi are more similar to one another than to New York, but this higher level of granularity captures that Chicago and Minneapolis are more alike than Chicago and Jackson, 50 years after the Great Migration.

Agglomerative hierarchical clustering with vowel data (and no geographic data), with 10 clusters.

Agglomerative hierarchical clustering with vowel data (and no geographic data), with 10 clusters.

Remember that in each of these we need to add a little world knowledge: California and DC are probably not a real cluster, they just share common features, likely by chance. Specifically, fronting of the vowels in GOOSE and GOAT. (I have given semi-exaggerated audio examples here).

Hotspots for fronting and raising of /uw/ as in GOOSE and /ow/ as in GOAT on the East Coast.

Hotspots for fronting and raising of /uw/ as in GOOSE and /ow/ as in GOAT on the East Coast.

GOOSE fronting in California.

GOOSE fronting in California.

GOOSE fronting.

GOOSE fronting.

For comparison, here’s fronting of /uw/ in the Atlas of North American (white) English, where fronted /uw/ is circled in Orange.

ANAEuwInAtlas.png

How do you tell where someone is from by their accent?

In the last few years, I’ve been able to pinpoint where new people I meet are from. It’s almost a party trick at this point — I’m no Henry Higgins, but I’ve astonished and impressed quite a few people by pinpointing what state, or part of a state, they’re from. Obviously, I can’t teach everything there is to know, but there are some geographic patterns that are very salient. In future posts, I will do some deep dives into individual local accents.

Here are some of the patterns. These are generalizations and do not mean that all people from that location have that pronunciation. Rather, it reflects where a particular sound is more common.

The “African American Vowel Shift” (AAVS) involves swapped vowel nuclei for /iy/ as in FEET and /i/ as in KIT, swapped nuclei for /ey/ as in FACE and /e/ as in DRESS, and raised /uh/ as in STRUT. AAE speakers with the AAVS are from (eastern) North Carolina, and a broad path upward from the Gulf states to the Great Lakes. Note that it’s gradient, so for instance, Snoop Dogg, from California, has the shifted nuclei of /iy/ and /i/, but in general it was less prominent in my participants from California than it was in the Gulf.

The “African American Vowel Shift” (AAVS)

The “African American Vowel Shift” (AAVS)

The “African American Vowel Shift” (AAVS)

The “African American Vowel Shift” (AAVS)

Fronted GOOSE and GOAT vowels? That’s Washington D.C., Baltimore, and to a lesser extent, California (see above).

MARY-MARRY-MERRY merger, with centralized /r/ for MARRY (so MARY-MERRY are pronounced like “may-ree” and MARRY is pronounced like “Murray”)? Baltimore and DC only. Same goes for “fear” rhyming with “fur,” but this isn’t universal by any means, it’s just the main place this shift is attested at all.

Back GOOSE and GOAT, no PIN-PEN merger, and none or few of the reversals of the AAVS? That’s the Northeast, especially New York City and Philadelphia. (That’s the dark color on the AAVS map). The fact that many, many AAE speaking New Yorkers do not have the PIN-PEN merger should not come as a surprise to anyone who has heard any hip hop from New York since, well, ever (like how Whodini says “friends” in 1982, or how Biggie Smalls says everything).

Strongest PIN-PEN merger in the AAE data.

Strongest PIN-PEN merger in the AAE data.

PIN-PEN merger on the east coast. Strongest in Virginia Beach, weakest in NYC.

PIN-PEN merger on the east coast. Strongest in Virginia Beach, weakest in NYC.

Distributions of some PIN and PEN words among New Yorkers. Notice how you can divide them up pretty well.

Distributions of some PIN and PEN words among New Yorkers. Notice how you can divide them up pretty well.

As an aside, it has always perplexed me how linguists can teach that the PIN-PEN merger is a core, universal feature of AAE, and then go home and listen to hip hop from NYC where entire rhyme schemes are built on not having that merger. That Whodini track is 40 years old, and both cuts I included here for educational purposes were hit songs. The counter-evidence to our textbooks is literally all around us, every day.

COT-CAUGHT Merger? Your best bet is Florida, but you could go as far afield as Georgia and parts of South Carolina. Compare the vowel spaces for Florida and New York, below (AA refers to the COT vowel and AO refers to the CAUGHT vowel).

COT and CAUGHT vowels for Florida.

COT and CAUGHT vowels for Florida.

COT and CAUGHT vowels for New York.

COT and CAUGHT vowels for New York.

Just look at that beautiful separation in the second on the top (from Brooklyn) or the entire second row!

Raised and fronted /uh/ as in STRUT? Best bet is Kansas City or St. Louis, and parts of Oklahoma. This is why a colleague of mine from Oklahoma says he’s country with the same vowel I have in the word book.

aae_wedge_zoomed.jpeg

Vowel in DOWN/TOWN/MOUTH sounds like /æ/ (as in “cat”) or even /e/ (as in “bed”) or /ey/ (as in “say”) at the onset? Atlanta or NYC are most likely. For instance, listen to how the conductor on the 1 train in New York says “town” and “bound” in the clip below (“one-two-five where it always stays live. This is Harlem, one hundred and twenty fifth street. This is a one-three-seven bound uptown one. The next and last stop is 137, stand clear”), or how Jay-Z says “down” in the clip below that, from an interview on the Breakfast Club.

Vowel in CAUGHT/HAWK/DAWN starts with an oo sound? New York and Philadelphia. (and most places if it’s before an n).

Vowel in CAUGHT/HAWK/DAWN starts with /æ/ as in “cat”? Strongest in Mississippi and Alabama, but you’ll also find it in Tennessee, Kansas, Missouri, etc.

There are tons more patterns that I haven’t even touched on (what vowel do you have for “there”? How often do you pronounce /v/ after a vowel as in love or believe? What vowel do you have in words like thing? How often do you pronounce /r/ or /l/ after vowels? If you don’t pronounce it, do you replace it with a /w/ sound?). And these all work together as a coherent system.

Some AAE vowel systems.

Some AAE vowel systems.

Note the difference between bought and bot patterns in the Northeast and Southeast, or the patterns around where bait and bet are in North Carolina and the Gulf states, or where bat is relative to other words in these charts…these are very distinct sound systems.

So what now?

My biggest hope for the future is that researchers stop treating AAE like local divergences from white dialects, and really lean into treating it as its own set of systems instead of writing papers about a single vowel or consonant in a single place or two — this is why I tend to prefer Sonja Lanehart (and others’) approach to AAL, where the L is for Language. Theres so much more to say about this, and about regional variation, but I’ll stop here for now. My full dissertation is available here, but may only be interesting to readers interested in highly technical and detailed explanations of the statistics. I am, however, in the process of turning this material into a more digestible form for people outside of academic linguistics. Over the next few months, I will be writing posts that detail the accents of specific places, and what their unique features are, including some that I observed, but that did not make it into my dissertation (like regional patterns in how people pronounce thing). I hope that my work helps contribute to the growing chorus of voices in and outside of academia who are challenging the myth that there is one black accent, and who are challenging the academic approach that treats African American Language as just a few extra bells and whistles on local white varieties and not as its own rich linguistic variety not defined by its relationship to other language varieties.

-----

©Taylor Jones 2021

Have a question or comment? Share your thoughts below!

Testimony on Linguistic Discrimination

A few weeks ago, I was invited to testify about linguistic discrimination before a joint session of the Pennsylvania House and Senate democrats, organized in part by the Legislative Black Caucus.

I submitted written testimony, explicitly drawing the link between linguistic discrimination and racial, gender, and other kinds of discrimination against protected classes, arguing that linguistic discrimination often serves as a proxy for racial discrimination.

An overview of the entire hearing can be found at the PA Senate Dems website.

The full recording can be found at Senator Muth’s web page, or on the PA Senate Democratic Facebook page.

A discussion of the hearing can also be found at senator Street’s website.

My written testimony can be found at CulturePoint’s page on reports, white papers, and testimony.

-----

©Taylor Jones 2021

Have a question or comment? Share your thoughts below!

The "Latinx" controversy is interesting for different reasons than you think

In recent years, if you’ve been online (or listened to NPR, or watched the news on TV), you’ve no doubt come across the term latinx. It’s intended to replace latino in Spanish (and English), and is ostensibly a remedy for the gendered — and presumably therefore non-inclusive — nature of latino. Proponents argue that it is inclusive of people other than men and the x is more inclusive than @, which is inclusive of men and women, but not nonbinary and other people (usually, on twitter, folx). Detractors argue that it is unpronounceable ( Latin-equis? Latinks? Latin-eks, which is explicitly English?), and could lead to unpronounceable and uninterpretable language (“lxs personxs sxn humanxs”), and is an English imposition on Spanish that isn’t used or liked by 97% of Spanish speakers in the US.

I try to respect people’s language, culture, and self-determination, so I don’t really care one way or the other, and if the people I’m speaking to prefer latinx I’ll use it however they pronounce it. This is what linguists call “audience design”, and regular people call “being nice” and it helps you get along with people. I think people are, in general, so busy trying to prove that latinx is wrong (or right!) that they are missing what’s actually interesting about it:

The new gender-inclusive language in Spanish respects an animacy distinction.

More specifically, it’s between human and non-human nouns (and derived nouns). So people are busy arguing about latinxs, and amigxs, and personxs, but nobody is talking about xl librx instead of el libro ‘the book’. Actually, this isn’t entirely true, as I’ve seen it extended to pets, as in algunx de mis perrxs ‘one of my dogs’ and mis gatxs ‘my cats’. That said, animals who have different names for the sexes (e.g., a bull and a cow) aren’t included (that’s not to say you can’t find lxs vacxs, but it’s only used to dunk on gender inclusive language from PETA, as far as I can tell). So the line is not human/non-human, but it’s definitely pretty close — perhaps humans and pets, but not domesticated animals.

To a linguist, this is genuinely interesting (or rather, should be, but I haven’t seen any linguists discuss this yet. That may be that I’m just following the wrong linguists online). So to the extent that people are actually using this language in Spanish, they are creating a three gender system with masculine, feminine, and animate genders, it triggers agreement in other words (like algunx), and we get to watch it develop in real time. This is super cool! We know that the masculine/feminine distinction in Romance languages, if you go far enough back, originated as an animacy distinction (I mean really far back). We know that masculine/feminine/neuter gave way to two non-sexed genders in Dutch (masculine and feminine collapsed into one, leaving a common/neuter distinction). But we might actually have the opportunity to watch a language develop an animacy distinction in nouns and pronouns over a generation or two, in real time, and that’s truly exciting. And what’s interesting is not that it’s “ruining Spanish” but rather — how does this work for direct object and indirect object marking? What about indirect object pronouns for animate indirect objects (as in, les escribe una carta a sus amigos — should we expect the non-gender-marking, but plural-marking le and les to begin to respect a gender distinction between animate on hand and masculine/feminine on another, as in lxs escribe una carta a sus amigxs?). Is this primarily a written distinction? For whom and in what instances is it not?

So, my takeaways are:

  1. latinx is genuinely linguistically interesting, but not because of the anger and vitriol over gender, but rather because that social conflict may give rise to a new grammatical system, and

  2. It’s not hard to be nice, and sometimes it’s better to choose to be nice than to try to prove that you’re right. So why not be nice?

Of course, if anyone knows of linguistic work on these topics I would love to read it. In the mean time, I can sum this all up with: quit yelling at each other and pay attention to how interesting what is really happening actually is!

-----

©Taylor Jones 2020

Have a question or comment? Share your thoughts below!

What's in a name? Why do some linguists not call it African American Vernacular English (AAVE) anymore?

Today’s post will be a short one, but it’s something I should have written a long time ago. Thanks to social media, there is a rising awareness among the general public of the validity of the language variety most associated with Black Americans who are the descendants of enslaved people. However, just as people on Twitter are taking the term AAVE mainstream, linguists are moving away from it. This has created some awkward situations, in which well meaning lay people are suspicious of linguists doing cutting edge work, precisely because they are not using the same terminology. I tend, now, to use African American English or African American Language. Here’s a breakdown:

African American Language: This is the term now most used by linguists studying this language variety. The 2015 Oxford Handbook of African American Language has very thorough discussion as to why. The basic idea is that, in an academic environment, calling it African American English seems to suggest a particular position on the origin of the language variety. That is, people saying AAE in an academic environment might suggest they believe the Anglicist hypothesis that the language variety is basically English with some West African flavor (see Labov 1998 for an articulation of this argument). But we don’t know that, and it’s actually quite a contentious claim. Others (like John Rickford) argue convincingly for the Creole Origins Hypothesis, which says that AAL started as a creole — not a variety of English — and later became more like English through contact. Calling it AAL sidesteps any strong stance on origins, and does not presume mutual intelligibility with other varieties of English. This last part is important, as there is a growing body of evidence that English speakers who don’t also speak AAL actually don’t understand a lot of AAL. This is an area of ongoing research and contention.

African American English: I tend to use this when talking with the general public, in part because people haven’t heard of AAL, and in part because I’ve found that most people will understand AAE better as a valid variety of English than as a language variety in its own right that may seem mutually intelligible but isn’t always. Most English speakers understand most of the AAE they hear, and can be trained to understand all of it. They also intuitively understand that there are different varieties of English, and so tying it in to Appalachian, Southern, Scottish, Received Pronunciation, etc. helps people understand how it can be different, but still valid. This term is very similar to AAVE, but…it’s missing the V.


African American Vernacular English: AAE is missing the V because the V is for “vernacular”, which, in this case, means something like “casual”. In the 1970s and 80s, most of the work on AAVE was being done by white researchers who did not speak the language variety. This is still basically the case, although it’s changing. One result of this fact was that even the people attempting to valorize AAE implicitly viewed it as a casual register. Arthur Spears, however, has argued very convincingly (for at least 20 years) that there is also a “Standard” register which is different from white, mainstream, classroom English, and which is recognized as a formal standard register. Think about Dr. Martin Luther King Jr. giving a speech. He’s still got all of the phonological markers of African American speech, and many of the morphosyntactic markers, but it’s not vernacular by any stretch of the imagination. One example of AASE is the full closure and release of /t/ in the middle of a trochee (two syllables where the first is stressed). Most white Americans would pronounce /t/ as a flap in that situation, no matter the formality of the speech (think ladder/latter). However, many AAE speakers, when speaking formally, will release the /t/ similar to how it is released at the beginning of a word. So “identity” might be ‘eye-dennidy” [ɑ͡ɪdɛɾ̃ɪɾi] for even the most formal of white speakers, but “ah-dint-ih-tea” [ɑːdɪ̃tɪtʰi] for some AASE speakers. (I’m told that many HBCUs used to, or still do, have “diction” classes which cultivate this style of speaking). Notice that in the second example, there are features AAE shares with Southern (white) American English, like ah for eye and dint for dent, and that these are common in AAE but not universal, so someone from Harlem might say [ɑ͡ɪdɛ̃tɪtʰi] when speaking formally. The point is, historically AAE was thought of as being exclusively the domain of casual speech, and studies often privileged the way teenagers and gang members (!) spoke, and it wasn’t until there were more Black voices in the academy that this blind spot began to be corrected. It’s still a problem for the general public, as even the most well meaning people, many of whom speak AAE, will still assume that AAE is slang and miss that it is a complete linguistic system. On fleek is not AA(V)E from a linguistic perspective, but they don’t think it be like it is but it do is AAE. Much of the discussion of AAE and borrowing/appropriation misses that what is being borrowed often is slang, and that AAE is much, much harder to accurately imitate!

Ebonics: This term was originally coined to describe a variety of contact languages with African influence, and is a blend word of “ebony” and “phonics” for “black sounds”. This means that not only did Ebonics refer to African American Language, but it also refers to Gullah, to Black Canadian English, to Dominican Spanish, Haitian Kreyol, Jamaican Patwa, and Brazilian Portuguese, among others. In the 1990s there was a massive media storm over the term, after the Oakland school board attempted a (smart) maneuver to increase funding to teach literacy and classroom English to children who spoke AAL at home — they argued that AAL was not the same thing as English, and therefore funding earmarked for second language English learners should also go to underfunded English classes in primarily Black neighborhoods. The media latched onto this, and it was misrepresented as Oakland trying to teach all kids to “speak Ebonics” and there were tons of articles with titles like “Ain’t ain’t a word” (side note, imagine if it actually wasn’t — this would read like “blorf blorf a word!”). After 1996, the term ebonics became associated in the popular consciousness with (1) African American Language and (2) the idea that it is broken, bad, and defective. Because it is now effectively a pejorative for AAL, linguists no longer use it.

There are also a lot of historical names that are no longer used. It wasn’t until the late 1960s that linguists took AAL seriously, and in the beginning you’ll find papers talking about things like “Northern Negro English”, “Black English Vernacular (BEV")” or “Black English” (some linguists from that time still use the latter, although both BE and BEV seem to be about whether one prefers “Black” or “African American” which is an entirely separate conflict to wade into).

For all of these, it should be noted that the language variety is not exclusive to Black people, although they comprise the vast majority of the fluent speakers of AAL. This is similar to the situation with Russian. The vast majority of its speakers are from a handful of ethnic groups, but this is because of history and geopolitics, and not DNA. There’s no AAL gene that’s magically associated with melanin, but there is massive residential and educational segregation in the US, and a strong social stigma against AAL, so it’s rare to find fluent, native, non-Black speakers. Because it is the language of an oppressed people, there are also strong feelings around ownership and who has the right to speak it, even seemingly paradoxically, among black folks who don’t speak it.

I think most linguists working on AAL would love to see AAL catch on in the mainstream, but I think this is highly unlikely. I do, however, hope to see AAE replace AAVE, as that V is doing a lot of work. I have published papers that use AAVE, where appropriate — like when Christopher Hall and I wrote about “the n-word” which only occurs in informal speech — but it feels weird to see people attempting to valorize AAL while simultaneously implicitly calling it informal and vernacular.

For more on these, check out the videos below. The first is an interview with Dr. Lanehart, the editor of the Oxford Handbook of African American Language (OHAAL), and a proponent of the term AAL. The second is a discussion of African American language use with four experts.

-----

©Taylor Jones 2020

Have a question or comment? Share your thoughts below!


Closed (minded) captions

Today I was tagged into a conversation on Twitter by New York Times best-selling author Morgan Jerkins, who had been watching an episode of The Cleveland Show and happened to have the closed captions on. She wrote:

I was expecting closed captions that were, at least, an attempt at accurately captioning what was said. Perhaps the transcriptionist misheard or misunderstood, but transcribing in good faith.

Instead, there was this:

IMG_1273.PNG

The caption reads “dam-fa-foo-dun-may-hebeyad-shoot.”

That was followed by:

IMG_1274.PNG

The caption reads “Naw-a-gah-may-mah-beyad, dayum.”

This raises an important question: what is the function of closed captions? Ostensibly, it’s so the viewers know what was said.

In this case, what was said was:

In IPA (this will be relevant later) that’s:

[dæ̃ fæ fuw dʌ̃ me͡ɪ hɪ beːʲɪʔ ʃːuʔ næ͡w ɑ͡ɪ gɑː me͡ɪk̚ mɑː beːʲɪd̥ deʲɪ̃ʰ]

The transcript should read “damn fat fool done made his bed? Shoot. Now I gotta make my bed. Damn.”

There are a couple of things happening here.

First, the character is a black character being voiced by a white voice actor, who does not, evidently, have early life contact with AAE speech communities necessary to speak it natively. He’s very good, but he also is not perfect, and it’s clear that while he’s nailed some of the harder parts of some black accents, he’s also missed some important nuance, overgeneralized some parts of the accent, and applied the wrong accent to the wrong place. He’s noticed that word final consonants are often deleted, unreleased, realized as glottal stops, or deleted altogether, but he has overgeneralized, and left no word final consonants in places where they should appear. In fact, it was his second word — [fæ] instead of [fæʔ] — that made me look up who was voicing the character — I’ve never heard an AAE speaker who would say fah for fat. He noticed that word final nasals (n, m, ng) are often pronounced as nasalization on the vowel, like in French, and not as a following segment. He also noticed that the vowel in bed is often split, so it sounds like the vowels in “play hid”. However, the show takes place in Stoolbend, Virginia, and this accent feature is not as common in Virginia AAE. It’s common in parts of the Carolinas, and from the Gulf to the Great Lakes, along the Mississippi, but not most of the mid-Atlantic or Northeast. He also overdoes the consonant deletion — this level of syllable coda deletion is only really plausible in Georgia. He also over does it with “gotta.” That kind of reduction does happen, but not exactly in that context: the word is too slow and too carefully pronounced, so it comes across as caricature.

Caricature brings me to the second: This is a white actor voicing a black character on a comedy show, where part of the humor is evidently making fun of how he speaks. It should not be controversial for me to plainly state that it looks a lot like minstrelsy. I’m not entirely clear on how Rallo Tubbs is significantly different from Amos ‘n Andy, or from Thomas D. Rice. Evidently, after the killing of George Floyd, even the voice actor realized it was probably a bad look, and he publicly announced he would not be voicing black characters anymore. Why George Floyd, but not Mike Brown or Emmett Till, changed his mind remains a mystery. He made it clear he doesn’t want to take work from Black voice actors, but I’m not sure if the broader context is clear to him, given that statement and, you know, the decade or so of him doing this work. As I mentioned in my replies on Twitter, it’s uncomfortably evocative, to me anyway, of Jim Crow in Dumbo. The crows are clearly a vaudeville/minstrel act, and clearly intended to be speaking AAE (“I-uh be done seen most ev’rything/when I seen an elephant fly!”). They’re also voiced by white actors in the 1950s, and the line between imitation as flattery and caricature as mockery is razor thin there (and they’re on the wrong side of that line anyway). We can say that they clearly have contact with AAE speakers, and that there’s clearly a certain level of respect, but at the end of the day they’re taking a job that a Black man simply could not have at that time, to play at the culture, music, and language for laughs. It’s no longer the case that a Black voice actor could never get the job (just look at the cast of the Cleveland Show), but there’s still a direct line from Al Jolson, through Amos ‘n’ Andy, through the crows in Dumbo, right up to the Cleveland Show.

Third, and most importantly for this discussion, there’s the captions on top of all of that. If you’re reading the captions to know what was said, you still don’t know what was said! What you get is that the character said something unintelligible. The way I look at it, there’s two plausible possibilities, neither of which is good: first, the transcriptionist couldn’t make sense of the utterance and did the best they could, assuming it was some kind of gibberish. Or Jive (note to self: write post about Airplane). Second, the transcriptionist thought it was more important to show that the character wasn’t speaking “right” than to actually, you know, transcribe what was said. That would explain why “done” was written as “dun”. They’re pronounced the same, but any time a writer chooses to write something like “eye dun tole yew” instead of “I done told you”, they’re not telling us much about how a character sounds, but they’re telling us a great deal about how we’re supposed to perceive that character. Is it really possible that the transcriptionist who had flawlessly transcribed up to that point could no longer tell from context that the character was talking about making his bed? That he said “now” — a recognizable word of English — and not “naw”?

This is a really interesting case to me, because it is in some ways very subtle. What has to be behind this choice, any way you slice it, is a certain often unstated linguistic ideology. Most of us were taught explicitly in school that writing takes precedence over speech, and that for both writing and speech, there is one correct way to do things, which coincidentally overlaps with how well educated, wealthy White people (but not White Ethnics!) speak. This manifests itself in all aspects of our society, from arguments about pronunciation, to whether something “is (really) a word.” Built into that ideology is that there is some reason why one way of doing things is better — clarity, logic, authority — and it’s never the truth: that the prestige variety exists based on social norms, not linguistic facts. Lastly, this ideology positions ways of speaking that are not “classroom” English as inferior (and lacking in clarity, logic, and authority). This captioning only makes sense if we recognize that the transcriptionist, the service running these captions unquestioningly (in this case, Hulu), and likely most of people involved in the show’s production either view AAE as unintelligible, as something that can function as the butt of a joke, or both. It’s a subtle form of anti-blackness that’s not necessarily predicated on overt or deep hostility. It’s casual.

That’s not to say that all nonstandard spellings are inherently racist, or offensive, or what have you. I’ve even written chapters on how people intentionally represent how they speak with novel spellings (as in “dis tew much”). But in this particular case, there’s no valid reason I can think of why turning on the captions on Hulu should result in “dam-fa-foo-dun”. And this is, weirdly, something you only really see with AAE, and some socially stigmatized varieties of English spoken by (generally poor) white people, like Appalachian English.

As a thought experiment, can you imagine what would happen if Downton Abbey were captioned this way?

“noaw mayde? noaw nah-nee? noaw valette eevun?”

“ihts nayntiyn twuntee sevun, wi-uh mahdun foake”

(If that wasn’t transparent to you, It was the first lines in the Downton Abbey movie trailer).

This was a very interesting counterpoint for me this week, as I’ve been reviewing transcripts of a deposition and I was blown away by the accuracy and professionalism of the court reporter. While mistranscription of AAE (and mock AAE!) is a systemic problem, it’s not a universal one.

I don’t know where people stand on this issue, but I know where I do. While I see mistranscriptions of AAE everywhere, from Netflix to Turner Classic Movies, this is different, in that it’s apparently intentional. We can do better than this.

-----

©Taylor Jones 2020

Have a question or comment? Share your thoughts below!

Local Identity, Appropriation, and Mock Yiddish: A Kvetch

There is an advertisement I consistently see on TV, especially on New York 1, that never fails to annoy me. It’s a car service ad that tries to tell the viewer what “real New Yorkers” do, and does so by repeating a “New York” catchphrase over and over: “what are you, mashugana? Real New Yorkers take Carmel!” This is their spelling, by the way.

What they’re trying to do, as far as I can tell, is get people to use their car service based on insulting them in bad Yiddish.

Yiddish, and by extension Yiddish English, are highly stigmatized, much like African American English (AAE). Both are stigmatized in large part because of (non-linguistic) prejudice against their speakers. Both also have what linguists call “covert prestige” meaning you can use them to positive effect sometimes. With AAE this often means that borrowing features of AAE can be used to construct a “tough” identity, a “dangerous” identity, or a “cool”, “in-the-know” identity. For women, sometimes it’s a “sassy” identity (think about white women adopting “girl” and “girlfriend”, for instance). With Yiddish English, it’s often a curmudgeonly, beleaguered, but comedic persona (à la Mel Brooks or Jerry Seinfeld).

All of this is related to “mock” language — best explained by Jane Hill’s discussion of “mock Spanish” in The Everyday Language of White Racism.

So what’s the problem?

There are a few issues with this ad, from a sociolinguistic and linguistic anthropological stance. First and foremost, it’s borrowing language use from an ethnolect (that is, a language variety associated with a (minority) ethnicity — often strongly related to segregation), and it’s putting those words in the mouths of people who are seemingly not part of that community, and it’s doing so for comedic effect. This is not to say there aren’t black Jews (there are), or waspy jewish converts (there are), or people who live in New York, have a few Jewish friends, and picked up some Yiddishisms that they use appropriately (there are a ton). It’s just that this doesn’t seem to be what’s happening in the ad, in part because of my second point:

They do it wrong.

What they spell (and pronounce) as mashugana is the Yiddish noun (not adjective!) משוגענער meshugener: ‘lunatic, madman, crazy person.’ This is a noun derived from the adjective משוגע meshuge , originally from Hebrew מְשֻׁגָּע m’shugá ‘crazy, insane.’

The first guy, I’ll give the benefit of the doubt, as it sounds like he’s saying “what are you, a meshugener?” spoken by someone with a non-rhotic (‘r-less’) variety — something consistent with working class Jewish New York.

However, the rest of the speakers in the ad say “what are you, meshugana,” both mispronouncing the word and treating a noun as an adjective. The fact their official youtube channel spells it that way suggests it’s how they intended to use it. What’s more, their voice-over treats the noun as an adjective. Let’s break them down. First, you have a woman who seems to be stumbling over the whole line, not just the last word:

Then, a slowed down recording of a woman saying “what are you, mashuigana?”

Followed by a man again treating it as an adjective. The way he says “of course” is…shall we say, not like the “real” New Yorkers I know.

Then a man who treats at as an adjective, pronounces the /r/, but weirdly changes the final vowel (“meshuganar”):

Finally the voice over says “don’t be meshugener”.

Here’s the thing: Yiddish is not just “anything goes.” It’s not “bad German.” It’s a language. The same goes for Yiddish English, a dialect of English heavily influenced by Yiddish. And Just like AAE, there’s often a perception that it’s just “bad grammar” because it has different grammatical structures than standard English (e.g., “You want I should ask him?”).

This means that it’s entirely possible to speak it wrong. And when you dismiss it as something not worth getting correct because it’s not a “real” or “valid” language, it sends a message to the people who speak it that they aren’t of value. While most people know that approximately 6 million Yiddish speakers were killed in a genocide, fewer are aware that the strong push in the US toward cultural assimilation resulted in language repression that is a textbook example of cultural genocide (in this case, the suppression of cultural activities that do not conform to the destroyer's notion of what is appropriate). And the flip side of that is that the majority of people who do speak Yiddish natively now live in Brooklyn, and would certainly recognize this ad as not being “correct” (although they may not feel as strongly about it as I do). And heritage learners who are trying to learn the language of their grandparents or great grand parents now have little choice but to learn a “standardized” version that isn’t what their ancestors actually spoke, or the language of a particular sect from a particular place that they’re not directly related to. It’s depressing.

This business with the ad is further complicated by the fact that a generation ago there were people who did speak Yiddish well, but only used it publicly for comedic effect. Remember that mention of Mel Brooks, above? Well one of the best gags in Blazing Saddles, but also one that’s problematic from a language genocide point of view (but also that’s why it works?), has Mel Brooks playing a (generic) Native American and speaking Yiddish. In fact, he correctly declines meshugga in it. (Warning for those not already familiar with Blazing Saddles, the whole movie is puerile jokes about race and racism, and in the clip below he uses a Yiddish term that has been borrowed into English and is offensive in English but not (necessarily) in Yiddish).

None of this rises to the level that I’d have hard feelings if anyone told me they got here by taking Carmel, even though it does genuinely give me tsuris. I think the real takeaway for me, beyond the catharsis of kvetching (note to self: great title for a memoire), is something that was put best by Mel Brooks:

zayt nisht meshugga. Cop a walk.

(…instead of taking a cab)

-----

©Taylor Jones 2020

Have a question or comment? Share your thoughts below!

Truncations and offensive language

CONTENT WARNING: lots of uncensored slurs and offensive language, in a variety of languages.

.In part starting with my research with Christopher Hall on uses of the “n-word” (available here, and in (free) final draft form here), and in part because of the consulting work I do, I am an expert on slurs, epithets, and offensive language — the main language-y thing that companies, government organizations, journalists, lawmakers, lawyers, and judges are interested in is offensive language. Everyone wants to understand what they can and can’t, or should and shouldn’t say, and where the line is drawn, and for many people it has stark, real-world consequences. One of the things lots of people ask about is some variation of how do you know or how can I prove that something is offensive?

Ever since working on totes constructions with Lauren Spradlin, years ago, I’ve been thinking about hypochoristics (fancy linguist speak for ‘baby talk’ or ‘pet names’) and truncations, and how they relate to offensive language. A huge number of slurs are truncations of other words, and this isn’t really a coincidence.

Working with Lauren Spradlin on totes truncations we were focused on phonological and morphological rules of truncation: how does everyone know how to make new truncations, and what intuitive rules to people follow? Does anyone ever break those rules? What do these truncation patterns tell us about language more generally? There’s a lot to say there (and we’ve only, honestly, written about some of it — there’s another paper on the way, but here’s the conference talk), but one of the things that stood out to me immediately was that when we were talking about how you don’t yuzh (usually) eat blewbs (blueberries) with guac beeteedubs (BTW, by the way), is that certain words were shortenable but just sounded…offensive.

Truncating blueberry? Great. Truncating adjectives like ridiculous, or obnoxious? Totally fine. Truncating adjectives that relate to ethnic groups or places of origin? Really offensive. This is interesting: a morphological transformation that’s completely unremarkable in most contexts is deeply offensive in a small set of specific contexts.

The crazy thing is, this holds for novel truncations, meaning I can refer to someone with a truncation you’ve never heard before, and you’ll have an intuitive sense of whether it’s offensive. For instance, if someone said the chefs at the French restaurant I like are all mexies it reads, at least to me and everyone I’ve asked, as very offensive… even though I don’t know anyone who has ever heard that word before. (I wouldn’t be surprised if it exists, though). It’s clear I mean Mexican, which is in-and-of-itself fine, but it’s also clear that this particular phrasing is NOT OK.

And when you look at a list of offensive words (like, say, on Wikipedia), it really jumps out how many offensive terms follow this. An by no means exhaustive sample in no particular order:

  • Heeb and Heeby from Hebrew

  • Jap from Japan(ese)

  • Jerry from German

  • Hunky (and honkey(!)) from Hungarian

  • Paki from Pakistan(i)

Another way people intentionally offend is to take names stereotypically associated with a people, and call someone by that name, knowing it’s not their real name (for instance, Ike is an old fashioned epithet for Jewish men, Shaniqua is a name used to insult Black American women, Ahmed is used for Arab and muslim men, and the fact that this phenomenon exists goes a long way toward explaining the sentiment that being called “Karen” is a slur, even when the target is a person whose actual name is Karen but that’s unknown to the speaker). This truncation is also applied to names, making their use, in some cases, more offensive:

  • Ikey-mo from Ike and Moses (or Moishe)

  • Ack (or Akh) from Ahmed

  • Hymie from Hyman (itself an anglicization of Chayyim)

  • Abi from Abraham

  • Mo from Mohammed

  • Shaneeq(s) from Shaniqua

Interestingly, it’s not just totes style truncations, and shortening applied to offensive terms doesn’t make them less offensive, but rather more:

  • coon from barracoon

  • nig from…you know what it’s from

  • spic from either “hispanic” or “no spik ingles” (!)

I mentioned hypochoristics above, and I really think that’s the common factor. Baby-talk and “childish” language games can be fun and solidarity-building when they’re in-group behavior, but when baby talk is directed to someone who you don’t have the appropriate level of social closeness with, it’s insulting. It tells the speaker you respect them as much as you do a child. And to use baby talk for the name of the listener’s ethnic group, religion, or geographic origin, indicates belittling their background.

This seems to hold cross-linguistically as well, so in French, verlan (a game where you move syllables around similar to pig latin) is used to make offensive terms, like rebeu from beurre, itself a melioration of beurre from arabe (Arab), or feuj from juif (Jew(ish)). In Hebrew you get aravush, which has a “cutesy” diminutive marker -ush on the word arab. And the baby-talk element can be used to generate new offensive terms, so the more hateful parts of the internet use the term nig-nog to double down on offensiveness. While that particular term is attested as early as 1959, it and words like it were a starting point for the (thankfully now-defunct) subreddit /r/coontown.

Perhaps the wildest part about this, to me anyway, is that these truncations are used in fiction, sometimes even for groups of people that don’t exist. There’s an episode of Star Trek Voyager where a character claims that The Doctor was being totally Vulky and my first reaction was “the censors let that slide?!” Cardi is a slur for the fictional “race” of Cardassians in Star Trek (and it has its own entry in memory alpha). And in the gritty post-apocalyptic Canadian graphic novel “We Stand on Guard” the American aggressors routinely call the Canadians Nucks (from Canuck, a sometimes offensive, sometimes not slang term for Canadians). I remember reading it at a friend’s house and being genuinely shocked at a character referring to one of the protagonists as a "nuck bitch”. (He’s a mountie — itself a word probably originally intended disparagingly, from mounted police — and he left it as bedside reading as a sly provocation).


There’s a LOT left to be said about slurs, epithets, and offensive language — after all, we haven’t even touched on using religious headgear or ethnic foods as terms of address, let alone sweary words combined with prosody (as in “shit gibbon”) — but it seems there’s something profoundly offensive about truncation and the diminutives it commonly is accompanied by. So a good rule of thumb is to avoid any truncations unless you either (1) are really sure your interlocutors won’t take it the wrong way, or (2) you’re actively trying to offend.

-----

©Taylor Jones 2020

Have a question or comment? Share your thoughts below!

Why isn't declaring AAE a separate dialect/language racist?

I recently received a message asking a question I am asked a lot: Why isn’t it racist to call African American English a separate dialect? Or even to talk about African American Language as its own language distinct from English?

Usually the person asking is starting from some common, but linguistically uninformed, assumptions. Given how often I’m asked this, it seemed like a good idea to share my response here, and hopefully it’s useful to others.

I edited this particular version of the question to eliminate identifying or sensitive information, but it’s pretty representative of this line of questioning:

A colleague, […], passed along your AAE blog post and I found it very helpful to share with my audience- I work in the speech tech space […] so for reasons that may or may not be obvious, I believe it's important that our NLU datasets include wider varieties of speech than is currently being represented. I have a background in ling but most of my work colleagues don't, and so I've been asked the question "Why isn't declaring AAE/BE a separate dialect/language racist?" and I still can't seem to put my finger on a strong and persuading answer. If you have any thoughts on this, I'd love to hear from you! :)

My response, with a few added bits here, was as follows:

Thanks for reaching out. I get asked this a fair amount, and should probably get around to making a blog post about it.

While this is sometimes confusing for people, it is not racist to call AAE a distinct dialect, or even to discuss whether it is better to think of it as its own distinct language variety. The reason for this is that the discussion is based on the linguistic facts, not on race or ethnicity. What confuses people is that AAE is an ethnolect, meaning it's associated with a particular ethnic group because of history (in this case, chattel slavery and the separation of people who spoke the same African languages combined with the imposition of English, followed by Jim Crow, and enduring residential and educational segregation). This history, especially segregation, is very similar to known processes that have historically given rise to new languages and dialects, like the range of Romance languages.

It is very important to note that not all Black people speak AAE, and not all people who speak AAE are Black. Saying that all Black people speak the same is racist! Saying that there is a distinct language variety characterized by specific phonological rules, distinct morphosyntax, and associated with different cultural and pragmatic norms is just descriptive linguistics. Recognizing that the language variety under question is spoken overwhelmingly by African Americans, and thinking about why that is, is just good social science.  While this isn’t necessarily the origin of the study of AAE, it’s now the state of the field.

Now, the inverse -- refusing to acknowledge the very well established linguistic and social facts -- is ignorant in the neutral sense, but can lead to negative, and systemically racist outcomes. If I were to speak French, and you just declare it to be "bad Italian" it may go no further than simply hurting my feelings. However, if I were a school child in Italy, this dynamic would have radically different implications -- I wouldn't get the resources necessary to learn Italian, you might convince yourself and others that I'm stupid, etc. And we can imagine similarly how this might play out for adults on the job market, in contact with the judicial system, and so on. But nobody in their right mind would seriously claim that French doesn't really exist and is just defective Italian, and we can arrive at that conclusion simply by looking at the linguistic facts. The same is true for AAE: it has a well-described system of Tense, Aspect, and Mood that are distinct from "standard English", different phonological rules, different morphology, and a dispassionate and descriptive scientific approach reveals that it behaves like a regular old language (or dialect) and not like a pathology, learning disorder, or any such thing. The problem is that people still declare it just “bad English”, facts be damned. That’s why there’s a long history of articles in Linguistics with titles like “The Logic of Nonstandard English” (Labov 1970) or “African American Vernacular English is Not Standard English with Mistakes” (Pullum 1999).

So the quick answer for your colleagues is:

  1. It would be racist to say that all Black people talk the same, especially if you believe that way of talking to be just "bad English", however:

  2. AAE refers to a linguistic system that is very well studied and understood, and which is not spoken by all Black people, nor is it spoken exclusively by Black people, although it is spoken primarily by Black people who are the American descendants of enslaved Africans.

  3. The existence of this system, and general contour of its structures and rules are not debatable. They're settled science. Though, like any other language, linguists are still discovering new interesting facts about it.

  4. The language variety is an ethnolect: meaning it's associated more strongly with a particular ethnic group because of historical and social factors that affect language evolution (like segregation).

  5. Denying the existence of this system can lead to racist outcomes. Assuming all Black people speak AAE is also likely to create problems, and will be perceived as racist, especially by Black people who don’t speak AAE, and especially by those who also share common, uninformed, negative views about AAE.

  6. AAE is not the only stigmatized language variety that regular folks wrongly consider to be defective, and this social pattern is also pretty well understood (see, for example, Yiddish, or even regional Romance langauges like Niçart). Some of what our teachers tell us in school is wrong, especially in most people’s English classes.

I hope this helps, and don't hesitate to follow up with me if you have any other questions.

I truly believe that for many people asking some version of this question, their hearts are in the right place, and they’re genuinely trying to understand. Where most people get hung up is the link to race and ethnicity. It’s not causal! Melanin does not give you the ability or right to speak AAE and lack of it does not take those things away — rather, as with any language, it’s about culture, contact, upbringing, respect, and all the social factors that go into language acquisition and language use. Recognizing AAE as a legitimate language variety just forces us to confront the history of this country, including it’s upsetting parts, and forces us to see how that history continues into the present.

-----

©Taylor Jones 2020

Have a question or comment? Share your thoughts below!

On "woke"

I recently read a fantastic thread on Twitter from Dr. Alayo Tripp, aka @phonotactician, discussing changing use of “woke” — a participle that originated in (some) varieties of African American English, but that has been adopted into mainstream, white varieties of English with dramatically changing meanings. Dr. Tripp wrote about the intersection of borrowing, semantic change, and anti-blackness, and did an excellent job of explaining something I had been struggling to articulate.

I thought the thread was great, especially the discussion of superlative morphology (“the wokest”), and have reproduced the thread here, with their permission:

The Thread:

NonBlack people here’s a thread for you about the word “woke.” Since “no mickey mouse can be expected to follow today’s Negro idiom without a hip assist.”

TL;DR: If you are using the word “woke” to denigrate people then you are an agent of antiBlack racism. If you’re not Black and you’re using this word at all you should think carefully about why.

You might already be aware that the word “woke” comes from African American English (AAE). It even has a perfectly literal sense which we can readily translate to standard American English (SAE). SAE: is he asleep or awake? AAE: he sleep or woke?

This usage is demonstrated in the 1962 essay published by William Melvin Kelley publishes in the New York Times entitled “If You’re Woke, You Dig It.” Despite the subject of the piece being extant Negro idiom, he is variously credited with coining the usage.

The application of the asleep/awake concept to specifically social justice gains a lot of traction in the decade to follow. See: Malcom X, 1965

To “stay” in AAE expresses an intensified continuative and habitual aspect of a verb. To “stay tweeting” is to tweet continuously and habitually. Combining the AAE grammar with the social nuance, i will now give a definition of what it means to stay woke:

remain aware of the value of Black people in an antiBlack world which seeks to devalue, exploit and destroy us. Seek solidarity with those who have woken and have compassion for those who are still sleeping.

This phrase is widely used in the protest movement of the 1970s, but because antiBlackness, Erykah Badu is often credited with coining it decades later in 2008. It later reaches new popularity and visibility with the Black Lives Matter movement and the ubiquity of social media.

NonBlack folks begin widely adopting it as a term, using it to define nonBlack identities... unmindful of how use of the *stative verb* connects to Black conceptions of engagement with social issues and invent comparative terms to facilitate competition amongst themselves.

nonBlack people instead adopted woke as an *adjective* and quickly took to discussing the properties of being woker than one another, and what might characterize the wokest among them. (This sounds as ungrammatical to my ear as “awakest.”)

The practice of defining nB identities as either more or less woke of course goes hand in hand with establishing nB ideology regarding the value of “wokeness.” Is it a virtue, or a flaw? How can white people determine which wokeness is authentic and which is “performative?*”

*it’s of course all performance wrt speech act theory but here we will use “performative” in the colloquial sense which means “(inauthentically or inappropriately) signaling a belief in ideological superiority”

White people have been gripped in a heated discussion about whether and how they should aspire to be woke, and what performances of “wokeness” are appropriate and acceptable.

But missing from this conversation is always “why?” It is for white folk somehow a foregone conclusion that discussions of wokeness are important.

Black conversations using the word historically present the state of being woke as an unquestionable good *for the sake of Black people.* Criticisms of the application of the word in this community explicitly center the worth and value of Black people.

So to review, nonBlack people have been gripped in a very heated public discussion about whether and how this label can denote something truly awful. This is because collectively, nonBlack people are antiBlack.

As the term reached widespread recognition particularly among white people, dictionaries begin to add the term, and its history is whitewashed. The year is 2017, 55 years after Kelley’s NYT essay.

(The line in the first tweet is the subtitle of that essay.) Dictionaries broadly redefine “woke” to reference alertness to injustice, with racism sometimes being highlighted as an afterthought in the definition (but not antiBlackness.)

This entire conversation necessarily dislocates the language and the issues it addresses from the people it was unquestionably created to affirm and uplift. It’s not a coincidence that this dislocation admits so much derision to the conversation.

It’s a really consistent cycle where racists on the right adopt Black language and then see white liberals adopt that same language to deride and mock their white political opposition. AntiBlackness is the commonality. Also see: simp, cancel, hater and king

If people in your sphere are using this language pejoratively and you say nothing then you are normalizing antiBlackness. You can do better.

This thread is not about the broad phenomenon of language change or borrowing; it is about antiBlack appropriation. Derails will get summarily blocked.

I accidentally a whole tweet in this thread pivoting from the literal to the figurative interpretation but OH WELL

-----

©Taylor Jones 2020

(Tweets from @phonotactician are their intellectual property and are reproduced here with their permission).

Have a question or comment? Share your thoughts below!