Facebook's "Emotional Contagion" Study Design: We're Mad For All the Wrong Reasons

A new study in the Proceedings of the National Academy of Sciences has been receiving an enormous amount of negative press, as their study of 'emotional contagion,' has been called 'secret mood manipulation,' 'unethical,' and a 'trampl[ing] of human ethics.' Researchers took 689,003 participants, and used the Linguistic Inquiry and Word Count (LIWC) software to manipulate the proportion and valence of 'positive' and 'negative' emotional terms that appeared on users news feeds. They then argued that emotional contagion propagates across social networks. This study has a number of flaws, and the fact that it passed Institutional Review Board (IRB) review is the least of them.

Since there's so much wrong with it, let's start first with why it's not as bad as everyone thinks: there is far more content generated by Facebook users' friends than is viewable, and so the news feed only presents users with a small sample of what their friends posted. All of their friends posts were visible (that is, nothing was suppressed!) on their walls and timelines, as well as on news feed viewings before and after the one week experiment. Facebook is very clear about the fact that they only present a subset of posts on any given user's news feed, and this experiment was simply tinkering with the algorithm for a week. A careful read of the study methodology reveals why it passed IRB review -- it's not massive, secret emotional manipulation, like some kind of google-era attempt at a privately funded MK Ultra. Rather, it was slight tinkering with how Facebook filters posts that it already filters, and is clear about filtering in their terms of use. This is not, however, an attempt at Facebook apologetics. In fact, I think the article was absolutely terrible, but for different reasons. The thing people seem to be missing is that:

Facebook claims they demonstrated emotional contagion, but cannot show that they actually successfully manipulated emotions AT ALL.

That's right, the reason I'm upset is that they didn't manipulate emotions; not because I wanted them to -- as that would potentially be an enormous violation of ethics -- but because they claimed they did and published it in a peer-reviewed journal, without actually proving anything of the sort.

There are so many flaws with the methodology that I'm going to limit myself to bullet points covering the most glaring problems:

  • "Posts were determined to be positive or negative if they contained at least one positive or negative word, as defined by Linguistic Inquiry and Word Count software (LIWC2007) (9) word counting system, which correlates with self-reported and physiological measures of well-being, and has been used in prior research on emotional expression (7, 8, 10)."  -- I'm friends with a ton of jazz musicians. When they call something bad, this is not a negative term, but would be interpreted as such by the LIWC.
  • More generally, depending on the social circle, terms like bad, dope, stupid, ill, sick, wicked, killing, ridiculous, retarded,  and terrible should be grouped differently. There is absolutely no indication that the researchers took slang or dialect variation in English into account.
  • This study does not -- and cannot -- demonstrate actual emotional contagion. They have a much better chance of demonstrating lexical priming than emotional contagion. Except, they can't demonstrate that either, because all of the terms are aggregated, so they only know that words with 'negative valence' are predictors of the use of other words with negative valence.
  • "people ’s emotional expressions on Facebook predict friends’ emotional expressions, even days later (7) (although some shared experiences may in fact last several days)" -- That is, there's no control for friends in social networks sharing a real-world experience and posting about it on Facebook using similar emotional terms.
  • "there is no experimental evidence that emotions or moods are contagious in the absence of direct interaction between experiencer and target."

In other words, the Facebook study does not control for shared experiences being described in similar terms, does not control for different semantic and pragmatic contexts (e.g., "those guys were BAD, son. [Piano player] was STUPID NASTY on the gig last night!" is extremely positive, but would be interpreted by LIWC as extremely negative), and conflates emotional contagion with lexical priming (simply, the increased likelihood of using a given term if it is 'primed' by previous use or by previous use of a related term).

In order for this study to say anything even remotely interesting, the researchers would first have to demonstrate that they can get at actual emotional state through social media posts. Then, they would have to demonstrate that they could reliably determine actual emotional state from social media posts (what is the probability that a Facebook user is experiencing sadness given that they have used descriptive terms about sadness in their posts?). Next, they'd have to separate out confounds (e.g. "nasty" for "good"). Then they'd have to demonstrate that there is in fact a 'contagion' effect. Finally, they'd have to demonstrate that the apparent contagion effect was not just lexical priming (that is, me repeating "sad" because I was primed by another person's use of the word "sad," while not actually feeling sadness). If this post is any indication, they'd also have to figure out a way to control for discussion of emotion -- this post is chock full of negative terms, while being emotionally neutral, since I'm discussing emotional terms.

The real travesty is not that the Facebook study passed IRB; it's that it passed peer review.

This is indicative of a larger problem in the sciences: there is a bias toward dramatic findings, even if they're not terribly well supported. As a linguist, it feels like linguistics suffers more from this than other fields, since there have been a slew of recent dramatic articles published about linguistic topics by non-linguist dabblers who employ terrible methodology (for instance, making claims about linguistic typology predicting economic behavior, but getting all the typologies wrong!). Whether linguistics as field suffers from this more than other fields remains to be proven by a well designed study. That said, when people decide to do research that relies heavily upon understanding linguistic behavior, it behooves the researchers to, I don't know, maybe...consult a linguist.

Ultimately, the Facebook study was (just barely) within the realm of ethical study on human subjects, although their definition of informed consent was more than a little blurry. What's truly terrible about it is the fact that they make very strong claims about emotional contagion on social networks that their research does not justify, and they passed peer review.

 

-----

©Taylor Jones 2014

Have a question or comment? Share your thoughts below!

Obvs is Phonological, and it's Totes Legit

Recently, NPR ran a story called Researchers are Totes Studying how Ppl Shorten Words on Twitter. It was primarily focused on what they called 'clipping,' for which the author of the article provides the example "awks," for "awkward." As far as I know, aside from the researchers interviewed by NPR, no one has done any scholarly work on this phenomenon, and as far as I can find on JSTOR and Google Scholar, no one has published anything on it.

The general consensus among regular folk is that the phenomenon is:

  1. annoying
  2. associated with young white women
  3. the result of character limits on Twitter, or choices about spelling economy in text messages.

The first two are likely in some ways true: I don't have the data to prove it (yet!), but it does seem to be most deployed by young women (who are often the leaders of linguistic change), and -- as is often the case -- because of its association with young women, it is negatively socially evaluated by the general public. My issue is with the third point. Most people take it as so obvious as to be axiomatic that 'clippings' like "obvs," and "totes legit," are the result of spelling choices. Even the Dartmouth researchers interviewed by NPR are influenced by this assumption, and were perplexed to find that people still shorten their words on Twitter even when they have plenty of characters left to write.

Not only is the assumption that it's orthographically motivated wrong, but it's a perfect example of where linguistics can provide clearer insight than can be afforded by Big Data style data mining and statistical analysis without a grounding in the past 100 years or so of the scientific study of language. Perhaps it's confirmation bias that leads people to assume that this phenomenon originated in written communication. The fact is:

Truncations like "totes" for "totally" arise out of the spoken, not written, language.

They can be described entirely in phonological terms, without recourse to writing. Moreover, they are clearly sensitive to phonological environment: specifically, primary stress. It's not entirely clear why a written truncation should be sensitive to stress. If that weren't enough, sometimes what NPR calls 'clippings' are significantly longer than the word they're supposedly an abbreviation of. Case in point:

bee tee dubs is more than 3x longer than "BTW."

So, what's really going on?

Let's break it down. There are a few key features:

  1. Words are truncated after their primary stress. A word like totally has three syllables, but its primary stress is on the first: tótally. The style of truncation under discussion is extremely productive, and can be used on new words. All of the truncations are sensitive to primary stress. When I asked women who use these forms, the consensus was that indécent becomes indeec, expósure becomes expozh, and antidisestablishmentárianism becomes antidisestablishmentairz. Note how spelling changes serve to preserve what remains of the pronunciation of the original word.
  2. As much material as possible from the syllable following the stressed syllable is incorporated into the end of the new word (that is, the onset of the following syllable is resyllabified as part of the coda of the stressed syllable).
  3. A final fricative is added if not already present (marv for marvellous). For most people who employ this kind of language play, there is actually a more restrictive rule: a final sibilant is added. This means that truncations can end with sh, zh, ch, s or z, and if there is no sibilant present, an s or z is added.

Voilà! An explanation that accounts for most of the data, explains forms that are not predicted by spelling rules, and makes correct predictions about novel forms.

The astute, Twitter-savvy reader might not be totally satisfied with the above, however. Such a reader might ask, "but what about forms like legit? Soz (sorry)?  Tommaz (tomorrow)? Bruh (brother)? "

First, it's necessary to point out that truncation is not a new phenomenon in English. Part of what motivated me to look into this phenomenon was outrage that anyone would suggest legit arose from Twitter or texting. Three words can disprove the 'twitter hypothesis': MC Hammer. 

Of course, a quick Google Ngram search will show that legit was in common use in the 1800s. Bumf, slang for tedious paperwork, is actually a truncation of 16th century 'bumfodder,' (i.e., 'toilet paper'). What's new here is the addition of the sibilant. Interestingly, it's now possible to find reanalyzed truncations on Twitter, so alongside legit, one may also see legits.

With regards to soz, appaz, tommaz, there is actually a very simple explanation: these forms are much more popular in the UK, and the speakers are non-rhotic. That is, they speak dialects that "drop the rs" (in point of fact, there is compensatory vowel lengthening in the contexts where r is not pronounced, so the r is not entirely absent). The above description actually perfectly describes how you get soz in a non-rhotic dialect. Underlyingly, it's still sorrs.

Finally, bruh, cuh, luh, and others. These are truncations, but in a different dialect of English: African American Vernacular English (although bruh has been borrowed into other dialects, like twerk, turnt, and shade have been recently). In these cases, the word is truncated after the primary stress, but subsequent material is not added to make a maximally large syllable coda.

This is where things get interesting. Truncation in both AAVE and other dialects of English leads to 'words' that are otherwise ill-formed. This may be part of why some people believe that such truncations are "annoying,"  or that their users are "ruining English." The /-bvz/ in obvs is not otherwise a permissible cluster in English (and most native speakers actually find it quite hard to say. Some 'fix' it by changing it to 'obv' or 'obvi,' the latter being the standard English diminutive or hypocoristic truncation). There, as far as I know, only four words in English that end with /ʒ/: rouge, garage, homage, and louge -- all of which are borrowed words, and some speakers 'correct' them to /dʒ/ (as in "George"). That sound does occur, however, in the middle of words like pleasure, treasure, measure, leisure, and so on...and ends up word final in truncations like plezh, trezh, mezh, leezh, and so on.

So what's the takeaway from all of this? Well, I hope it goes without saying, but young women aren't ruining English, even if they maybe speak a little differently than, say, your high school English teacher. Moreover, truncated forms like 'obvs' have nothing to do with writing. If they were simply shorthand for texting and Twitter, it would be a lot easier to wrt smthng lk ths. Instead, truncated forms are the result of language games that follow specific rules and are based on native mastery of phonology. They're closer to Pig Latin (or French Verlan, or Arrunde Rabbit Talk) than the  babbling of a "speech impaired halfwit."

So next time someone says it was totes a plezh to make your acquaints, or responds to your "how're things?" with "my sitch is pretty deec," recognize that they are playing a language game that requires total, intuitive, mastery of English...and maybs play along, rather than making things totes awks for everyone.

 

-----

©Taylor Jones 2014

Have a question or comment? Share your thoughts below!