Future Tense

Writing IRL

Emoji didn’t become so essential because they stand in for words—but because they finally made writing a lot more like talking.

Two phones with sets of emoji on them.
Miguel Medina/AFP/Getty Images

Excerpted from Because Internet: Understanding the New Rules of Language by Gretchen McCulloch. Out now from Riverhead Books.

Our bodies are a big part of the way we communicate. If someone stamps into a room with a furrowed brow, slams the door, and proclaims, “I’M NOT ANGRY,” you believe their body, not their words. If a good friend looks you in the eye, grins, and says, “You’re the most terrible person I’ve ever met!” you think, “Awesome, we’re such close friends that we can mock-insult each other and we both know we don’t mean it!” Likewise, a lot of our language about emotion is embodied—our hearts race, our eyebrows arch, our cheeks flush, our stomachs butterfly, our throats, um, frog.

Writing is a technology that removes the body from language. That’s its greatest advantage—it’s easier to transport and store words written on paper or in bytes than embodied in an entire living human or a hologram of one. Sometimes, you don’t want the body component: Just because I ambitiously decide to keep a copy of Plato’s Republic beside the toilet does not mean that I want Zombie Socrates taking up residence in my bathroom.

But the lack of a body is also writing’s greatest disadvantage, especially when it comes to representing emotions and other mental states. Even though punctuation goes some way in helping to represent tone of voice, we’ve still been missing something in written communication—something embodied. On a technical level, we’ve gotten quite good at projecting and manipulating virtual bodies, a central feature for video games. But for general socializing, full-bodied avatars of ourselves never quite took off. Second Life made a lot of headlines, but it remained popular only in a smallish subcommunity of internet users, and similar efforts are even more obscure. The closest things most of us have to a social avatar are our profile pictures, which do provide some sense of who you’re talking to and what they (or their dog) look like. But they’re static. This is the void that emoji stepped into. But how emoji became so essential to digital communication is more complex than it first appears.

I got involved with the linguistics of emoji in 2014. I’d written some articles about meme linguistics and internet linguistics, and as emoji started hitting the news in a big way (more than 6,000 articles were written about the emoji released in 2014 alone), I became one of the people who journalists and tech companies called up to analyze emoji use. The question on everyone’s mind was: Why did emoji become so popular, so quickly? By the time they’d called up a linguist to answer this question, they’d all pretty much decided that the answer was that emoji were a new language. But as the linguist being called up, I wasn’t so sure. There isn’t even a clear way to say “emoji” in emoji, let alone a way to render, say, this paragraph. Real languages can handle meta-level vocabulary and adapt to new words with ease. Emoji aren’t capable of either.

But emoji are clearly doing something important for communication. When they first emerged, I considered whether emoji might be the perfect missing link between writing and the body. I made lists of common gestures and emoji to find correspondences. The lists got long: shrug, thumbs up, pointing finger, rolling eyes, middle finger, winking face, clapping hands, and so on. All of these exist in bodily gesture and emoji form, but there were also many that didn’t: the eggplant emoji and the fire emoji didn’t have equivalent gestures, and nodding “yes” or shaking one’s head “no” didn’t have emoji. But looking for a grand unifying theory of emoji wasn’t working because emoji don’t just have one communicative function, they have several.

I sent a draft of my emoji analysis to Lauren Gawne, an Australian linguist and close friend who helped me begin to crack the case. She drew my attention to a subcategory of communication we all use without noticing that we’re doing it. I’d been making a list of gestures, like thumbs up, waving, winking, shrugging, jazz hands, rolling one’s eyes, giving someone the finger, tugging out an imaginary collar to indicate awkwardness, playing a metaphorical tiny violin in false sympathy, brushing imaginary dirt off one’s shoulders, dropping a metaphorical mic, and so on. But what I’d also been doing, without realizing it, was making a list of gestures that have common names in English. I don’t have to describe to you that a “wink” involves the deliberate closing of one eye—as a speaker of English, these are things you already know. I was doing this for purely practical reasons (describing gestures in detail takes effort), but it turns out that nameable gestures all have something in common. Many theorists call them “emblems,” the way that a Jolly Roger is an emblem of a pirate.

Emblems are symbolic gestures that have precise forms and stable meanings. Crucially, while their names can fit easily into a sentence, they’re also perfectly meaningful without speech at all. Because of this, they often cover different territories than languages do: The middle finger, or digitus impudicus, was also considered rude in Ancient Greece and Rome, while the palm-inward V sign means “up yours” in some English-speaking nations but not others. Ultimately, emblems are arbitrary and culturally specific: Obscene emblems from around the world include thumbs up (“sit on this” in many Arabic-speaking countries), the OK sign (“asshole” in many Latin American countries), and a gesture known as the bras d’honneur or Iberian slap (common to many Romance-speaking countries), which consists of extending one arm palm up in a fist while the other hand is placed on the upper arm at the crook of the elbow. Perform one of these inside its region, and you’ll get anything from a rude gesture in return to legal prosecution for obscenity; perform it outside its region, and no one will care. Perform one of them in a subtly wrong way (try flipping someone off with your palm facing toward them, instead of toward yourself), and you’ll get laughed out of town.

Many emoji carry similarly arbitrary and culturally specific meanings. The eggplant emoji is a prime example: Widely used as a phallic symbol, it’s a natural heir to the obscene gesture list above. The smiling pile of poo emoji is another: In deciding whether to include it in Gmail, the Japanese engineers had to explain its importance to the head office. They described it as: “It says ‘I don’t like that,’ but softly”—a kind of “anti-like.” These kinds of emoji only work as emblems because they have precise forms: Some designers initially implemented the poo emoji without the smiling face, but this leaves out an essential aspect of its meaning. And when emoji were first catching on internationally, emoji fragmentation was a surprisingly big problem. Different app or device manufacturers were displaying the same underlying emoji with different designs. Designers had not anticipated how much people would hate it when sending a lady in a red dress could result in their friends seeing a disco man. People felt as foolish sending the wrong dancer emoji as you would giving the middle finger backward.

But what about the gestures we make in speech that don’t have specific names or precise meanings? Every culture that’s been studied uses these kinds of gestures as well. We gesture along with our speech even when it’s communicatively useless, like when we’re talking on the phone. Linguists call the kinds of gestures we can’t help doing “co-speech” or “illustrative gesture,” and these reflect more about the thinking of the speaker than the understanding of the listener. The next time you’re in a restaurant, have a look around you at the groups of people at other tables. You probably won’t see a lot of emblems, but you’re guaranteed to see some co-speech gestures. Look at some people at a table far enough away that you can’t hear them: You can tell who’s speaking when by who’s gesturing, but the content of their conversation remains private because the meaning of co-speech gestures is more dependent on their surrounding speech. For example, the thumbs-up sign could also be used as a co-speech gesture to indicate “up there.” But “up there” could just as easily be illustrated by the index finger or whole hand pointing up, using one or both hands, the eyes or eyebrows pointing up, or any of these things combined—none of which work as a substitute for the thumbs-up emblem.

Emoji have come to stand in for co-speech gestures too. You’ve probably seen the flexibility of co-speech at play when it comes to illustrating birthday texts. People might wish others a happy birthday using the cake with candles (🎂), the slice of cake (🍰), the balloon (🎈), the wrapped gift (🎁), the bouquet of flowers (💐), or general positive emoji such as hearts, sparkles, happy faces, confetti, and positive hand shapes like the thumbs-up or fist bump. To get a better sense of how co-speech emoji show up in everyday texting, I worked with the smartphone keyboard app SwiftKey, looking at the overall picture of how people use emoji based on SwiftKey’s billions of anonymized data points. In the SwiftKey dataset, these kinds of general emoji showed up in a wide variety of combinations and orders. When you’re illustrating your speech, you’re more willing to accept a range of options as suitable for “birthday” or “beach” or “fun” or “danger.” For the emblem emoji, we know exactly what we’re looking for. But for illustrative emoji, we often go browsing through our keyboards instead. Sure, it takes cultural knowledge of birthday traditions to interpret a cake and a balloon, but it doesn’t take any particular internet literacy. Whereas many an inadvertent grocery-store innuendo has been texted by someone who was treating the eggplant as an innocent illustrator rather than a suggestive emblem.

The difference between co-speech and emblem emoji and how we use both distinctly in digital communication is even more apparent when we look at how emoji show up in combination. When the world was wondering if emoji were a new kind of language, sequences that retold familiar stories in emoji got a lot of attention. It’s easy to see how this fit in with the idea of emoji as gesture: They’re like playing digital charades or pantomiming to a friend across a loud bar. But this is rarely the way that emoji combos interface with our casual written communication.

I got SwiftKey’s engineers to extract the most common sequences of two, three, and four emoji. This is a common way of analyzing a large body of writing and a good way to spot any storytelling that might be going on, since narratives have subpatterns among their common words. If you look at the most common short sequences for the half-billion words of the Corpus of Contemporary American English, you get sequences like “I am,” “in the,” “I don’t,” “a lot of,” “I don’t think,” “as well as,” and “some of the.” But instead of finding emoji combos that might be the glue holding together a story, what we found was a ton of repetition.

Looking at the top 200 sequences of short emoji combos, about half were pure repetition, such as two tears of joy emoji (😂😂), three loudly sobbing emoji (😭😭😭), or four red heart emoji (❤️❤️❤️❤️). The ones that weren’t simple repetition were often complex repetition, such as snow around a snowman (❄️☃️❄️), the see-no-evil, hear-no-evil, speak-no-evil monkeys (🙈🙉🙊), and kiss faces with kiss marks (😘💋😘💋). Even the most heterogeneous strings of emoji were always thematically similar, such as heart eyes and kiss face (😘😍) or single tear and loudly crying (😢😢😭😭), strings of related objects like birthday paraphernalia (🎂🎈🎉) or fast food(🍕🍖🍗), and strings of hearts in different colors or sizes (💓💕💞💖). To put that data in perspective, there are no repetitions at all in the top 200 sequences of two, three, and four words in the Corpus of Contemporary American English. We don’t even get strings of all nouns or all adjectives the way we got strings of thematic emoji. Sure, words are sometimes used repetitively (“very very very,” “higgledy-piggledy”) and emoji are sometimes used nonrepetitively (🚫❄️ to mean “no snow” or ❤️🍕 to mean “I like pizza”). But these sequences are significantly less common.

In fact, the part of communication where we repeat stuff all the time isn’t in our words, it’s in our gestures. Look over to your imaginary co-patrons at the imaginary restaurant again. Here’s someone emphatically looping the air with their finger several times as they exclaim something. Another person nods vigorously. Someone else claps their hand down on the table over and over again to drive home a point. We type “😘😘😘” because we might also blow multiple kisses, we type “👍👍👍👍” because when we give the thumbs-up gesture, we sometimes do it rhythmically or hold it up for several seconds to emphasize it. Like how we can extend the letters of a word for emphasis, even when the result is unpronounceable (“sameeeee”), and we occasionally repeat even those emoji that don’t have direct gestural correlates, like the skull (💀💀💀) or the smiling pile of poo (💩💩💩💩), because we’ve generalized this behavior to the category as a whole.

The emoji combinations data we collected also illuminated one final puzzle that clarifies the distinction between co-speech and emblems in emoji use: the case of the missing eggplant emoji. When we looked at the most common nonrepetitive sequences, the eggplant was nowhere to be found. We did find other, less famous, sexual combinations, such as the tongue emoji with water droplets (👅💦), or the pointing finger and the OK sign (👉👌). But the eggplant emoji only showed up as pure repetition (🍆🍆🍆) in our Top 200 lists. Same for the smiling pile of poo: People were happy to repeat it, but reluctant to combine it. What gives?

Co-speech illustrative gestures are fluid, going smoothly from one into the other, with lots of possible shapes and variations for essentially the same meaning. If you describe the path of where you’ve gone today, you’ll use many gestures in a row and you could easily gesture it slightly differently when you tell me about it now and when you told someone else about it a few minutes ago, just as you can illustrate birthday wishes with a wide range of combined emojis. But emblems, on the other hand, are discrete, individual gestures: They can repeat, but they don’t readily combine. That’s why we tend not to see them in interesting emoji sequences. Sending someone all of the possible birthday party emoji is extra festive: great! But sending someone all of the possible phallic emoji (say, the eggplant and the cucumber and the corncob and the banana: 🍆🥒🌽🍌) is NOT extra sexxaayy—that’s just a weird salad.

Book cover of Because Internet.
Riverhead Books

Because Internet

By Gretchen McCulloch. Riverhead Books.

Future Tense is a partnership of Slate, New America, and Arizona State University that examines emerging technologies, public policy, and society.