Anthropology professor Kendra Calhoun studies the creative language people use on social media platforms to fool algorithms that may incorrectly categorize content as “inappropriate” or “offensive.” Calhoun spoke with News Bureau life sciences editor Diana Yates about this phenomenon, which she calls “linguistic self-censorship.”
In a recent report, you and your coauthor, Alexia Fawcett of the University of California, Santa Barbara, wrote that TikTok’s content-moderation practices sometimes delete or suppress posts that reference contentious subject matter such as suicide, race, gender and sex or sexuality. Does this have a disproportionate effect on some users’ speech?
Yes. These practices disproportionately affect TikTok users who are already structurally marginalized in society based on their identities and backgrounds. In our research we found that users from all types of backgrounds engage in linguistic self-censorship, but those who expressed the most fear of having their content suppressed were primarily people from marginalized groups.
What communities are most likely to have their posts removed or suppressed?
There isn’t publicly available statistical data on TikTok’s moderation practices, but TikTok content creators who are from communities marginalized based on race, gender and/or sexuality have been some of the most vocal about their content being suppressed or removed. A large portion of the examples in our study came from creators who are Black, transgender and/or queer. People with social and political beliefs that go against the beliefs of people in power is another group whose content may be at risk of suppression. We’ve seen this recently with pro-Palestinian content in the wake of the Hamas attack on Israel on October 7, 2023, and Israel’s retaliatory war.
Are social media companies doing anything to better protect the free speech of marginalized groups or others while also protecting users from harmful content that violates platform policies?
Social media platforms all have public-facing community guidelines that are supposed to govern their content-moderation practices, but these guidelines still have to be interpreted by moderators so there’s always room for unequal application. Years of research on content moderation and censorship on platforms like YouTube, Twitter, Instagram and now TikTok shows that the suppression of marginalized people’s language is an ongoing problem. But content-moderation decisions are company-internal decisions, so social media users don’t know if or how these companies are attempting to resolve this conflict.
How are users adjusting their own speech to avoid being censored on TikTok?
The main goal of linguistic self-censorship online is to avoid writing or saying specific words or phrases that could be recognized by an algorithmic filter. Users do this in a lot of creative ways by manipulating spelling, sound and meaning, as well as using digital linguistic resources like emojis and speech-to-text technology. Someone might avoid writing the word “gay” by spelling it “ghey,” or replace the word “porn” with the emoji for the rhyming word “corn,” as in “🌽 star.” Users also replace words with homophones that can be correctly interpreted in context, such as “sir-come-sized” for “circumcised.”
Please share some of the most playful examples.
One of the most creatively censored words in our study was “white,” which was censored because of the widespread belief that acknowledging or talking about race, including whiteness, is harmful. Some forms change spelling, using terms like “whyte,” but many others use references to white-colored objects. Emojis for white objects, for example 🦷, 🧻, 🚽, are used as a word-for-word substitution for “white,” such as “🦷 people.” Expressions like “blank google doc” or “8.5x11” – both referring to a white piece of paper – use written descriptions of white objects in place of “white person/people.” Because so many white objects exist in the world, there’s a seemingly endless number of possibilities.
How do the new expressions help build community?
Many self-censored forms rely on in-group knowledge, like existing slang or earlier trends on the platform or cultural practices. When people use those forms, it signals to others in the group that they have some sort of social connection. The expression “le dollar bean” came from the censored form “le$bean” — shorthand for “lesbian” — being mispronounced by TikTok’s text-to-speech feature. The video that popularized the form was posted by a lesbian couple, and “le dollar bean” became a popular phrase for self-description among lesbian TikTok users.
Some forms also signal a specific social or political viewpoint, so using the form can tell others that you are like-minded. “Isnotreal” and “Israhell”— for “Israel”— are two recent example of self-censored expressions that signal a user is critical of Israel’s current war on Palestinians in Gaza.
Do these linguistic innovations live on in other contexts?
Some of these innovations are very niche or have a brief moment of popularity before fading into the background, as is the case for many TikTok trends. Other self-censored forms have had a long life and migrated to other social media platforms or into offline contexts. The word “unalive” for “kill” continues to be used on TikTok and can also be easily found with a keyword search on Twitter. “Accountant” as a censored form of “sex worker” emerged before TikTok even existed, but its use on the platform helped it spread to new communities. It’s not uncommon for words that originate online to become part of people’s everyday lexicon — think “rizz” or “trolling,” so there’s a high chance that as more innovative forms come into being, they’ll find their ways into our offline interactions.