Mapping the United Swears of America

Swearing varies a lot from place to place, even within the same country, in the same language. But how do we know who swears what, where, in the big picture? We turn to data – damn big data. With great computing power comes great cartography.

Jack Grieve, lecturer in forensic linguistics at Aston University in Birmingham, UK, has created a detailed set of maps of the US showing strong regional patterns of swearing preferences. The maps are based on an 8.9-billion-word corpus of geo-coded tweets collected by Diansheng Guo in 2013–14 and funded by Digging into Data. Here’s fuck:

Jack Grieve swear map of USA GI z-score FUCK

The red–blue scale shows relative frequency. The frequency of a word in the tweets from a given county is divided by the total number of words from that county (which correlates strongly with population density). The result is then smoothed using spatial autocorrelation analysis, with Getis-Ord z-scores mapped to identify clusters. Alaska and Hawaii are not included.

Polysemy – a word’s multiple meanings – has not been controlled in the graphs, so the hell map includes straight religious uses as well as sweary ones, the pussy map includes cat references, and so on. But the graphs are nonetheless highly suggestive of differential swearword (and minced oath) clustering in different parts of the country.

Hell, damn and bitch are especially popular in the south and southeast. Douche is relatively common in northern states. Bastard is beloved in Maine and New Hampshire, and those states – together with a band across southern Arizona, New Mexico, and Texas – are the areas of particular motherfucker favour. Crap is more popular inland, fuck along the coasts. Fuckboy – a rising star* – is also mainly a coastal thing, so far.

Here’s the full glorious set in alphabetical order (click to enlarge):

Jack Grieve swear map of USA GI z-score ASSHOLE

Jack Grieve swear map of USA GI z-score BASTARD

Jack Grieve swear map of USA GI z-score BITCH

Jack Grieve swear map of USA GI z-score CRAP

Jack Grieve swear map of USA GI z-score CUNT

Jack Grieve swear map of USA GI z-score DAMN

Jack Grieve swear map of USA GI z-score DARN

Jack Grieve swear map of USA GI z-score DOUCHE

Jack Grieve swear map of USA GI z-score FAGGOT

Jack Grieve swear map of USA GI z-score FUCK

Jack Grieve swear map of USA GI z-score FUCKBOY

Jack Grieve swear map of USA GI z-score GOSH

Jack Grieve swear map of USA GI z-score HELL

Jack Grieve swear map of USA GI z-score MOTHERFUCKER

Jack Grieve swear map of USA GI z-score PUSSY

Jack Grieve swear map of USA GI z-score SHIT

Jack Grieve swear map of USA GI z-score SLUT

Jack Grieve swear map of USA GI z-score WHORE

As Grieve put it, ‘pretty much everyone’s swearing. We just don’t all prefer the same words’. You can see more word-maps on his research blog and various publications elsewhere on his website. He and colleagues have been measuring the 100,000 most common words in American English (as manifested in the tweet corpus), so additional maps will be appearing, and he tells me Diansheng is also collecting UK data.

For more on the method of spatial analysis used to create the maps, see for example Grieve’s ‘A regional analysis of contraction rate in written Standard American English’ (PDF), or ‘A statistical method for the identification and aggregation of regional linguistic variation’ (PDF) (co-written with Dirk Speelman and Dirk Geeraerts), both from 2011.

Updates:

See my follow-up post, Sweary maps 2: Swear harder, for ~60 more sweary heat maps and a link to Jack Grieve’s Word Mapper app, where you can run your own searches.

Some composite maps, including swears not covered above, are now available on Grieve’s blog. Here’s one with bollocks, bloody, piss, and crap:

jack grieve swear map - piss crap bollocks bloody

Picked up by Washington Post, Kottke, Fusion, MetaFilter, Discovery, AJC, Mental Floss, WaPo again.

*

* Grieve’s presentation ‘Mapping lexical spread in American English’ (PDF) has data on the fastest growing words on Twitter in 2014, among other delights. Four of the top 10 are based on fuck. We’re becoming sweary asf.

jack grieve - top 10 rising words on Twitter 2014

108 thoughts on “Mapping the United Swears of America

  1. Dan March 28, 2016 / 12:59 pm

    This is some fucking bullshit.

    Like

  2. Mad Chef June 23, 2016 / 7:59 pm

    I fucking love this site! I thought I knew how to swear until I joined the Marine Corps! I learned a whole new level and heard fantastic swearing from all over the country!

    Liked by 1 person

  3. MHChicago July 6, 2016 / 12:20 pm

    Anybody besides me bothered by the inclusion of hate language (faggot, slut, etc) with “swears”? I don’t just mean made uncomfortable – but also questioning the methodology.

    Like

    • Stan Carey July 6, 2016 / 12:26 pm

      We use ‘swear’ on this blog as a convenient catch-all term for taboo vocabulary in its many forms: this includes slurs and epithets. It’s not clear how this relates to Grieve and colleagues’ research methodology.

      Like

      • MHChicago July 6, 2016 / 2:02 pm

        Gotcha. Still wondering why these terms and not others – maybe some of the more “loaded” terms get blocked by Twitter as hate speech and so don’t show up in their sample. Maybe I’ll take a look at the original study.

        Like

    • Liz Lemmey August 10, 2016 / 8:47 am

      I’m not unbothered.

      Like

  4. Darren July 6, 2016 / 12:49 pm

    I notice a trend of low z-score for Colorado and specifically the Denver metropolitan area. Is there a way to substantiate that across the board with a single graphic that there are indeed some places that just don’t “swear” much online?

    Like

  5. Aidan Beers March 13, 2017 / 9:54 pm

    A small note: I think you mean spatial autocorrelation in your third paragraph instead of spatial autocorrection. Spatial autocorrelation means that the z value (e.g. elevation, Getis-Ord) for any one point (or pixel/cell) is similar to those cells near to it because they are close to each other, and the process driving that z value is not independent at that scale.

    Like

    • Stan Carey March 14, 2017 / 8:34 am

      Thanks for pointing this out. It’s fixed now.

      Like

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s