Collocations of ‘cock’: What corpus linguistics tells us about porn writing

This is a guest post by Orin Hargraves, an independent lexicographer, language researcher, and past president of the Dictionary Society of North America. Orin is the author of several language reference books, including It’s Been Said Before: A Guide to the Use and Abuse of Clichés (Oxford) and Slang Rules!: A Practical Guide for English Learners (Merriam-Webster).

*

A few years ago I wrote about how collocations in fiction skew the statistics of collocations in a corpus because of their extremely frequent use; Ben Zimmer expanded on the idea in a later New York Times piece. In summary, the point is that a number of collocations would not be statistically significant were it not for their appearance in fiction. This is because writers of fiction—particularly writers of the amateur, unedited fiction that appears online—tend to reuse the same tropes and phrases so much that these effectively become clichés, formulaic ways of expressing the same (rather tired) ideas and events.

All of that came to light when I was working with the Oxford English Corpus, a well balanced and carefully curated corpus that, at the time, had about two billion words of English. These days I’m working with the enTenTen13 corpus, a web-crawled corpus of nearly 20 billion words, owned and made available by Sketch Engine. Sketch Engine’s web-crawler roves the Internet indiscriminately, pulling text from wherever it can be found. Like some grandmother aghast in Greenville, the web-crawler regularly comes upon sites with pornographic content. The difference between the grandmother and the web-crawler is that while she may avert her gaze in shock and dismay, the web-crawler grabs the text, parses and tags it, and adds it to the corpus. The result is that enTenTen13 houses a steaming, pulsating trove of pornographic writing.

Continue reading

Literary expletive avoidance

Show, don’t tell goes the writer’s refrain. It can apply to cursing, too, but doesn’t tend to in contemporary prose. Swearwords pepper modern novels, not least in genres like detective fiction where they lend colour and authenticity to hard-boiled dialogue. But there are times when a writer can say more by not saying them.

deirdre madden - molly fox's birthday - faber & faber book coverTake Deirdre Madden’s novel Molly Fox’s Birthday. (Or better yet, read it.) Madden has a gift for imaginative description but knows when to apply the subtler force of discretion. Here the narrator, a playwright, is chatting by phone to her friend Molly Fox, a stage actor with what we have learned is a remarkable voice, ‘clear and sweet’ and at times ‘infused with a slight ache, a breaking quality that makes it uniquely beautiful’.

Molly has just received birthday wishes from a mutual friend:

Continue reading