Dispatches from the Meme War

Most common nicknames for Hillary Clinton in comments on conservative-leaning news sites during the 2016 election cycle

"Hillary" occurred 2,565,207 times in 1.25 billion tokens

Most common nicknames for Donald Trump in comments on liberal-leaning news sites

"Trump" occurred 2,389,829 times in 490 million tokens

Methodology

All data comes from comments made on news sites using Disqus between January 20, 2016, and January 20, 2017. "Conservative" and "liberal" sites were identified using Allsides.

Specifically, the conservative sites used were Breitbart, Infowars, The Washington Times, Newsbusters, Newsmax, The Gateway Pundit, Frontpage Mag, American Thinker, CNS News, TownHall.com, The Daily Caller, and The Federalist.

The liberal sites were Media Matters, Mother Jones, The Atlantic, Rolling Stone, The Salt Lake Tribune, CBS News, RawStory, Mediaite, and TruthOut.

Data was gathered over the course of many weeks using Disqus's free api. In total, there were about 75 million comments with an average of 28 tokens each.

I had much more data from conservative sites than liberal ones -- in total, 490 million word tokens from the left, and 1.25 billion from the right. Breitbart was, by far, the biggest data source, with 695 million tokens on its own.

To identify nicknames, I trained a Word2Vec model on the corpus of comments from each side. I found all words in each corpus with a cosine-similarity score of at least 0.5 with "hillary" and "trump", respectively, and over 100 occurrences. I then manually filtered out adjectives, verbs, words that clearly referred to someone else (like Obama or Bernie), and boring words like "clinton" and "president." I did not filter out typos.

Observations

This was inspired by, and the code almost entirely stolen from, the LeXicon of LeBron.