For non-UK readers, the EU referendum is to take place on the on Thursday 23rd June 2016 and the UK will vote either to remain in or leave the European Union.
There is much buzz around social media and the referendum and I thought I’d delve into some analysis. However, as an academic looking at this critically, and as having published several papers using Twitter data. I have to state that:
- Twitter is a highly non-uniform sample of the population. Not everyone in the UK uses the Internet, and of those that do use the Internet only a sample of those use Twitter.
- Twitter also allows members of the public to hold more than one Twitter account, so in theory one user could set up several accounts to post about vote leave or vote remain.
- There is also the issue of bots, which are Twitter accounts which tweet in large volumes automatically or look to mimic real users.
With such caveats in place let’s look at some recent analysis produced by using different analytic programs.
Using hashtagify I located the most frequently used hashtags associated with #EUref. I then used the quick trends explorer offered by Visibrain Focus to compare the frequency of the VoteRemain and VoteLeave hashtags.
Below is a time series graph of the hashtags VoteRemain and VoteLeave
There are over 900 thousand tweets that contain VoteLeave in comparison to VoteRemain within the last 30 days.This suggests that the VoteLeave campaign is more active on Twitter.
Here is a more complicated graph with a number of hashtags compared against each other such as StrongerIn, Brexit, VoteRemain, VoteLeave:
As the graph above demonstrates, Brexit has been used in over three million instances within the previous month. However, many news articles and general media coverage use this term (see G2 in the NodeXL graph below). Therefore, it is difficult to attach the Brexit hashtag solely to those whom wish to vote to leave the EU.
Now lets take a look at the data using @NodeXL which can produce network graphs alongside comprehensive reports which are uploaded to the NodeXL graph gallery.
Using data which was already published by NodeXL, I then examined the EURef hashtag which is impartial as opposed to VoteLeave or VoteRemain:
The network graph below displays 6,259 Twitter users whose recent tweets contained “EURef”, or who were replied to or mentioned in those tweets over the 3-hour, 46-minute period from Friday, 10 June 2016 at 12:58 UTC to Friday, 10 June 2016 at 16:44 UTC.
The network graph is made up of several groups of Twitter users. Notable highlights of the report are that:
Group 1 (top left) contains the following most frequently used hashtags:
[8009] euref
[1888] brexit
[1122] bftownhall
[859] voteleave
[360] itveuref
[350] voteremain
[342] strongerin
[330] remain
[302] leave
[288] bbcqt
Group 1 contains Hashtags and URLs which point to the vote to remain and leave campaign, this suggests it may be a polarised group (see Smith, Rainie, Shneiderman, & Himelboim, 2014).
In Group 2 is interesting to observe an isolates group which shows that a number of users which are not connected to each other are tweeting using the hashtag. For instance, they may be tweeting media stores. This is one possibility for why the Brexit term is used so frequently.
There are a range of different domains that are being used within this campaign including Facebook, the Guardian, and YouTube , full list below:
[798] twitter.com
[685] co.uk
[449] twimg.com
[183] facebook.com
[129] trib.al
[98] org.uk
[97] theguardian.com
[88] youtube.com
[88] ac.uk
[54] buzzfeed.com
The full report NodeXL report including top influencers, top URLs, top domains, hashtags, keywords, word pairs, and replied to can be found within an interactive version of the graph. This was produced by Marc Smith who resides in Belmont, CA, USA.
I’d also like to mention the ongoing work by colleagues:
- John Swain examining the possible echo chamber effect of #voteleave
- Georgina Parsons examining the live ITV debate which produced over 200,000 tweets in one evening.
Any questions? Feel free to drop me a message (@was3210).
Disclaimer: At no time was any personally identifiable data and/or information physically stored and/or analysed by-myself and/or using any of my own equipment. The post draws on the various analyses conducted by others.