Using @NodeXL to analyse @foodgov and associated hashtags

In this blog post I want to analyse the Foods Standards Agency (FSA), specifically their Twitter handle @foodgov by producing a network graph and associated analytics using the very powerful Microsoft Excel plugin NodeXL. I then want to further analyse the top 5 hashtags by creating 5 further network graphs.

I selected the FSA as I had the opportunity to attend an event at Twitter HQ, London where the head of the Head of Information Management , Dr Sian Thomas provided some insight into the innovate and intuitive methods the FSA have applied. Both in using social media data, and as a method of reaching the public via allergy awareness campaigns and by use of influencers (that is to say, users who may have a bigger reach or a different type of user following compared to the FSA). My report on the event which provides more context to the work by the FSA can be found here.

In the network graphs, G1, G2, and G3 etc. refer to different groups of users and the words at the top of each group are those that occur most frequently. By visiting the NodeXL graph gallery more analytics can be located such as top URLs overall in the graph and in the separate groups (in this blog post I have hyperlinked each of the graphs i.e, by clicking on Network graph 1, for example, will take you to the graph gallery version of the network graph).

I find network graphs useful in summarizing and providing a snapshot of what users are conversating about on Twitter related to a keyword, hashtag, or user-handle at any given time. One topic i.e., bird flu may generate a range of conversations and this would be represented in the network graph with a number of different groups and associated keywords and URLs. For each graph I have added a section where I briefly mention what I found interesting about it.

 Network graph 1Tweets containing @foodgov
@foodgov

The graph above represents a network of 441 Twitter users whose recent tweets contained “@foodgov”, or who were replied to or mentioned in those tweets, taken from a data set limited to a maximum of 10,000 tweets.

An interesting observation in this network graph: Top URLs such as: FSA advice about avian (bird) flu,  &  FSA Board agrees restrictions on raw milk should remain,  &  Suspected bird flu found on Lancashire poultry farm,  &  Campylobacter Action Plan – Our Progress,  &  J & K Smokery Ltd recalls vacuum packed smoked fish because of concerns over Clostridium botulinum controls 

What I am interested in this post is the top 5 hashtags in the entire graph and these were:

fsaboard
birdflu
rawmilk
recall
foodallergy

So, one by one, I entered these hashtags into NodeXL to create 5 further network graphs.

Network graph 2 – fsaboard

fsaboard

The graph represents a network of 100 Twitter users whose recent tweets contained “#fsaboard”, or who were replied to or mentioned in those tweets, taken from a data set limited to a maximum of 10,000 tweets.

An interesting observation in this network graph: The @foodgov account was most influential in this network graph (ranked by betweenness centrality). The top URL in G1 and overall in the graph was: FSA Board agrees restrictions on raw milk should remain and one of the top keywords in this group was ‘raw milk’ indicating that discussion revolved around this news article.

Network graph 3birdflu

birdflu

The graph represents a network of 876 Twitter users whose recent tweets contained “#birdflu”, or who were replied to or mentioned in those tweets, taken from a data set limited to a maximum of 10,000 tweets.

An interesting observation in this network graph: In G1 a number of Twitter users (that are not connected to each other) are relaying the message i.e., are posting a tweet that contains the keyword or hashtag ‘birdflu’. The top URL in the entire graph and G1 was Avian flu confirmed in Lancashire.

Network graph 4 – rawmilk

rawmilk

The graph represents a network of 175 Twitter users whose recent tweets contained “#rawmilk”, or who were replied to or mentioned in those tweets, taken from a data set limited to a maximum of 10,000 tweets.

An interesting observation in this network graph: In G2 a number of unconnected users are relaying tweeting about a news article related to scientific risk assessments that were recently published in the Journal of Food Protection. Drawing on these results the author of the article suggests that raw milk is ‘remarkably’ safe. The top 3 URLs overall and the top URL in G2 is the aforementioned news story: New Science Confirms that Drinking Raw Milk is Remarkably Safe.

Network graph 5 – recall

recall

The graph represents a network of 1,777 Twitter users whose recent tweets contained “#recall”, or who were replied to or mentioned in those tweets, taken from a data set limited to a maximum of 10,000 tweets.

An interesting observation in this network graph: The most influential Twitter account in the entire graph is @usdafoodsafety (ranked by betweenness centrality)In G1 a number of unconnected users are relaying messages i.e., tweeting about products (mostly  food) being recalled. The top URL in overall in the graph is a wordpress website which provides news and email alerts on product recalls (not always food products which explains the top keywords such as ‘gm’ and ‘India’ as the company General Motors recently had to recall a large number of vehicles due to a wiring problem).

Network graph 6 – foodallergy

foodallergy

The graph represents a network of 662 Twitter users whose recent tweets contained “#foodallergy”, or who were replied to or mentioned in those tweets, taken from a data set limited to a maximum of 10,000 tweets.

An interesting observation in this network graph: The top hashtags in this graph, foodallergy, faact, and peanutallergy. The top co-words: food, allergy & peanut,patch, & foodallergy,friendly,& phase,iii, iii,trials, & trials,foodallergy. The most influential Twitter account @foodallergy. The top URL in the entire graph Peanut Patch’ Heads to Phase III Trials.

This blog post has presented some analytics on the @foodgov Twitter account and associated hashtags using NodeXL, there is much more going on within each graph and I have only highlighted what I found interesting, particularly from a health informatics perspective. Written consent was obtained (consent via a tweet) to analyse the FSA’s Twitter account and/or associated keywords and hashtags related to the FSA’s Twitter account.

For anyone wanting to learn more about NodeXL and network graphs check out this video – Network Mapping the Ecosystem by Marc Smith (@marc_smith) and this excellent article Mapping Twitter Topic Networks: From Polarized Crowds to Community Clusters.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s