We have been working hard here at CASA lately, building tools to collect, analyse and visualise different data sets from all over the web. One piece of software that has proven quite popular over the past few months is our internal Twitter Collector.
The collector mines Twitter for tweets inside a geographical radius either with a specified hash tag, for example #CASA, or just a plain geographical latitude/longitudinal search. This data can then be analysed in a variety of different ways from aggregating the data into the form of heat maps to sentiment analysis of individual tweets. The collector actually also powers and collects data for both our digital and analogue Tweet-o-Meters.

Oliver O’Brien and I have used the data collected over a 3 week period during the months of February and March to create an overview of social media activity in the Greater London area for the Royal Mail magazine for marketers, Contact. The heat map marks the intensity of 200,000 tweets from a 30km radius from the geographical center of London. Some interesting findings from this Twitter map were the appearance of roadwork traffic in the Rickmansworth area and the totally empty parks of Hampstead Heath, Hyde Park and Richmond Park.

We are quite excited, not only about the publication of this map, but also the implications of this research as it fuses together our social media mining and our interests in big data. The fact that we can pinpoint events and geographical areas just by mining Twitter data is very powerful indeed and we have only just scratched the surface of this fascinating research topic.

Flickr Gallery







Harry Wood says,
Maybe people don’t get a location lock when they’re in big parks (no wifi networks), or maybe people with smartphones are too busy to be chilling out in parks. Most likely it’s just that there’s less stuff to bitch about when you’re in a park
on 19 May 2011 / 11:28 AM
Kentman says,
Further to Harry’s post it would be interesting to know if the absence of tweets from parks is linked to lack of available signal (technological) or activity (social) and tweets can be traced to dates and times it might be useful to link to the local authority, managing a particular park, to see what their visitor numbers are: if high and tweets low but signal not an issue we could infer that the social aspect of going to the park overrides the need to tweet etc. etc.
It would also be nice to open-up the mining service!
on 19 May 2011 / 2:26 PM
Tweets in London says,
[...] detail on Steven’s Big Data Toolkit blog. [...]
on 25 May 2011 / 12:33 PM
Tweets in London says,
[...] detail on Steven’s Big Data Toolkit [...]
on 25 May 2011 / 2:21 PM
Suprageography says,
[...] detail on Steven’s Big Data Toolkit [...]
on 25 May 2011 / 8:40 PM
Nile says,
I’m prepared to bet that the small red clusters in the suburbs are secondary schools.
How can this hypothesis be tested?
on 11 June 2011 / 9:08 PM
Les data en forme » OWNI, News, Augmented says,
[...] au University College de Londres. Imaginée par Fabian Neuhaus à partir des data récoltées par Steven Gray, cette île virtuelle au troublant potentiel phallique représenterait la ville de Londres [...]
on 30 January 2012 / 2:45 PM
The Week In Data » OWNI.eu, News, Augmented says,
[...] for Advanced Spatial Analysis & Visualization, part of University College London. Designed by Fabian Neuhaus and based on data collected by Steven Gray, this troubling phallic virtual island represents the [...]
on 06 February 2012 / 10:26 AM