Geotagged Tweet Heat

So I’ve worked through the python course and have been looking for where to go next.

I figured learning to use the Twitter API and the Tweepy Library would be a way to gather some really interesting real world spatial data.

My first idea was to scrape all the tweets directed @qanda that have a coordinate location attached. Q and A is a very popular Australian panel discussion show that attracts 20-40,000 tweets an episode (Source). My thinking was there should be enough tweets with coordinate data to be able to get an idea of the distribution of @qanda tweets within the major Australian cities. I was mistaken.

During the episode I ran the code, 48 tweets were returned and something like 40 of these looked like they came from the same household.

So, still interested in Twitter’s potential for gathering coordinate data, I decided to cast the net wider.

My next aim was to find places in the world where there are a lot of tweets being made that DO have coordinate data attached. This would allow me to develop an idea of places I could carry out fine scale spatial analyses of tweets.

The main challenges I faced in devising the code were:

-Ensuring that the code would only save tweets that had a specific point coordinate, excluding tweets with other location data such as city and country name, or a bounding box of coordinates.

-Maintaining a relatively stable number of tweets saved per second. This was designed to limit the overall database size.

-Ensuring that the code did not stop running due to issues with connection speed.

Here is the code I ended up running in the end:

from __future__ import absolute_import, print_function

from tweepy.streaming import StreamListener
from tweepy import OAuthHandler
from tweepy import Stream
from time import localtime, gmtime, strftime
import time
import json
strftime("%a, %d %b %Y %H:%M:%S +0000", gmtime())



file = open('GLOBALBIG.csv', 'a')
data_list =[]
count = 0
time_start = time.time()
print ("THIS IS START TIME", time_start)

class StdOutListener(StreamListener):

        def on_data(self, data):
            global count

            if count <= 500000:
                json_data = json.loads(data)
                    coords = json_data["geo"]
                    if coords is not None:
                            global time
                            since_start = (time.time()-time_start)
                            if count / since_start < 5:
                            	print (data)
                            	ptime = strftime("%a, %d %b %Y %H:%M:%S +0000", gmtime())
                            	lon = coords["coordinates"][0]
                            	lat = coords["coordinates"][1]

                            	file.write(str(ptime) + "&")
                            	file.write(str(lon) + "&")
                            	file.write(str(lat) + "\n")
                            	count += 1
                        except KeyError:

                            print ("Fail")   

                except KeyError:
                    print ("Fail")


        #def on_error(self, status):
           # print status

if __name__ == '__main__':
    l = StdOutListener()
    auth = OAuthHandler(consumer_key, consumer_secret)
    auth.set_access_token(access_token, access_token_secret)
    while True:
            stream = Stream(auth, l)
        except Exception:

ianbrod’s post here helped form the basis of my code and Eugene Yan’s post here helped fix network connection issues. I also had some great help from friends who are much more experienced and clever than I am, who I won’t embarrass by naming but will say thanks to here.

The map below represents the 341,029 tweets gathered between 0600 GMT 08 April and 0600 GMT 09 April, 2016.

Screen Shot 2016-04-11 at 7.57.08 PM

So this tells you a fair bit just from the naked eye, for me the biggest surprise packages were Indonesia (particularly Java) and Turkey.

I have given some summary data below for the ten countries with the highest number of tweets gathered.

Country No. of Tweets gathered Tweets/km² x 100 Tweets/capita x 10,000
United States 65709 0.67 2.06
Indonesia 34460 1.80 1.38
Japan 25415 6.72 2.00
Brazil 22547 0.26 1.13
Malaysia 17030 5.16 5.73
Turkey 17021 2.17 2.27
Philippines 10744 3.58 1.09
Thailand 10645 2.07 1.59
Argentina 9176 0.33 2.21
United Kingdom 9153 3.77 1.43

This data provides a valuable foundation and will form the basis of my site selection for analysing small-scale spatial patterns in tweets in the future.

I also realised that the data could be used to display temporal patterns. I took this as a chance to try out a different visual style. The QGIS heatmap rendering style provides a more simple and clearer representation of the areas of concentration.  The resulting animation displays the concentrated areas of tweets with coordinates and a drop off of twitter traffic over night.


Again, thanks to for the world map shapefile.




Create Day Night Shapefile in QGIS

Creating a shapefile that displays areas of night and day on the globe at a given time and date was something I had trouble finding help for.

I suspect the Worldwide Night plugin might do exactly this, but it requires the Matplotlib Basemap Toolkit Python library and unfortunately there is no Mac version (that I could find).

But I came up with a workaround.

In my last post I referred to this excellent post by Hamish Campbell:

After you follow the initial steps during  Section 2. Fixing artifacts, you will generate a shapefile. This shapefile covers the half of the globe centred on the latitude and longitude you feed into the clipper command.

So if you know the position of the Sun relative to the earth ( is a handy resource for this), you can plug the opposite latitude and longitude into your clipper commands to generate a shapefile that covers the half of the earth opposite to the Sun’s position, i.e. the half of the earth in darkness. (Make sure you simplify your rendering to view it)

Screen Shot 2016-04-08 at 10.20.50 PM

Gray circle is output shapefile of Clipper script displayed on Azimuth Orthogrpahic Projection (the strange bands are caused by the projection and aren’t a worry for what I’m doing in this post as far as I can tell)

Then if you change your project projection to one such as WGS 84 you will see your shapefile covering (roughly) the areas of the Earth where it is night at your chosen position of the Sun.

Screen Shot 2016-04-08 at 10.18.42 PM

The same shapefile displayed on WGS 84 Projection

The shapefile still displayed some odd behaviours when changing map scale and position etc.

Selecting the polygon and saving it as a new shapefile and then also thickening the band along the south by adding vertices and moving them south seemed to fix this for me. How appropriate these last steps are will obviously depend on what you are using this for/ how you plan on displaying it.

Obviously this doesn’t account for areas of dawn and dusk and is only a rough representation, but as a quick and relatively easy fix it gives you a decent enough display of areas of day and night on the Earth.

Again, credit is also due to for the world map shapefile.

JFK and Heathrow Great Circle Animations

I’m using QGIS for the first time ever as a cost effective and OS X friendly way to get back into GIS.

I happened upon this video tutorial by Steven Bernard:, which then lead me to Alasdair Rae’s (  and Hamish Campbell’s ( posts on the same topic.

One thing I noticed was that the  flight paths represented using the methods above did not follow great circles, or the actual straight line path between two points on the earth in 3D space.

So I decided to have a stab at animating a representation of the great circle flight paths from JFK in New York and Heathrow in London.

The first step was to create a shapefile with the lines that represent the great circle flight paths. This involved creating a custom aeqd projection centered on the latitude and longitude of the origin airport. Once the location of the origin and destination airports were imported (Using the OpenFlights database: it was a simple matter of joing the dots. This post: by AndreJ was very helpful working out this stage.

Screen Shot 2016-04-04 at 1.08.25 PM

Creating the flight path lines from Heathrow

Screen Shot 2016-04-04 at 1.09.33 PM

Creating the flight paths from JFK

Then following the steps outlined in Hamish Campbells’s blog I created the azimuth orthographic projection that the animation would be plotted on. I centred this projection on a spot roughly halfway between JFK and Heathrow.

Screen Shot 2016-04-04 at 1.12.37 PM

Azimuth orthographic projection of the earth centred on 45N 37W

Then the final step was to import the flight path lines onto this new projection and animate them using the methods outline by Steven Bernard and Alasdair Rae.

Final compositing was done if After Effects were I added some ease in to the animation to give the motion some pop and added a drop shadow to the flight paths to give the illusion of depth and that the flight paths are above the surface of the earth. The end result is below.


Representation of flights from JFK, New York and Heathrow, London

I’m pretty happy with how it turned out.  There are things I’d like to improve. Automating the creation of the lines from origin to destination is something I’m sure could be done to increase the speed of work. Also the different method of generating the flight paths compared to the blogs mentioned above means the flights can only be viewed as lines, not as animated points. Addressing these issues as well as doing a verison with flights from Sydney, Australia where I am based is something I’d like to do in the future. (Quick tests I did of Sydney flights caused headaches related to flights crossing the international date line)


Credit is also due to for the world map shapefile I used.