Monday, April 25, 2011

Visualizing Global Sales Data With Google Earth

This post was written by Charles Ma, one of our software engineers based in Sydney. The post is also cross-posted to his personal blog which you can check out here.
As an early employee of the unique design your own shoe company Shoes of Prey, I'm intrigued by how quickly the company is growing. It seems like every time I walk into the office there is a new employee in Sydney or one of it's partner offices around the world. As the business spreads into new markets across the globe, we start to ask questions like "how quickly are we growing in Japan?", "how successful was our valentine's day campaign?", "how can we replicate our Australian success in Russia?". We can take information like where and when we're making a sale and apply basic statistics and number crunching-fu to try and answer some of those questions, but that's no fun. What if we could see our growth with our eyes?

Google earth is an excellent tool for visualizing geo-location data, and that's exactly what we chose to use. In a few hours, I was able to create a visualisation of shoe sales around the world over time. Here is a video of Australian sales:


Seeing the growth of Shoes of Prey over time is exciting, and it only takes the right tools and bit of programming background to put something like this together. This post gets technical from here, so skip to the end of that's not your cup of tea.

Displaying time sequence animations in Google Earth is easy, it's a matter of getting your data in the right format. It uses a format called KML, for which you can find documentation here. I'll summarize what it took to put the animation together.
You can place a "placemark" on the map with the following tag:


<?xml version="1.0" encoding="UTF-8"?>
<kml xmlns="http://www.opengis.net/kml/2.2">
  <Placemark>
    <Point>
      <coordinates>-122.0822035425683,37.42228990140251,0</coordinates>
    </Point>
  </Placemark>
</kml>
Furthermore, you can add a custom icon as well as a "TimeSpan" to specify when to display that "placemark". This is what we use to display those red shoe icons on the map.

<?xml version="1.0" encoding="UTF-8"?>
<kml xmlns="http://www.opengis.net/kml/2.2">
  <Document>
    <Style id="iconStyle">
       <IconStyle>
          <scale>1.0</scale>
          <Icon><href>shoe.gif</href></Icon>
       </IconStyle>
   </Style>
  <Placemark>
      <TimeSpan>
        <begin>2009-10-09</begin>
        <end>2009-10-16</end>
      </TimeSpan>
     <styleUrl>#iconStyle</styleUrl>
     <Point>
       <coordinates>-122.0822035425683,37.42228990140251,0</coordinates>
     </Point>
  </Placemark>
</Document>
</kml>
The "kml" above will display a single shoe icon at GPS coordinates (-122.0822035425683, 37.42228990140251), you can add more points by adding more <Placemark> sections.

Creating and animating heat maps is more difficult and will require some programming. Python is a major programming language we use at Shoes of Prey and luckily I found a Python library that not only generates heat maps, but also generates kml code to overlay the heat map on Google Earth. The heat map library by itself had a few limitations for what we wanted to do; mainly, it was too slow and it was limited to only generating a single static heat map and kml file. However, the great thing about open source software with modification friendly licenses is that you can modify it if it doesn't do exactly what you want it to do. I had to modify the library to make it faster and generate a time sequence of heat maps like the one you see in the video. You can find the source code of the modified version here. All you have to do is give it the data in the right format, the library does the hard work for you.

import heatmap
# Code to format your data goes here #
hm = heatmap.Heatmap()
hm.animated_heatmapKML(pointsets, outfile_name)

The first argument to the animated_heatmapKML functino takes an array of 3-tupples in the format like below:

pointsets = [(start_date, end_date, [lat, lng]), (start_date, end_date, [lat, lng]), ...]


The second argument is the name of the output file, e.g. "heatmap.kml". There are other optional parameters, see the documentation for more details. That should generate a "heatmap.kml" file as well as associated heatmap images heatmap.kml1.png, heatmap.kml2.png, ... to go with it. Open the file in google earth and slide the timeline to see the heatmap change over time.

What do you do with your business' data and how do you use it to improve your business?

5 comments:

  1. Thanks Michael/Charles, after reading this I endeavoured to build in an option to create a kml file export into our reporting engine. Surprisingly it only took a few hours to implement but I hope they can add some more features that would make it 'enterprise ready' i.e. being able to add a legend via the xml

    ReplyDelete
  2. Sorry, but how is this actually useful, or more useful than the raw data? What does a pin mean? A place or a number of sales? How long is the timeframe? Does the 'heat' dissipate or just keep getting added to? What do the colours represent? I just don't see the use in this beyond looking nice.

    ReplyDelete
  3. Hi Anonymous, a couple of things were interesting about this for us but unfortunately much of that we couldn't show in the video without potentially breaching the privacy of our customers because each pin is a sale mapped to the delivery address. We've run the same heatmap across the world and the data gets particularly interesting when viewing it across lot of cities rather than just in Australia.

    The main learning for us was something we'd always suspected but had never drilled down to confirm - we could see clusters of orders happening within months of each other in particular cities and countries. and the heatmap is a good way to visually see the orders building up in the particular city - better than just showing lots of pins. We would start with no orders in a city, then get a small number of 5-10 (often we could track those first orders back to press) then get a big flurry of orders a month later which we put down to those initial customers receiving their shoes and telling their friends about our service.

    As you suggest, it's possible to measure that from the raw data, but it's not that easy. It only took Charles a couple of hours to put this together and it would potentially have taken longer (not to mention being a lot less fun) to review the data in a spreadsheet compared with doing it on Google Earth.

    ReplyDelete
  4. Clearly you need to target sales out in the Gibson Desert. You also seem to be falling behind in the Pacific Ocean demographic!

    Your extra info does make sense Michael - have you taken away any "actions" or just "positive re-enforcement of existing understandings" from the data/maps?

    ReplyDelete
  5. Haha, we have big plans for the Gibson Desert market, stay tuned...! ;)

    The positive re-enforcement of what we already suspected has lead us to review our packaging and what we include with our shoes to see if we can encourage that word of mouth spread even further. We should be actioning that over the coming month!

    ReplyDelete