eurica

numbers for people.

Distance Matrix API (failed data project)

I’m going to try to document more of my side projects, even when they fail. Originally, when I found the Distance Matrix API, I planned to cover San Francisco in a grid, and compare neighbourhoods in terms of the ratio of average bike times to driving times to figure out which were the most bike-friendly.

The API limits you to 100 datapoints per query (and a limit of 100 datapoints every 10 seconds) and 2500 a day, so an 8×8 grid seemed about all I could reasonably load. That’s 64 starts x 63 destinations x 2 methods = 8064 datapoints (trips aren’t necessarily symmetric). I wrote a simple script to download a distance matrix from each of the 64 points (and each method) every 11 seconds, and then waited an hour or two before trying again to store them in localStorage¬†if it was over the API limit.

Here’s the results as a JSON: sf_distances.txt (one datapoint is missing)

And here’s a summary spreadsheet: googlemapstimes.xlsx

It didn’t turn out to be an interesting dataset, but here are some conclusions:

  • On average it takes 2.34 times as long to bike somewhere as drive, but these seem to be times without traffic…which is never. That varies from 1.9 times to 3.25 times as long.
  • On average, the bike route is about 4% shorter than the car route. That varies from 14% shorter to 8% longer.
  • On a bike, the fastest starting point is (37.739,-122.451) on twin peaks, and by car it’s (37.715,-122.451) on the highway.

That’s where I gave up, because an evenly distributed grid doesn’t make much sense, some points are in parks, others are on highways, and traffic times aren’t taken into account.

What I’d like to do:

  • Pick some representative points, one in every neighbourhood
  • Use the maps API to calculate the time to a few common destinations
  • Compare walking, biking, driving without traffic, driving with traffic and public transit times

Comments are closed.