My new debugging technique is unstoppable.

Exploring Mars 1

Grabbing the images

Idea 1

I'm going to use the pictures of Mars from The United States Geological Survey, which has an astrogeology research program. The data is black and white, but it comes in a variety of resolutions. I think I'm going to use the JPG's because the viewer for the raw data only works in *nix and my Linux box is in pieces.

Wget

I love Wget, it's a bulk command-line downloader with a plethora of options. I'm going to point it towards an HTML on my localhost and ask it to download everything I've linked in there.

What to get

I downloaded the page with the links to all of the files, and ran a few regular expressions on it to produce a list of pictures. Now:

  wget -r -H http://localhost/marspics.html

will get everything recusively and no matter what server it's on.

While that's chugging away, I'll find some low resolution colour pictures to merge with my high resolution grayscale images.

I found one at: [http://www.vendian.org/mncharity/dir3/planet_globes/], and the author was kind enough to mention how he got it. So now I have a better idea.

Idea 2

This time, I'm going to use PDS Map-A-Planet to do all of the hard work (I'm beginning to love the USGS).

The maximum resolution available is 64 pixels/degree so 23040 pixels wide and 11520 pixels tall. That's 1/16th of the resolution that you can get of the earth, but it's probably good enough. Best of all, I can ask for it in whatever size increments I want, and the server will merge the color with the grayscale for me.

So first, I'll look at a merged picture

  http://pdsmaps.wr.usgs.gov/PDS/public/explorer/map/1134486402.jpg

I was hoping the URL would be easier to understand. I'm going to look at a few more:

    http://pdsmaps.wr.usgs.gov/PDS/public/explorer/map/1134487018.jpg
    http://pdsmaps.wr.usgs.gov/PDS/public/explorer/map/1134487081.jpg
    http://pdsmaps.wr.usgs.gov/PDS/public/explorer/map/1134487147.jpg

Wait... it's a timestamp! Ok, they win this round. So I'm going to need to write a script to

  • Generate the URLs pages that I want to load
  • Parse that page to get the IMG url
  • Save the image
  • Give the image a more useful filename
Back of the envelope calculation: I want 400x400 pixel pictures, I think. That doesn't divide evenly, but I guess I can live with 320x320 (which does). That's 72*36=2592 pictures, of about 75k each, so 194 megs... ouch. I'd feel guilty if it wasn't part of the USGS's mandate to make this information available.

The Script

So, I'm going to write a JavaScript to be parsed by Windows Scripting Host:

`fl download.js

Looks Good

First I ran a sanity check with the resolution set to 45 degrees per picture:

Looks pretty good, except for "mars5-2.jpg" -- I'm guessing that someone else used the map page at the exact same millisecond as me. That's going to be a problem (and I'm probably going to cause some problems for other people too). Running "get(5,2)" again fixes the problem.

Running it

Next I ran the script for the higher resolution images, it took 6 hours (though you'll probably do better if you're in the right hemisphere), the files are 180 megs, but I downloaded much more since I needed to get 2500 pages to find the file name.

Errors

    6 failed: mars4-3.jpg, 36-31, 37-13, 38-25, 45-5, and 57-19

And 4 are the wrong picture (from looking at the sizes): 36-24, 51-20, 9-14, 42-12, 60-32

If this was a real program, we'd need better error handling, but since I'm only running it once or twice, that's not cost effective. I'll just run it again with this list instead of the loops.

3 minutes later, I have all the pictures. And I can stop harrasing the good people at the USGS.

Post-processing

The resulting pictures are 20 megs, which isn't bad. But I'm going to run a quick batch process with IrfanView to compress them a little smaller.



Programming Math etc

What I've said lately

Loading feeds

More of me on the web