For some reason, two posts at The Blackboard, It’s “Fancy,” Sort of … (Shollenberger) and “To get what he wanted”: Upturned end points. (Lucia), seem to be having difficulty understanding the mechanics of another post at “Open Mind”, In the Classroom (Tamino). But there is nothing unusual or difficult about the methods Tamino used to create the charts which have generated so much smoke and apparent frustration at The Blackboard – resulting in an outbreak of mcintyretude: scorn, derision, insults, and the questioning of motives. Since this is mostly a quick walk through of some code to clear the smoke, I will leave the charts generated to post at the end.
Yesterday I posted the simple unsupervised cluster analysis of USHCN station 051294, Canon City, CO. It neatly divided into two classes which seemed to make a good match with the visible appearance of man-made features -v- vegetated landscape with a few bare ground spots thrown in to the man-made side of the classification.
But it doesn’t always work so easily. Before I tried USHCN 051294, I did a test run on station picked at random, USHCN 199316, which is West Medway, Massachusetts.
Dividing this station into two classifications, similar to the exercise yesterday, and we see that many areas within the natural vegetation are marked the same as obviously man-made areas. Indeed, this image is divided into almost equally into two classes: 51.37% and 48.63%
So we take the clustering code we looked at yesterday, which divided the image into two classes, and divide it into 3 classes instead.
i.cluster group=51294_18 subgroup=51294_18 classes=3 sigfile=51294_18_sig.txt reportfile=51294_18_rpt.txt
i.maxlik group=51294_18 subgroup=51294_18 sigfile=51294_18_sig.txt class=51294_18c_class reject=51294_18c_reject
Taking a quick look and we see that it looks much more like we expect for a natural -v- man-made division. Most of the man-made stuff is in class 3. But it’s not perfect. The house in the lower left corner is mostly classed “natural” while the surrounding lawn is classed “man-made.” Note: the unsupervised classification is NOT dividing things by any predefined ‘natural’ or ‘man-made’ clusters – its just that the cluster analysis of the location of certain colors tends to lump MOST man-made landscapes into a single class – given the right number of classes.
Running the stats on the 3×3 smoothed version, we get these three classification coverages:
Which seems about right.
A confounding factor in the above is the presence of dark shingles and shaded trees which works against a clean separation of classes.
GRASS is the common name for Geographic Resources Analysis Support System. It has a huge GIS (Geographic Information Systems) toolkit and is under constant development. I’ve been meaning to take it out on a spin for some cluster analysis and surface classification. Better late than never.
A couple of months ago, I used RGoogleMaps to automate the download of some USHCN sites from Google Maps. We can use GRASS tools to classify the various pixel colors into an arbitrary set of classes using cluster analysis. I’ll use the same image, Canon City CO, that I used in the earlier post.