Yesterday I posted the simple unsupervised cluster analysis of USHCN station 051294, Canon City, CO. It neatly divided into two classes which seemed to make a good match with the visible appearance of man-made features -v- vegetated landscape with a few bare ground spots thrown in to the man-made side of the classification.
But it doesn’t always work so easily. Before I tried USHCN 051294, I did a test run on station picked at random, USHCN 199316, which is West Medway, Massachusetts.
Dividing this station into two classifications, similar to the exercise yesterday, and we see that many areas within the natural vegetation are marked the same as obviously man-made areas. Indeed, this image is divided into almost equally into two classes: 51.37% and 48.63%
So we take the clustering code we looked at yesterday, which divided the image into two classes, and divide it into 3 classes instead.
i.cluster group=51294_18 subgroup=51294_18 classes=3 sigfile=51294_18_sig.txt reportfile=51294_18_rpt.txt
i.maxlik group=51294_18 subgroup=51294_18 sigfile=51294_18_sig.txt class=51294_18c_class reject=51294_18c_reject
Taking a quick look and we see that it looks much more like we expect for a natural -v- man-made division. Most of the man-made stuff is in class 3. But it’s not perfect. The house in the lower left corner is mostly classed “natural” while the surrounding lawn is classed “man-made.” Note: the unsupervised classification is NOT dividing things by any predefined ‘natural’ or ‘man-made’ clusters – its just that the cluster analysis of the location of certain colors tends to lump MOST man-made landscapes into a single class – given the right number of classes.
Running the stats on the 3×3 smoothed version, we get these three classification coverages:
Which seems about right.
A confounding factor in the above is the presence of dark shingles and shaded trees which works against a clean separation of classes.