Clustering algorithms exist to assign observational units to relavant groups. A major problem is identifying the ideal number of groups if the true number is not known in advance.
This is the first in a series of blocks where we will create a visualization designed to help a domain expert identify the ideal number of groups for a set of time series observations.
The images above were created in R as part of the data analysis process. We'll use D3 to create an interactive version.
xxxxxxxxxx
<meta charset="utf-8">
<style>
img {
display: block;
}
a {
display: block;
}
</style>
<h2>Income from ages 25-65</h2>
<img src="hdinc.png" alt="Income Series"/>
<h2>Group Assignment with k = 10</h2>
<img src="kmeans_c_10.png" alt="Clusters for K = 10"/>
<a href="kmeans_matrix.pdf" target="_blank">Matrix of Clusters</a>