2003 AI Qual Programming Question

Sixty-Minute Software (60S) wants a quick-and-dirty clustering algorithm for N-dimensional data. The idea is real simple: given M data points, where each data point is a vector of N floating point numbers, start with each point in its own cluster, then repeatedly merge the closest pair of clusters, where the distance between two clusters is the average of the distances between the points in the clusters. 60S lost the page that describes when to stop merging, so your main function should take a parameter that can be used to decide when to stop, given the current set of clusters.

60S doesn't care what language you do this in, but they want clean code with short well-named functions (and classes, if any). Flexibility is a plus. Note: it's better to get something working using standard collections for points, clusters, and sets of clusters, than to spend time defining new data structures.