Hi guys
Im working on a text mining/clustering project and am trying to create a table which contains number of clusters as rows and 6 columns representing the following 6 metrics: max.diameter, min.separation, average.within,average.between,avg.silwidth,dunn.
I need to create the tables for 3 methods - kmeans, pam and hclust.
I was able to create something for kmeans
OUTPUT
OUTPUT END
I need similar output for hclust and pam methods but for the life of me can't get the same function to work for either of the two methods
OK, so I was able to make the function for HCLUST
But I get an error when i run this
Error in cluster.stats$min.separation :
object of type 'closure' is not subsettable
What I need is for it to print "min.separation" output and other 5 measures like in the kmeans code.
I would really appreciate all the help and perhaps some guidance in understanding why my approach is failing in hclust.
Also, is there a good source that can explain the functioning and application of these methods, step by step, in detail?
Thank You
Im working on a text mining/clustering project and am trying to create a table which contains number of clusters as rows and 6 columns representing the following 6 metrics: max.diameter, min.separation, average.within,average.between,avg.silwidth,dunn.
I need to create the tables for 3 methods - kmeans, pam and hclust.
I was able to create something for kmeans
Code:
dtm0.90Dist = dist(dtm0.90)
foreachcluster = function(k) {
kmeans.result = kmeans(dtm0.90, k);
kmeans.stats = cluster.stats(dtm0.90Dist,kmeans.result$cluster);
c(kmeans.stats$min.separation, kmeans.stats$max.diameter,
kmeans.stats$average.within, kmeans.stats$avearge.between,
kmeans.stats$avg.silwidth, kmeans.stats$dunn)
}
rbind(foreachcluster(2), foreachcluster(3), foreachcluster(4), foreachcluster(5),
foreachcluster(6), foreachcluster(7),foreachcluster(8))
Code:
[,1] [,2] [,3] [,4] [,5]
[1,] 3.162278 30.19934 5.831550 0.5403872 0.10471348
[2,] 2.236068 28.37252 5.006058 0.3923446 0.07881104
[3,] 1.000000 28.37252 4.995478 0.2496066 0.03524537
[4,] 1.000000 26.40076 4.387212 0.2633338 0.03787770
[5,] 1.000000 26.40076 4.353248 0.2681947 0.03787770
[6,] 1.000000 26.40076 4.163757 0.1633954 0.03787770
[7,] 1.000000 26.40076 4.128927 0.2676423 0.03787770
I need similar output for hclust and pam methods but for the life of me can't get the same function to work for either of the two methods
OK, so I was able to make the function for HCLUST
Code:
forhclust=function(k){dfDist = dist(dtm0.90);
hclust.result = hclust(dfDist);
hclust.cluster = (cutree(hclust.result, k));
cluster.stats(dfDist,hclust.cluster);c(cluster.stats$min.separation)}
Error in cluster.stats$min.separation :
object of type 'closure' is not subsettable
What I need is for it to print "min.separation" output and other 5 measures like in the kmeans code.
I would really appreciate all the help and perhaps some guidance in understanding why my approach is failing in hclust.
Also, is there a good source that can explain the functioning and application of these methods, step by step, in detail?
Thank You
Last edited: