Clusters building from series of financial data

#1
Hi everyone,

First of all I need to tell you that I don't know many things about statistics - I work in finance. Currently I need to "reverse-engineering" one of our projects. To put it clear, I have a series of data (26 members, products we sell, percentage) that represent the share in total sales for each product. Based on this series of data, a "statistics-guy" generated five clusters and enclosed each % on a cluster. The clusters are: <1%; between 1% and 2.3%; between 2.4% and 5%; between 5% and 7.5%; > 7.5%. The data is (separated by semicolon): 14.33%;10.19%;8.67%;8.21%;7.24%; 6.98%;5.87%;5.72%;5.32%;5.12%;3.91%;3.59%;2.54%;2.19%;1.89%;1.61%;1.49%;1.30%;1.11%;0.72%;0.58%;0.48%;0.45%;0.30%;0.13%;0.08%.

Now, my question is, how the guy was able to generate the clusters like he did? What he could have used, why 5 (five clusters) and what methods. Should I think on some finance statistics (we'll use these to generate some expense allocation on each product)?

Many thanks,
Costin
 
Last edited:
#2
Now, my question is, how the guy was able to generate the clusters like he did? What he could have used, why 5 (five clusters) and what methods.
You need to read about clustering. Go through the idea of clustering in general and these two methods

- Hierarchical clustering(here dendrogram is a visual aid that will give you a rough idea about the number of clusters in your data
- K-means clustering(needs the no. of clusters as input)