Submitted by: Submitted by mukeshbhaarathi
Views: 38
Words: 3206
Pages: 13
Category: Other Topics
Date Submitted: 03/06/2015 05:37 PM
Clustering
Clustering
• Cluster: a collection of data objects
– Similar to one another within the same cluster – Dissimilar to objects in other clusters
• Cluster analysis
– Grouping a set of data objects into clusters
• Unsupervised classification: no predefined classes
• Typical applications
– Making sense of structure of complex data Break up large data into meaningful subsets – Customer segments – Prototypical cases, outliers
Example: Claritas segmented US neighborhoods based on demographics & income: “Furs & station wagons,” “Money & Brains”, …
Berry & Linoff, pg. 462
Hertzsprung-Russell diagram Star clusters by temp. and brightness
Clusters represents stars at different phases in stellar life-cycle
How many clusters?
Two Clusters
Four Clusters Six Clusters
Example: Public Utilities
Goal: find clusters of similar utilities Data: 22 firms, 8 variables
– – – – – – – – Fixed-charge covering ratio Rate of return on capital Cost per kilowatt capacity Annual load factor Growth in peak demand Sales % nuclear Fuel costs per kwh
Company Arizona Boston Central Commonwealth Con Ed NY Florida Hawaiian Idaho Kentucky Madison Nevada New England Northern Oklahoma Pacific Puget San Diego Southern Texas Wisconsin United Virginia
Fixed_charge RoR Cost Load D Demand Sales Nuclear Fuel_Cost 1.06 9.2 151 54.4 1.6 9077 0 0.628 0.89 10.3 202 57.9 2.2 5088 25.3 1.555 1.43 15.4 113 53 3.4 9212 0 1.058 1.02 11.2 168 56 0.3 6423 34.3 0.7 1.49 8.8 192 51.2 1 3300 15.6 2.044 1.32 13.5 111 60 -2.2 11127 22.5 1.241 1.22 12.2 175 67.6 2.2 7642 0 1.652 1.1 9.2 245 57 3.3 13082 0 0.309 1.34 13 168 60.4 7.2 8406 0 0.862 1.12 12.4 197 53 2.7 6455 39.2 0.623 0.75 7.5 173 51.5 6.5 17441 0 0.768 1.13 10.9 178 62 3.7 6154 0 1.897 1.15 12.7 199 53.7 6.4 7179 50.2 0.527 1.09 12 96 49.8 1.4 9673 0 0.588 0.96 7.6 164 62.2 -0.1 6468 0.9 1.4 1.16 9.9 252 56 9.2 15991 0 0.62 0.76 6.4 136 61.9 9 5714 8.3 1.92 1.05 12.6 150 56.7 2.7 10140 0 1.108 1.16 11.7...