23 terms

Chapter 9: Cluster analysis

Cluster analysis: (in a nutshell)
Cluster analysis combines similar objects to larger groups or clusters. Examples: What are the consumer segments in the smart phone market? Which countries can be combined to homogenous groups?
Goals of thsi chapter:
1. Understand the basic concept of cluster analysis.
2. Be aware of when to use different procedures.
3. Know how to assess the results of a cluster analysis.
4. Know approaches of how to implement cluster analysis in statistical software.
Cluster analysis:
Cluster analysis reduces the number of observations.
Groups objects (consumers, brands, situations, companies, etc) to new clusters.
Finds clusters of objects sucht that: 1.objects within a cluster are relatively similar , 2. objects from different clusters are relatively dissimilar.
Often used for segmentation purposes.
Six criteria for effective segmentation:
1. Responsiveness: homogeneous, unique response within segment, heterogeneous between segments.
2. Actionability: segments and firm`s goals/competencies should match.
3. Substantiality: Segments should be large enough.
4. Identifiability: easily measureable segmentation.
5. Accessibility: effective promotional/distributional tools needed.
6. Stability: Composition of segments should not change rapidly.
Agglomeration schedule:
gives information on the objects or cases being combined at each stage of a hierarchical clustering process.
Cluster centroid:
Mean values of the variables for all the cases or objects in a particular cluster.
Cluster centers:
Initial stating points in non-hierarchical clustering. Clusters are built around these centers or seeds.
Cluster membership:
Indicates the cluster to which each object or case belongs.
Dendogram (tree graph)
Shows clustering results. Vertical lines represent clusters that are joined together. THE position of the line on the scale indicates the distances at which clusters were joined.
Distance between cluster centers.
Indicate how seperated the individual pairs of clusters are.
Similiarity/ distance coefficient matrix
Is a lower triangular matrix containing pairwise distances between objects or cases.
Conducting cluster analysis:
Step 1: Determine the variables that are used to cluster on ("active variables")
Step 2: Determine number of segments or clusters. (Ward´s method)
Step 3: Determine cluster membership (K-mean clustering)
Step 4: Check whether the ative and passive varibles are different between clusters.
Selecting the cluster variables is crucial task:
The set of variables selected should describe the similarity between objects in terms that are relevant to the marketing research problem.
The variables should be selected based on past research, theory or a consideration of the hypothesis being tested.
Hierarchical clustering:
Hierarchical clustering is characterized by the development of a hierarchy or treelike structure. Hierarchical methods can be agglomerative or divisive.
Agglomerative clustering: Clusters are formed by grouping objects into bigger and bigger clusters until all objects are members of a single cluster.
Divisive clustering: Clusters are divided or split until each object is in a separate cluster.
Single linkage method
The single linkage method is based on minimum distance or the nearest neighbor rule. At every stage, the distance between two clusters is the distance between their two closest points.
Complete linkage method
The complete linkage calculates the distance between two clusters as the distance between their two farthest points.
Average linkage method
The average linkage uses the average of the distance between all pairs of objects, where one member of the pair is from each of the clusters.
Variance methods minimize the within-cluster variance: (Ward´s procedure)
Ward´s procedure: Compute means for all variables within a cluster. Compute for each object the squared Euclidean distance to the cluster means. Sum over all OBjects. The two clusters with the smallest increase in the overall sum if squares within cluster distances are combined.
Variance methods minimize the within-cluster variance:
(Centroid method)
In the centroid method the distance between two clusters is the distance between teir centroids (means for all the variables). Every time objects are grouped, a new centroid is computed.
Using Ward´s method to determine the number of segments.
Cluster joining such that within-group variance is small and between-group variance is large. Tendency towards clusters of similar size, that are relatively compact. Simulation studies find that Ward´s method works best among hierarchical methods.
Dendogram for Ward´s method:
Merging becomes costly because distance between clusters is large. (When the branches have larger differences between them)
Non-hierarchical K-means clustering:
Requires that user specifies number of clusters.
Objects are iteratively re-assigned to clusters whose centroids are closest.
Done when no more re-assignments can be made.
Advantage of K-means: switching of cluster membership allowed.
In output: Iteration history :change in cluster centres should be zero, which means that the maximum distance by which any centre has changed is 0.00.
F-test can´t be assessed because the clusters have been chosen to maximize the differences among cases in different clusters. Therefore the F-test ist invalid in testing the hypothesis that the cluster means are equal.
Interpretation of cluster solution:
1. Active variables: used for clustering: Test differences with ANOVA
2. Passive variables: not used for clustering, but may help interpretation: Sociodemographics, media usage, life stage
Test differences in metric variables with ANOVA
Test differences in non-metric variables with CHi-square analysis.