Tag Archives: data mining

Cohort Analysis — understanding your customers

Dimensional BI analysis is great stuff for understanding the business, but what really makes your customer’s tick? BI techniques such as survey analysis, market basket analysis, and clickstream analysis help illuminate this universal question. But BI queries typically tells us about the “average customer” who never exists.  How many households do you know with exactly 2.6 people?   
Cohort Analysis comes at the customer question from a different direction. The idea is to choose specific customers first, and then dimensionally analyze their actions, and the reactions of related parties. This approach eliminates error resulting from changes in the underlying customer mix over time. In effect, the data warehouse becomes a massive pool of longitudinal study participants whom you don’t have to pay, and can be studied in seconds rather than years.

Stepwise Analysis 

Cohort Analsis is a three step process:

  1. The “design step” frames the question or hypothesis to investigate. For example,—do free breakfast upgrades measurably improve business traveler retention for a hotel chain? If there is a good correlation, then a check-in rule might be instituted to upgrade certain types of guests when room availability hits certain thresholds.
  2. The “cohort step” selects the customers to study,typically a control group and one or more target groups.  For example, “select from the EDW a list of business travelers based in the north-east market with at least one stay per quarter over the preceding four quarters”.  From this pool frequent travelers, subset a control group that did not receive the upgrades in the first quarter, and a target group that did.
  3. The “study step” compares the control and target cohorts over some time range and dimensional grouping criteria.  Any metrics or target groupings can be used in the comparisons.  Example metrics are number of stays, length of stays, revenue, non-room charges, etc.  Example dimensional groupings are weekend versus non-weekend guests, guests that had meal charges versus those that did not, etc.

Power to the People 

Ad-hoc cohort analysis requires raw horsepower for dynamic joins at the customer key level. Without the power afforded by an EDW, development staff may be needed to set up temporary database structures and batch processing–not good.  We want the business analyst to be in full control–commanding rapid study cycles in the iterative question?answer?next question approach.  For performance reasons, cohort analysis processing should take place mostly on the EDW server, not the reporting tool platform.  In addition, certain aggregate tables and views may be required in the back-end EDW. 

Cohort Analysis Workbench

A Cohort Analysis Workbench enables the analyst to save cohort stage result sets for later use as study filters.  The workbench also manages a folder-based library of cohorts and studies for comparative analysis. High-end BI tools like MicroStrategy are well suited for creating such a workbench.  MicroStrategy can publish intermediate results to SQL tables, join reports using multi-pass SQL, and manage a library of studies with appropriate security. 

A Collaborative Approach 

This is not entirely an out-of-the box solution.  The workbench should be explicitly designed and tested with a portfolio of templates and key metrics.  Fortunately, high-end BI tools and MPP EDW servers make building a cohort analysis workbench relatively quick task.  But business analyst and the BI analyst must collaborate closely to get the job done. The active ingredient is the brain of the lead business analyst.  His or her vision, sponsorship, enthusiasm and direct participation will assure success. 

* * * * *

Do you have opportunities for cohort analysis in your organization?  Comment back to this post with your objectives and challenges.  Let’s get some dialog going around this very powerful method for understanding your customers.