Other important modeling components
Here is what you are going to find in this page:
Clustering
Clustering is necessary for dimension analysis
- For discrete dimensions, anything below X% (X=2) of primary numerator is aggregated into “other”
- For continuous dimensions, cuts are made using weighted decision tree methodology, in order to create coherent buckets.
Read docs related to continuous dimension
Interdependence
In ‘Safe Mode’, most correlated dimensions are flagged. Interdependencies between dimensions are tested using Chi-Square and simple business calculation.
Combined dimension
Combined dimension is created by concatenating all clustered dimensions into one “Combined_Dimension”. It is then considered as all other dimensions and it’s contribution in the variation performance is assessed as it is for the other dimensions.
Significance
In ‘Safe Mode’, simple check of minimal volume (manually inputted) for given metric in Start and End You can also use Datama Impact to assess properly signifiance of variations
Scope
‘Out’ segment defined in column ‘Scope’ is excluded from analysis, and simply stacked on Start and End column in waterfall chart
Covariance
A Covariance ratio appears on the top left of the waterfall.
For waterfall analysis, covariance is distributed on each step. User should check that it remains reasonable (typically, <30%)
For Dimension analysis, covariance is not distributed on neither mix nor performance sizing. Hence user should be careful when looking at dimension impact
Read more about Covariance