Before You Test: Assumptions About Groups & Variance
This article is part of the DataTile series on Significance Testing.
Before applying statistical tests like the Z-test or T-test, it’s important to check a few foundational assumptions. Two questions matter most:
Are the groups independent?
Can we assume equal variance between them?
These distinctions determine which formulas to use and whether the results can be trusted.
Dependent vs Independent Samples
The first assumption concerns group independence:
Independent Samples: The compared groups contain different individuals, with no overlap or relationship between them. This is typical for comparisons like Brand A vs Brand B, or Millennials vs Gen Z, or New customers vs Returning customers.
Dependent (Paired) Samples: The same individuals are measured in both groups — for example, pre- vs post-campaign awareness among the same respondents. Since the responses are linked, a different testing approach is needed (e.g., paired t-tests).
In DataTile, we always apply tests designed for independent samples. However, when comparing overlapping groups (e.g., Male vs Total), we apply a correction in the Z-test to adjust for shared respondents. See our article on the Advanced Z-test with Audience Overlap Detection for details.
Pooled vs Unpooled Variances
The second assumption concerns the variability in the compared groups:
Pooled Variance Method: Assumes equal variances across groups. Pooling is appropriate when the groups are similar (e.g., same distribution, equal variability). To check this assumption, analysts often use Levene’s test or an F-test to compare the spread of values in each group. If this assumption is violated, pooling can distort your results.
Unpooled Variance Method: Makes no assumption about equal variances. It analyzes each group’s variability separately. This approach is more robust, especially when the group sizes or spreads differ.
DataTile always uses the unpooled method, as it's more conservative and reliable when the assumption of equal variance is untested or unlikely.