Standard Z-test in DataTile
This article is part of the DataTile series on Significance Testing.
· For proportion-based comparisons · Independent groups · Unpooled variance
As outlined in the “Before You Test” article, all statistical tests in DataTile are built for independent samples — we do not apply methods for paired or dependent observations.
To compare numeric values (such as average spend or satisfaction scores), DataTile uses a Z-approximation of Welch’s t-test, which assumes unequal variances between groups (i.e., unpooled variance). This method is well-suited for large samples and avoids unreliable assumptions about equal variability.
For transparency and learning purposes, we also show formulas for the pooled variance version of the test. However, this variant is not used in DataTile and is included solely for educational comparison.
Step-by-Step: Z-Test Algorithm
Pooled approach
combines the successes and total observations from both samples.
Not used in DataTile
Unpooled approach
treats each sample separately without assuming equal variances.
Used in DataTile
1. Calculate the Sample Proportions
Pooled proportion - the weighted average of success rates in both samples.
It can be calculated either from raw counts:
or from sample proportions:
Individual proportions are calculated separately for each group
2. Calculate the Variance
Pooled variance - not used in DataTile
Unpooled variance - used in DataTile
3. Calculate the Standard Error
The standard error (SE) is the square root of the variance.
It represents the typical deviation of a sample statistic (like a proportion or mean) from the true population value. Taking the square root brings the units back to the original scale (e.g. percentage points instead of squared percentages), making the result interpretable and usable in Z- or t-tests.
Pooled variance - not used in DataTile
Unpooled variance - used in DataTile
4. Compute the Z-Score
The difference between two proportions divided by the standard error.
Pooled variance - not used in DataTile
Unpooled variance - used in DataTile
5. Determine Significance
Compare the Z-score to the standard normal distribution to obtain the p-value. If the p-value is below the chosen significance level (typically 0.05), the difference in proportions is considered statistically significant, meaning it is unlikely to have occurred by random chance alone.