Skip to main content
Skip table of contents

Advanced Z-test with Audience Overlap Detection

This article is part of the DataTile series on Significance Testing.

· Unpooled variance · Overlap-adjusted standard error · Supports weighting

Why Adjust for Audience Overlap

When comparing proportions across groups, traditional Z-tests assume the groups are completely independent — that no respondent appears in both groups. But in real-world cases, this assumption is often violated. Imagine comparing Male vs Total. Since Total includes males, the groups partially overlap, which violates the independence assumption and can distort results:

  • Group correlation: Shared respondents create dependencies between the groups

  • Underestimated standard error: The overlap reduces variability, making the difference appear more precise than it is

  • Inflated statistical significance: The p-value becomes artificially low, increasing the chance of false positives

To ensure valid results, it’s essential to adjust for audience overlap when comparing proportions. DataTile handles this automatically using a modified z-test tailored for overlapping groups.

How the Adjustment Works

In DataTile, the Advanced Test for Audience Overlap is applied only when comparing proportions (e.g., % awareness, % usage). It does not apply to comparisons of means or numeric values.

To ensure accurate and reliable comparisons, DataTile uses a modified z-test specifically designed to handle overlapping audiences. This approach replaces the assumption of full independence with a more realistic structure — one that reflects partial audience overlap and weighting. The advanced formula does the following:

  • Divides the sample into three non-overlapping subgroups:

    • -respondents unique to Group 1,

    • -respondents unique to Group 2,

    • -overlapping respondents (present in both groups).

  • Weights each subgroup individually using the sum of squared respondent weights, ensuring accurate estimation even in complex weighted samples

  • Applies a correction factor to the overlapping portion to adjust for correlation and imbalance between groups.

  • Combines all three components to compute an adjusted standard error, which reflects the true audience structure and avoids underestimating variability.

This method ensures that the test remains statistically robust, avoiding inflated significance and providing fair comparisons, even in complex segment structures.

Advanced Z-test for overlapping samples: formula

(variance not pooled, standard error is adjusted for overlapping, formula adapted for weighting)

0. Terms an definitions

Let the two groups being compared be:

  • We don't have a way to export this macro.
    - first group

  • We don't have a way to export this macro.
    - second group

Define the three mutually exclusive subsets:

  • We don't have a way to export this macro.
    : Overlap between groups (respondents who belong to both groups)

  • We don't have a way to export this macro.
    : Only in the group
    We don't have a way to export this macro.

  • We don't have a way to export this macro.
    : Only in the group
    We don't have a way to export this macro.

Variables:

  • We don't have a way to export this macro.
    : proportion of success in
    We don't have a way to export this macro.

  • We don't have a way to export this macro.
    : base sizes in
    We don't have a way to export this macro.

  • We don't have a way to export this macro.
    : the sum of squared weights in
    We don't have a way to export this macro.
    .

If weighting is disabled, or all weights are 1, then the sum of squared weights equals the regular unweighted base size — simply the number of respondents.

1. Calculate the Adjusted Standard Error

We don't have a way to export this macro.

Decomposition of Adjusted Standard Error

The

We don't have a way to export this macro.
formula consists of three parts, each responsible for a different portion of the sample structure:

  1. Standard variance component for overlapping group (X₀).

Adjusts for dependency between groups. If the groups partially overlap, some respondents appear in both, introducing correlation. This term subtracts the overlapping influence to ensure a fair comparison.

We don't have a way to export this macro.

Importantly, we don’t simply discard the overlapping data — that would waste valuable information. Instead, this term retains the shared audience but applies a correction factor based on the difference in group sizes

We don't have a way to export this macro.
and the weighted base structure.

  1. Standard variance component for

    We don't have a way to export this macro.
    , adapted for weighted data.

Reflects the contribution of respondents who are only in group 1 (i.e., not shared with group 2).

We don't have a way to export this macro.

  1. Standard variance component for

    We don't have a way to export this macro.
    , adapted for weighted data.

Reflects the contribution of respondents who are only in group 2 (i.e., not shared with group 1).

We don't have a way to export this macro.

2. Calculate the adjusted Z-score

We don't have a way to export this macro.

3. Determine Significance

Compare the Z-score to the standard normal distribution to obtain the p-value. If the p-value is below the chosen significance level (typically 0.05), the difference in proportions is considered statistically significant, meaning it is unlikely to have occurred by random chance alone.

Standard T-test in DataTile

Standard Z-test in DataTile

JavaScript errors detected

Please note, these errors can depend on your browser setup.

If this problem persists, please contact our support.