DataTile Knowledge Base

Using MATH functions in crosstabs

When working with a crosstab, you can apply mathematical functions to data in both rows and columns. Formulas use row or column labels as operands, however, there are some exceptions. In the tables below you can find detailed information on how functions are applied.

How to enable MATH for crosstabs

Go to Project settings → Settings → Math ->Tick the box provided to enable the display of the MATH window in the Crosstab. After enabling the feature, the ‘MATH’ button is displayed above the crosstab in the DVI interface.

math enable.jpg

Note that when applying mathematical operations, you can use data with different base sizes. In such cases, the base calculation will follow one of four available methods, which can be selected in the project settings:

  • AVG - as an average of bases;

  • SUM - as a sum of bases;

  • MIN - as a minimum base;

  • MAX - as a maximum base.

How to use MATH in crosstabs

Use the MATH button above the crosstab to open the formula editor.

math popup.jpg
  • First, choose whether the calculation result should appear in a row or a column. Depending on this, you can use variables from rows or columns for your calculations.

  • Press Ctrl+Space / Cmd+Space to open the prompts.

  • Enter your expression in the MATH window using formulas and arithmetic operations.

  • Click Add to insert the result into the selected row or column.

Syntax & Formulas Usage

A range of functions represent metrics that previously had to be deliberately added to a crosstab. Now you can get the value of a metric by a function, then apply a mathematical operation to it.

These metric functions are:

Syntax

Result

MEAN('header label')

returns the mean value based on the digit value of the selected variable

BASE('header label')

UBASE('header label')

BASE and UBASE (unweighted base) return the value of the selected variable depending on the dimension: applied to rows show the horizontal base; applied to columns show the vertical base.

HBS('header label')

UHB('header label')


VBS('header label')

UVB ('header label')

returns the Horizontal Base value of the selected variable;

returns the Unweighted Horizontal Base value of the selected variable;

returns the Vertical Base value of the selected variable;

returns the Unweighted Vertical Base value of the selected variable;

Stat.Base

Unweight.Base

returns the statistical base value;

returns the common unweighted base value

CNT('header label')

returns the count of the selected variable

HPT('header label')

returns the horizontal percent of the selected variable

INDEX_U('header label')

INDEX_F('header label')

returns the value of the unfiltered Affinity index

returns the value of the filtered Affinity index

VPT('header label')

returns the vertical percent of the selected variable

VOL('header label')

returns the volume value of the selected variable

Other functions give you the following values:

Syntax

Result

ABS('header label')

ABS('formulaic expression')

returns the absolute value of the chosen variable or the expression

AVG('header label')

AVG('formulaic expression')

returns the average value of the chosen variable or the expression

AVG.INC('header label')

AVG.INC('formulaic expression')

returns the average value of the chosen variable or the expression including SYSMIS

AVG.EXC('header label')

AVG.EXC('formulaic expression')

returns the average value of the chosen variable or the expression excluding SYSMIS

AVG.FULL

return average value for the row/column, accounting for the missing values

AVG.P('header label')

AVG.P('formulaic expression')

means AVG Post and returns the average value of the chosen variable or the expression after masking was applied

Weight.Avg('header label')

Weight.Avg('formulaic expression')

returns the average value from a weighted base

CELL('header label 1','header label 2')

returns the value of a particular cell by specifying ‘column, row’ OR ‘row, column’ respectively, using commas

CORREL('header label')

returns the correlation coefficient of two cell ranges. See the example below

COUNT

returns the number of rows or columns with variable values in the table; series with MATH or logical expressions are excluded from counting

COUNT.FULL

returns the number of rows or columns with variable values, accounting for the missing values

CUMULATE('header label')

returns cumulative totals for rows/columns

LINEST('header label')

takes a row/column as an input and produces a row/column with the linear approximation. It returns a line that fits the provided points.

MAX('header label a', 'header label b', ... 'header label x')

returns the largest value in a set of values

MAX.Label

returns the label of the largest value in a row/column

MEDIAN

returns the median value of all series

MIN('header label a', 'header label b', ... 'header label x')

returns the smallest value in a set of values

NORMALIZE ('header label')

NORMALIZE ('FIRST')

NORMALIZE ('LAST')

returns the share of a value within a row or column relative to the total of that row or column.

It can be applied to a specific row or column using its label. You can also use the first and last operators, which apply the function to the first or last row/column, regardless of the data they contain.

PERCENTRANK('header label')

returns the percentage rank of a value in a row/column

POW('formulaic expression, extent) or

POWER('formulaic expression, extent)

exponentiation


For example, in the formula POWER(brand-1, 3) the values for the brand-1 will be raised to the third power (cubed)

RANK('header label')

returns the rank of a value in a row/column. See the example below

RANK P.('header label')

returns the perpendicular rank of the value: the rank within the row will be displayed in the column and vice versa. See the example below

REF ('FIRST')

REF ('LAST')

the REF function is designed to reference a row or column without specifying a label. Currently, you can use First or Last as operators for this function in DataTile.

ROUND('header label','decimal')

returns a value in a row/column rounded to an indicated decimal point

SUM('header label a', 'header label b', ... 'header label x')

returns the value of the sum of the given set of variables

CORREL calculations

Unlike the CORREL metric, the mathematical formula for calculating the correlation coefficient is applied not to a single cell, but to two ranges of cells. Here is an image with an example of how CORREL is calculated for the range of Brand-1 (Range 1) and other brands (Range and Range 3).

correl example.png

RANK and RANK P. calculations

Here, the ranking calculation range for brand 3 is highlighted in green, and the perpendicular ranking calculation range for brand 3 is highlighted in yellow. Thus, in the case of regular rank, the rank in the column is calculated. In the case of perpendicular, the rank within the row will be displayed in the column.

rank p & rank comparison.png