All Collections
Crosstab and Data Analysis
Tips
Calculating data with empty cells/missing values
Calculating data with empty cells/missing values

How to set up the calculation of sysmis values

Updated over a week ago

Many surveys contain missing values, which are also known as 'empty cells', meaning a cell in the data set lacks a value. This often arising when a respondent hasn't been posed a specific question.

How do empty cells occur? Some questionnaires have “re-routing” of questions in order that only relevant questions are asked of a respondent. For instance, in a survey with questions about overseas holidays this summer, subsequent detailed queries about travel mode or destinations are usually directed only to those who indicated they traveled abroad. Respondents who didn't go on an international holiday will have missing values for these questions, as they were not asked to provide this information.

There are two ways to make calculations for variables with missing values:

Ignore empty cells in calculation

You can calculate metrics without taking into account empty cells. In this case, any % will be calculated only from the respondents who were asked the question, thus ignoring those with missing values.

Below is a scenario using the holiday's abroad example:

  • Total sample: 1000

  • People who said yes to holiday abroad: 500

  • People who travelled by air: 300

In order to calculate people who travelled by air (without including the empty cells) the calculation is 300/500*100 = 60%.

Include empty cells in calculation

Alternatively, you could calculate metrics based on all cells (including empty cells). In this case any % will be calculated from all respondents on the survey, regardless of whether they were asked the question.

Here is the same holiday survey example displaying this calculation:

  • Total sample: 1000

  • People who said yes to holiday abroad: 500

  • People who travelled by air: 300

In order to calculate people who travelled by air (including empty cells) the calculation is 300/1000*100 = 30%

You will see from above, there is a large difference between the figures for making decisions. To improve accuracy it makes sence to calculate the % only from those who were asked the question by excluding empty cells.

How to set up empty cells calculation in DataTile

The way you handle missing values in a survey can significantly affect your analysis, including percentages, significance levels, and indexes. When setting up a survey in DataTile, you can choose how to deal with these empty or missing responses in the project settings, with the default option detailed below.

By default, DataTile's project settings are configured to ignore sysmis responses, as outlined in the 'ignore empty cells' calculation method. This means that for each calculation, only responses from participants who answered the question are considered. Respondents with missing values (due to not answering the question) are not included in these calculations. This is the standard setup for all variables in the project settings.

‘Ignore sysmis’ is a default option in DataTile

If you want to include missing values in calculations, you can untick the ‘Ignore sysmis’.


Also, you can customize this for a particular variable using the Meta-Editor. Here you can select either a ‘ignore missing values' option or a 'count missing values’ oprion for each variable as shown below. The software will exclude/include missing values in the calculations for that variable accordingly. Note, thet this will change the calculation of percentages, significance, indexes etc.

How to set up sysmis calculation for a particular variable

Submenu for strategy choosing

There is also a functionality called rebasing, that can be applied on the fly when building a report. This allows you to filter any calculations for any column or row through a particular variable e.g. a different audience.

Did this answer your question?