All Collections
Data Processing
Calculating data with empty cells (sysmis)
Calculating data with empty cells (sysmis)
Updated over a week ago

Many surveys contain some missing values, these are also referred to as ‘empty cells’ i.e. a cell without any value. This is normally seen in survey data, when a respondent has not been asked a question.

Why is there sometimes empty cells?Questionnaires often have “re-routing” of questions in order that only relevant questions are asked of a respondent. For example, in a survey with questions asking about respondents' holidays abroad this summer, the detailed questions related to these such as about method of travel, or destination are generally only asked to the relevant respondents i.e. those who said yes to going abroad this summer. Those who said that they didn’t go on holiday abroad will therefore have missing values because they were not asked the following questions.

There are two ways that you can calculate % for variables with missing values:

Ignore empty cells in calculation

You can calculate metrics without taking into account empty cells. In this case, any % will be calculated only from the respondents who were asked the question, thus ignoring those with missing values.

Below is a scenario using the holiday's abroad example:

  • Total sample: 1000

  • People who said yes to holiday abroad: 500

  • People who travelled by air: 300

In order to calculate people who travelled by air (without including the empty cells) the calculation is 300/500*100 = 60%.

Do not ignore empty cells in calculation

Alternatively, you could calculate metrics based on all cells (including empty cells). In this case any % will be calculated from all respondents on the survey, regardless of whether they were asked the question.

Here is the same holiday survey example displaying this calculation:

  • Total sample: 1000

  • People who said yes to holiday abroad: 500

  • People who travelled by air: 300

In order to calculate people who travelled by air (including empty cells) the calculation is 300/1000*100 = 30%

You will see from above, there is a large difference between the two % shown above when making decisions. It makes sense and improves accuracy to calculate the % only from those who were asked the question by excluding empty cells.

Defining the empty cells calculation in DataTile

Depending on which of these you use, the total number of respondents in the selected variables will change. Therefore percentages, significance, indexes etc. will all change.

When a survey is loaded, the creator can specify how calculations should be carried out for empty or missing values. This is done in the project settings. The default setting is described below.

In DataTile project settings you have options for specifying how to calculate the data with missing values:-

  1. The default setting ignores blank values as described in the ignore empty cells calculation description above. DataTile sets an ‘ignore sysmis’ option for all variables and therefore each calculation is made only from those who answered the question. As a result the software does not count people with blank values (i.e. blank because the people didn’t answer the question) when running calculations. This is how it appears in the project settings (by default).

  1. The default shown above “ignore sysmis/empty/null values for base calculation” applies to all variables (questions). Ignore sysmis can also be applied to individual questions if necessary. This is done in the meta-editor on the specific question as shown below.

  2. If ignore sysmis is unticked as below, the program will calculate % from all of the respondents in the survey regardless of whether they answered a particular question, therefore counting the data from the entire database. The calculation is described in the do not ignore empty cells description above.

4. There is also a functionality called rebasing, that can be applied on the fly when building a report. This allows you to filter any calculations for any column or row through a particular variable e.g. a different audience. This is called rebasing.

Note! Not all variables have empty values. It depends on the question methodology of the specific study.

Did this answer your question?