All Collections
Data Loading and Processing
Data Loading
Preparing SPSS files: Do’s and Don’ts
Preparing SPSS files: Do’s and Don’ts

Tips to load SPSS files properly

Updated over a week ago

Polite notice from DataTile:
Please, supply a questionnaire together with the data file. It saves our experts time and helps avoid misinterpretations of variables in data.

1. UTF-8 is the preferred encoding for labels and open-ended questions. Note that third-party libraries for generation SPSS files may not declare text encoding in the file itself.

It is assumed that the default encoding is UTF-8. If you use another encoding, please ensure that it is explicitly declared in the SPSS file before sending it to us.

2. Variables’ names must contain:

  • Latin letters

  • Numbers [0-9]

  • Symbols [ _ . @ - $ ]

They must not contain:

  • Any kind of spaces

  • Letters of non-Latin alphabets like Cyrillic, hieroglyphs, Arab, Hebrew, etc.
    Preferably start the name with a letter, not with a number or symbol.

3. Automatic tracking needs the constancy of the variable's naming. One variable should have a constant name from one wave to another. Names like 'gender' (in 1st wave) and 'sex' (in 2nd wave) will be considered as two different variables.

4. Names are not case-sensitive. For example ‘Gender’ and ‘gender’ are considered to be the same variable.

5. Categorical variables must be labeled. Missing labels will be ignored in the process of loading to DataTile and corresponding values will be assessed as SYSMIS. Variables without labels will be read as numerical.

6. Advisable to give variables names with meaningful references to corresponding questions from the questionnaire. It makes the process of correspondence more convenient.

The naming convention for processing automatization below

7. We recommend using simple patterns in the names of multi-response questions, for example, Q8_1, Q8_2, Q8_3, etc. If we have an ‘Other’ option as an open question it is preferable to name it like Q8_99 or Q8_999 to simplify the data processing.

8. Categorical variables corresponding to answers from the ‘Other’ category should be named like that Q8_99_OE where OE is a short form for ‘Open-ended’. In that case, it will be easier to highlight the open part of the ‘Other’ category and not mix them with closed questions.

Did this answer your question?