Skip to main content
Skip table of contents

Preparing SPSS files: do’s and don’ts

When providing data to the DataTile service team, please always supply the questionnaire with the data file. This helps our experts interpret variables correctly and saves time.

Naming conventions

File Encoding

  • Preferred encoding: UTF-8 for all labels and open-ended text.

  • Some third-party tools may not declare encoding explicitly. In such cases, UTF-8 will be assumed.

  • If you use another encoding, ensure it is declared in the SPSS file so DataTile can read it correctly.

  • Refer to the list of supported encodings if needed.

Variable Naming Rules

  • Must start with a Latin letter.

  • May include:

    • Latin letters (A–Z, a–z)

    • Numbers (0–9)

    • Symbols: _ . @ - $

  • Must not contain:

    • Spaces

    • Non-Latin characters (e.g., Cyrillic, Chinese, Arabic, Hebrew)

  • Names are not case-sensitive, so DataTile reads "Gender" as "gender" regardless of case.

  • Prefer meaningful names linked to the questionnaire for easier cross-reference.

Labels

  • Variable labels - if a label is not provided, DataTile will use the variable name as a label.

  • Value labels - explicitly declare value labels for categorical variables to ensure DataTile recognises them correctly. DataTile ignores undeclared categories, treating them as missing values (SYSMIS).

Multi-Response and Matrix Questions

  • Use simple, consistent patterns such as Q8_1, Q8_2, Q8_3, etc.

  • Use _99 or _999 for “Other” options (e.g., Q8_99).

  • Append names of open-ended “Other” responses with _OE or _T prefix to indicate semantically its relevance to the set of categorical variables. For example, the textual variable Q8_99_OE is clearly related to the set of categorical variables Q8_*.

Upholding Automated Matching for Tracking Studies

DataTile automatically matches and tracks data across waves or benchmarks, assuming variable names and category values are consistent.

  • Keep Variable Names Consistent across waves and datasets you are intending to conflate in analysis or reporting. For example, use gender in every wave, not sex in one and gender in another.

DataTile has methods to integrate inconsistent datasets by providing an explicit map for variables and labels. For more information, contact our service team.

  • Do not reuse variable names for new questions, even if the old question has been removed from the survey. It may result in erroneous trending.

  • Do not reuse category values - if a value (e.g., 1=Tomahawk) was used in a previous wave, don’t assign 1 to a new category (e.g., 1=Banana).

Using consistent, recognizable naming patterns for related variables enables DataTile’s automated processing to align your data correctly and efficiently.

JavaScript errors detected

Please note, these errors can depend on your browser setup.

If this problem persists, please contact our support.