DataTile Knowledge Base

Database loading properties in the codebook

When uploading a codebook, you can also define database parameters, including settings that control how empty or missing values are handled. These parameters are declared on the ‘Properties' sheet of the codebook file; generally, they look as follows:

codebook properties.png

We recommend always using codebook templates to avoid manually configuring these properties.

ignore_string_vars

Defines ignorance of text variables. This helps prevent the database from being overloaded with auxiliary text variables. If you do not need text variables for analysis, we recommend enabling this option.

  • TRUE - ignore

  • FALSE - do not ignore

charset

Defines the file encoding. The default value is UTF-8.

csv_delimiter

Defines the delimiter used for CSV files. The default delimiter is a comma.

ignore_empty_cats

Defines whether categories with no values should be displayed in variables.

Example: a 10-point scale contains no responses with a value of 10. If this option is enabled, only categories 1–9 will be visible after upload, while 10 will be hidden.

  • TRUE - ignore

  • FALSE - do not ignore

ignore_missing_values

This one lets you define the default strategy for handling missing values.

  • TRUE - ignore

  • FALSE - do not ignore

ignore_undeclared_cats

Defines how categories that are not declared in the codebook should be handled.

  • TRUE - ignore

  • FALSE - do not ignore

ignore_undeclared_vars

Defines how variables that are not declared in the codebook should be handled. As with undeclared categories, it is important to consider scenarios during subsequent data uploads, both with and without a codebook.

  • TRUE - ignore

  • FALSE - do not ignore

Since database uploads can be performed with or without a codebook and may occur in multiple stages, it is important to understand how the ignore behavior works in the following scenarios.

  • Initial upload without a codebook — all variables/categories are created.

  • Initial upload with a codebook and ignore enabled — only non-empty variables/categories are created.

  • Secondary upload without a codebook — during the diff process (comparison of variable trees), all categories are checked, including empty and non-empty ones. This approach is slower but provides full validation.

  • Secondary upload with a codebook and ignore enabled — empty categories in the uploaded volume are skipped during comparison. This significantly speeds up the process, but undeclared data inconsistencies may remain unnoticed.

date_format

The date format defines only the order of components — day (d), month (m), and year (y) — but not the syntax or separators. This is sufficient for the DataTile system to import dates, regardless of the formatting.

external_weight_missing

This parameter is used when a separate weight file with the *.wts.xlsx extension is uploaded together with the database.

If the file is missing, the system applies the FAIL strategy by default.

When the file is present, this parameter controls how missing weight coefficients should be handled and provides two alternative strategies:

  • NULL (or SYSMIS) — assigns missing values to unmatched weight coefficients.

  • DECIMAL — assigns a numeric constant to unmatched weight coefficients.