read.spss {foreign}R Documentation

Read an SPSS Data File

Description

read.spss reads a file stored by the SPSS save or export commands.

Usage

read.spss(file, use.value.labels = TRUE, to.data.frame = FALSE,
          max.value.labels = Inf, trim.factor.names = FALSE,
          trim_values=TRUE)

Arguments

file Character string: the name of the file to read.
use.value.labels Convert variables with value labels into R factors with those levels?
to.data.frame return a data frame?
max.value.labels Only variables with value labels and at most this many unique values will be converted to factors if use.value.labels = TRUE.
trim.factor.names Logical: trim trailing spaces from factor levels?
trim_values Logical: should values and value labels have trailing spaces ignored when matching for use.value.labels = TRUE?

Details

This uses modified code from the PSPP project (http://www.gnu.org/software/pspp/ for reading the SPSS formats.

Occasionally in SPSS value labels will be added to some values of a continuous variable (eg to distinguish different types of missing data), and you will not want these variables converted to factors. By setting max.val.labels you can specify that variables with a large number of distinct values are not converted to factors even if they have value labels. In addition, variables will not be converted to factors if there are non-missing values that have no value label. The value labels are then returned in the "value.labels" attribute of the variable.

If SPSS variable labels are present, they are returned as the "variable.labels" attribute of the answer.

Fixed length strings (including value labels) are padded on the right with spaces by SPSS, and so are read that way by R. The default argument trim_values=TRUE causes trailing spaces to be ignored when matching to value labels, as examples have been seen where the strings and the value labels had different amounts of padding. See the examples for sub for ways to remove trailing spaces in charcter data.

Value

A list (or data frame) with one component for each variable in the saved data set.

Note

If SPSS value labels are converted to factors the underlying numerical codes will not in general be the same as the SPSS numerical values, since the numerical codes in R are always 1,2,3,...

You may see warnings about the file encoding: it is possible such files contain non-ASCII character data which need re-encoding. The most common occurrence is Windows codepage 1252, a superset of Latin-1.

You may also see warnings like ‘Unrecognized record type 7, subtype 16’. These are thought to be harmless: see http://www.nabble.com/problem-loading-SPSS-15.0-save-files-t2726500.html

Author(s)

Saikat DebRoy and the R Core team

Examples

## Not run: 
read.spss("datafile")
## don't convert value labels to factor levels
read.spss("datafile", use.value.labels = FALSE)
## convert value labels to factors for variables with at most
## ten distinct values.
read.spss("datafile", max.val.labels = 10)
## End(Not run)

[Package foreign version 0.8-23 Index]