Type: Package
Title: Import Data from Spanish Sociological Research Center (CIS)
Version: 0.1.0
Description: Search and import data directly to R from the Spanish Sociological Research Center (CIS) https://www.cis.es/inicio. The CIS is a public institution that conducts electoral and sociological research studies on the Spanish society. The CIS has a large database of surveys that can be accessed through its website. The package includes functions to search for surveys, survey questions and timeseries, and import the data directly to R.
License: GPL (≥ 3)
Encoding: UTF-8
Language: en-US
RoxygenNote: 7.3.3
Depends: R (≥ 3.5.0)
Imports: httr (≥ 1.4.7), tibble (≥ 3.0.0), purrr (≥ 1.0.0), haven (≥ 2.5.3), magrittr (≥ 2.0.0), rvest (≥ 1.0.0), stringr (≥ 1.0.0), memoise (≥ 2.0.0)
URL: https://opencis.spainelectoralproject.com, https://github.com/hmeleiro/opencis
BugReports: https://github.com/hmeleiro/opencis/issues
Suggests: knitr, testthat (≥ 3.0.0), rmarkdown
VignetteBuilder: knitr
Config/Needs/website: hmeleiro/spainelectoraltheme
Config/testthat/edition: 3
NeedsCompilation: no
Packaged: 2026-04-29 07:47:02 UTC; hmele
Author: Héctor Meleiro [aut, cre]
Maintainer: Héctor Meleiro <hmeleiros@gmail.com>
Repository: CRAN
Date/Publication: 2026-04-29 19:10:07 UTC

opencis: Import Data from Spanish Sociological Research Center (CIS)

Description

Search and import data directly to R from the Spanish Sociological Research Center (CIS) https://www.cis.es/inicio. The CIS is a public institution that conducts electoral and sociological research studies on the Spanish society. The CIS has a large database of surveys that can be accessed through its website. The package includes functions to search for surveys, survey questions and timeseries, and import the data directly to R.

Author(s)

Maintainer: Héctor Meleiro hmeleiros@gmail.com

See Also

Useful links:


Open the questionnaire PDF of a CIS study

Description

Opens a PDF document from a CIS study in the default browser.

Usage

browse_pdf(study_code, wanted_file = "cues")

Arguments

study_code

A string with the study code.

wanted_file

A keyword used to match the PDF filename inside the ZIP. Use "cues" (default) for the questionnaire or "ft" for the technical sheet.

Details

CIS study ZIP files typically contain two PDF documents:

Value

Called for its side effect of opening the PDF in the browser. Returns NULL invisibly.

Examples

if (interactive()) {
# Open the questionnaire (cuestionario) for study 3328
browse_pdf("3328")

# Open the technical sheet (ficha técnica) for study 3328
browse_pdf("3328", wanted_file = "ft")
}

Build CIS catalog URL with date range

Description

Constructs a URL for querying the CIS catalog with optional date range filters.

Usage

cis_catalog_url_date(
  start = 1,
  q = "",
  from = NULL,
  to = NULL,
  sort = "relevance",
  catalogo = "estudio",
  ...
)

Arguments

start

Integer. The starting page for the search results. Default is 1, iterate to get more results.

q

String. The search query. Default is an empty string.

from

Date or NULL. The start date for filtering results. Default is NULL

to

Date or NULL. The end date for filtering results. Default is NULL.

sort

String. The sorting order for the results ("publishDate-", "publishDate+", "relevance"). Default is "relevance".

String. The catalog type ("estudio", "pregunta", "serie"). Default is "estudio".

...

Additional parameters (not used).

Value

A string representing the constructed URL.


Clear the opencis session cache

Description

Clears the in-memory cache used by search_cis and read_cis. Call this when you want to force fresh data to be retrieved from the CIS server within the same R session.

Usage

clear_cache()

Value

NULL invisibly.


Download and read a CIS study from a given URL

Description

Download and read a CIS study from a given URL

Usage

download_file(url, destfile = tempfile(fileext = ".zip"))

Arguments

url

A string with the URL of the CIS study page.

destfile

A string with the path where the ZIP file will be saved. Defaults to a temporary file.


Download a CIS study ZIP file to disk

Description

Downloads the data ZIP file for a CIS study to a specified directory, instead of a temporary folder. Useful for projects that need to keep the raw data files.

Usage

download_study(study_code, destdir = ".")

Arguments

study_code

A string with the study code.

destdir

A string with the directory where the ZIP file will be saved. Defaults to the current working directory.

Value

The path to the saved ZIP file, invisibly.

Examples


# Save the ZIP file to a temporary directory
path <- download_study("3328", destdir = tempdir())
cat("Saved to:", path, "\n")


Find CIS study data URL in HTML content

Description

Searches for the URL of the CIS study data ZIP file within the provided HTML content.

Usage

find_url(html, ids = NULL, allow_uuid = FALSE)

Arguments

html

A character string containing the HTML content to search.

ids

An optional vector of two strings representing the numeric subroutes of the URL (e.g., c("3411", "3411") for "https://www.cis.es/documents/3411/3411/MD3411.zip"). If NULL, the function searches for any valid CIS study data URL.

allow_uuid

A boolean indicating whether to allow UUIDs in the URL. Defaults to FALSE.

Value

A character vector of unique URLs found in the HTML content.


Extract a data dictionary from a CIS study data frame

Description

Returns a tibble listing each variable in the data along with its variable label and value labels, as loaded by haven.

Usage

get_data_dictionary(data)

Arguments

data

A data.frame loaded from a CIS .sav file, typically the output of read_cis.

Value

A tibble with columns:

variable

Variable name.

label

Variable label, or NA if none.

value_labels

A named numeric vector of value labels, or NULL for unlabelled variables (list-column).

Examples

# Create a small labelled data frame
df <- data.frame(
  SEXO = haven::labelled(c(1, 2, 1), labels = c(Hombre = 1, Mujer = 2)),
  EDAD = c(34, 51, 29)
)
attr(df$SEXO, "label") <- "Sexo"
attr(df$EDAD, "label") <- "Edad"

# Inspect its variable dictionary
dict <- get_data_dictionary(df)
print(dict)

# Find variables with a specific keyword in their label
dict[grepl("sexo", dict$label, ignore.case = TRUE), ]

# Inspect value labels for a specific variable
sex_var <- match("SEXO", dict$variable)
if (!is.na(sex_var)) {
  dict$value_labels[[sex_var]]
}

Get metadata of a CIS study

Description

Retrieves the technical metadata of a CIS study from its detail page, including study dates, type, country, author, and thematic indices.

Usage

get_metadata(study_code)

Arguments

study_code

A string with the study code.

Value

A tibble with two columns: field and value.

Examples


# Get metadata for study 3328
meta <- get_metadata("3328")
print(meta)

# Access a specific field
meta$value[meta$field == "Tipo de estudio"]


Get the URL of a CIS study

Description

Retrieves the URL of a specific CIS study using its study ID.

Usage

get_study_url(study_code)

Arguments

study_code

A string with the study ID.

Value

A string with the URL of the study, or NULL if not found.


List file paths inside a ZIP archive

Description

Returns a data.frame with the files contained in a ZIP archive, optionally filtered by file extension.

Usage

list_file_paths(zip_file, type = NULL)

Arguments

zip_file

A string with the path to the ZIP file.

type

A string with the file extension to filter by (e.g. "sav", "pdf"). If NULL (default), all files are returned.

Value

A data.frame with the files in the ZIP archive.


Imports

Description

Package import declarations

Details

Package imports

These declarations ensure NAMESPACE keeps required imports.


Parse CIS question search results

Description

Parse CIS question search results

Usage

parse_question(resp)

Arguments

resp

The HTTP response object from the CIS search.

Value

A tibble with the parsed data series information.


Parse CIS data series search results

Description

Parse CIS data series search results

Usage

parse_serie(resp)

Arguments

resp

The HTTP response object from the CIS search.

Value

A tibble with the parsed data series information.


Parse CIS study search results

Description

Parse CIS study search results

Usage

parse_study(resp)

Arguments

resp

The HTTP response object from the CIS search.

Value

A tibble with the parsed studies information.


Import a CIS study

Description

Download and import the data of a CIS study.

Usage

read_cis(study_code)

Arguments

study_code

A string with the study code.

Value

A data.frame with the study data.

Examples


# If you know the study code you can just read it into R
df <- read_cis("3328")
print(df)

# If you dont know the study code, you can search for a study using search_cis() function:
studies <- search_cis(q = "gastronomia")
print(studies)

df <- read_cis(studies$study[1])
print(df)


Read a SAV file from a ZIP archive

Description

Extracts and reads the SPSS (.sav) data file contained in a ZIP archive.

Usage

read_sav_from_zip(zip_path)

Arguments

zip_path

A string with the path to the ZIP file.

Value

A data.frame with the data read from the .sav file.


Search all CIS results with automatic pagination

Description

Calls search_cis repeatedly, incrementing the page index until no more results are returned, and returns all results in a single tibble.

Usage

search_all_cis(
  q = "",
  from = NULL,
  to = NULL,
  sort = "relevance",
  catalogo = "estudio",
  ...
)

Arguments

q

String. The search query. Default is an empty string.

from

Date or NULL. The start date for filtering results. Default is NULL. The date format must be "YYYY-MM-DD".

to

Date or NULL. The end date for filtering results. Default is NULL. The date format must be "YYYY-MM-DD".

sort

String. The sorting order for the results ("publishDate-", "publishDate+", "relevance"). Default is "relevance".

String. The catalog type ("estudio", "pregunta", "serie"). Default is "estudio".

...

Additional parameters passed to search_cis.

Value

A tibble with all search results across all pages.

Examples


# Retrieve all postelectoral studies (all pages)
all_studies <- search_all_cis(q = "postelectoral")
print(nrow(all_studies))

# Filter by date range
studies_2010_2020 <- search_all_cis(
  q    = "ideologia",
  from = "2010-01-01",
  to   = "2020-12-31"
)
print(studies_2010_2020)


Search for CIS studies.

Description

Searches for CIS studies using the CIS search engine.

Usage

search_cis(
  start = 1,
  q = "",
  from = NULL,
  to = NULL,
  sort = "relevance",
  catalogo = "estudio",
  ...
)

Arguments

start

Integer. The starting page for the search results. Default is 1, iterate to get more results.

q

String. The search query. Default is an empty string.

from

Date or NULL. The start date for filtering results. Default is NULL. The date format must be "YYYY-MM-DD".

to

Date or NULL. The end date for filtering results. Default is NULL. The date format must be "YYYY-MM-DD".

sort

String. The sorting order for the results ("publishDate-", "publishDate+", "relevance"). Default is "relevance".

String. The catalog type ("estudio", "pregunta", "serie"). Default is "estudio".

...

Additional parameters (not used).

Value

A data.frame with the search results.

Examples


# Search by search terms
studies <- search_cis(q = "postelectoral")
print(studies)

# Narrow the search by dates
studies <- search_cis(q = "postelectoral",
                          from = "2011-01-01",
                          to = "2020-01-01")
print(studies)

# Use the catalogo parameter to search for questions ("pregunta") or data series ("serie")
studies <- search_cis(q = "ideologia",
                          from = "2011-01-01",
                          to = "2020-01-01",
                          catalogo = "serie")
print(studies)