Data from a survey of consumer expectations
Since April 24, 2020, Fabian Lange and Lars Vilhuber have been conducting the survey "Uncertainty in COVID-19 times". The survey is a single-question survey focusing on people's anticipation about social distancing rules and firm closures during the 2020 COVID-19 health crisis.
We believe that this information is not otherwise available in a reliable and timely fashion. The information should be usable by policy-makers and researchers, to be included in models of future developments of society and the economy.
Please cite the data as
Lange, Fabian and Lars Vilhuber. 2020. "Uncertainty in times of COVID-19: Raw survey data [dataset]." Available at https://labordynamicsinstitute.github.io//covid19-expectations-data (accessed 2020-07-30).
Please cite this document as
Lange, Fabian and Lars Vilhuber. 2020. "Codebook for: Uncertainty in times of COVID-19: Raw survey data." Available at https://labordynamicsinstitute.github.io//covid19-expectations-data (accessed 2020-07-30).
This document is also available in PDF format at https://labordynamicsinstitute.github.io//covid19-expectations-data/expectations-codebook.pdf.
We will be posting the data on Zenodo shortly. Data should be cite via DOI then.
Final files are uploaded after each wave is completed. Filenames in final
tagged with geography, language, the question type,and date downloaded:
survey-[geography]-[language]-[question]-[date].xlsx
We provide a normalized Stata and R (Rds
) file with all surveys, recoded consistently.
Files |
---|
expectations.csv |
expectations.dta |
expectations.Rds |
We provide additional files that were either used or generated in the process of data cleaning, in particular for the reweighting. (explanations to come)
Temporary files may be made available if a survey has not yet completed, but data are already available.
Temporary
files follow
survey-[surveyid].xlsx
Topic | Answer |
---|---|
Geographic Coverage | United States of America, Canada |
Time Periods | 2020-04-24 - 2020-06-19 |
Date of Collection | 2020-04-24 - 2020-06-19 |
Unit of Observation | Individual |
Description of Variables | User_ID, Time_UTC, Survey_Completion, Publisher_Category, Gender, Age, Geography, Weight, Question_1_Answer, rt_Q1_ms |
The survey asks about point-in-time expectations. A new wave is launched every Friday. The list provides the dates of collection for each wave. Currently, data are available covering the period between 2020-04-24 and 2020-06-19.
This field captures the answer to the sole question of each survey, where answers differ across geographic scope (geotag
), and languages. A consolidated (standardized) distribution is shown below, using the standardizer mapping.
The following tabulations are of unweighted data.
Question_1_Answer | count | percent |
---|---|---|
1-2 months | 2561 | 21.23 |
2-3 months | 2015 | 16.71 |
3-6 months | 1986 | 16.46 |
less than 1 month | 1559 | 12.92 |
more than 6 months | 2674 | 22.17 |
My province has not implemented such rules. | 1267 | 10.50 |
Question_1_Answer | count | percent |
---|---|---|
1-2 mois | 1553 | 30.66 |
2-3 mois | 1071 | 21.14 |
3-6 mois | 769 | 15.18 |
Les entreprises dans ma province ne sont pas fermées | 236 | 4.66 |
moins d'un mois | 1130 | 22.31 |
plus que 6 mois | 307 | 6.06 |
Question_1_Answer | count | percent |
---|---|---|
1-2 mois | 842 | 17.02 |
2-3 mois | 965 | 19.51 |
3-6 mois | 1334 | 26.97 |
Ma province n'a pas de telles mesures | 27 | 0.55 |
moins d'un mois | 269 | 5.44 |
plus que 6 mois | 1509 | 30.51 |
Question_1_Answer | count | percent |
---|---|---|
1-2 months | 5046 | 25.11 |
2-3 months | 1918 | 9.55 |
3-6 months | 1293 | 6.44 |
less than 1 month | 8365 | 41.63 |
more than 6 months | 1306 | 6.50 |
My state has not implemented such rules. | 2165 | 10.77 |
Question_1_Answer | count | percent |
---|---|---|
1-2 months | 4131 | 23.57 |
2-3 months | 2411 | 13.76 |
3-6 months | 2379 | 13.57 |
less than 1 month | 3894 | 22.22 |
more than 6 months | 3228 | 18.42 |
My state has not implemented such rules. | 1485 | 8.47 |
The actual question asked is encoded in the tag
variable on normalized files, and differs by geographic target (geotag
). On the original files, geographic target is not identifiable except through the file name, and the question text is on the "Overview" tab. On the normalized files, the variables tag
and geotag
allow to map back to the actual question:
Encoded in geotag
on normalized files, and specifies the two-letter geocode (country or postal abbreviation) as targeted on the Google Survey platform. Note: geotag
= qc
also identifies the surveys that used the app.
geotag | count | percent |
---|---|---|
canada | 26768 | 35.56 |
ny | 2003 | 2.66 |
qc | 8890 | 11.81 |
us | 37621 | 49.97 |
Age | count | percent |
---|---|---|
18-24 | 9670 | 12.85 |
25-34 | 12348 | 16.40 |
35-44 | 10660 | 14.16 |
45-54 | 9303 | 12.36 |
55-64 | 9305 | 12.36 |
65+ | 8902 | 11.82 |
Unknown | 15094 | 20.05 |
Gender | count | percent |
---|---|---|
Female | 28419 | 37.75 |
Male | 32794 | 43.56 |
Unknown | 14069 | 18.69 |
Geography is as coded by Google Surveys. Precision may vary, having country, region, province, and sometimes city. Note that this may be different from the targeted geography.
The variable Geography
corresponds to the geography as captured and recorded by Google. All other geography variables are derived from this variable, and are only available on the normalized files.
Distribution across countries
Regions may be single states or provinces, or larger collections. They may correspond to US Census regions or Statistics Canada regions.
States and provinces are codes as two-letter postal abbreviation on the original data files. On derived files, geonum
contains the numeric FIPS or province code (coded as character to preserve leading zeros), and as a full name (geoname
). Note that the Google-provided Region
often, but not always corresponds to a state or province, whereas State_Province
, geonum
, geoname
always correspond to state/province.
In some cases, details is available at the city level.
See elsewhere in this document how weights are computed.
User_ID
Time_UTC
Survey_Completion
Data files are available for each completed cycle of the survey, in general once a week, and are stored under final
. Data from the preliminary study (assessing the questionnaire design) is stored under preliminary
. We may make available data before the survey is completed for each cycle, under temporary
, however, once the final version from that cycle is available, these are deleted (this directory will be empty on Zenodo).
Native format is Office Open XML (XLSX, ECMA International (2016) ). Normalized files are available in Stata and R formats.
Files are provided as downloaded from Google Surveys. Each file has 4 tabs.
Lists the questions asked by the client, in this case Lange and Vilhuber, as well as a survey ID.
This tab contains a weighted summary of the responses to the questions (similar to the above summary).
This tab contains the actual microdata for any complete responses. Note that for a single-question survey, this is identical to the "All responses". A complete response might have a weight of zero.
All responses, whether complete or not, are recorded on this tab. In the case of a single-question survey, this is identical to the "Complete responses" tab.
Each individual is asked one of two questions: how long they expect "social distancing rules" or "business closures" to remain in effect:
Five response choices are offered:
An additional answer allows respondents to affirm that "such measures are not implemented in their province/state". See questionnaires for visual representation of the questions.
Data is collected via Google Surveys. For English-language surveys, data is collected via a web form. For French-language surveys, the Android Google Survey app is used, as web-collection in French is not possible via Google Surveys. See Sostek and Slatkin (2018) and Google (2020) for more details.
The survey questionnaire was approved by McGill University Research Ethics Board under REB File # 20-04-070. Exemption was issued by Cornell University Institutional Review Board under Protocol ID# 2004009539.
Google Surveys is an online non-probability survey. It uses stratified sampling for collection, based (in the US) on the target internet population from the 2017 Current Population Survey (CPS) Computer and Internet Use Supplement (Sostek and Slatkin 2018; Google 2020).
Data are collected directly from survey respondents.
For each country, we plan to collect 2500 responses per question, per week. For Canada, a French-language variant is fielded. In order to determine the split, we use Statistics Canada statistics on "Languag e spoken most often at home" by other language(s) spoken regularly at home and age" (Statistics Canada 2017),1 combining responses for "French" and "French and non-official language" (i.e., no English mentioned).
For 2016, 20.4% spoke French and no English as the language spoken most often at home. We thus target 510 responses via the French-language questionnaire, and 1990 in English.
All demographics are imputed by Google Surveys, if collected via web. Demographics for respondents via the app are collected through the app.
Weights are provided by Google Surveys, based on the imputed demographics. For the US, the US Census Bureau's Current Population Survey (CPS) Computer and Internet Use Supplement is used (currently the 2017 version). For Canada, Google (2020) points to a "combination of government data and internal Google data sources." Google uses post-stratification weighting to align the weighted demographics with the target population.
A preliminary survey was conducted to allow for choice of either a two-question variant, or a one-question variant that incluced both social distancing and business closures ("How much longer do you expect social distancing rules (restrictions on gatherings, closure of non-essential businesses, stay-at-home rules) to stay in place in your province?"). See "Uncertainty in times of COVID-19: Choosing whether to ask 1 or 2 questions" for more information.
Privacy and disclosure control are described in Google (2020). For most respondents, no direct or indirect identifiers are collected, and are imputed based on other information available to Google, but not the sponsors of the survey.
The specific response rates are not known. Google (2020) reports response rates in general for this type of data collection.
We acknowledge generous funding by Lange’s Canada Research Chair in Labour and Personnel Economics, and by the Cornell Atkinson Center for Sustainability under its “Rapid Response Fund” program.
These data are licensed under a Creative Commons Attribution-NonCommercial 4.0 International license. See citation for attribution.
ECMA International. 2016. “Standard ECMA-376: Office Open XML File Formats.” https://www.ecma-international.org/publications/standards/Ecma-376.htm.
Google. 2020. “Methodology Google Surveys.” https://support.google.com/surveys/answer/6189786.
Sostek, Katrina, and Brett Slatkin. 2018. “How Google Surveys Works.” Whitepaper. Google. https://services.google.com/fh/files/misc/white_paper_how_google_surveys_works.pdf.
Statistics Canada. 2017. “Language Highlight Tables, 2016 Census.” Catalogue 98-402-X2016005. Statistics Canada. https://www12.statcan.gc.ca/census-recensement/2016/dp-pd/hlt-fst/lang/Table.cfm?Lang=E&T=31&Geo=00.