Skip to main content
Bowdoin College Library Ask Us!

Quantitative Analysis in Sociology, Soc 2020: General Social Survey

Find Data

Where to get GSS data

Use the GSS Quick Start Guides for more help on how to find, extract, and analyze GSS data.

Sites with Useful Data from Social Statistics for a Diverse Society

Statistical Data, a Bowdoin College Library guide

I'd be happy to work with you to find other sources of data! Please contact me. - Barbara

General Social Survey


In this exercise, you will learn the basics of how the General Social Survey (GSS) survey is administered, how the data are displayed through the NORC interface, how to read the documentation, how the data are coded, and how to find the data that you are seeking.

How the GSS is administered

  1. What is NORC? What is NORC's relationship to the General Social Survey (GSS)?
  2. To get an overview of how the GSS is conducted, skim at least the following questions and answers on the GSS FAQ.
    1. In what years is the GSS conducted?
    2. When will the next GSS data be available?
    3. How is the GSS administered?
    4. Which items are part of the 'GSS core'?
    5. Why do some questions only appear in some years?
    6. Are all questions asked of all respondents?
    7. How many people are interviewed for each GSS?
    8. Which population did the GSS target?
  3. As you just learned, the interviewers conduct the GSS using a questionnaire. Access the GSS questionnaire for 2014, version 1 in English, with the following name in the first column: 2014 GSS V1 English. (You may need to display more results to find the desired file.)
    1. To see some of the questions that survey respondents were asked, skim the questionnaire relating to the variables named:
      1. WKHARSEX on p. 244.
        1. What question was asked?
        2. What type of variable is this? (nominal (categorical), ordinal, interval-ratio; dichotomous; discrete; continuous)
      2. HISP1, HISP2, RACECEN1, RACECEN2, and RACECEN3 on pp. 98-102.
        1. What questions were asked?
        2. What is the difference between the two Hispanic variables? Among the three Race variables?
        3. The set of Hispanic questions was asked before the set of Race questions. Would it be wiser to (a) reverse the order of the set of Hispanic questions and the set of Race questions in the questionnaire, or (b) leave them as is? Why?

How the data are displayed through the NORC interface

  1. From NORC's GSS website, choose "Access and Analyze GSS Data": "Search Data". Search for the variable WKHARSEX by entering the variable name in the "Keyword" searchbox, clicking "Search", and clicking on the variable name in the results list.
    1. In which years was this question asked?
    2. Does the table "Summary by Year" present raw frequency distributions, proportions, or percentage distributions?
  2. View data for each of the variables RACE, RACECEN1, RACECEN2, and RACECEN3 by entering RACE in the "Keyword" searchbox, clicking "Search", and looking for each of the variable names in the results list.
    1. In which years were these questions asked?

How the data are coded

  1. The cumulative codebook, covering all years of the GSS (and over 3700 pages long!), provides the exact wording of all questions, the possible responses, the variable names, and the years the questions were asked. Access it under the link "Entire GSS Cross-Section Codebook (A single file)".
    1. Skim the entries on variable WKHARSEX (pg. 1538), on variable RACE (p. 203), and on variables RACECEN1, RACECEN2, and RACECEN3 (pp. 2948-2950).
    2. If you were to download the dataset for RACECEN1, how would you expect the (first) response "Asian Indian" to be coded in the dataset?

How to find the data that you are seeking

  1. The cumulative codebook also includes a Subject Index to Questions in Appendix V (pp. 3591-3673).
    1. Look for other questions/variable names relating to harassment in the workplace in the section on "Work" (pp. 3671-3673).
    2. A recent article in the Washington Post, "Dog owners are much happier than cat owners, survey finds", was based on data from the 2018 GSS. Without reading the article, use the Subject Index to find variables that might have been used in the study.
  2. Where is the following information available?
    1. frequency distributions are in:
      • ___ Questionnaire
      • ___ NORC website
      • ___ Codebook
    2. related variables can easily be found in:
      • ___ Questionnaire
      • ___ NORC website
      • ___ Codebook
    3. type of variable is in:
      • ___ Questionnaire
      • ___ NORC website
      • ___ Codebook
  3. Let's say that you were asked to collect GSS data pertaining to harassment in the workplace and pertaining to religion for as many years as possible during the time period 1996-2014. Which of the following three methods are possible ways to do that?
    1. (1) Use a GSS questionnaire for one of the relevant years to find a relevant question that was asked, then (2) search on a word from the question or responses on the NORC GSS website.
    2. (1) Use the Subject Index to Questions in the cumulative GSS codebook to find a relevant question that was asked during the preferred number of years, then (2) search on the variable name on the NORC GSS website.
    3. Ignore the GSS questionnaire and codebook. Instead, search on the NORC GSS website on words that you think might be found in either the question or the responses and limiting the search to 1996-2014.


  1. Give some examples of information that was provided in the NORC FAQ in #2 above that was important to know as you worked with the questionnaire, the NORC website, and the codebook. Why was it important?
  2. Do you think that the GSS is a reliable dataset to use for scholarly purposes? Why or why not? How can you tell?


Bowdoin College Library
3000 College Station
Brunswick, ME 04011
Ask Us!
Send Feedback/Report a Problem