Skip to Main Content

Open Data Sets and Visualization Resources: How to Evaluate Data and Datasets

Locate and use open source numeric, statistical, geospatial, and qualitative data sets by topic. Explore data visualization methods, techniques, and resources to help discover patterns in the data that may not be clear from statistics alone.

How to Evaluate Data and Data Sets

Factors to consider when evaluating data and statistics:

Source

  • Who collected it? An individual, organization, or agency? 

  • The data source and the reporter or citer are not always the same. For example, advocacy organizations often publish data that were produced by some other organization. When feasible, it is best to go to the original source (or at least know and evaluate the source).

  • If the data are repackaged, is there proper documentation to lead you to the primary source? Would it be useful to get more information from the primary source? Could there be anything missing from the secondary version?

Authority

  • How widely known or cited is the producer? Who else uses these data?

  • Is the measure or producer contested?

  • What are the credentials of the data producer?

  • If an individual, are they an expert on the subject?

  • If an individual, what organizations are they associated with? 

Objectivity & Purpose

  • Who sponsored the production of these data?

  • What was the purpose of the collection/study?

  • Who was the intended audience for or users of the data?

  • Was it collected as part of the mission of an organization? Or for advocacy? Or for business purposes?

Currency

  • When were the data collected? There is often a time lag between collection and reporting of data.

  • Are these the newest figures? Sometimes the newest available figures are a few years old. That is okay, as long as you can verify that there isn't something more current.

Collection Methods & Completeness

  • How are the data collected? Count, measurement or estimation?

  • Even a reputable source and collection method can introduce bias. 

  • If a survey, what was the total population -- how does that compare to the size of the population it is supposed to represent?

  • If a survey, what methods used to select the population included, how was the total population sampled?

  • If a survey, what was the response rate?

  • What populations included? Excluded?

Consistency / Verification

  • Do other sources provide similar numbers?

  • Can the numbers be verified?


This information is from the Gould Library Research Guides Evaluating Statistics and Data and is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License