Download QuantGov Data in Bulk

Welcome to the bulk download page for all QuantGov related data. Below users will find downloadable zip files that hold the most recent data for all QuantGov projects. For previous versions of QuantGov projects, please contact us directly at info@quantgov.org. Documentation for the data can be found on the Documentation Hub page.

 
+ RegData United States 4.1

These data sets measure the restrictiveness of the Code of Federal Regulations (CFR) from 1970-2021 and the degree to which it affects various industries. Specifically, the files include (1) a restriction count; (2) the probability that a part is relevant to industries included in the 2007 NAICS at the 2, 3, 4, 5, and 6-digit levels; (3) the authoring agency and department for each part-level; (4) a metric called shannon entropy, which measures the likelihood of encountering new words or concepts in a part; (5) conditional counts, which numbers the occurrence of “if,” “but,” and “provided”; (6) sentence length; and (7) word count.

+ RegData United States 4.0

These data sets measure the restrictiveness of the Code of Federal Regulations (CFR) from 1970-2020 and the degree to which it affects various industries. Specifically, the files include: (1) a restriction count; (2) the probability that a part is relevant to industries included in the 2007 NAICS at the 2, 3, and 4-digit levels; (3) the authoring agency and department for each part-level; (4) a metric called shannon entropy, which measures the likelihood of encountering new words or concepts in a part; (5) the flesch reading ease score, which measures how easy a page is to read (the higher the score, the easier); (6) the average number of conditional counts, which numbers the occurrence of “if,” “but,” and “provided”; (7) sentence length; and (8) word count.

+ RegData United States 3.2

These data sets measure the restrictiveness of the Code of Federal Regulations (CFR) from 1970-2019 and the degree to which it affects various industries. Specifically, the files include: (1) a restriction count; (2) the probability that a part is relevant to industries included in the 2007 NAICS at the 2, 3, 4, 5, and 6-digit levels; (3) the authoring agency and department for each part-level; (4) a metric called shannon entropy, which measures the likelihood of encountering new words or concepts in a part; (5) the number of conditionals, which counts the occurrence of “if,” “but,” and “provided”; (6) sentence length; and (7) word count.

+ RegData Canada 2.2

These datasets include Canadian federal data that spans from 2006-2021 and province/territory data that span from 2018-2021. The files include (1) restriction counts; (2) the occurrence of certain restrictive words; (3) the probability that a part is relevant to a 3-digit industry (as classified by the NAICS); (4) a metric called shannon entropy, which measures the likelihood of encountering new words or concepts; (5) word counts; and (6) sentence length. Learn more about the RegData Canada project here: https://www.quantgov.org/regdata-canada.

+ RegData Australia 2.2

RegData Australia focuses on the regulatory codes of the federal Australia government and the six federated states within the country (not including internal and external territories). The federal data includes restrictive phrase counts and complexity metrics for the following time series: (1) 2005-2021 for Legislative Instruments; (2) 1977-2021 for Acts; and (3) 2018-2021 for Notifiable Instruments. The state level data includes restrictive phrase counts and complexity metrics for regulations and statutes from 2019-2021. Learn more about the RegData Australia project here: https://www.quantgov.org/regdata-australia.

+ RegData UK 1.0

All datasets from the RegData UK 1.0 project, including document metadata and complexity metrics. Data is through the year 2021.

+ RegData India 1.0

All datasets from the RegData India 1.0 project, including document metadata and complexity metrics.

+ State RegData 3.0 Regulations

Document-level statistics from various state administrative codes, as of July 2021. Specifically, this zip file provides three csvs, which include total word and restriction counts, complexity metrics, and NAICS industry classification.

+ State RegData 3.0 Statutes

Document-level statistics from various state statutory codes, as of August 2021. Specifically, this zip file provides three csvs, which include total word and restriction counts, complexity metrics, and NAICS industry classification.

+ Count the Code: Quantifying Federalization of Criminal Statutes

BY: Giancarlo Canaparo, Patrick McLaughlin, Jonathan Nelson, Liya Palagashvili
DATE: January 7, 2022

Abstract: We develop an algorithm to quantify the number of statutes within the United States Code that create one or more federal crimes. This is the first effort to “count the Code” since 2008 and is unique among previous efforts in that it employs an algorithm to sift through Code using carefully selected keywords to count the number of statutes that create crimes. We find 1,510 statutes in the Code as of 2019 that create at least one crime. This represents an increase of nearly 36 percent relative to the 1,111 statutes that created at least one crime found in the 1994 United States Code. Although the algorithm cannot precisely count discrete crimes within sections, we estimate the number of crimes contained within the Code as of 2019 at 5,199. These findings support the conclusions of other studies that the number of federal crimes has increased over time, while also bolstering the concerns raised by numerous scholars that federal crimes are too diffuse, too numerous, and oftentimes too vague for the average citizen to know what the law requires of him or her. Lastly, we present preliminary ideas for further investigation using our new dataset.

+ State RegData 2.1

Document-level statistics from various state administrative codes, as of July 2020. Specifically, this zip file provides three csvs, which include total word and restriction counts, complexity metrics, and NAICS industry classification. This is a slightly updated version of the 2.0 version with improved accuracy for a handful of states.

+ State RegData 2.0

Document-level statistics from various state administrative codes, as of July 2020. Specifically, this zip file provides three csvs, which include total word and restriction counts, complexity metrics, and NAICS industry classification.

+ State Health RegData 1.0

Classification probabilities for relation to healthcare for all U.S. states. This project uses the files in the State RegData 2.0 data project.

+ Federal Healthcare RegData 2.0

Classification probabilities for relation to healthcare for the U.S. Code of Federal Regulations. This project uses the files from the RegData U.S. 3.2 data project.

+ Federal Register 1.0

Document-level statistics from the Federal Register, 1996-2017, with codebook.

+ Deregulation 1.0

Metadata relating the the deregulation dataset. Will include metadata for each Federal Register document spanning back to 2012, and deregulatory word counts.

+ FRASE Index 2021

The complete dataset and user's guide for the 2021 version of the FRASE project. This dataset allows researchers to compare the relative impact of federal regulations on each state economy. This data is not available through the QuantGov API.

+ Occupation Data 1.0

QuantGov Occupation data looks to pair the text of a piece of regulatory code with an occupation from the Standard Occupational Classification (SOC) system from the Bureau of Labor Statistics. Analyzed documents include US federal and US state regulatory codes.

+ Occupational Licensing RegData 1.1

Occupational Licensing RegData identifies and quantifies regulatory text that is related to occupational licensing. The output of this data series is simply a probability for each State RegData 2.0 document has text relating to occupational licensing.

+ Public Law 1.0

This dataset establishes assiciations between public laws and CFR parts from 1980-2016. This data is not available through the QuantGov API.

+ Section 232 Tariffs (New Portal)

This dataset contains Section 232 tariff exemption requests collected from the new Commerce Portal. This data is updated approximately every quarter and is not available through the QuantGov API.

+ Section 232 Tariffs (Old Portal)

This dataset contains Section 232 tariff exemption requests collected from the old Commerce Portal. This data is no longer updated and is not available through the QuantGov API.

+ Section 301 Tariffs

This dataset contains Section 301 tariff exemption requests. This data is updated approximately every quarter and is not available in the QuantGov API.

+ Temporary Flight Restrictions

The text of temporary flight restrictions and their corresponding shapefiles issued from Sept 2017 to June 2019 (5,646 TFRs). This data is not available in the QuantGov API.