RegAuthorities: The Regulations Authorities Database, A Dataset

September 7, 2022
By: Kofi Ampaabeng, Patrick McLaughlin, Dustin Chambers, Jonathan Nelson

In its most recently concluded term, the US Supreme Court, according to critics, came close to gutting the so-called Chevron deference or doctrine. Under Chevron the apex court defers to the reasonable agency interpretation of ambiguous statutory language while adjudicating cases. Since 1984, Chevron deference has guided agency regulatory behavior. One of the more intriguing features of the doctrine is that agency interpretation is not static--as conditions change, agencies can develop new regulations based on old statutes. 

Many critics of the doctrine have argued that its broad scope means that agencies can effectively pursue their own policies unless explicitly forbidden by Congress. As an applied economist (and not an administrative law expert), my primary interest in Chevron is its effects on agency behavior, particularly on regulatory accumulation. And as a data scientist, I am interested in features of statutory and regulatory language. Is it possible to determine, empirically, the statutes that are most likely to cause regulatory bloat and can such bloat be attributable to the ambiguity in the authorizing statutes?

These questions led the Policy Analytics Project at the Mercatus Center to build a new dataset that connects regulations to their statutes. The resulting product, which we call RegAuthorities, is a large dataset that shows, among other things, how agencies use their interpretive powers to formulate regulations. We constructed RegAuthorities from the US Code (statutes) and the US Code of Federal Regulations (CFR). Any agency that issues a regulation is required to cite the statute or statutes which authorize it to do so. We use this reference to the authorizing statutes to construct a dataset that links all regulations to their authorizing statutes. Similar to the RegData suite of products, RegAuthorities also includes features of both the regulations and the statutes, including the number of words, the number of restrictions, the quality, readability, and complexity. These data are available for multiple years beginning in 2001.

In addition, we introduce two new concepts that apply to statutes -- centrality and amplification. Centrality exploits the links we have produced between statutes and regulations to identify the statutes that result in the most regulations.  Our preliminary analyses of the RegAuthorities data have been revealing. For example, we are able to establish that as of 2019, nearly half of all statutes did not result in any regulation. 

The CFR is organized hierarchically, starting with titles, which are further subdivided into chapters, subchapters, parts, sections and paragraphs.RegData and RegAuthorities consider a regulation to be a CFR part. Similarly for the statutes, a unit of law is defined at the section level. Using these units of analyses, we find that three statutes are perhaps the most central to regulations - 5 USC 301(858), 5 USC 552 (295 parts) and 21 USC 371 (224). These three statutes are cited in more than 15 million words in the CFR. 

The second concept we introduce is amplification, which is perhaps the more interesting of the two. Amplification is the ratio of the number of words in the authorizing statute to the number of words in the resulting regulation. To illustrate, consider one of the most amplified statutes, 16 USC 1543, which concerns commerce and trade. The statute itself contains only 36 words and yet, it is cited  in 20 regulations (CFR parts), which together contain 6.5 million words, an amplification factor of 180,000. This brings us back to the original motivation: what is it about those 36 words that causes them to be cited by 6.5 million words of regulatory text? And can we reasonably infer the role of Chevron or other doctrines that enjoin the judiciary to defer to agency interpretation of statutes and therefore causing agencies to issue more regulations than they would otherwise? 

We hope these data and the subsequent revisions and additions can help answer these questions and more. As an initial application, we use the data to predict the regulatory output of an agency based on the characteristics of the agency, including the leadership structure, whether it is independent or an executive department, and the number of words in the authorizing statutes.