Commonplace of datasets, methods, stats, R, Python, etc.

Commonplace books, or commonplaces, were a way to compile knowledge, usually by writing information into books. Such books were essentially scrapbooks filled with items of every kind. This commonplace includes datasets, research resources, stats, econometric methods, libraries in R and python, and the like. Because I work in tech policy, there is a strong leaning towards tech, innovation and telecom in the legal and legislative research section.


Public policy cheatsheet

It is still in beta.


Economic lectures + notes


Reading lists


Math + stats

If you have technical questions, check out Luke Stein’s Stanford graduate economics core. 65 pages of stats review. For a gentler introduction to the ideas of probability, check out Jane Street’s Introduction to Probability and Markets. “Statistical tests, P values, confidence intervals, and power: a guide to misinterpretations.” (link)


Econometrics + model making

  • “This document contains the set of lecture notes from the late Gary Chamberlain’s 2010 Econometrics class (EC2120) that I (Paul Goldsmith-Pinkham) took during my economics Ph.D. at Harvard University. Gary was a remarkable teacher and this class was an amazing experience for me as a young economist.” [Github]
  • Cara Jackson is collecting Wooldridge’s Twitter lessons in a Google Doc.

Programming for economists, R, python, libraries, packages, etc.

R packages of note:


Federal Reserve data

Federal Reserve Economic Data is a great resource for all government datasets. Some key FRED series:

The Atlanta Fed’s Wage Growth Tracker


Census data

The FactFinder is dead. Explore Census data instead. Other Census data:

  • Small Area Income and Poverty Estimates (SAIPE)
  • Business Dynamics Statistics – BDS provides annual measures of business dynamics (such as job creation and destruction, establishment births and deaths, and firm startups and shutdowns) for the economy and aggregated by establishment and firm characteristics.
  • Nonemployer statistics – NES is an annual series that provides subnational economic data for businesses that have no paid employees and are subject to federal income tax. This series includes the number of businesses and total receipts by industry.
  • Economic Census – The economic census serves as the foundation for the measurement of U.S. businesses, including the Island Areas, and their economic impact.
  • A complete list of the Census' economic surveys – The U.S. Census Bureau business surveys and censuses measure the pulse of the U.S. economy, businesses, and governments. They provide data for businesses in the economic sectors such as manufacturing, construction, retail trade, health care, and services industries, as well as for state and local governments, and on imports and exports.
  • Computer and Internet Use – In recent decades, computer usage and Internet access has become increasingly important for gathering information, looking for jobs, and participation in a changing world economy. See also: The NTIA’s Digital Nation Data Explorer which includes raw data for the Internet and Computer Use Survey
  • Code Lists, Definitions, and Accuracy – View the detailed codes and definitions for variables, statistical testing, and an explanation of sample design, methodology, and accuracy for the American Community Survey.
  • Surveys & Programs – The U.S. Census Bureau conducts more than 130 surveys and programs each year. This is a list of all of them.
  • Are Millennials making less than their parents? When you check recent Census data, you get a completely different interpretation. Table P-10 located here lays out median/mean income for those 25 to 34 going back to 1974, adjusted to constant 2018 dollars. It starts at line 104. Quite clearly those aged 25-34 in 2018 are doing better than any group in the 1980s since the median income of \$37,133 is higher than the 1980s peak of \$33,356.
  • The National Survey of Children’s Health (NSCH) examines the physical and emotional health of children ages 0-17 years of age. Special emphasis is placed on factors related to the well-being of children. These factors include access to - and quality of - health care, family interactions, parental health, neighborhood characteristics, as well as school and after-school experiences.

The Bureau of Labor Statistics (BLS) data

The Bureau of Labor Statistics


Bureau of Economic Analysis (BEA) data

Bureau of Economic Analysis


Other government data

  • Data.gov – The home of the U.S. Government’s open data.
  • Rural Urban Continuum Codes – The Rural-Urban Continuum Codes form a classification scheme that distinguishes metropolitan counties by the population size of their metro area, and nonmetropolitan counties by degree of urbanization and adjacency to a metro area.
  • The Rural Atlas – If you need to do analysis of counties, the Rural Atlas is your go to. It is the most up-to-date document pulling together population and economic data from the 5 year ACS.
  • O*NET OnLine – ONET contains hundreds of standardized and occupation-specific descriptors on almost 1,000 occupations covering the entire U.S. economy. The database, which is available to the public at no cost, is continually updated from input by a broad range of workers in each occupation.
  • USA Spending – USAspending.gov is the official source for spending data for the U.S. Government.
  • About Underlying Cause of Death, 1999-2019
  • A list of construction indices
  • Interagency Data Inventory – The Interagency Data Inventory is a product of the Data Committee of the Financial Stability Oversight Council (FSOC). The inventory catalogs the data collected by FSOC member organizations. The inventory contains information — metadata — about each data collection. It does not contain the underlying datasets. For each data collection, the inventory has basic information, such as a brief description of the collection, collecting organization, and the name and number of the form used to collect the data.
  • The Private School Survey produces data similar to that of the NCES Common Core of Data (CCD) for the public schools. The data are useful for a variety of policy- and research-relevant issues, such as the growth of religiously-affiliated schools, the length of the school year, the number of private high school graduates, and the number of private school students and teachers.
  • The primary purpose of the Common Core of Data (CCD) is to provide basic information on public elementary and secondary schools, local education agencies (LEAs), and state education agencies (SEAs) for each state, the District of Columbia, and the outlying territories with a U.S. relationship.
  • Total factor productivity – From Eli Dourado: “Total factor productivity captures how much output can be produced with a diverse but fixed basket of inputs. As technology and institutions improve, TFP goes up. As they deteriorate, it goes down. In the last decade, TFP has deeply stagnated. While there are numerous estimates of total factor productivity in the US, only the series maintained by the Federal Reserve Bank of San Francisco is quarterly and attempts to adjust for the business cycle. Since this series is not available in FRED, I am making it available here in graphical form.”
  • The Atlanta Fed’s Wage Growth Tracker

State and local government data

  • Correlates of State Policy | IPPSR – The Correlates of State Policy Project aims to compile, disseminate, and encourage the use of data relevant to U.S. state policy research, tracking policy differences across the 50 states and changes over time. We have gathered more than 900 variables from various sources and assembled them into one large, useful dataset. We hope this project will become a “one-stop shop” for academics, policy analysts, students, and researchers looking for variables germane to the study of state policies and politics. R package Shiny App
  • The Fiscally Standardized Cities (FiSC) database allows users to create a custom table with fiscal information for selected cities. To create a table, select one or more cities, one or more years, and one or more fiscal variables. The default display options can be also adjusted, and users can choose whether to display data for FiSCs and/or one of the component governments (Cities, counties, school districts, and special districts).
  • State expediture report – This annual report examines spending in the functional areas of state budgets: elementary and secondary education, higher education, public assistance, Medicaid, corrections, transportation, and all other. It also includes data on capital spending by program area, as well as information on general fund and transportation fund revenue collections.
  • Annual Survey of State and Local Government Finances is the only source of nationwide, comprehensive local government finance information. It provides statistics on revenue, expenditure, debt, and assets for the 50 states and D.C.
  • IPUMS – IPUMS provides census and survey data from around the world integrated across time and space. IPUMS integration and documentation makes it easy to study change, conduct comparative research, merge information across data types, and analyze individuals within family and community contexts.
  • Eurostat – Eurostat is the statistical office of the European Union.
  • NORC at U of Chicago – NORC experts conduct research in a wide range of subjects, bringing insight to topics including education, economics, global development, health, and public affairs. NORC is a solid alternative to Census and BLS surveys.
  • Data USA – Maintained by Deloitte and Datawheel, DataUSA has a lot of databases. I use this site to grab data about cities and states.
  • Damodaran’s finacial datasets | industry cap ex, risk premiums, etc – Aswath Damodaran is the GOAT: “Since I teach valuation and corporate finance, I am constantly collecting and analyzing data, and I have found that the data, once analyzed, can be used multiple times. Since I already have the processed data, I could not see any harm from sharing that data with others, thus saving us all some collective time, which we can spend far more productively not just on valuation but also with family and friends.”
  • State expediture report – This annual report examines spending in the functional areas of state budgets: elementary and secondary education, higher education, public assistance, Medicaid, corrections, transportation, and all other. It also includes data on capital spending by program area, as well as information on general fund and transportation fund revenue collections.

Compendiums of data:


Telecom & tech datasets


Other datasets

  • Vertical Farming - [link]
  • Twitterstream from Archive. A simple collection of JSON grabbed from the general twitter stream, for the purposes of research, history, testing and memory. This is the “Spritzer” version, the most light and shallow of Twitter grabs. Unfortunately, we do not currently have access to the Sprinkler or Garden Hose versions of the stream. [Archive]
  • The California Forest Observatory is a data-driven forest monitoring system that maps wildfire hazard drivers across California, including forest structure, weather, topography, and infrastructure.
  • OpenStreetMap provides a broad range of map data maintained by a worldwide community of geographers and cartographers.
  • The Registry of Open Data on AWS has empowered laboratories, research institutions, and various other organizations to deliver open datasets to developers, startups, and enterprises worldwide since its launch in 2018.
  • Nasa Earth Observations offer climate and environmental data for the globe. You can browse and download the satellite data from NASA’s constellation of Earth Observing System satellites. Over 50 different global datasets are represented with daily, weekly, and monthly images available in various formats.
  • A Google BigQuery public dataset is any dataset made available to the general public through the Google Cloud Public Dataset Program.
  • Koordinates is an emerging geospatial data management platform where you can host, manage, share, publish, and access geodata.
  • Natural Earth is a collection of public domain map datasets available in vector or raster formats and various scales.
  • Safegraph offers some open census data and neighborhood demographics.
  • The Canadian government has its own Open Data Portal.
  • An open dataset of electric vehicles and their specs.
  • Open Zone Map - the largest dataset and only interactive map of Special Economic Zones.


General research


Data science tools and resources


Rhetoric


Miscellaneous lectures, notes, + resources