Download U.S. census data and reformat it for humans

datadesk, updated 🕥 2023-02-11 02:21:06

census-data-downloader

Download American Community Survey data from the U.S. Census Bureau and reformat it for humans.

What's available

All of the data files processed by this repository are published in the data/processed/ folder. They can be called in to applications via their raw URLs, like https://raw.githubusercontent.com/datadesk/census-data-downloader/master/data/processed/acs5_2017_population_counties.csv

The command-line interface

The library can be installed as a command-line interface that lets you download files on demand.

Installation

bash $ pipenv install census-data-downloader

Command-line usage

There's now a tool named censusdatadownloader ready for you.

```bash Usage: censusdatadownloader [OPTIONS] TABLE COMMAND [ARGS]...

Download Census data and reformat it for humans

Options: --data-dir TEXT The folder where you want to download the data --year [2009-2020] The years of data to download. By default it gets only the latest year. Not all data are available for every year. Submit 'all' to get every year. --force Force the downloading of the data --help Show this message and exit.

Commands: aiannhhomelands Download American Indian, Alaska Native and... cnectas Download combined New England city and town... congressionaldistricts Download Congressional districts counties Download counties in all states countysubdivision Download county subdivisions csas Download combined statistical areas divisions Download divisions elementaryschooldistricts Download elementary school districts everything Download everything from everywhere msas Download metropolitian statistical areas nationwide Download nationwide data nectas Download New England city and town areas places Download Census-designated places pumas Download public use microdata areas regions Download regions secondaryschooldistricts Download secondary school districts statelegislativedistricts Download statehouse districts states Download states tracts Download Census tracts unifiedschooldistricts Download unified school districts urbanareas Download urban areas zctas Download ZIP Code tabulation areas ```

Before you can use it you will need to add your CENSUS_API_KEY to your environment. If you don't have an API key, you can go here. One quick way to add your key:

bash $ export CENSUS_API_KEY='<your API key>'

Using it is as simple as providing one our processed table names to one of the download subcommands.

Here's an example of downloading all state-level data from the medianage dataset.

bash $ censusdatadownloader medianage states

You can specify the download directory with --data-dir.

bash $ censusdatadownloader --data-dir ./my-special-folder/ medianage states

And you can change the year you download with --year.

bash $ censusdatadownloader --year 2010 medianage states

That's it. Mix and match tables and subcommands to get whatever you need.

Python usage

You can also download tables from Python scripts. Import the class of the processed table you wish to retrieve and pass in your API key. Then call one of the download methods.

This example brings in all state-level data from the medianhouseholdincomeblack dataset.

```python

from census_data_downloader.tables import MedianHouseholdIncomeBlackDownloader downloader = MedianHouseholdIncomeBlackDownloader('') downloader.download_states() ```

You can specify the data directory and the years by passing in the data_dir and years keyword arguments.

```python

downloader = MedianHouseholdIncomeBlackDownloader('', data_dir='./', years=2016) downloader.download_states() ```

Usage examples

A gallery of graphics powered by our data is available on Observable.

Black and Latino U.S. population shares

The Los Angeles Times used this library for an analysis of Census undercounts on Native American reservations. The code that powers it is available as an open-source computational notebook.

The 2020 census is coming. Will Native Americans be counted?

Contributing to the library

Adding support for a new table

Subclass our downloader and provided it with its required inputs.

```python import collections from census_data_downloader.core.tables import BaseTableConfig from census_data_downloader.core.decorators import register

@register class MedianHouseholdIncomeDownloader(BaseTableConfig): PROCESSED_TABLE_NAME = "medianhouseholdincome" # Your humanized table name UNIVERSE = "households" # The universe value for this table RAW_TABLE_NAME = 'B19013' # The id of the source table RAW_FIELD_CROSSWALK = collections.OrderedDict({ # A crosswalk between the raw field name and our humanized field name. "001": "median" }) ```

Add it to the imports in the __init__.py file and it's good to go.

Developing the CLI

The command-line interface is implemented using Click and setuptools. To install it locally for development inside your virtual environment, run the following installation command, as prescribed by the Click documentation.

bash $ pip install --editable .

That's it. If you make some good ones, please consider submitting them as pull requests so everyone can benefit.

Issues

Bump ipython from 8.7.0 to 8.10.0

opened on 2023-02-11 02:21:06 by dependabot[bot]

Bumps ipython from 8.7.0 to 8.10.0.

Commits
  • 15ea1ed release 8.10.0
  • 560ad10 DOC: Update what's new for 8.10 (#13939)
  • 7557ade DOC: Update what's new for 8.10
  • 385d693 Merge pull request from GHSA-29gw-9793-fvw7
  • e548ee2 Swallow potential exceptions from showtraceback() (#13934)
  • 0694b08 MAINT: mock slowest test. (#13885)
  • 8655912 MAINT: mock slowest test.
  • a011765 Isolate the attack tests with setUp and tearDown methods
  • c7a9470 Add some regression tests for this change
  • fd34cf5 Swallow potential exceptions from showtraceback()
  • Additional commits viewable in compare view


Dependabot compatibility score

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


Dependabot commands and options
You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) - `@dependabot use these labels` will set the current labels as the default for future PRs for this repo and language - `@dependabot use these reviewers` will set the current reviewers as the default for future PRs for this repo and language - `@dependabot use these assignees` will set the current assignees as the default for future PRs for this repo and language - `@dependabot use this milestone` will set the current milestone as the default for future PRs for this repo and language You can disable automated security fix PRs for this repo from the [Security Alerts page](https://github.com/datadesk/census-data-downloader/network/alerts).

Bump cryptography from 38.0.4 to 39.0.1

opened on 2023-02-08 04:43:33 by dependabot[bot]

Bumps cryptography from 38.0.4 to 39.0.1.

Changelog

Sourced from cryptography's changelog.

39.0.1 - 2023-02-07


* **SECURITY ISSUE** - Fixed a bug where ``Cipher.update_into`` accepted Python
  buffer protocol objects, but allowed immutable buffers. **CVE-2023-23931**
* Updated Windows, macOS, and Linux wheels to be compiled with OpenSSL 3.0.8.

.. _v39-0-0:

39.0.0 - 2023-01-01

  • BACKWARDS INCOMPATIBLE: Support for OpenSSL 1.1.0 has been removed. Users on older version of OpenSSL will need to upgrade.
  • BACKWARDS INCOMPATIBLE: Dropped support for LibreSSL < 3.5. The new minimum LibreSSL version is 3.5.0. Going forward our policy is to support versions of LibreSSL that are available in versions of OpenBSD that are still receiving security support.
  • BACKWARDS INCOMPATIBLE: Removed the encode_point and from_encoded_point methods on :class:~cryptography.hazmat.primitives.asymmetric.ec.EllipticCurvePublicNumbers, which had been deprecated for several years. :meth:~cryptography.hazmat.primitives.asymmetric.ec.EllipticCurvePublicKey.public_bytes and :meth:~cryptography.hazmat.primitives.asymmetric.ec.EllipticCurvePublicKey.from_encoded_point should be used instead.
  • BACKWARDS INCOMPATIBLE: Support for using MD5 or SHA1 in :class:~cryptography.x509.CertificateBuilder, other X.509 builders, and PKCS7 has been removed.
  • BACKWARDS INCOMPATIBLE: Dropped support for macOS 10.10 and 10.11, macOS users must upgrade to 10.12 or newer.
  • ANNOUNCEMENT: The next version of cryptography (40.0) will change the way we link OpenSSL. This will only impact users who build cryptography from source (i.e., not from a wheel), and specify their own version of OpenSSL. For those users, the CFLAGS, LDFLAGS, INCLUDE, LIB, and CRYPTOGRAPHY_SUPPRESS_LINK_FLAGS environment variables will no longer be respected. Instead, users will need to configure their builds as documented here_.
  • Added support for :ref:disabling the legacy provider in OpenSSL 3.0.x<legacy-provider>.
  • Added support for disabling RSA key validation checks when loading RSA keys via :func:~cryptography.hazmat.primitives.serialization.load_pem_private_key, :func:~cryptography.hazmat.primitives.serialization.load_der_private_key, and :meth:~cryptography.hazmat.primitives.asymmetric.rsa.RSAPrivateNumbers.private_key. This speeds up key loading but is :term:unsafe if you are loading potentially attacker supplied keys.
  • Significantly improved performance for :class:~cryptography.hazmat.primitives.ciphers.aead.ChaCha20Poly1305

... (truncated)

Commits


Dependabot compatibility score

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


Dependabot commands and options
You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) - `@dependabot use these labels` will set the current labels as the default for future PRs for this repo and language - `@dependabot use these reviewers` will set the current reviewers as the default for future PRs for this repo and language - `@dependabot use these assignees` will set the current assignees as the default for future PRs for this repo and language - `@dependabot use this milestone` will set the current milestone as the default for future PRs for this repo and language You can disable automated security fix PRs for this repo from the [Security Alerts page](https://github.com/datadesk/census-data-downloader/network/alerts).

Bump certifi from 2022.9.24 to 2022.12.7

opened on 2022-12-09 09:37:15 by dependabot[bot]

Bumps certifi from 2022.9.24 to 2022.12.7.

Commits


Dependabot compatibility score

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


Dependabot commands and options
You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) - `@dependabot use these labels` will set the current labels as the default for future PRs for this repo and language - `@dependabot use these reviewers` will set the current reviewers as the default for future PRs for this repo and language - `@dependabot use these assignees` will set the current assignees as the default for future PRs for this repo and language - `@dependabot use this milestone` will set the current milestone as the default for future PRs for this repo and language You can disable automated security fix PRs for this repo from the [Security Alerts page](https://github.com/datadesk/census-data-downloader/network/alerts).

Add additional methods to base classes to let users support additional sources

opened on 2021-05-18 14:38:40 by ghing

This is somewhat related to #2.

I find this project to be extremely useful and a great framework for a task that I have to do often. In my projects, I've found myself using the base classes and concepts from this project when I want to download and process data from other Census Bureau API sources.

However, for non-ACS sources, I find myself entirely reimplementing many of the methods on my geotype downloader classes because the changes in functionality aren't possible by just calling super() and then adding additional logic.

I think adding these methods to BaseGeoTypeDownloader could make adding additional data sources easier, both in this project, and for other users in their own projects:

  • BaseGeoTypeDownloader.get_api_client(): This would be called from the constructor to set sefl.api and allow subclasses to specify a customized subclass of census.Census that supports additional API endpoints.
  • BaseGeoTypeDownloader.get_field_type_map(): This would be similar to BaseGeoTypeDownloader.get_raw_field_map() except it would map from raw field names to types that would be passed to pd.Series.astype(). Like BaseGeoTypeDownloader.get_raw_field_map(), this would be called from BaseGeoTypeDownloader.process() when setting the column types after reading in the raw table. The implementation could check for the existence of a FIELD_TYPES attribute on the table configuration class, and if that doesn't exist, default to the existing logic for ACS tables that checks the field name suffix. Adding the ability to explicitly set type conversions allows supporting non-ACS tables that might have field names that don't have the same suffix convention as ACS tables.

Fix README image references so they work on PyPI

opened on 2020-12-23 19:06:00 by palewire None

Migrate to GitHub actions for testing

opened on 2020-12-23 19:01:00 by palewire None

Releases

2020 data live 2022-06-10 16:15:04

2022-01-09 18:15:58

Los Angeles Times Data and Graphics Department

Reporting, editing, computer programming

GitHub Repository

census journalism data-journalism news python pandas api-wrapper demographics mapping-la-pipeline