Package dp_policy
Impacts of Uncertainty & Differential Privacy on Title I
Author: Ryan Steed, with help from Terrance Liu
Paper co-authors: Terrance Liu, Steven Wu, Alessandro Acquisti
API documentation: rbsteed.com/dp-policy
For more, check out the paper and SI.
Installation
make dp_policy
Running the CLI
Use the CLI endpoints in dp_policy/api.py
.
dp_policy --help
# to run a specific experiment
dp_policy run [experiment]
# to only produce the feather file for regression analysis (using cached results)
dp_policy run --just-join [name]
# to run all experiments
dp_policy run_all
Experiment options (described in detail the SI and passed to the Experiment.get_experiment
factory method) include:
"baseline"
- just the baseline settings (no experimental modifications).- Policy changes
"hold_harmless"
- treatments which add one or both of the post-formula provisions (hold harmless and the state minimum)."thresholds"
- treatments which modify the thresholds for district funding eligibility."moving_average"
- treatments which use multiyear averages of varying size."budget
" - treatments which vary the overall Title I appropriation.
- Robustness checks
"post_processing"
- treatments which modify the post-processing applied after noise injection."epsilon"
- treatments which modify the privacy parameter epsilon."sampling
" - treatments which vary the variance of simulated data error."vary_total_children"
- treatment where the number of total children is also noised.
Replicating Results
- For general statistics, run cells in
notebooks/results.ipynb
. - Generate all the experimental results by running
dp_policy run_all
or running chosen experiments individually withdp_policy run [experiment]
. - Visualize experiment results with
notebooks/policy_experiments.ipynb
. (For example, Fig. 1 was produced with statistics from the Epsilon Sensitivity section ofnotebooks/policy_experiments.ipynb
.) - Produce disparity plots and GAM smooth plots with
R/plot_all.R
. (For example, Fig. 2 is the race disparity plot for the"hold_harmless"
experiment.)
Contents
data/
discrimination/
- ACS 5-year data for discrimination analysisshapefiles/
- TIGER shapefiles for school districtstitlei-allocations
- official dep. of ed. figures, from Todd Stephensonsaipe*
,county_saipe*
- district- and county-level SAIPE datafips_codes.csv
- map of FIPS codes to postal codes and state namesnslp19.csv
- National School Lunch program data (exploration only)sppe*
- state per-pupil expenditure data
dp_policy/
- codebasetitlei/
- submodule for replicating the Title I allocation procedure, with noiseallocators.py
- allocation proceduresbootstrap.py
- exploratory functions for sampling experimentsevaluation.py
- utility functions for evaluating resultsmechanisms.py
- randomization (noise injection) mechanismsthresholders.py
- thresholding mechanisms for formulautils.py
- utility functions
api.py
- endpoints for CLIconfig.py
- settingsexperiments.py
- set of experiment configurations for replicating results
logs/
- logs for recording runsnotebooks/
- Jupyter notebooks for exploration and visualizationresults.ipynb
- main notebook for replicating and visualizing auxiliary experiment resultspolicy-experiments.ipynb
- notebook for visualizing results of policy experimentsnslp.ipynb
- exploring NSLP data as an alternative ground truthplot_sampling.ipynb
- developing sampling mechanisms
plots/
- output plotsR/
- R scripts for regression and visualizationexploration.Rmd
- exploring resultsplot_all.R
- plots/regressions for all experimentsplot_experiment.R
- plots/regressions for one experimentplots.R
- endpoints for plotting results and running regressionsregression_tables.R
- endpoint for recording regression tablesregressions.Rmd
- exploring regression specificationsutils.R
- utility functions for plotting and regressions
results/
- cached results filespolicy_experiments/
- for experiment runsregressions/
- for regressions
scripts/
- miscellaneous bash scripts to make server runs easier
Documentation
Documentation for the dp-policy
API is published at rbsteed.com/dp-policy.
To generate the documentation, use pdoc3:
pdoc --html --output-dir docs --force dp_policy --template-dir docs/templates
git subtree push --prefix docs/dp_policy origin gh-pages
Expand source code Browse git
"""
.. include:: ../README.md
"""
Sub-modules
dp_policy.api
dp_policy.config
dp_policy.experiments
dp_policy.titlei