Summarizing OpenPoliceData Data#
This notebook shows examples of the following tasks: - Summarizing available data in OpenPoliceData (OPD) - Exporting data summaries - Generating your own data summaries
[2]:
import openpolicedata as opd
[3]:
# Get The number of unique datasets (unique state, source, agency, and table type)
print(f"The OpenPoliceData package has {opd.datasets.num_unique()} unique datasets")
The OpenPoliceData package has 425 unique datasets
[11]:
# Find how many datasets are available for full states and how many are available for individual agencies
print(f"OPD has at least 1 datasets for all reporting agencies in {opd.datasets.num_sources(full_states_only=True)} states")
print(f"OPD has at least 1 dataset for {opd.datasets.num_sources()-opd.datasets.num_sources(full_states_only=True)} individual agencies")
OPD has at least 1 datasets for all reporting agencies in 10 states
OPD has at least 1 dataset for 158 individual agencies
[5]:
# Find number of datasets from each state
opd.datasets.summary_by_state().head(10)
[5]:
| Total | |
|---|---|
| State | |
| California | |
| All State Agencies | 2 |
| Individual Agency | 58 |
| North Carolina | |
| All State Agencies | 1 |
| Individual Agency | 31 |
| New York | |
| All State Agencies | 1 |
| Individual Agency | 29 |
| Arizona | 23 |
[6]:
# Find number of datasets from each state broken down by year
opd.datasets.summary_by_state(by="year").head(7)
[6]:
| Total | N/A | MULTI-YEAR | 2024 | 2023 | 2022 | 2021 | 2020 | 2019 | 2018 | ... | 2012 | 2011 | 2010 | 2009 | 2008 | 2007 | 2006 | 2005 | 2004 | 2003 | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| State | |||||||||||||||||||||
| California | ... | ||||||||||||||||||||
| All State Agencies | 2 | 0 | 1 | 0 | 0 | 1 | 1 | 1 | 1 | 1 | ... | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| Individual Agency | 58 | 3 | 42 | 4 | 7 | 9 | 11 | 11 | 14 | 10 | ... | 3 | 2 | 2 | 1 | 1 | 1 | 1 | 0 | 0 | 0 |
| North Carolina | ... | ||||||||||||||||||||
| All State Agencies | 1 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ... | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| Individual Agency | 31 | 3 | 26 | 0 | 1 | 2 | 2 | 2 | 2 | 2 | ... | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 1 | 0 | 0 |
| New York | ... |
7 rows × 25 columns
[7]:
# Find number of datasets from each state broken down by table type
opd.datasets.summary_by_state(by="table").head(7)
[7]:
| Total | ARRESTS | CALLS FOR SERVICE | CITATIONS | COMPLAINTS | COMPLAINTS - ALLEGATIONS | COMPLAINTS - BACKGROUND | COMPLAINTS - BODY WORN CAMERA | COMPLAINTS - OFFICERS | COMPLAINTS - PENALTIES | ... | TRAFFIC STOPS | TRAFFIC STOPS - INCIDENTS | TRAFFIC STOPS - SUBJECTS | TRAFFIC WARNINGS | USE OF FORCE | USE OF FORCE - INCIDENTS | USE OF FORCE - OFFICERS | USE OF FORCE - SUBJECTS | USE OF FORCE - SUBJECTS/OFFICERS | VEHICLE PURSUITS | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| State | |||||||||||||||||||||
| California | ... | ||||||||||||||||||||
| All State Agencies | 2 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ... | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 |
| Individual Agency | 58 | 0 | 10 | 3 | 2 | 1 | 1 | 1 | 0 | 0 | ... | 7 | 0 | 0 | 0 | 3 | 0 | 0 | 0 | 1 | 2 |
| North Carolina | ... | ||||||||||||||||||||
| All State Agencies | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | ... | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| Individual Agency | 31 | 3 | 3 | 2 | 1 | 0 | 0 | 0 | 0 | 0 | ... | 7 | 1 | 1 | 0 | 1 | 0 | 1 | 1 | 0 | 0 |
| New York | ... |
7 rows × 47 columns
[9]:
# Find number of datasets for each type of table
opd.datasets.summary_by_table_type()
[9]:
| Total | Definition | |
|---|---|---|
| TableType | ||
| STOPS-RELATED | ||
| Single Table | ||
| STOPS | 37 | Contains data on both pedestrian and traffic s... |
| Multi-Table | ||
| TRAFFIC STOPS | 71 | Traffic stops are stops by police of motor veh... |
| ... | ... | ... |
| POINTING WEAPON | 2 | Instances of officers pointing a weapon (firea... |
| LAWSUITS | 2 | Lawsuits against a police department |
| INCIDENTS - SUBJECTS | 1 | Incidents data may be split into several table... |
| INCIDENTS - INCIDENTS | 1 | Incidents data may be split into several table... |
| DISCIPLINARY RECORDS | 1 | Disciplinary records of officers |
64 rows × 2 columns
[10]:
# Find number of datasets for each type of table broken down by year
opd.datasets.summary_by_table_type(by_year=True).head()
[10]:
| Total | N/A | MULTI-YEAR | 2024 | 2023 | 2022 | 2021 | 2020 | 2019 | 2018 | ... | 2011 | 2010 | 2009 | 2008 | 2007 | 2006 | 2005 | 2004 | 2003 | Definition | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| TableType | |||||||||||||||||||||
| STOPS-RELATED | ... | ||||||||||||||||||||
| Single Table | ... | ||||||||||||||||||||
| STOPS | 37 | 0 | 34 | 0 | 3 | 3 | 3 | 2 | 2 | 2 | ... | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | Contains data on both pedestrian and traffic s... |
| Multi-Table | ... | ||||||||||||||||||||
| TRAFFIC STOPS | 71 | 0 | 67 | 0 | 3 | 5 | 5 | 6 | 6 | 6 | ... | 1 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | Traffic stops are stops by police of motor veh... |
5 rows × 26 columns
[ ]:
# All returned summary tables are pandas DataFrames so they can be easily exported to CSV files using pandas to_csv.
# Find number of datasets for each type of table broken down by year
opd.datasets.summary_by_table_type(by_year=True).to_csv("table_summary_by_year.csv")