Loading Datasets#

This notebook shows an example of how to load a dataset. It assumes you found the dataset using techniques shown in finding_datasets.ipynb The basic steps it demonstrates to load data is: 1. Find available datasets with opd.datasets.query 2. Create a data source using opd.Source and information from the previous step. 3. Find available data types for given years using get_tables_types and get_years 4. Load the data type for a given year using load

[1]:
import openpolicedata as opd
[2]:
# We will load Montgormery County, Maryland traffic stop data. First show our dataset options.
df = opd.datasets.query(table_type='TRAFFIC STOPS', state="Maryland")
df.head()
[2]:
State SourceName Agency AgencyFull TableType coverage_start coverage_end last_coverage_check Description source_url readme URL Year DataType date_field dataset_id agency_field min_version query
479 Maryland Maryland MULTIPLE NaN TRAFFIC STOPS 2007-01-01 2014-03-31 01/10/2024 Standardized stop data from the Stanford Open ... https://openpolicing.stanford.edu/data/ https://github.com/stanford-policylab/opp/blob... https://stacks.stanford.edu/file/druid:yg821jf... MULTIPLE CSV date <NA> department_name <NA> NaN
485 Maryland Montgomery County Montgomery County Montgomery County Police Department TRAFFIC STOPS 2012-06-07 2024-05-09 05/10/2024 This dataset contains traffic violation inform... https://data.montgomerycountymd.gov/Public-Saf... <NA> data.montgomerycountymd.gov MULTIPLE Socrata date_of_stop 4mse-ku6q <NA> <NA> NaN
[3]:
# To access the data, create a source using a Source Name (usually a police department name). There is an optional state input to clarify ambiguities.
# We will use the above cell's information for Maryland to choose the agency "Montgomery County" which we select for the source_name

src = opd.Source(source_name="Montgomery County", state="Maryland")
src.datasets.head()
[3]:
State SourceName Agency AgencyFull TableType coverage_start coverage_end last_coverage_check Description source_url readme URL Year DataType date_field dataset_id agency_field min_version query
480 Maryland Montgomery County Montgomery County Montgomery County Police Department COMPLAINTS 2013-10-24 2024-05-06 05/10/2024 This dataset contains allegations brought to t... https://data.montgomerycountymd.gov/Public-Saf... <NA> data.montgomerycountymd.gov MULTIPLE Socrata created_dt usip-62e2 <NA> <NA> NaN
481 Maryland Montgomery County Montgomery County Montgomery County Police Department CRASHES - INCIDENTS 2015-12-20 2024-01-03 05/10/2024 general information about each collision and d... https://data.montgomerycountymd.gov/Public-Saf... <NA> data.montgomerycountymd.gov MULTIPLE Socrata crash_date_time bhju-22kf <NA> 0.4 NaN
482 Maryland Montgomery County Montgomery County Montgomery County Police Department CRASHES - NONMOTORIST 2015-03-23 2023-12-31 05/10/2024 information on non-motorists (pedestrians and ... https://data.montgomerycountymd.gov/Public-Saf... <NA> data.montgomerycountymd.gov MULTIPLE Socrata crash_date_time n7fk-dce5 <NA> 0.5 NaN
483 Maryland Montgomery County Montgomery County Montgomery County Police Department CRASHES - SUBJECTS 2015-06-30 2024-01-03 05/10/2024 information on motor vehicle operators (driver... https://data.montgomerycountymd.gov/Public-Saf... <NA> data.montgomerycountymd.gov MULTIPLE Socrata crash_date_time mmzv-x632 <NA> 0.4 NaN
484 Maryland Montgomery County Montgomery County Montgomery County Police Department INCIDENTS 2017-04-02 2024-05-10 05/10/2024 list of Police Dispatched Incidents records https://data.montgomerycountymd.gov/Public-Saf... <NA> data.montgomerycountymd.gov MULTIPLE Socrata start_time 98cc-bc7d <NA> <NA> NaN
[4]:
# Find out what types of data are available from this source
types = src.get_tables_types()

print(types)
['COMPLAINTS', 'CRASHES - INCIDENTS', 'CRASHES - NONMOTORIST', 'CRASHES - SUBJECTS', 'INCIDENTS', 'TRAFFIC STOPS']
[5]:
# Find out what years are available from the stops table
# IF you do not have a key setup you may see the message: "WARNING:root:Requests made without an app_token will be subject to strict throttling limits." This is normal.
years = src.get_years(table_type=types[0])
print(years)
[2013, 2014, 2015, 2016, 2017, 2018, 2019, 2020, 2021, 2022, 2023, 2024]
[6]:
# Load traffic stop data for 2021
t = src.load(year=2021, table_type='TRAFFIC STOPS')
[7]:
# The loaded table is stored in the table parameter as a pandas DataFrame (https://pandas.pydata.org/docs/user_guide/10min.html#min)
# Show the first 5 rows of the table
t.table.head(n=5)
# Now you are ready for analyzing the data in the table t.
[7]:
geometry seq_id date_of_stop time_of_stop agency subagency description location latitude longitude ... driver_state dl_state arrest_type search_conducted search_outcome search_reason_for_stop search_disposition search_reason search_type search_arrest_reason
0 POINT (-77.13047 39.01268) f08d0293-6ade-4802-84c1-4b7b1a707245 2021-01-01 03:12:00 MCP 2nd District, Bethesda RECKLESS DRIVING VEHICLE IN WANTON AND WILLFUL... IFO 9609 SINGLETON DR 39.0126813333333 -77.130466 ... MD MD A - Marked Patrol NaN NaN NaN NaN NaN NaN NaN
1 POINT (-77.13047 39.01268) f08d0293-6ade-4802-84c1-4b7b1a707245 2021-01-01 03:12:00 MCP 2nd District, Bethesda FAILURE OF VEH. DRIVER IN ACCIDENT TO LOCATE A... IFO 9609 SINGLETON DR 39.0126813333333 -77.130466 ... MD MD A - Marked Patrol NaN NaN NaN NaN NaN NaN NaN
2 POINT (-77.13047 39.01268) f08d0293-6ade-4802-84c1-4b7b1a707245 2021-01-01 03:12:00 MCP 2nd District, Bethesda NEGLIGENT DRIVING VEHICLE IN CARELESS AND IMPR... IFO 9609 SINGLETON DR 39.0126813333333 -77.130466 ... MD MD A - Marked Patrol NaN NaN NaN NaN NaN NaN NaN
3 POINT (-77.13047 39.01268) f08d0293-6ade-4802-84c1-4b7b1a707245 2021-01-01 03:12:00 MCP 2nd District, Bethesda FAILURE OF VEH. DRIVER TO STOP AFTER UNATTENDE... IFO 9609 SINGLETON DR 39.0126813333333 -77.130466 ... MD MD A - Marked Patrol NaN NaN NaN NaN NaN NaN NaN
4 POINT (-77.13047 39.01268) f08d0293-6ade-4802-84c1-4b7b1a707245 2021-01-01 03:12:00 MCP 2nd District, Bethesda FAILURE OF VEH. DRIVER INVOLVED IN ACCIDENT TO... IFO 9609 SINGLETON DR 39.0126813333333 -77.130466 ... MD MD A - Marked Patrol NaN NaN NaN NaN NaN NaN NaN

5 rows × 43 columns