{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Loading Datasets" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This notebook shows an example of how to load a dataset. \n", "It assumes you found the dataset using techniques shown in `finding_datasets.ipynb`\n", "The basic steps it demonstrates to load data is:\n", "1. Find available datasets with `opd.datasets.query`\n", "2. Create a data source using `opd.Source` and information from the previous step.\n", "3. Find available data types for given years using `get_tables_types` and `get_years`\n", "4. Load the data type for a given year using `load`" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "import openpolicedata as opd" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
StateSourceNameAgencyAgencyFullTableTypecoverage_startcoverage_endlast_coverage_checkDescriptionsource_urlreadmeURLYearDataTypedate_fielddataset_idagency_fieldmin_versionquery
479MarylandMarylandMULTIPLENaNTRAFFIC STOPS2007-01-012014-03-3101/10/2024Standardized stop data from the Stanford Open ...https://openpolicing.stanford.edu/data/https://github.com/stanford-policylab/opp/blob...https://stacks.stanford.edu/file/druid:yg821jf...MULTIPLECSVdate<NA>department_name<NA>NaN
485MarylandMontgomery CountyMontgomery CountyMontgomery County Police DepartmentTRAFFIC STOPS2012-06-072024-05-0905/10/2024This dataset contains traffic violation inform...https://data.montgomerycountymd.gov/Public-Saf...<NA>data.montgomerycountymd.govMULTIPLESocratadate_of_stop4mse-ku6q<NA><NA>NaN
\n", "
" ], "text/plain": [ " State SourceName Agency \\\n", "479 Maryland Maryland MULTIPLE \n", "485 Maryland Montgomery County Montgomery County \n", "\n", " AgencyFull TableType coverage_start \\\n", "479 NaN TRAFFIC STOPS 2007-01-01 \n", "485 Montgomery County Police Department TRAFFIC STOPS 2012-06-07 \n", "\n", " coverage_end last_coverage_check \\\n", "479 2014-03-31 01/10/2024 \n", "485 2024-05-09 05/10/2024 \n", "\n", " Description \\\n", "479 Standardized stop data from the Stanford Open ... \n", "485 This dataset contains traffic violation inform... \n", "\n", " source_url \\\n", "479 https://openpolicing.stanford.edu/data/ \n", "485 https://data.montgomerycountymd.gov/Public-Saf... \n", "\n", " readme \\\n", "479 https://github.com/stanford-policylab/opp/blob... \n", "485 \n", "\n", " URL Year DataType \\\n", "479 https://stacks.stanford.edu/file/druid:yg821jf... MULTIPLE CSV \n", "485 data.montgomerycountymd.gov MULTIPLE Socrata \n", "\n", " date_field dataset_id agency_field min_version query \n", "479 date department_name NaN \n", "485 date_of_stop 4mse-ku6q NaN " ] }, "execution_count": 2, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# We will load Montgormery County, Maryland traffic stop data. First show our dataset options.\n", "df = opd.datasets.query(table_type='TRAFFIC STOPS', state=\"Maryland\")\n", "df.head()" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
StateSourceNameAgencyAgencyFullTableTypecoverage_startcoverage_endlast_coverage_checkDescriptionsource_urlreadmeURLYearDataTypedate_fielddataset_idagency_fieldmin_versionquery
480MarylandMontgomery CountyMontgomery CountyMontgomery County Police DepartmentCOMPLAINTS2013-10-242024-05-0605/10/2024This dataset contains allegations brought to t...https://data.montgomerycountymd.gov/Public-Saf...<NA>data.montgomerycountymd.govMULTIPLESocratacreated_dtusip-62e2<NA><NA>NaN
481MarylandMontgomery CountyMontgomery CountyMontgomery County Police DepartmentCRASHES - INCIDENTS2015-12-202024-01-0305/10/2024general information about each collision and d...https://data.montgomerycountymd.gov/Public-Saf...<NA>data.montgomerycountymd.govMULTIPLESocratacrash_date_timebhju-22kf<NA>0.4NaN
482MarylandMontgomery CountyMontgomery CountyMontgomery County Police DepartmentCRASHES - NONMOTORIST2015-03-232023-12-3105/10/2024information on non-motorists (pedestrians and ...https://data.montgomerycountymd.gov/Public-Saf...<NA>data.montgomerycountymd.govMULTIPLESocratacrash_date_timen7fk-dce5<NA>0.5NaN
483MarylandMontgomery CountyMontgomery CountyMontgomery County Police DepartmentCRASHES - SUBJECTS2015-06-302024-01-0305/10/2024information on motor vehicle operators (driver...https://data.montgomerycountymd.gov/Public-Saf...<NA>data.montgomerycountymd.govMULTIPLESocratacrash_date_timemmzv-x632<NA>0.4NaN
484MarylandMontgomery CountyMontgomery CountyMontgomery County Police DepartmentINCIDENTS2017-04-022024-05-1005/10/2024list of Police Dispatched Incidents recordshttps://data.montgomerycountymd.gov/Public-Saf...<NA>data.montgomerycountymd.govMULTIPLESocratastart_time98cc-bc7d<NA><NA>NaN
\n", "
" ], "text/plain": [ " State SourceName Agency \\\n", "480 Maryland Montgomery County Montgomery County \n", "481 Maryland Montgomery County Montgomery County \n", "482 Maryland Montgomery County Montgomery County \n", "483 Maryland Montgomery County Montgomery County \n", "484 Maryland Montgomery County Montgomery County \n", "\n", " AgencyFull TableType \\\n", "480 Montgomery County Police Department COMPLAINTS \n", "481 Montgomery County Police Department CRASHES - INCIDENTS \n", "482 Montgomery County Police Department CRASHES - NONMOTORIST \n", "483 Montgomery County Police Department CRASHES - SUBJECTS \n", "484 Montgomery County Police Department INCIDENTS \n", "\n", " coverage_start coverage_end last_coverage_check \\\n", "480 2013-10-24 2024-05-06 05/10/2024 \n", "481 2015-12-20 2024-01-03 05/10/2024 \n", "482 2015-03-23 2023-12-31 05/10/2024 \n", "483 2015-06-30 2024-01-03 05/10/2024 \n", "484 2017-04-02 2024-05-10 05/10/2024 \n", "\n", " Description \\\n", "480 This dataset contains allegations brought to t... \n", "481 general information about each collision and d... \n", "482 information on non-motorists (pedestrians and ... \n", "483 information on motor vehicle operators (driver... \n", "484 list of Police Dispatched Incidents records \n", "\n", " source_url readme \\\n", "480 https://data.montgomerycountymd.gov/Public-Saf... \n", "481 https://data.montgomerycountymd.gov/Public-Saf... \n", "482 https://data.montgomerycountymd.gov/Public-Saf... \n", "483 https://data.montgomerycountymd.gov/Public-Saf... \n", "484 https://data.montgomerycountymd.gov/Public-Saf... \n", "\n", " URL Year DataType date_field \\\n", "480 data.montgomerycountymd.gov MULTIPLE Socrata created_dt \n", "481 data.montgomerycountymd.gov MULTIPLE Socrata crash_date_time \n", "482 data.montgomerycountymd.gov MULTIPLE Socrata crash_date_time \n", "483 data.montgomerycountymd.gov MULTIPLE Socrata crash_date_time \n", "484 data.montgomerycountymd.gov MULTIPLE Socrata start_time \n", "\n", " dataset_id agency_field min_version query \n", "480 usip-62e2 NaN \n", "481 bhju-22kf 0.4 NaN \n", "482 n7fk-dce5 0.5 NaN \n", "483 mmzv-x632 0.4 NaN \n", "484 98cc-bc7d NaN " ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# To access the data, create a source using a Source Name (usually a police department name). There is an optional state input to clarify ambiguities.\n", "# We will use the above cell's information for Maryland to choose the agency \"Montgomery County\" which we select for the source_name\n", "\n", "src = opd.Source(source_name=\"Montgomery County\", state=\"Maryland\")\n", "src.datasets.head()" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "['COMPLAINTS', 'CRASHES - INCIDENTS', 'CRASHES - NONMOTORIST', 'CRASHES - SUBJECTS', 'INCIDENTS', 'TRAFFIC STOPS']\n" ] } ], "source": [ "# Find out what types of data are available from this source\n", "types = src.get_tables_types()\n", "\n", "print(types)" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[2013, 2014, 2015, 2016, 2017, 2018, 2019, 2020, 2021, 2022, 2023, 2024]\n" ] } ], "source": [ "# Find out what years are available from the stops table\n", "# IF you do not have a key setup you may see the message: \"WARNING:root:Requests made without an app_token will be subject to strict throttling limits.\" This is normal.\n", "years = src.get_years(table_type=types[0])\n", "print(years)" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [], "source": [ "# Load traffic stop data for 2021\n", "t = src.load(year=2021, table_type='TRAFFIC STOPS')" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
geometryseq_iddate_of_stoptime_of_stopagencysubagencydescriptionlocationlatitudelongitude...driver_statedl_statearrest_typesearch_conductedsearch_outcomesearch_reason_for_stopsearch_dispositionsearch_reasonsearch_typesearch_arrest_reason
0POINT (-77.13047 39.01268)f08d0293-6ade-4802-84c1-4b7b1a7072452021-01-0103:12:00MCP2nd District, BethesdaRECKLESS DRIVING VEHICLE IN WANTON AND WILLFUL...IFO 9609 SINGLETON DR39.0126813333333-77.130466...MDMDA - Marked PatrolNaNNaNNaNNaNNaNNaNNaN
1POINT (-77.13047 39.01268)f08d0293-6ade-4802-84c1-4b7b1a7072452021-01-0103:12:00MCP2nd District, BethesdaFAILURE OF VEH. DRIVER IN ACCIDENT TO LOCATE A...IFO 9609 SINGLETON DR39.0126813333333-77.130466...MDMDA - Marked PatrolNaNNaNNaNNaNNaNNaNNaN
2POINT (-77.13047 39.01268)f08d0293-6ade-4802-84c1-4b7b1a7072452021-01-0103:12:00MCP2nd District, BethesdaNEGLIGENT DRIVING VEHICLE IN CARELESS AND IMPR...IFO 9609 SINGLETON DR39.0126813333333-77.130466...MDMDA - Marked PatrolNaNNaNNaNNaNNaNNaNNaN
3POINT (-77.13047 39.01268)f08d0293-6ade-4802-84c1-4b7b1a7072452021-01-0103:12:00MCP2nd District, BethesdaFAILURE OF VEH. DRIVER TO STOP AFTER UNATTENDE...IFO 9609 SINGLETON DR39.0126813333333-77.130466...MDMDA - Marked PatrolNaNNaNNaNNaNNaNNaNNaN
4POINT (-77.13047 39.01268)f08d0293-6ade-4802-84c1-4b7b1a7072452021-01-0103:12:00MCP2nd District, BethesdaFAILURE OF VEH. DRIVER INVOLVED IN ACCIDENT TO...IFO 9609 SINGLETON DR39.0126813333333-77.130466...MDMDA - Marked PatrolNaNNaNNaNNaNNaNNaNNaN
\n", "

5 rows × 43 columns

\n", "
" ], "text/plain": [ " geometry seq_id \\\n", "0 POINT (-77.13047 39.01268) f08d0293-6ade-4802-84c1-4b7b1a707245 \n", "1 POINT (-77.13047 39.01268) f08d0293-6ade-4802-84c1-4b7b1a707245 \n", "2 POINT (-77.13047 39.01268) f08d0293-6ade-4802-84c1-4b7b1a707245 \n", "3 POINT (-77.13047 39.01268) f08d0293-6ade-4802-84c1-4b7b1a707245 \n", "4 POINT (-77.13047 39.01268) f08d0293-6ade-4802-84c1-4b7b1a707245 \n", "\n", " date_of_stop time_of_stop agency subagency \\\n", "0 2021-01-01 03:12:00 MCP 2nd District, Bethesda \n", "1 2021-01-01 03:12:00 MCP 2nd District, Bethesda \n", "2 2021-01-01 03:12:00 MCP 2nd District, Bethesda \n", "3 2021-01-01 03:12:00 MCP 2nd District, Bethesda \n", "4 2021-01-01 03:12:00 MCP 2nd District, Bethesda \n", "\n", " description location \\\n", "0 RECKLESS DRIVING VEHICLE IN WANTON AND WILLFUL... IFO 9609 SINGLETON DR \n", "1 FAILURE OF VEH. DRIVER IN ACCIDENT TO LOCATE A... IFO 9609 SINGLETON DR \n", "2 NEGLIGENT DRIVING VEHICLE IN CARELESS AND IMPR... IFO 9609 SINGLETON DR \n", "3 FAILURE OF VEH. DRIVER TO STOP AFTER UNATTENDE... IFO 9609 SINGLETON DR \n", "4 FAILURE OF VEH. DRIVER INVOLVED IN ACCIDENT TO... IFO 9609 SINGLETON DR \n", "\n", " latitude longitude ... driver_state dl_state arrest_type \\\n", "0 39.0126813333333 -77.130466 ... MD MD A - Marked Patrol \n", "1 39.0126813333333 -77.130466 ... MD MD A - Marked Patrol \n", "2 39.0126813333333 -77.130466 ... MD MD A - Marked Patrol \n", "3 39.0126813333333 -77.130466 ... MD MD A - Marked Patrol \n", "4 39.0126813333333 -77.130466 ... MD MD A - Marked Patrol \n", "\n", " search_conducted search_outcome search_reason_for_stop search_disposition \\\n", "0 NaN NaN NaN NaN \n", "1 NaN NaN NaN NaN \n", "2 NaN NaN NaN NaN \n", "3 NaN NaN NaN NaN \n", "4 NaN NaN NaN NaN \n", "\n", " search_reason search_type search_arrest_reason \n", "0 NaN NaN NaN \n", "1 NaN NaN NaN \n", "2 NaN NaN NaN \n", "3 NaN NaN NaN \n", "4 NaN NaN NaN \n", "\n", "[5 rows x 43 columns]" ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# The loaded table is stored in the table parameter as a pandas DataFrame (https://pandas.pydata.org/docs/user_guide/10min.html#min)\n", "# Show the first 5 rows of the table\n", "t.table.head(n=5)\n", "# Now you are ready for analyzing the data in the table t." ] } ], "metadata": { "kernelspec": { "display_name": "Python 3.9.12 ('opd')", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.9.7" }, "orig_nbformat": 4, "vscode": { "interpreter": { "hash": "a73158d29711b2da05ac73de25b71e5d8cae591f14917bba77a9573b5c85a0ce" } } }, "nbformat": 4, "nbformat_minor": 2 }