{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Related Tables\n", "In some cases, an agency will split data on a topic across multiple tables to enable better representation of many-to-one relationships. For example, a single use of force incident can involve multiple subjects and/or officers. In this case, there might be 3 use of force tables for each of the following:\n", "\n", "- Incidents\n", "- Officers\n", "- Subjects\n", "\n", "Usually, in this case, there will a unique ID for each incident that will appear in each table. This ID can be used to look up which subjects and officers were involved in the incident.\n", "\n", "Table types for related tables in OpenPoliceData are of the form `{general table type} - {sub-type}` where `general table type` describes the type of interaction (i.e. use of force, complaints, etc.) and `sub-type` indicates the subset of data contained in the table (such as Incidents, Officers, and Subjects in the example above)." ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "nbsphinx": "hidden" }, "outputs": [], "source": [ "# This cell should have \"nbsphinx\": \"hidden\" in its metadata and not be included in the documentation!\n", "import sys\n", "sys.path.append(\"../../..\")\n", "\n", "import warnings\n", "warnings.filterwarnings(\"ignore\")" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
StateSourceNameAgencyAgencyFullTableTypecoverage_startcoverage_endlast_coverage_checkDescriptionsource_urlreadmeURLYearDataTypedate_fielddataset_idagency_fieldmin_versionquery
41ArizonaTucsonTucsonTucson Police DepartmentOFFICER-INVOLVED SHOOTINGS - INCIDENTS2010-10-182018-05-1807/06/2023This approximately 9-year dataset of shooting ...NaN<NA>https://publicgis.tucsonaz.gov/open/rest/servi...MULTIPLEArcGISINCI_DATE<NA><NA><NA>NaN
42ArizonaTucsonTucsonTucson Police DepartmentOFFICER-INVOLVED SHOOTINGS - OFFICERSNaTNaT07/06/2023This approximately 9-year dataset of suspects ...NaN<NA>https://publicgis.tucsonaz.gov/open/rest/servi...NONEArcGIS<NA><NA><NA><NA>NaN
43ArizonaTucsonTucsonTucson Police DepartmentOFFICER-INVOLVED SHOOTINGS - SUBJECTSNaTNaT07/06/2023This approximately 9-year dataset of suspects ...NaN<NA>https://publicgis.tucsonaz.gov/open/rest/servi...NONEArcGIS<NA><NA><NA><NA>NaN
\n", "
" ], "text/plain": [ " State SourceName Agency AgencyFull \\\n", "41 Arizona Tucson Tucson Tucson Police Department \n", "42 Arizona Tucson Tucson Tucson Police Department \n", "43 Arizona Tucson Tucson Tucson Police Department \n", "\n", " TableType coverage_start coverage_end \\\n", "41 OFFICER-INVOLVED SHOOTINGS - INCIDENTS 2010-10-18 2018-05-18 \n", "42 OFFICER-INVOLVED SHOOTINGS - OFFICERS NaT NaT \n", "43 OFFICER-INVOLVED SHOOTINGS - SUBJECTS NaT NaT \n", "\n", " last_coverage_check Description \\\n", "41 07/06/2023 This approximately 9-year dataset of shooting ... \n", "42 07/06/2023 This approximately 9-year dataset of suspects ... \n", "43 07/06/2023 This approximately 9-year dataset of suspects ... \n", "\n", " source_url readme URL \\\n", "41 NaN https://publicgis.tucsonaz.gov/open/rest/servi... \n", "42 NaN https://publicgis.tucsonaz.gov/open/rest/servi... \n", "43 NaN https://publicgis.tucsonaz.gov/open/rest/servi... \n", "\n", " Year DataType date_field dataset_id agency_field min_version query \n", "41 MULTIPLE ArcGIS INCI_DATE NaN \n", "42 NONE ArcGIS NaN \n", "43 NONE ArcGIS NaN " ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "import openpolicedata as opd\n", "src = opd.Source(\"Tucson\")\n", "src.datasets[src.datasets['TableType'].str.contains(\"SHOOTING\")]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "`Source.find_related_tables` can be used to find tables that are related:" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "('OFFICER-INVOLVED SHOOTINGS - OFFICERS',\n", " 'OFFICER-INVOLVED SHOOTINGS - SUBJECTS')" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Find Tucson datasets related to the OFFICER-INVOLVED SHOOTINGS - INCIDENTS data\n", "related_tables, corresponding_years = src.find_related_tables(opd.TableType.SHOOTINGS_INCIDENTS)\n", "related_tables" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "To help with analyzing related datasets, `Table.merge` can be used to merge related tables. Note that the left table in the merge, currently must be [standardized](index.ipynb#Data-Standardization)." ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
INCIDENT_IDDATEZIP_CODEOBJECTID_1XYBOI_NUMADDRESSCITYSTATE...OBJECTIDBOI_NUM_FromMergedSUS_WEAPONSUS_INJDECOFC_INJDECSUS_AGESUS_RACESUS_GENDDATASOURCE_FromMergedRAW_INCI_NUM_FromMerged
010101805422010-10-18 00:00:00+00:008570119.931871e+05445491.49838010-0658E Broadway Bl / S 6th AvTUCSONAZ...110-0658HandgunDeceasedNo34IndianMaleNone1010180542
111031080012011-03-09 00:00:00+00:008571121.021717e+06437594.37968611-01455402 E 30th StTUCSONAZ...211-0145VehicleDeceasedInjured22WhiteMaleNone1103108001
211031080012011-03-09 00:00:00+00:008571121.021717e+06437594.37968611-01455402 E 30th StTUCSONAZ...311-0145VehicleInjuredNo42BlackFemaleNone1103108001
311031080012011-03-09 00:00:00+00:008571121.021717e+06437594.37968611-01455402 E 30th StTUCSONAZ...411-0145VehicleNoNo25BlackMaleNone1103108001
411052801402011-05-28 00:00:00+00:008571439.942417e+05424406.07105411-0317200 E Irvington RdTUCSONAZ...511-0317HandgunDeceasedNo26HispanicMaleNone1105280140
\n", "

5 rows × 25 columns

\n", "
" ], "text/plain": [ " INCIDENT_ID DATE ZIP_CODE OBJECTID_1 X \\\n", "0 1010180542 2010-10-18 00:00:00+00:00 85701 1 9.931871e+05 \n", "1 1103108001 2011-03-09 00:00:00+00:00 85711 2 1.021717e+06 \n", "2 1103108001 2011-03-09 00:00:00+00:00 85711 2 1.021717e+06 \n", "3 1103108001 2011-03-09 00:00:00+00:00 85711 2 1.021717e+06 \n", "4 1105280140 2011-05-28 00:00:00+00:00 85714 3 9.942417e+05 \n", "\n", " Y BOI_NUM ADDRESS CITY STATE ... \\\n", "0 445491.498380 10-0658 E Broadway Bl / S 6th Av TUCSON AZ ... \n", "1 437594.379686 11-0145 5402 E 30th St TUCSON AZ ... \n", "2 437594.379686 11-0145 5402 E 30th St TUCSON AZ ... \n", "3 437594.379686 11-0145 5402 E 30th St TUCSON AZ ... \n", "4 424406.071054 11-0317 200 E Irvington Rd TUCSON AZ ... \n", "\n", " OBJECTID BOI_NUM_FromMerged SUS_WEAPON SUS_INJDEC OFC_INJDEC SUS_AGE \\\n", "0 1 10-0658 Handgun Deceased No 34 \n", "1 2 11-0145 Vehicle Deceased Injured 22 \n", "2 3 11-0145 Vehicle Injured No 42 \n", "3 4 11-0145 Vehicle No No 25 \n", "4 5 11-0317 Handgun Deceased No 26 \n", "\n", " SUS_RACE SUS_GEND DATASOURCE_FromMerged RAW_INCI_NUM_FromMerged \n", "0 Indian Male None 1010180542 \n", "1 White Male None 1103108001 \n", "2 Black Female None 1103108001 \n", "3 Black Male None 1103108001 \n", "4 Hispanic Male None 1105280140 \n", "\n", "[5 rows x 25 columns]" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "t1 = src.load(opd.TableType.SHOOTINGS_INCIDENTS, year=opd.defs.MULTI) # Year values are from src.datasets table\n", "t1.standardize() # Left table in the merge must currently be standardized\n", "t2 = src.load(opd.TableType.SHOOTINGS_SUBJECTS, year=\"NONE\")\n", "t3 = t1.merge(t2, std_id=True)\n", "t3.table.head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Note that the merged table has incident details on the left (date, location) and subject demographics on the right (SUS_RACE, SUS_GENDER). `std_id=True` requests that the unique ID column be identified and standardized to have a standardized name (INCIDENT_ID). Additionally, `Table.merge` accepts the same inputs as [pandas.merge](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.merge.html). As in the above example, if no column is selected to perform on the merge on, `Table.merge` defaults to attempting to find a unique ID column to use (even if `std_id` is False)." ] } ], "metadata": { "kernelspec": { "display_name": "opd-docs", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.11.3" } }, "nbformat": 4, "nbformat_minor": 2 }