{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Related Tables\n", "In some cases, an agency will split data on a topic across multiple tables to enable better representation of many-to-one relationships. For example, a single use of force incident can involve multiple subjects and/or officers. In this case, there might be 3 use of force tables for each of the following:\n", "\n", "- Incidents\n", "- Officers\n", "- Subjects\n", "\n", "Usually, in this case, there will a unique ID for each incident that will appear in each table. This ID can be used to look up which subjects and officers were involved in the incident.\n", "\n", "Table types for related tables in OpenPoliceData are of the form `{general table type} - {sub-type}` where `general table type` describes the type of interaction (i.e. use of force, complaints, etc.) and `sub-type` indicates the subset of data contained in the table (such as Incidents, Officers, and Subjects in the example above)." ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "nbsphinx": "hidden" }, "outputs": [], "source": [ "# This cell should have \"nbsphinx\": \"hidden\" in its metadata and not be included in the documentation!\n", "import sys\n", "sys.path.append(\"../../..\")\n", "\n", "import warnings\n", "warnings.filterwarnings(\"ignore\")" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
StateSourceNameAgencyAgencyFullTableTypecoverage_startcoverage_endlast_coverage_checkDescriptionsource_urlreadmeURLYearDataTypedate_fielddataset_idagency_fieldmin_versionquery
43ArizonaTucsonTucsonTucson Police DepartmentOFFICER-INVOLVED SHOOTINGS - INCIDENTS2010-10-182018-05-1807/06/2023This approximately 9-year dataset of shooting ...https://gisdata.tucsonaz.gov/datasets/cotgis::...<NA>https://publicgis.tucsonaz.gov/open/rest/servi...MULTIPLEArcGISINCI_DATE<NA><NA>0.7NaN
44ArizonaTucsonTucsonTucson Police DepartmentOFFICER-INVOLVED SHOOTINGS - OFFICERSNaTNaT07/06/2023This approximately 9-year dataset of suspects ...https://gisdata.tucsonaz.gov/datasets/cotgis::...<NA>https://publicgis.tucsonaz.gov/open/rest/servi...NONEArcGIS<NA><NA><NA><NA>NaN
45ArizonaTucsonTucsonTucson Police DepartmentOFFICER-INVOLVED SHOOTINGS - SUBJECTSNaTNaT07/06/2023This approximately 9-year dataset of suspects ...https://gisdata.tucsonaz.gov/datasets/cotgis::...<NA>https://publicgis.tucsonaz.gov/open/rest/servi...NONEArcGIS<NA><NA><NA><NA>NaN
\n", "
" ], "text/plain": [ " State SourceName Agency AgencyFull \\\n", "43 Arizona Tucson Tucson Tucson Police Department \n", "44 Arizona Tucson Tucson Tucson Police Department \n", "45 Arizona Tucson Tucson Tucson Police Department \n", "\n", " TableType coverage_start coverage_end \\\n", "43 OFFICER-INVOLVED SHOOTINGS - INCIDENTS 2010-10-18 2018-05-18 \n", "44 OFFICER-INVOLVED SHOOTINGS - OFFICERS NaT NaT \n", "45 OFFICER-INVOLVED SHOOTINGS - SUBJECTS NaT NaT \n", "\n", " last_coverage_check Description \\\n", "43 07/06/2023 This approximately 9-year dataset of shooting ... \n", "44 07/06/2023 This approximately 9-year dataset of suspects ... \n", "45 07/06/2023 This approximately 9-year dataset of suspects ... \n", "\n", " source_url readme \\\n", "43 https://gisdata.tucsonaz.gov/datasets/cotgis::... \n", "44 https://gisdata.tucsonaz.gov/datasets/cotgis::... \n", "45 https://gisdata.tucsonaz.gov/datasets/cotgis::... \n", "\n", " URL Year DataType \\\n", "43 https://publicgis.tucsonaz.gov/open/rest/servi... MULTIPLE ArcGIS \n", "44 https://publicgis.tucsonaz.gov/open/rest/servi... NONE ArcGIS \n", "45 https://publicgis.tucsonaz.gov/open/rest/servi... NONE ArcGIS \n", "\n", " date_field dataset_id agency_field min_version query \n", "43 INCI_DATE 0.7 NaN \n", "44 NaN \n", "45 NaN " ] }, "execution_count": 2, "metadata": {}, "output_type": "execute_result" } ], "source": [ "import openpolicedata as opd\n", "src = opd.Source(\"Tucson\")\n", "src.datasets[src.datasets['TableType'].str.contains(\"SHOOTING\")]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "`Source.find_related_tables` can be used to find tables that are related:" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "('OFFICER-INVOLVED SHOOTINGS - OFFICERS',\n", " 'OFFICER-INVOLVED SHOOTINGS - SUBJECTS')" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Find Tucson datasets related to the OFFICER-INVOLVED SHOOTINGS - INCIDENTS data\n", "related_tables, corresponding_years = src.find_related_tables(opd.TableType.SHOOTINGS_INCIDENTS)\n", "related_tables" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "To help with analyzing related datasets, `Table.merge` can be used to merge related tables. Note that the left table in the merge, currently must be [standardized](index.ipynb#Data-Standardization)." ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
INCIDENT_IDDATEZIP_CODEOBJECTID_1XYBOI_NUMADDRESSCITYSTATE...OBJECTIDBOI_NUM_FromMergedSUS_WEAPONSUS_INJDECOFC_INJDECSUS_AGESUS_RACESUS_GENDDATASOURCE_FromMergedRAW_INCI_NUM_FromMerged
010101805422010-10-188570119.931871e+05445491.49838010-0658E Broadway Bl / S 6th AvTUCSONAZ...110-0658HandgunDeceasedNo34IndianMaleNone1010180542
111031080012011-03-098571121.021717e+06437594.37968611-01455402 E 30th StTUCSONAZ...211-0145VehicleDeceasedInjured22WhiteMaleNone1103108001
211031080012011-03-098571121.021717e+06437594.37968611-01455402 E 30th StTUCSONAZ...311-0145VehicleInjuredNo42BlackFemaleNone1103108001
311031080012011-03-098571121.021717e+06437594.37968611-01455402 E 30th StTUCSONAZ...411-0145VehicleNoNo25BlackMaleNone1103108001
411052801402011-05-288571439.942417e+05424406.07105411-0317200 E Irvington RdTUCSONAZ...511-0317HandgunDeceasedNo26HispanicMaleNone1105280140
\n", "

5 rows × 25 columns

\n", "
" ], "text/plain": [ " INCIDENT_ID DATE ZIP_CODE OBJECTID_1 X Y \\\n", "0 1010180542 2010-10-18 85701 1 9.931871e+05 445491.498380 \n", "1 1103108001 2011-03-09 85711 2 1.021717e+06 437594.379686 \n", "2 1103108001 2011-03-09 85711 2 1.021717e+06 437594.379686 \n", "3 1103108001 2011-03-09 85711 2 1.021717e+06 437594.379686 \n", "4 1105280140 2011-05-28 85714 3 9.942417e+05 424406.071054 \n", "\n", " BOI_NUM ADDRESS CITY STATE ... OBJECTID \\\n", "0 10-0658 E Broadway Bl / S 6th Av TUCSON AZ ... 1 \n", "1 11-0145 5402 E 30th St TUCSON AZ ... 2 \n", "2 11-0145 5402 E 30th St TUCSON AZ ... 3 \n", "3 11-0145 5402 E 30th St TUCSON AZ ... 4 \n", "4 11-0317 200 E Irvington Rd TUCSON AZ ... 5 \n", "\n", " BOI_NUM_FromMerged SUS_WEAPON SUS_INJDEC OFC_INJDEC SUS_AGE SUS_RACE \\\n", "0 10-0658 Handgun Deceased No 34 Indian \n", "1 11-0145 Vehicle Deceased Injured 22 White \n", "2 11-0145 Vehicle Injured No 42 Black \n", "3 11-0145 Vehicle No No 25 Black \n", "4 11-0317 Handgun Deceased No 26 Hispanic \n", "\n", " SUS_GEND DATASOURCE_FromMerged RAW_INCI_NUM_FromMerged \n", "0 Male None 1010180542 \n", "1 Male None 1103108001 \n", "2 Female None 1103108001 \n", "3 Male None 1103108001 \n", "4 Male None 1105280140 \n", "\n", "[5 rows x 25 columns]" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "t1 = src.load(opd.TableType.SHOOTINGS_INCIDENTS, year=opd.defs.MULTI) # Year values are from src.datasets table\n", "t1.standardize() # Left table in the merge must currently be standardized\n", "t2 = src.load(opd.TableType.SHOOTINGS_SUBJECTS, year=\"NONE\")\n", "t3 = t1.merge(t2, std_id=True)\n", "t3.table.head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Note that the merged table has incident details on the left (date, location) and subject demographics on the right (SUS_RACE, SUS_GENDER). `std_id=True` requests that the unique ID column be identified and standardized to have a standardized name (INCIDENT_ID). Additionally, `Table.merge` accepts the same inputs as [pandas.merge](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.merge.html). As in the above example, if no column is selected to perform on the merge on, `Table.merge` defaults to attempting to find a unique ID column to use (even if `std_id` is False)." ] } ], "metadata": { "kernelspec": { "display_name": "opd-docs", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.11.3" } }, "nbformat": 4, "nbformat_minor": 2 }