Related Tables#
In some cases, an agency will split data on a topic across multiple tables to enable better representation of many-to-one relationships. For example, a single use of force incident can involve multiple subjects and/or officers. In this case, there might be 3 use of force tables for each of the following:
Incidents
Officers
Subjects
Usually, in this case, there will a unique ID for each incident that will appear in each table. This ID can be used to look up which subjects and officers were involved in the incident.
Table types for related tables in OpenPoliceData are of the form {general table type} - {sub-type} where general table type describes the type of interaction (i.e. use of force, complaints, etc.) and sub-type indicates the subset of data contained in the table (such as Incidents, Officers, and Subjects in the example above).
[2]:
import openpolicedata as opd
src = opd.Source("Tucson")
src.datasets[src.datasets['TableType'].str.contains("SHOOTING")]
[2]:
| State | SourceName | Agency | AgencyFull | TableType | coverage_start | coverage_end | last_coverage_check | Description | source_url | readme | URL | Year | DataType | date_field | dataset_id | agency_field | min_version | query | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 43 | Arizona | Tucson | Tucson | Tucson Police Department | OFFICER-INVOLVED SHOOTINGS - INCIDENTS | 2010-10-18 | 2018-05-18 | 07/06/2023 | This approximately 9-year dataset of shooting ... | https://gisdata.tucsonaz.gov/datasets/cotgis::... | <NA> | https://publicgis.tucsonaz.gov/open/rest/servi... | MULTIPLE | ArcGIS | INCI_DATE | <NA> | <NA> | 0.7 | NaN |
| 44 | Arizona | Tucson | Tucson | Tucson Police Department | OFFICER-INVOLVED SHOOTINGS - OFFICERS | NaT | NaT | 07/06/2023 | This approximately 9-year dataset of suspects ... | https://gisdata.tucsonaz.gov/datasets/cotgis::... | <NA> | https://publicgis.tucsonaz.gov/open/rest/servi... | NONE | ArcGIS | <NA> | <NA> | <NA> | <NA> | NaN |
| 45 | Arizona | Tucson | Tucson | Tucson Police Department | OFFICER-INVOLVED SHOOTINGS - SUBJECTS | NaT | NaT | 07/06/2023 | This approximately 9-year dataset of suspects ... | https://gisdata.tucsonaz.gov/datasets/cotgis::... | <NA> | https://publicgis.tucsonaz.gov/open/rest/servi... | NONE | ArcGIS | <NA> | <NA> | <NA> | <NA> | NaN |
Source.find_related_tables can be used to find tables that are related:
[3]:
# Find Tucson datasets related to the OFFICER-INVOLVED SHOOTINGS - INCIDENTS data
related_tables, corresponding_years = src.find_related_tables(opd.TableType.SHOOTINGS_INCIDENTS)
related_tables
[3]:
('OFFICER-INVOLVED SHOOTINGS - OFFICERS',
'OFFICER-INVOLVED SHOOTINGS - SUBJECTS')
To help with analyzing related datasets, Table.merge can be used to merge related tables. Note that the left table in the merge, currently must be standardized.
[4]:
t1 = src.load(opd.TableType.SHOOTINGS_INCIDENTS, year=opd.defs.MULTI) # Year values are from src.datasets table
t1.standardize() # Left table in the merge must currently be standardized
t2 = src.load(opd.TableType.SHOOTINGS_SUBJECTS, year="NONE")
t3 = t1.merge(t2, std_id=True)
t3.table.head()
[4]:
| INCIDENT_ID | DATE | ZIP_CODE | OBJECTID_1 | X | Y | BOI_NUM | ADDRESS | CITY | STATE | ... | OBJECTID | BOI_NUM_FromMerged | SUS_WEAPON | SUS_INJDEC | OFC_INJDEC | SUS_AGE | SUS_RACE | SUS_GEND | DATASOURCE_FromMerged | RAW_INCI_NUM_FromMerged | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 1010180542 | 2010-10-18 | 85701 | 1 | 9.931871e+05 | 445491.498380 | 10-0658 | E Broadway Bl / S 6th Av | TUCSON | AZ | ... | 1 | 10-0658 | Handgun | Deceased | No | 34 | Indian | Male | None | 1010180542 |
| 1 | 1103108001 | 2011-03-09 | 85711 | 2 | 1.021717e+06 | 437594.379686 | 11-0145 | 5402 E 30th St | TUCSON | AZ | ... | 2 | 11-0145 | Vehicle | Deceased | Injured | 22 | White | Male | None | 1103108001 |
| 2 | 1103108001 | 2011-03-09 | 85711 | 2 | 1.021717e+06 | 437594.379686 | 11-0145 | 5402 E 30th St | TUCSON | AZ | ... | 3 | 11-0145 | Vehicle | Injured | No | 42 | Black | Female | None | 1103108001 |
| 3 | 1103108001 | 2011-03-09 | 85711 | 2 | 1.021717e+06 | 437594.379686 | 11-0145 | 5402 E 30th St | TUCSON | AZ | ... | 4 | 11-0145 | Vehicle | No | No | 25 | Black | Male | None | 1103108001 |
| 4 | 1105280140 | 2011-05-28 | 85714 | 3 | 9.942417e+05 | 424406.071054 | 11-0317 | 200 E Irvington Rd | TUCSON | AZ | ... | 5 | 11-0317 | Handgun | Deceased | No | 26 | Hispanic | Male | None | 1105280140 |
5 rows × 25 columns
Note that the merged table has incident details on the left (date, location) and subject demographics on the right (SUS_RACE, SUS_GENDER). std_id=True requests that the unique ID column be identified and standardized to have a standardized name (INCIDENT_ID). Additionally, Table.merge accepts the same inputs as pandas.merge. As in the above example, if no column is selected to perform on the
merge on, Table.merge defaults to attempting to find a unique ID column to use (even if std_id is False).