Data Source Table Dictionary/Key#
The source table provides the information needed to create sources and load data as well as background information about each dataset. Below are the definitions for each column in the source table:
State: Name of the state where the agency(s) described in the data are. If agencies are in multiple states, value will be
MULTIPLE. This column is optionally used when creating aSourceto distinguish ambiguous sources (i.e. same city name in different states)SourceName: Original source of the data (typically a shortened name of a police department). Used when creating a
Source.Agency: Shortened agency / police department name. Typically the same as SourceName. Value is
MULTIPLEif a datasets contains data for multiple agencies.AgencyFull: Full name of agency.
TableType: Type of data (TRAFFIC STOPS, USE OF FORCE, etc.). Used when loading data.
coverage_start: Start date of data contained in dataset. Combined with coverage_end, this determines the years available for this datasets when loading data. NOTE: Often, agencies store their data in different datasets for different years so one table type may be spread across multiple datasets corresponding to each year of data.
coverage_end: End date of data contained in dataset at the time of the msot recent update. Combined with coverage_start, this determines the years available for this datasets when loading data. If the data has been updated by the dataset owner since the date in
last_coverage_check, more recent data may be available. NOTE: Often, agencies store their data in different datasets for different years so one table type may be spread across multiple datasets corresponding to each year of data.last_coverage_check: Date that
coverage_startandcoverage_endwere last updated.Year: Year of the dataset. Either a single year for data that is released annually,
MULTIPLEfor data containing multiple years, orNONEif the data is not for a particular year or set of years.agency_originated: Whether the data was originally generated by the agency it describes. If the value is ‘Yes’ or empty, the data originated with the agency described.
supplying_entity: The organization that supplied the data if it was not the agency described in the data.
Description: Description of the dataset
source_url: Homepage for dataset
readme: URL for data dictionary containing definitions of columns, etc. If empty, the
source_urlmay also contain a data dictionary.URL: Location of data or API endpoint. If
dataset_idis not empty, URL is combined withdataset_idto locate data.DataType: Type of data (CSV, Excel, ArcGIS, Socrata, etc.)
date_field: Column in the data where date information is stored. Absence of this value does not indicate that there is no date field. This value may be empty if OPD does not internally require a value to be set.
dataset_id: If required, one or more dataset IDs are stored here that are used in combination with
URLto locate data.agency_field: For multi-agency data, this is the the column in the data that indicates which agency a row corresponds to.
min_version: Minimum OPD version required to load a dataset
py_min_version: Minimum Python version required to load a dataset
query: Query to perform on the data after loading from the source but prior to providing to the user. This is only used in rare cases where the data must be filtered in order to match the dataset described in the source table. For example, a dataset could have data for both a municipality’s police and fire department, and the fire departments may be filtered out since OPD only provides information on law enforcement agencies. Another example is if the
TableTypeis Officer-Involved Shootings and the full dataset contains all shootings (not just officer-involved), the data would be filtered so that only officer-involved shootings are returned.
With its optional inputs, query can be used to filter for desired data. Here is a very specific query using all optional inputs: