Export Dataset to CSV#
This notebook shows an example of how to export a dataset to a csv file. It assumes you found the dataset using techniques shown in finding_datasets.ipynb and loaded the dataset using loading_datasets.ipynb
[1]:
try:
import openpolicedata as opd #This import should be last in the try block because the expect block will only try to load it
except:
import sys
sys.path.append('../openpolicedata')
import openpolicedata as opd
[3]:
# To access the data, create a source using a Source Name (usually a police department name). There is an optional state input to clarify ambiguities.
# We will use the above cell's information for Maryland to choose the agency "Montgomery County" which we select for the source_name
src = opd.Source(source_name="Montgomery County", state="Maryland")
src.datasets.head()
[3]:
| State | SourceName | Agency | TableType | Year | Description | DataType | URL | date_field | dataset_id | agency_field | min_version | readme | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 5 | Maryland | Montgomery County | Montgomery County | TRAFFIC STOPS | MULTI | This dataset contains traffic violation inform... | Socrata | data.montgomerycountymd.gov | date_of_stop | 4mse-ku6q | <NA> | <NA> | https://data.montgomerycountymd.gov/Public-Saf... |
| 6 | Maryland | Montgomery County | Montgomery County | COMPLAINTS | MULTI | This dataset contains allegations brought to t... | Socrata | data.montgomerycountymd.gov | created_dt | usip-62e2 | <NA> | <NA> | https://data.montgomerycountymd.gov/Public-Saf... |
[4]:
# Load traffic stop data for 2021
t = src.load_from_url(year=2021, table_type='TRAFFIC STOPS')
[5]:
# Show the first 5 rows of the table
t.table.head(n=5)
# Now you are ready for analyzing the data in the table t.
[5]:
| geometry | seq_id | date_of_stop | time_of_stop | agency | subagency | description | location | latitude | longitude | ... | driver_state | dl_state | arrest_type | search_conducted | search_outcome | search_reason_for_stop | search_disposition | search_reason | search_type | search_arrest_reason | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | POINT (-77.27504 39.14653) | 123add05-d3d2-428d-9932-66bc30831388 | 2021-01-01 | 23:03:00 | MCP | 5th District, Germantown | DISPLAYING EXPIRED REGISTRATION PLATE ISSUED B... | GREAT SENECA @ WSSC ENTRANCE | 39.1465333333333 | -77.2750433333333 | ... | MD | MD | Q - Marked Laser | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
| 1 | POINT (-77.27504 39.14653) | 123add05-d3d2-428d-9932-66bc30831388 | 2021-01-01 | 23:03:00 | MCP | 5th District, Germantown | EXCEEDING POSTED MAXIMUM SPEED LIMIT: 64 MPH I... | GREAT SENECA @ WSSC ENTRANCE | 39.1465333333333 | -77.2750433333333 | ... | MD | MD | Q - Marked Laser | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
| 2 | POINT (-77.27504 39.14653) | 123add05-d3d2-428d-9932-66bc30831388 | 2021-01-01 | 23:03:00 | MCP | 5th District, Germantown | KNOWINGLY DRIVING UNINSURED VEHICLE | GREAT SENECA @ WSSC ENTRANCE | 39.1465333333333 | -77.2750433333333 | ... | MD | MD | Q - Marked Laser | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
| 3 | POINT (-77.27285 39.14366) | 1b7c9229-d80f-4ed2-9692-d24a6fbda5c7 | 2021-01-01 | 22:43:00 | MCP | 5th District, Germantown | DRIVING VEHICLE IN EXCESS OF REASONABLE AND PR... | GREAT SENECA @ HORN POINT | 39.1436583333333 | -77.2728533333333 | ... | MD | MD | A - Marked Patrol | No | Warning | 21-801(a) | NaN | NaN | NaN | NaN |
| 4 | POINT (-77.27405 39.17419) | 0c6f50ae-d462-4356-8319-e1f035dc00fc | 2021-01-01 | 22:20:00 | MCP | 5th District, Germantown | DRIVER CHANGING LANES WHEN UNSAFE | 118 @ WALTERJOHNSON | 39.174195 | -77.274045 | ... | MD | MD | A - Marked Patrol | No | Warning | 21-309(b) | NaN | NaN | NaN | NaN |
5 rows × 43 columns
[8]:
import os
cwd = os.getcwd()
csv_filepath = cwd
print(f"The CSV file will be written to {csv_filepath}. Make sure this path is okay before running the next cell. If the path is not okay then modify csv_filepath.")
The CSV file will be written to c:\Users\matth\repos\opd-examples. Make sure this path is okay before running the next cell. If the path is not okay then modify csv_filepath.
[10]:
# Save to CSV. To specify a custom filename, set the filename input
csv_written_filename=t.to_csv(output_dir=csv_filepath)
print(f"The CSV file was written to {csv_written_filename}.")
The CSV file was written to c:\Users\matth\repos\opd-examples\Maryland_Montgomery_County_TRAFFIC_STOPS_2021.csv.
[11]:
# To load data back in from CSV, create a new source and use load_from_csv
# load_from_csv usage is similar to load_from_url except for the output_dir
# input
src = opd.Source(source_name="Montgomery County", state="Maryland")
t = src.load_from_csv(year=2021, table_type='TRAFFIC STOPS', output_dir=csv_filepath)
t.table.head()
[11]:
| geometry | seq_id | date_of_stop | time_of_stop | agency | subagency | description | location | latitude | longitude | ... | driver_state | dl_state | arrest_type | search_conducted | search_outcome | search_reason_for_stop | search_disposition | search_reason | search_type | search_arrest_reason | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | POINT (-77.2750433333333 39.1465333333333) | 123add05-d3d2-428d-9932-66bc30831388 | 2021-01-01 | 23:03:00 | MCP | 5th District, Germantown | DISPLAYING EXPIRED REGISTRATION PLATE ISSUED B... | GREAT SENECA @ WSSC ENTRANCE | 39.146533 | -77.275043 | ... | MD | MD | Q - Marked Laser | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
| 1 | POINT (-77.2750433333333 39.1465333333333) | 123add05-d3d2-428d-9932-66bc30831388 | 2021-01-01 | 23:03:00 | MCP | 5th District, Germantown | EXCEEDING POSTED MAXIMUM SPEED LIMIT: 64 MPH I... | GREAT SENECA @ WSSC ENTRANCE | 39.146533 | -77.275043 | ... | MD | MD | Q - Marked Laser | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
| 2 | POINT (-77.2750433333333 39.1465333333333) | 123add05-d3d2-428d-9932-66bc30831388 | 2021-01-01 | 23:03:00 | MCP | 5th District, Germantown | KNOWINGLY DRIVING UNINSURED VEHICLE | GREAT SENECA @ WSSC ENTRANCE | 39.146533 | -77.275043 | ... | MD | MD | Q - Marked Laser | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
| 3 | POINT (-77.2728533333333 39.1436583333333) | 1b7c9229-d80f-4ed2-9692-d24a6fbda5c7 | 2021-01-01 | 22:43:00 | MCP | 5th District, Germantown | DRIVING VEHICLE IN EXCESS OF REASONABLE AND PR... | GREAT SENECA @ HORN POINT | 39.143658 | -77.272853 | ... | MD | MD | A - Marked Patrol | No | Warning | 21-801(a) | NaN | NaN | NaN | NaN |
| 4 | POINT (-77.274045 39.174195) | 0c6f50ae-d462-4356-8319-e1f035dc00fc | 2021-01-01 | 22:20:00 | MCP | 5th District, Germantown | DRIVER CHANGING LANES WHEN UNSAFE | 118 @ WALTERJOHNSON | 39.174195 | -77.274045 | ... | MD | MD | A - Marked Patrol | No | Warning | 21-309(b) | NaN | NaN | NaN | NaN |
5 rows × 43 columns