Aggregating weekly crime and traffic accidents separately
The Denver crime dataset has all crime and traffic accidents together in one table, and separates them through the binary columns: IS_CRIME and IS_TRAFFIC. The .resample method allows you to group by a period of time and aggregate specific columns separately.
In this recipe, we will use the .resample method to group by each quarter of the year and then sum up the number of crimes and traffic accidents separately.
How to do it…
- Read in the crime hdf5 dataset, set the index as
REPORTED_DATE, and then sort it to increase performance for the rest of the recipe:>>> crime = (pd.read_hdf('data/crime.h5', 'crime') ... .set_index('REPORTED_DATE') ... .sort_index() ... ) - Use the
.resamplemethod to group by each quarter of the year and then sum theIS_CRIMEandIS_TRAFFICcolumns for each group:>>> (crime ... .resample...