Use Power BI to Visualize Baltimore Crime Reports
Using Power BI to Visualize Baltimore Crime Reports
With Power BI you can take raw data and quickly transform it into easy to use visualizations. For this example, I will be using data from the Baltimore police crime reports public dataset that has data points back to 1963. However, most of the data really starts in 2014. I wanted to see what I could build from the dataset and spent a few hours pulling together the data and then building reports that enable me to see various trends about Baltimore crime reports.
This report summarizes the crime reports by the time of day aggregated from data by year. The line graph shows that crime reports are lowest before 6AM and highest between 6PM and Midnight.
Baltimore Crime Reports by Hour
For this page of my report. I wanted to see when crimes were occuring by the hour of the day. I converted the timestamps from the data into a column that provides me with the Hour of the Crime and then went to work building a report to show me exactly what happens and when.
I built a map that gives me the location of the reported crime and then some data grids that allow me to click to filter my results right on the report. For example, I can see crime reports by year or description of the crime just by clicking on the name in the data grids.
Then I added some visualizations to let me see trends over time. I have a line graph that shows how many crimes are reported by hour of day. From the data the 5 AM to 6 AM block is the lowest time for crime. While 6 PM to 7 PM has the most crimes reported.
I also wanted to see how incidents of crime trended by hour over the descriptions and districts. So I built 2 different 100% stacked bar charts to let me see trends into those data points. From the data Burglary is highest between 7 AM and 8AM and tapers off throughout the day.
Having data means having dirty data. This page in my report shows crime reports that do not have a District assigned. There are 97 occurrences of crime reports that do not have a proper district.
Data Quality in Crime Reports
To quickly spot any data issues, I created a page that highlights crime reports that do not have a District assigned. This allows me to spot data quality issues in the data. The most common day to have no District assigned in Sunday as shown on the bar graph in the middle. While Larceny from Auto is the most common description to have data quality issues. From this view a Data Steward could identify the crime reports that have data issues. Then correct the data in the source system and clean up the data incrementally.
Power BI has the ability to run the data through machine learning to look for key influencers in data. In this example, I wanted to know what description of crimes lead to a decrease in the total incidents. The top 3 descriptions are Arson, Homicide and Robbery – Carjacking. Meaning that these 3 types of crimes are lower than other in total reported cases. This feature automatically changes based upon the different values you may want to investigate. Allowing for highly interactive data analytics and machine learning on this or any dataset.