Visual Interpretation Comparison
By Arianna Hsu, Hannah Jin, Neena Nair, Brian Ouyang, Jessie Yuan
Introduction
One of the major aspects of data analysis is visual interpretation. In this project, we utilized the FBI data for arrests in the United States in 2012 by sex (Table 42). Using this data, we did our best to create four different graphs: bar charts, pie charts, box and whiskers, and line graphs. Through this process, we both analyzed the best graphs for interpreting the arrest rate data and explored the flaws with the other graphs.
To do so, we analyzed the arrest data based on gender. For each type of graph, we used Tableau to create a visual for both sexes. Then, we examined the effectiveness of each graph solely based on our abilities to analyze the data for these visuals.
Bar Chart
The bar charts demonstrates the offenses charged for each type of criminal offense based on sex. The bar length demonstrates the range in the number of offenders, which allows for better visual comparison of the most common offenses charged for males and females. For instance, in 2012, the second most charged offense was drug abuse, with about 956,000 charges. In contrast, the second most charged offense for women was property crime, with about 480,000 charges in 2012. There are no visible outliers in the data.
Due to the data being by offense category, although there is a slight skew in the bar charts, this skew holds no meaning in the interpretation of the data. However, as demonstrated by the data, males and females had the same general pattern for offenses charged. For example, both males and females had more arrests for driving under the influence, drug abuse, larceny-theft, other assaults, and property crime in comparison to other charges. However, overall, men had a higher number of arrests than women, as demonstrated by the scale for the number of people arrested in each bar chart.
Pie Chart
The proportion demonstrated through the size of the slice is how frequent each crime is compared to all crimes. The biggest slices represent around 20-25% of all crimes. The medium slices are around 10% of all crimes. Some slices are as small as .1%.
For females, the most common crimes are all other offenses except traffic (665,827), property crime (479,339), and larceny-theft (431,637). The least common female crimes are forcible rape (131), suspicion (277), gambling (718), and murder and nonnegligent manslaughter (965).
For males, the most common crimes are all other offenses except traffic (2,000,269), drug abuse violations (956,962), property crime (803,567), driving under the influence (743,029), and other assaults (672,170). The least common crimes are suspicion (883), gambling (5,258), embezzlement (6,414), arson (7,283), and murder and nonnegligent manslaughter (7,549).
The most common crime regardless of sex is all other offenses except traffic. Drug abuse violations, property crime, and other assaults seem to be some of the most common offenses regardless of sex. One of the least common is suspicion for both sexes. Upon analysis, there appears to be a gender bias in sex related crimes. Males are forty times as likely as females to be charged with forcible rape (.17% vs .004%). Males are also four times as likely as females to be charged with other sex offenses (.6% vs .14%). However, females are five times as likely to be charged with prostitution (.17% vs .96%). Even though more males are arrested for property crime, property crime makes up a greater distribution for female crimes (10% vs 16%).
While the pie chart is an effective way of communicating the proportions of crime, due to the sheer number of categories in the 2012 United States arrest data, the pie chart may not be the best type of graph for analyzing the given data.
Box and Whiskers Plot
The range of arrests of women per category in America is around 480 thousand which is much less than the men data coming in with a range of around 950 thousand. This shows how out of all arrests in 2012 males had a much higher ceiling of number of arrests in one single category.
We found that there are very significant outliers in that data. In the women's arrests there are two outliers — by eye — sitting around 440 thousand and 480 thousand arrests. In the men’s arrests there is only one outlier at 950 thousand. This is important as it throws off our analysis which can already be seen in our spread greatly inflating our maximums.
The median of women arrests is around 20 thousand while the median of men arrests in a single category is around 70 thousand. This gives us a decent idea of where the average data falls as medians are more resistant to the outliers covered previously. With this information we can speculate that per category men have more arrests than women.
Both data sets — men and women — are heavily skewed right which shows how the number of arrests in many categories are more concentrated at lower numbers while only a few categories have a relatively large amount of arrests, this is true for both men and women. This is important as we can speculate that a majority of categories have a small number of arrests while only a few have a very large amount of arrests.
While the box and whiskers plot is possible, due to the large range for the data, there are definitely more viable options.
Line Graph
While it is possible to create a line graph using time vs. the number of crimes committed, we were unable to find data with multiple dates over time in 2012. As a result, we are unable to create a line graph for the trends overtime regarding crime rates in the United States during 2012. However, if we are to graph data over time, we would create two graphs based on sex and create individual lines based on offense charged.
Conclusion
Of the graphs we created, the most effective graph was the bar chart. While the pie chart and box and whiskers plot were both possible, they were not best suited for the given data. In addition, the line graph was not possible using the given data.
Through our usage of these graphs, we were able to determine that, while it is possible to create and analyze multiple types of graphs, not all of the graphs suit the given data. As such, one should not decide what type of graph they will use prior to obtaining the data. In order to create an effective visual interpretation of the data, one must determine the most well-suited type of graph based on the data provided.
Works Cited:
“Table 42.” FBI, ucr.fbi.gov/crime-in-the-u.s/2012/crime-in-the-u.s.-2012/tables/42tabledatadecoverviewpdf/table_42_arrests_by_sex_2012.xls.
Authors
Arianna Hsu Hannah Jin
Writer | Editor | Graph Designer Writer | Editor
Neena Nair Brian Ouyang
Writer | Editor | Graph Designer Writer | Editor
Jessie Yuan
Writer | Editor