Monday, October 28, 2013

Hoboken School District's Violence and Vandalism Summary Report: How Some Districts Use Raw Numbers Rather Than Rates as an example of a common issue in Analyzing and Reporting of Critical Data

Note: error bars based on standard deviation of rates
An example used in my Knowing and Learning course ....Recently, the Hoboken Board of Education released a document comparing the last 5 years of Violence and Vandalism Reports for the Hoboken School District (see below). Unfortunately, this document reports only the raw number of incidents without taking into account the total number of students or supplying a rate (i.e. "incidents per 100 students"). Supplying a rate would make the data easier to compare across years. Attached are the actual data for the district from the 2008-09 school year to the 2012-13 school year, the same years reported on the Board of Education document. One may discuss, argue, or debate whether the incidents per 100 students is getting better or getting worse under Kids First but the chart provides a more accurate way to made a data driven decision. One thing is clear, the "drop" is not as dramatic when you look at rates instead of raw number of incidents. To be clear, I am not assuming anyone currently doing statistics in the Hoboken School District is cognizant enough to be making conscious decisions to deceive. Rather this is much more likely an example of the lack of expertise in the district in effectively gathering and analyzing data and presenting results to the public.

Even though the rate increased in 2012-13 (4.2656 incidents per 100 students) from 2011-2012 (4.055 incidents per 100 students) the official district document reads: "The report shows a significant drop in the number of incidents over five years." A look at the "per 100" rates indicates no such "significant drop"at all and in fact the rates of violence and vandalism peaked during the 2009-10 and 2010-2011 school years (Carter/Rusak/Toback/Kids First). Moreover, 2012-2013 saw a slight RISE in the violence and vandalism rate compared to the previous year.
"Rates take into account the size of the population, so comparison can be made across different population groups. By using rates instead of raw numbers, the occurrence of violence and vandalism in one group or cohort can be fairly compared with another."  -Material in any Introduction to Statistics Course 
Furthermore, when you add error bars to the charts (see chart above), you realize the claims of "significantly less" are even less credible and accurate. 
Note the difference when raw numbers are reported rather than rates and standard deviations. The above raw number chart gives the impression things are getting a lot better when its not necessarily the case when compared to the rates chart at the top of this post. Raw numbers do not take into account population size and make it impossible to compare across yearly cohorts in a statistically accurate manner. 
In other posts I have indicated that comparable data (school year 2011-12) around New Jersey is Atlantic City (2.13 incidents per 100 students), Camden (1.6 incidents per 100 students), Newark (.9 incidents per 100 students), and Patterson (1.0 incidents per 100 students).

note: While a new category was added to the EEVRS report in 2011-12 (HIB- Harassment, Intimidation, and Bulling), these incidents were previously responsibly reported under other categories on the form. The fact that the state created a new category to identify these incidents do not mean the incidents were not previously tallied

Data Used in Analysis 

2008-09 2009-10 2010-11 2011-12 2012-13
Enroll 1873 1954 1816 1726 1641
Incidents  91 101 97 70 70
per 100  4.858 5.168 5.3414 4.055 4.2656