Friday, March 30, 2007

The Wisconsin state school data and how I used it

The data I analyzed consisted of the state standardized tests given to students in grades 4, 8 and 10. This covers elementary, middle and high schools so we can get a decent idea of the relative quality of the various school levels.
The tests have several sections (e.g. reading, math, science, etc.) and they do not provide raw scores. Instead they provide the percentage of students scoring in one of 4 groups:
  • Minimal performance
  • Basic
  • Proficient
  • Advanced
So for one school and one test, you’ll get something like:


Math


Minimal performance

7%


Basic

23%


Proficient

40%


Advanced

30%



Assumptions

In order to make any sense of this mass of data, I had to make a bunch of assumptions and selections of the data that I considered important to me and people like me (including my friends who had the original question).


Assumption 1: The percent of students scoring in the Advanced score range represents the quality of a school.

First, just like in Lake Wobegone, everyone assumes their kids are above average and I’m no exception. I assume that my kids have the ability to be in the Advanced score range. The more kids a school has in the Advanced score range, the more likely my kids will be able to be in that range.

Second, this seems like a reasonable proxy for educational quality. If one school has 5% of its students scoring in the Advanced range and another has 35% in the Advanced range, which one do you think is better?


Assumption 2: The average of percentages of students scoring in the Advanced range over all tests for a particular grade represents the overall quality of a school.

Since there are 4-7 tests taken, I had to combine them somehow into one number. I chose to average across all the tests at a particular school the percent of students scoring in the Advanced score range.

One problem is obviously that a school could have a fantastic Math score and a dismal English score resulting in a reasonably good average. A second problem is that a statistician might have a problem with this averaging strategy – comments welcome if you’re a stats whiz.

I think, however, this reasonably combines the scores to obtain a single number for one school.

Assumption 3: We’re all white. NOTE FOR THE 2007 and later data, I used non-economically disadvantaged students rather than white.

I only collected and compared scores of white/Caucasian students. I did this for two reasons.

First, I was specifically concerned with my kids’ opportunities not yours. If you’re non-white, I’m sorry, you can try to figure this out yourself for your own demographic and you probably should. If they had more specific demographics that relate to educational performance, I would have used that as well (e.g. parents educational level, etc.) but they only provide race and gender. My reasoning is that if a school is good for my kids’ particular demographic then it’s good for their education.

My guess is that in this data set, race is actually a proxy for economic class which, I think, is probably a better predictor for academic achievement. So my guess is that these results apply to all races having a similar economic status to the average white Madisonian.

Second, I had a feeling that the well-known and unfortunate “achievement gap” between whites and minorities causes the statistics to be not comparable between schools with varying minority populations.

Clearly the achievement gap is a problem in schools with significant minority populations and needs to be addressed. I’m just not analyzing or addressing it here.

3 comments:

jbhstats said...

Some folks might suggest that you only consider reading scores, since you can't be successful in school if you can't comprehend the material you are studying, but I don't think that there is anything wrong with your averaging scores across the different content areas, as you are aiming at an overall picture of school success.

You might want to consider restricting your data even further, though, to exclude free and reduced lunch students. This would eliminate poverty as a confound in your comparisons. (note. The effects of poverty on performance were recently highlighted in Jason Shepard's piece in the Isthmus two weeks ago.) As we all know, poverty is not restricted to one ethnic group or another, and there is tremendous variability in poverty rates across the district and across the county. The result will make your comparisons across schools cleaner and easier to interpret.

celeste roberts said...

Since I saw your site for the first time today, I've been bothered by something. When you talk about being concerned about your own kids, you don't actually mean that you aren't concerned at all about all the other kids, do you? I assume that you worded it rather loosely. I care about my own kids in one way and desire that they attend schools where they are likely to thrive and grow, both academically and in character and socially. On another level I am concerned about the welfare of all children, both those I know personally and those I have never met. You probably meant that your concern for other's children is different in degree or nature and not what was driving you to perform this analysis. If you would clarify this, it would be a good thing.

Disaggregating the data this way is a useful tool. I notice that you do encourage others to perform this analysis for their own groups. Minority parents who did this would likely find that their children's chances of success vary by school. David Rusk has done something of the sort in his paper 'Classmates Count', which uses Madison elementary data from 1998-2001. His general conclusion was that minority students score better the higher the %age of non-minority students there are in the school. It also may be that schools vary in how they implement programs designed to help minority children succeed.

Madtown Chris said...

I've looked further at the data and don't see school lunch but I do see that there is an "economically disadvantaged" versus "not economically disadvantaged". I'm not sure what the definition is but doing that analysis would be interesting -- if I have time, I'll give it a try.

In answer to the comment about where I write "I care only about my children," what I mean is that I did this analysis because I was interested and concerned about the performance of area schools for them.

That is, if I could narrow the data down to "how do schools perform for white girls whose parents have graduate degrees and income X?" I would have done that analysis.

I am concerned about school performance for everyone but this analysis is intended to determine school performance for children like mine.