search
top

When Baseball and Education Meet: Moneyball, the UFT and a Missed Opportunity

“… mighty Casey has struck out.”

Indeed, Ernest Lawrence Thayer. Little did you know in 1888 how well that closing line of “Casey at the Bat” would fit the recent debate involving Billy Beane’s Moneyball philosophy and how we might best evaluate our teaching corps.

I like baseball a lot. I’ve followed the game - and especially its stats - for as long as I can remember, and growing up in Cooperstown didn’t make that very difficult. For those of you who consider that too anecdotal, I studied baseball history with a pretty sharp scholar for a semester and, some 5 or so years ago, submitted a proposal to the Tufts Experimental College to teach a class on baseball history [proposal denied].

I also like education a lot.

Kevin Carey’s [The Quick and the Ed, Education Sector] recent piece about using “value-added” methodologies to evaluate teachers ["Value-Added Comes of Age"] raises interesting points about how we might assess our educators. The shorter, quicker version appeared in the New York Daily News, in which Carey says:

“In the late 1990s, Oakland A’s general manager Billy Beane revolutionized professional baseball by ignoring what his players looked like and focusing, objectively, on how they performed. Now New York City Schools Chancellor Joel Klein is trying to do the same thing for public education - if the teachers union doesn’t stop him first.

… And so, Klein plans to start using something called “value-added” data to figure out differences among teachers. The data compare annual test score gains in a teacher’s classroom to statistically predicted gains, given students’ backgrounds, previous academic history and a range of other factors.

At first, school officials are going to measure the performance of some 2,500 teachers - a small fraction of the 80,000-plus in the system. There are no plans to attach huge rewards or penalties to the results, at least not yet.”

Then, as Drew Curtis of fark.com would say, hilarity ensued:

“Unfortunately, when the initiative was reported last month, United Federation of Teachers President Randi Weingarten rejected the notion outright, saying, “Any real educator can know within five minutes of walking into a classroom if a teacher is effective.”"

She might be right. I’d claim with certainty that I could walk into a room and know whether a speaker is shilling for a shameless partisan willfully blind to accountability measures and, like Weingarten’s teacher-judger, do it mostly on gut feeling. I’d probably be right, too.

But I’d rather focus on Leo Casey’s response on Edwize, the UFT’s official blog. In “BillyBall Strikes Out as Educational Model,” Casey says:

“It now appears that “Billyball,” as its advocates called Beane’s statistical approach, doesn’t have quite the track record of success Carey reported. The most famous account of Beane’s method was Michael Lewis 2004 book Moneyball, which looked closely at Beane’s 2002 draft picks, since the Athletics had accumulated an unusually large number of such picks that season. As New York Times sports columnist Murray Chass recently recounted, the Beane’s 2002 choices chronicled in the Lewis book have proven less than felicitous. Statastically speaking, the other teams which picked based on scouting reports did better than the Athletics.

You can count us among the skeptics that evaluating teachers is a process akin to judging baseball talent. But it is interesting to know that the baseball model being proffered as a basis for judging teaching performance was not even successful on its own terms.”

You’re a skeptic because you want to be, Mr. Casey, and because “value-added” metrics might not serve the interests of the UFT - nothing more, nothing less.

To analyze Edwize’s witless reaction means that we have to look at three things: briefly, what Moneyball is all about, Murray Chass and his article, and how Edwize used the two.

For the Moneyball summary, I’ll defer to a friend of mine who works for an organization in Major League Baseball. He wrote to me:

“The UFT Blog has added itself to the roll of blogs misinterpreting and misrepresenting the meaning of Michael Lewis’ Moneyball …

… The philosophy espoused in Moneyball was not the use of statistics to analyze baseball that is so often stated. That has been known since Henry Chadwick [baseball writer and statistician, 1824 - 1908].

Moneyball was about the identification and exploitation of undervalued commodities. This is not applying statistics to baseball, it is applying business acumen to baseball, and understanding players as commodities. Billy Beane himself would definitely tell you that baseball players are more than commodities, that it’s important to understand their makeup and mien. But the point of Moneyball was that it was time to move further toward the analytical commodity side of the spectrum and away from the personal interpretation of scouts.”

After citing Terman and Binet as two fine examples of introducing new, effective metrics to improve our understanding of education, he concludes:

“Is identifying undervalued commodities, the true point of Moneyball, applicable to recruiting quality educators? Perhaps. And applying it to the educators themselves? If the observers of that community and the educators feel they are not doing so well enough with the current metrics (GPA, coursework, quality of institution, administrator evaluation), then perhaps new metrics and methods are warranted.”

Well put - it’s a worthwhile discussion that we all should have. Unfortunately, Casey is dutifully toeing the Wein-line. He isn’t interested.

And why should he take Carey’s ideas seriously? After all, New York Times national baseball writer Murray Chass declared Tuesday that Beane’s Moneyball philosophy was a flop. Chass took a look at the 2002 draft class - a small sample size in a business that has incredibly large fluctuation in outcomes - and ruled that the A’s had failed. You can read the article and judge for yourself, but there are several problems with Chass’ analysis:

  • Very small sample size;
  • Uses only the simplistic metric of Major League outcomes/statistics. Teams draft for their entire organizations - even non-MLBers can be successful draft picks;
  • Fails to take into account anything monetary/economic or other resource efficiencies; etc.

Chass basically looks at the draft class, concludes that the A’s should’ve drafted Prince Fielder and declares that the organization’s scouting system failed. Draw your own conclusions about its validity.

That’s fine, because Murray Chass is a traditionalist to a fault and I’m used to him being obnoxiously wrong about some things. He’s grating and charming at the same time and sportswriting would be less interesting without him. Every sector needs a crotchety, stubborn, pigheaded SOB here and there, and Chass is one of baseball’s.

What isn’t fine is Edwize’s over-eager jump to the conclusion that Carey is wrong because Murray Chass’s painfully-inadequate piece contradicted him. A newsflash for Leo Casey: if you’re going to criticize the effectiveness of Moneyball, Murray Chass isn’t the guy you want to cite. Looking to Chass for an honest, informed, unbiased analysis of Moneyball is on par with asking Ahmadinejad to give a fair and balanced opinion on Israel.

But don’t take it from me, take it from Murray Chass.

Consider sections from this Feb. 27, 2007 piece in which he talks about the stats/metrics Moneyballers use. [VORP is "Value Over Replacement Player," a metric that expresses a player's contributions relative to a fictitious "replacement player" with average defensive statistics and below-average hitting; equivalent measures for pitchers] :

“I receive a daily e-mail message from Baseball Prospectus, an electronic publication filled with articles and information about statistics, mostly statistics that only stats mongers can love.

To me, VORP epitomized the new-age nonsense. For the longest time, I had no idea what VORP meant and didn’t care enough to go to any great lengths to find out. I asked some colleagues whose work I respect, and they didn’t know what it meant either.

Finally, not long ago, I came across VORP spelled out. It stands for value over replacement player. How thrilling. How absurd. Value over replacement player. Don’t ask what it means. I don’t know.”

Edwize, take note - you’re depending on the analysis of a man who:

  • Is open and honest about how valueless he considers any of these metrics to be;
  • As recently as a year ago, admitted explicitly that he hated the metric;
  • In the next sentence he admitted that he didn’t know what the letters even stood for and, in the same sentence, admitted that he didn’t care to find out;
  • Asked like-minded friends to validate his willful ignorance;
  • Finds out what the words mean without having any idea what the measure is;
  • Admits that, at the end of it all, he still has no clue what VORP is about. But he hates it.

Good, trustworthy source, Leo. Very UFT of you.

Casey and the UFT should be ashamed of themselves for playing politics instead of taking an honest look at a potentially valuable set of metrics. They think it’s absurd. They don’t ask what it means. They don’t know. And, as a result, kids suffer so the UFT can take a cheap shot. Absolutely shameful.

Chass ended that 2007 piece with this line:

“People play baseball. Numbers don’t.”

He’s right. Numbers don’t play the game - just like the proposed numbers that will help evaluate New York City’s teachers don’t teach Regents English. Instead, these numbers reflect performance so we can assess that performance in a meaningful way. Declaring or pretending that they’re anything else is engaging in disingenuous propaganda that results in a missed opportunity to make our schools better.

Mudville’s hero fell in 1888. 120 years later, UFT’s slugger stepped to the plate.

Same result.

ADDITIONAL READING:

  • Michael Lewis’ Moneyball
  • FireJoeMorgan.com’s Murray Chass archive

Fisking the National Association of Secondary School Principals on Two Million Minutes

The National Association of Secondary School Principals has released an official statement on the film Two Million Minutes: A Global Examination. On their Principal’s Policy Blog, NASSP introduces the film before applying their criticism:

“Recently, the Ed in ’08 campaign released, in cooperation with Broken Pencil Productions, a film that focuses on how students in the United States, China, and India use their “two million minutes” in high school. The film makes a compelling argument that the United States is losing its competitive edge in the global economy. Unfortunately, its lack of objectivity taints the central message and prevents a constructive dialogue around its theme.”

Two Million Minutes creator and Executive Producer Bob Compton provided a detailed - and respectful - response both on the 2MM blog and in the comments of NASSP’s original post. If you’ve got the time, read it in full. If you don’t have the time, this summary will suffice:

Compton/2MM: 1, NASSP: 0.

I wanted to take the opportunity to fisk the weirdly defensive statement by the NASSP.

Their first gripe:

The film stacks the deck against U.S. high school students. The U.S. students the documentary profiles are in the top 5% of a school that is itself ranked in the top 5% of U.S. high schools. Although impressive, this does not compare to the Chinese students profiled. One had won a math competition that placed him among China’s top 100 mathematics students, which probably puts him in the top 0.000005% (or so) of Chinese students overall. A more balanced film might have taken top U.S. students from magnet schools such as Stuyvesant High School in New York City or Thomas Jefferson High School for Science and Technology in northern Virginia for their comparison.

As Compton explains in the comments of their statement and on the 2MM site, the kids’ situations - parents/family, income, all-around achievement - were roughly equal. It’s awfully tough to take 6 high school kids from 3 different countries and ensure several constants, but the film does a fine job. Also, complaining about how the 2 US students can’t compare to the Chinese students - and then citing one example from one student and leaving the other Chinese student out of the discussion entirely - is sloppy and unconvincing.

Unless the NASSP has compelling data showing that the top 5% at the featured Carmel, IN school is significantly different than the top 5% at Stuy, I’m not interested. At this point, it’s useless conjecture.

SPOILER ALERT: It’s also important to note that despite the male Chinese student’s victory in that math competition - you know, the one that places him in the “top 0.000005% (or so) of Chinese students overall,” he failed to be admitted to his school of choice. The Chinese girl? Not admitted to Yale. The Indian boy and girl were both denied admission to the universities of their choice.

If Broken Pencil Productions “stack[ed] the deck,” as the NASSP claims, they did a rotten job.

Their second gripe:

The film implies that engineering alone will set you free, and devotes almost no attention to success in other academic subjects. The Indian and Chinese students all excel in mathematics and science, and most plan to be engineers. Although we’d all like to see more students pursue engineering, Dan Pink and even Tom Friedman convincingly argue that right-brained activities should not be sacrificed at the altar of technical proficiency.

Stating that the film “implies that engineering alone will set you free” is disingenuous and borders on obnoxious. I’m not sure why the NASSP is trying to pick a fight here, but they could have chosen an argument that didn’t rely so heavily on logical fallacy.

If we want to pull any useful conclusions - even tiny ones - out of comparing 2 students each in 3 different countries, we need a common thread. 2MM chose math/science - all 6 students are working toward fields that use both. Math/science provides a common international language, as so many jointly-awarded Nobel prizes have shown us. Just imagine how disjointed the film would be if we were comparing a physics whiz in Bangalore to an aspiring romantic poet in Texas. Though I’ve only viewed the film 3 times, nothing comes to mind that suggests for a second that writing/communication, creative endeavors, etc. are less valuable - let alone “sacrificed at the altar” of technology, science or math. Grow up, NASSP.

And, if you’re keeping score, note that the NASSP’s Gripe #1 was about how 2MM compares apples to oranges, while Gripe #2 laments 2MM comparing apples to apples.

On to Gripe #3:

The film engages in some statistical sleight of hand. While presenting disheartening statistics about U.S. dropout rates, the documentary presents no comparable statistics from China or India-and little information about school access and how students are tracked as they progress to secondary level education. A quick Google search on Asian dropout rates, for example, reveals that the primary school dropout rate in India is a staggering 53%. Nowhere in the documentary is there a conversation about closing the achievement gap in China or India.

China is, by all facts and figures, a big place. So are India and the United States. I see this omission not as “sleight of hand,” but as pragmatic.

To suggest that seeing dropout rates for China and India might add to the discussion is valid to some degree. But the challenges that the United States faces in educating all of its youth are quite different than those in China and India - and those challenges make up the context within which statistics operate. Without devoting a great deal of time to providing that context, including a raw stat gives little value. From what I understand, the film is for United States educators, businesspeople, students, parents and taxpayers - we can give them stats for the US and they largely understand the context without the film having to provide it.

I have to guess at this - after all, I didn’t make the film - but I think the reason that “[n]owhere in the documentary is there a conversation about closing the achievement gap in China or India” is because that isn’t what the film is about.

Two Million Minutes also fails at:

  • Teaching you how to hit a curve ball [Rob Ellis can tell you, though];
  • Debating the merits of Brig. General Joshua Chamberlain’s salute of Confederate soldiers upon their Appomattox surrender;
  • Explaining how Rick Astley’s “Never Gonna Give You Up” fell far enough in popularity to even set itself up for a rebound.

Pursuing a massive, complex tangential topic - no matter how important that topic is - would do a disservice to 2MM’s point and, for example, to “closing the achievement gap in China or India.” The NASSP concludes:

Two Million Minutes opens a conversation about what we value in U.S. culture and the reality of a global economy. But it fails to prove its case against U.S. public schools.

Agreed - Two Million Minutes does a remarkable job of opening a conversation that we desperately need. It does not, however, open a “case against U.S. public schools” - any inference along those lines is owned entirely by NASSP. As Compton wrote in his reply, “Does it matter to America’s economic future that Indian and Chinese students spend more time building their intellectual foundation than American students?”

2MM is about how we use our time and the opportunity costs of those decisions.

I lament that the NASSP ignored their own conclusion - that the film “opens a conversation about what we value” - so they might release an official statement about a fabricated one-hour attack on public education. I’d prefer to see school leaders discussing how their schools and districts can use this state of affairs to forge better curricula and better instruction. The NASSP dropped the ball.

Then again, I’d probably be awfully defensive and insecure if my organization’s members averaged a GRE score of 950 [Verbal: 427, Quantitative: 523, national mean for those pursuing graduate degrees in Education Administration, page 19 of PDF].

Those one-dimensional engineering technonerds the NASSP so derided in Gripe #2? Well, their Verbal mean is 467 [40pts higher than education administration tracks], with a predictably-higher Quantitative score of 720 [197pts higher, page 18 of PDF] for a total mean of 1187.

I’m not much of a religious man, but I sure can tell when an organization needs a dose of Luke 4:23.

Happy Valentine’s Day, Education Blogosphere: A Poem

happy valentine's day!

Happy Valentine’s Day, everyone! Hopefully your Day will encompass all the warmth and sweetness of this ferret scene.

Eduwonkette’s got a Valentine’s Day-themed poetry contest running strong. Some are light-hearted and giggly, others are kinda morbid. I like most of them, they’re worth a read.

I think my favorite comes from NYCeducator:

To Joel Klein:

Roses are red,
Violets are blue,
My school’s over 250% capacity,
Can my 5th period class meet in your office?

That one got a real life EL-OH-EL.

So, here’s my humble submission, doggerel though it may be. I’ve embraced web 2.0 and put in lots of links, from opinion essays to blog posts to tired, fluff-ridden videos.

There’s even a point at the end - and a call for a gentle armistice. Enjoy!

Roses are red,

Overalls are dapper,

Let’s blame racism, poverty, repressive Rethuglican leadership, NCLB/all standardized tests [<---warning: unbearably hokey YouTube link], low teacher pay, failure to recognize and cater to multiple intelligences, evil charters and lack of SMART boards for our public education failures,

While teacher and administrator GRE scores are in the crapper*. [PDF, opens in new window, see pages 18-20]

But for Valentine’s Day, let’s all just hug! <33333

*Seriously, a mean in the 900s is terrible. Won’t more education professionals admit this?

School’s Closed: Here’s Why

There’s plenty of heavy, wet snow on Christian Hill [formerly Wood Hill].

Not pictured: Polar bears rejoicing at the invigoration of their habitat, Al Gore stumbling to find an explanation, former UK government officials re-spinning snowfall to blame Israel for global warming, very cold Chinese marveling at solidarity with Americans who don’t have four-wheel drive.

Poll: Do You Know William Arrowsmith?

[If you're reading this post in via RSS, click here to take the poll.]

It’s a simple question, really - you know him or you don’t.

Since I don’t know what you know or don’t know, the best way to find out is to ask. Do you know William Arrowsmith? Please choose an answer below, it’ll only take a second.

And when you answer, answer honestly. Not only does no one likes a cheater, but there’s a relatively-unknown circle of Hell reserved for those who cheat using Google.

I’ve wanted to write something up about a few of Arrowsmith’s points for years now. I started re-reading a piece to think about how I’d approach it when I realized that I have absolutely no clue whether anyone knows the man or his work.

So, these results, albeit from a small sample size, will give me an idea of how to approach this project. [I also get to test out this nifty AJAX WordPress poll plugin].

[poll=2]

Page 50 of 96« First...102030...4849505152...607080...Last »
top