Baseball in The Simpsons - A Data Analysis

Homer at the Bat (Credit: FOX)


Since one goal of this blog is to discuss the role of baseball in popular culture, I took it upon myself to analyze data from The Simpsons, given its history of featuring baseball rather prominently; everyone knows the episode Homer at the Bat, which earned itself a place in Cooperstown in 2017. Homer Simpson even has his own page at SABR. With that said, I’ll apologize in advance for making the following post sound more like an academic assignment than a blog post.

This post will be just one part of our look into baseball on The Simpsons, because, as the reader will find out, the data had obvious limitations in explaining the prominence of baseball within a given Simpsons’ episode – at least using our methodology.

For the TLDR crowd, I encourage you to check out the infographic on baseball in The Simpsons. In fact, everyone should check it out because it sums up the information in this post really well.


One might assume the frequency of the word “baseball” within a given episode’s transcript would be a solid indicator of baseball’s prominence in that episode; the more it’s mentioned, the more thematically important baseball becomes…right?

No, not necessarily.

Would you have guessed that “baseball” comes up a grand total of once in Homer at the Bat? You could even argue that it doesn’t come up at all; Mr. Burns delivers the line to Smithers:
I've decided to bring in a few ringers-- professional baseballers. (Swartzwelder, 1992)

And that’s as close as the most celebrated baseball episode of The Simpsons ever gets to uttering the word “baseball”. It’s well worth mentioning that “Softball” does come up 12 times in the episode, which underscores the value in diversifying the parameters of our proxy for “baseball prominence” by further expanding selections from the lexicon of baseball. In other words, while the frequency of the word “baseball” (by itself) in a Simpsons episode appears to be a poor measure of baseball’s significance to that episode, the frequency of related words or phrases (in addition to “baseball”) can vastly improve effectiveness. Words that are almost exclusively spoken in reference to, or originate from the baseball lexicon make the best selections for this exercise – these are the ones that can’t be mistaken for something else, at least in most cases. Examples of great choices include “home run” and “foul ball” because they’re almost always a reference to baseball; even when used metaphorically, baseball is still the origin of “home run”. The same can’t be said for bad examples like “pitcher”, “ace”, or “bat”; a pitcher can refer to a pitcher of beer (particularly in The Simpsons), which, unlike “home run”, doesn’t originate from the baseball lexicon as a metaphor.

Having to run through the entire lexicon of baseball-related phrases, words, or names is something I won’t be doing; the quick way is admittedly beyond my ability; the long way is beyond my patience. For simplicity, I selected 12 words whose combined frequency comprise a proxy for “baseball prominence”:
1. Baseball
2. Softball
3. Home run
4. First base
5. Second base
6. Third base
7. Shortstop
8. Left field
9. Center field
10. Right field
11. Isotopes
12. Isotots

The first two are obvious choices, and my explanation for the third has already been discussed. Additionally, the sans-battery infield positions, as well as the outfield positions, align with the conditions that make a given morpheme ideal for inclusion. That leaves the final two selections, which are exceptional among the rest because their origins are noticeably not from the lexicon of baseball. You can learn about isotopes studying nuclear physics – we’re more interested in the lexical context within The Simpsons’ universe, and that’s why isotopes and isotots are excellent inclusions. The Springfield Isotopes and the Springfield Isotots are baseball teams – the former is a minor league club and the latter is a youth team that Bart plays for (and Lisa temporarily manages). As a result, whenever the word “isotopes” or “isotots” is spoken in a Simpsons episode, it’s practically guaranteed to be in reference to baseball.

NOTE: The inclusion of “isotots” actually presents a small amount of bias in the results since the Springfield Isotots weren’t introduced until Season 18. Therefore, episodes from seasons 18 and beyond are going to reflect greater baseball prominence, all else being equal.
Using a dataset from Kaggle that includes almost every line from almost every episode from seasons 1 through 26 (the 26th season is missing the final 10 episodes, presumably because they hadn’t aired at the time the data was compiled). While the show is well passed 700 episodes at the time of this writing, we still have plenty of data to work with, and a sample representing well over 75% of the parent population.


Seasons 2, 3, 12, and 22 stand out as the only seasons with at least 300 baseball words per million spoken. Putting it into this context may be somewhat deceptive, however, as the spikes are largely driven by individual episodes. In fact, no season had more than 2 episodes in which a word from our baseball prominence proxy was spoken at least twice. This suggests that baseball is explicitly a part of the show’s dialogue a maximum of twice per season – and (check out seasons 9 and 16) a minimum of never!

Baseball in The Simpsons by Season

Although the chart is a visualization of seasons that uses baseball words per million on the y-axis, which represents the quantified form of the proxy, my definition of “frequency” within episodes is more basic: quite simply, it’s the number of times any one of our proxy’s words is spoken in an episode. For example, in Season 26 Episode 9 titled I Won’t be Home for Christmas, “baseball” is spoken twice, and “left field” is spoken once (none of the other words are spoken at all) giving us a grand total of 3 (Banerjee, 2019). Disappointingly, the proxy indicated a frequency greater than 3 in just 7 of the 564 episodes for which we have data – that basically means baseball was a greater part of the dialogue than Season 26 Episode 9 in just 7 episodes.
This doesn’t mean baseball has no role in the episodes where the proxy indicates little-to-no frequency. We see Milhouse attempting to purchase a 1973 Carl Yastrzemski card “with the big sideburns” from Comic Book Guy in Three Men and a Comic Book before he’s convinced to go in on Radioactive Man #1 with Bart and Martin (WikiSimpsons). Despite the mention of Yaz, the proxy indicates a frequency of 0 in that episode because no words originating from the baseball lexicon are actually used. Similarly, in the Season 18 episode Homerazzi, we see Jose Canseco exiting the dry cleaners with what is almost certainly his #33 A’s jersey. No words are spoken in this sequence, so the proxy obviously picks up nothing, but it’s nonetheless an obvious baseball reference.

This also doesn’t mean the proxy failed. While a non-sequitur baseball reference in an episode may yield a frequency of 0 or 1, the proxy succeeds at indicating baseball’s prominence within an episode – it doesn’t miss any episode where baseball is pivotal to the plot. Both Sal Bando and Gene Tenace lend their voices to the show in Season 17’s Regarding Margie, which features them driving down Evergreen Terrace in their 1974 A’s uniforms. Right before this, Home mentions to Lisa he wants the 1974 A’s to know what he thinks of them after painting “’74 Oakland A’s – Best Team Ever” on the sidewalk in front of the Simpsons’ house. Despite the voice acting from former players and Homer’s brief dialogue about the A’s, this episode yields a frequency of 1 from our baseball prominence proxy (Banerjee, 2019). Why? Because baseball isn’t pivotal to the plot – if it was, it would have been in the dialogue more and the proxy would’ve picked up on it.

The punch line here is that, although references to baseball in The Simpsons are ubiquitous, they’re often subtle. Although baseball is relevant to the plot in a handful of episodes, there are only 7 (according to the proxy) where baseball is noticeably prominent. If we're to take the proxy as the objective indicator for baseball prominence in Simpsons episodes, they would go in this order:

Rank Season.Episode Title BWPM
1 22.3 MoneyBart 10,363
2 3.17 Homer at the Bat 6,906
3 12.15 Hungry Hungry Homer 6,778
4 2.5 Dancin' Homer 4,071
5 10.11 Wild Barts Can't Be Broken 4,679
6 18.18 The Boys of Bummer 4,004
7 17.22 Marge and Homer Turn a Couple Play 3,527

It should come as no surprise that the seasons from which these episodes come from are the ones most prominent in the chart of BWPM by season. Like I’d mentioned earlier, the results for the show’s seasons are largely swayed by a single episode, particularly with respect to the 7 episodes above. Nonetheless, these episodes are the ones to watch if you need The Simpsons to fill a baseball void for the better art of their air time.  

If you didn't check out the infographic earlier, here it is in all it's glory:

Baseball in The Simpsons Infographic

More to come on this matter. Stay tuned.


