Tod Newman – todnewman.com

September 25, 2024September 25, 2024

Book Review – Portrait of a Lady by Henry James

Bernardino Luini, Milanese, c. 1480 – 1532 , “A Portrait of a Lady”, National Gallery

The Portrait of a Lady is a novel by Henry James, first published as a serial in The Atlantic Monthly and Macmillan’s Magazine in 1880–81 and then as a book in 1881. It is one of James’s most popular novels and is regarded by critics as one of his finest. I picked it up recently after once again rejecting James Joyce’s Ulysses for being insipid and unreadable (more on that perhaps in a different post). After confronting Joyce’s vain, meaning-free language, Henry James’ writing style was very pleasant and quite thought-provoking. Like Joyce, James is wordy, but his sentence construction conveys his thoughts clearly, as opposed to Joyce, whose experimental prose seems to primarily convey feeling and sentiment and often resists anything that has the appearance of truth. Henry James, however, is very interested in his characters and their depth and uses them to convey hard truths about life. This is my opinion, of course and it is your right to disagree!

In this book, James brings to the surface the conflict in his time between the spirit of independence and social norms. Throughout the book, James uses America and Europe as symbols of these qualities. America, of course at the time was a symbol of innocence and individualism. Europe, on the other hand, was revealed as a picture of sophisticated tradition and social conventions. Things have changed quite a lot since the late 1800’s, and a current author might not choose America and Europe to illustrate this point any longer, but there are plenty of targets available. Independence is viewed differently in the modern west than it was in Henry James’ time and while it may have been surprising and refreshing to read about a bright, independent young woman in the late 1800’s, in the modern west, it would not capture the same level of interest, because the notion of pursuing independent thinking is encouraged by society. Even in countries touched by communal or Marxist traditions, the idea of being an independent thinker is generally accepted. In both types of societies, though, I’d submit that the actual practice of independent thinking is often discouraged through various suppression techniques and fashions. That said, we see our main character, Isabel Archer, early in the novel and her notions of independence and the unfettered pursuits of experience stand out from the European formality and fatalism that surrounds her. The reader finds themselves excited to see her challenge and conquer these societies with her intelligence and energy. All goes well for her and she comes into a small fortune at the death of an Uncle and the intervention of a well-intentioned cousin. Isabel demonstrates her freedom and desire to follow her own unique path by rejecting marriage proposals from a British Lord and an American businessman, both of whom meet her and love her before her fortune is made.

To avoid spoilers, what we see next in the novel confirms many prejudices about how one’s fortune can negatively impact not just a life, but also one’s true independence. Isabel finds herself a target of scheming by unscrupulous expatriates and life crashes down hard on her. We see very complex characters, one is truly Machiavellian and evil in his self-regard and vanity. Another chooses to demonstrate proper behavior in society to set up complex social networks and schemes that solve her money problems (and also other, more challenging issues). One of my favorite characters is Isabel’s American friend, Henrietta, a direct and somewhat nosy news reporter. Henrietta grows significantly during the novel and becomes a true friend to Isabel by the end.

During the unfolding of the novel’s plot, we find Isabel becoming a loving step-mother to young Pansy Osmond, who looks to the reader as what we might expect a convent-raised wallflower might look. By all appearances, she has no imagination and has been trained to be passive and obedient to her manipulative father. To our modern sensibilities, her character for much of the book is disappointing, but when we are able to see a bit deeper beneath the surface, we find that our modern western prejudices have betrayed us a bit. Perhaps Pansy is more complex that we thought? This is the thought that I might close with because The Portrait of a Lady has much in it to trigger our modern fashions. Many will be unable to get beyond this, as our culture maintains just as strong a hold on our imaginations – and is as insidious – as the late 1800’s European behaviors that we have learned hostility towards. We moderns unfairly judge many of the characters of this book for actions and behaviors which are merely manifestations of their culture, but we are challenged by other characters whose narcissism and manipulation look like the evil that we often choose to paint over.

July 26, 2024July 31, 2024

Thoughts on Motivation Management

AI generated Image “visualizing the management of worker motivation”

I mentioned in my last post on knowledge-work productivity that understanding how to manage motivation to do specific tasks will result in overall higher productivity. As someone who has thought about this (and experimented with approaches) for years, here are some of my thoughts:

Set up Spaces Focused on Improving Motivation: This is a general idea to increase your productivity overall. It also includes the notion from my last entry on investing in quality tools. I’ve learned that it is not smart to skimp on the cost of tools (computers, notebooks, monitors, other work infrastructure) that are higher quality and are likely to give you joy to work with. Obviously, one can’t always have the best of everything, but if there are key, strategic tools that you work with, my experience is it is smart to spend the extra $50 or 100 bucks. Especially for a tool you may spend years working with. Key examples of this are 1) my mechanical keyboard – typing on this keyboard is satisfying in some weird way and I find that I’m unhappy working without it. 2) My noise cancelling Sony wireless headphones – What a lucky thing that I spent the money (they’re not cheap) right before COVID to buy these things… I originally had intended them for use on flights, but of course, COVID ended that idea and it morphed into use during Zoom calls. Was an absolute lifesaver for remote work and I was constantly grateful for them.
1. In addition to quality tools, though, envisioning how your whole workspace could be “architected” to draw you in to work more effectively is time well spent. Things like KVM switches or USB hubs that allow the switching between computers are important for me, because they allow the quick, painless switching between a work computer and one or two personal computers. If you have a powerful personal computer (I use a Macbook Air and a powerful Linux workstation with big GPU’s) or two that are better for use with certain kinds of tasks, then the ability to switch common resources like the big monitors, mechanical keyboard, wireless mouse, or upgraded camera between them makes switching a breeze.
2. Obviously things like a good chair and desk are important. Some people revel in their standing desks because it allows them to work in completely different contexts. I also like my office decorations (lots of tall ships and John Wayne paintings) because they make me happy to be in the office, which can improve my motivation to work.
Track your Work: I think we all feel motivated when we make visible and measurable progress on a task. This, to me, is the chief value of a tracking schedule. I like checking off the boxes on a task! An example beyond the tracking schedule is the word count trackers that I use when I’m working on a book. I always have a daily goal to write some manageable number of words (usually like 100 or so) in the hope that I actually sit down, get in the zone, and crank out thousands of words. Knowing the value of visibility of work, I add to this a tracker where I have two columns, date and cumulative word count. I then do a simple line plot that shows the slope of my writing accomplishments. Some days, I write more and then get to see a steep slope of word count output for the day or week. This motivates me to keep it going! I refer to the word count tracker as a commitment device, but I also think of my IOS “Streaks” app (well worth the $3 or so on the app store) is also a commitment device that also helps build habits. And building habits gets us in that seat to do that writing (or gets us to the gym to lift, or reminds us to floss every day).
Manage an Overall Work List: I have found that having a quite varied list of things that need to be accomplished helps with the times when I just don’t feel like doing anything. I’ve learned that when I feel this way, that I can almost always find something on the overall work list to do. If I’m feeling more like refinishing a piece of furniture than building the AI classifier for the dataset I’ve been looking at, I am able to work without any sense of guilt on the furniture. My experience is that there will come a moment when I’m very motivated to explore that dataset and if I ride the wave of that motivation, I’ll do the work much more efficiently.
Know When You Work Well on Types of Tasks: I know that in the early morning, I’m much, much better at tasks that require creativity and challenge my thinking. Conversely, right around 2 PM after I’ve eaten lunch is NOT a good time for these tasks. I have found that 2 PM is better for walking around and doing gardening tasks or working on other more physical kinds of chores. Don’t fight these kinds of patterns!
Celebrate Completed Work: I have a strong bias to finishing tasks that I’m working on instead of just pushing them forward a few steps. I think this allows one to be much more productive. So to support this, I celebrate completions. Perhaps finishing a major task means that I drive to Dairy Queen for a sundae. Or maybe it just earns me a short nap (I’m a big fan of the 13 minute nap for rejuvenation). The act of celebrating a completion is important and helps you build enthusiasm for the next task.
Work to Identify how to fit your Motivation into an Employer’s Goals: OK, sometimes your employer doesn’t have the intimate knowledge of your motivation cycles and just wants you to work 9 hours straight on the most important tasks. If anyone is able to do this day after day, I’d love to know about you! Your employer isn’t really able to measure your productivity and assumes that measuring your hours-worked (and maybe how late you stay at the office) is the best proxy for productivity. Of course this is completely false. Any employer who cares at all about quality and productivity SHOULD be focused on tapping their employees’ best hours for the jobs. So many mistakes have been made by employees who were burned out, mentally exhausted, and working on activities they are far over-qualified for. Since the employer can’t manage this, somehow the employee needs to focus hard on maximizing their productivity through working tasks when they are most fit (and motivated) to work them. There is probably a whole lot more to be said about this, and possibly some of it would be controversial. I heard the quote once, “good employees do exactly what their employer tells them to do. Great employees conspire to make their employer astonishingly successful.” This is interesting to consider, especially in light of managing your motivation cycles.

July 19, 2024

Productivity and Cycle Time in Knowledge Work

Here’s a bit of a diversion from my normal data-oriented posts, but in a previous job, as a data-driven systems thinker it was natural for me to explore and try to understand how to measure productivity and cycle time. The work outcomes that the organization needed to understand better tended to be heavy on system design tasks but also extended into the work needed to set up product lifecycle cash flow.

Background on Productivity in Design Work

It was always a struggle to measure productivity (and cycle time) in this kind of environment, because it was extremely challenging to identify and measure the most important, value-producing events in the workflows. For instance, in system design, it was extremely easy to measure the productivity of one of the major bottlenecks in the process… (drum roll)… drafting! Why is drafting a bottleneck in the design of a system? Well, not only does someone in a factory have to assemble the product you just designed, but you also have to ensure that the supply base can be enabled and protected. Often the bill of materials on a drawing is the entry point for most of the complex activities performed by the supply chain organization. Additionally, quality needs to be protected and the “recipe” for building your system cannot be lost. All of these objectives, always made drafting a long, tedious process that designers and manufacturing engineers both expressed impatience. I went a bit long on this, but maybe you can see why productivity is easy to define and measure. The drafting team is working on one product, generally is not multitasking, and start work and complete work date and times are easy to capture – allowing the cost of the drafting product to be easily normalized by the hours spent (resulting in dollars per hour). Perhaps even the whole process can be measured from logs built automatically by the drafting software.

However, most of the rest of the design process is not so easy to measure. It spreads across many teams, all of whom have some sort of dependencies on other teams, each of which has it’s own “special sauce” and tasks which build upon tasks. Mastering queueing theory helps in manufacturing facilities where assembly tasks depend on multiple preceding steps is hard but doable because in manufacturing, the product is generally always visible to the eye. In design, however, the product can be ideas, processes, models, and pieces of documentation and is rarely visible in the same way.

So with that as background, I recommend the following YouTube video if you have interest in improving your true productivity in a “knowledge work” environment. I agree wholeheartedly, but watch the video and I’ll add my fourth principle to Cal Newport’s three that he offers.

DO FEWER THINGS – Cal Newport

OK. Did you watch the video? What does Cal offer up as his three principles?

Do Fewer Things – Or better, “Do fewer things at once”. We all can chant “multitasking is a productivity killer”, but most of us still think we’re pretty good at it. Regardless, however, the point is not that you’re bleeding productivity, but that you’re probably doing things that have no impact on your life, your enjoyment of your work, and even on your final work product.
Work at a Natural Pace – This doesn’t mean “work slowly” as one might imagine, but really involves something I think about as working as an extension of your life. How do YOU work best? Should you spend more time putting the thing you’re working on down and thinking about it more? Do you work better if you spend time to kit up the parts you’re assembling (or build UML models of the code you’re building) first?
Obsess over Quality – I really like Cal’s point in the video that if one invests in quality tools (i.e., my Macbook that really makes me happy to code or write on) it’s a way for you to signal to yourself that your work is important and you ought to ensure you do the important parts as well as they have ever been done. He uses his grad school $50 lab notebook as a great example of this. How can one take lazy, incoherent notes in a really nice, expensive lab notebook (ostensibly with a very nice pen you’re proud of)??

My Fourth Category as Promised

Here’s Tod’s add. Maybe this is particularly me or maybe this is a pretty general thing, but here it is if it helps you.

4. Manage your Motivation: Your work productivity benefits (at perhaps an order of magnitude) when you are motivated to do it. For years I have pondered the difficulties and tricks for optimizing motivation cycles and have found that I do fabulously better work when I am “all in” on getting that work done. Sometimes, of course, one unfortunately doesn’t have the option of working on the “blue widget” when the motivation hits because the boss is impatient. But I’m going to guess that during those times, “blue widget” productivity and quality suffer significantly because they’re being worked on out out of obligation and not desire. Perhaps these times correlate strongly with surfing Reddit or YouTube?

It is also impractical if your ability to manage your motivation is weak and your motivation cycles are too sporadic. A cursory scan of my blog would probably reveal that I love to write (see my series on self-publishing). Unsurprisingly, I find that I write my most interesting and creative passages when the motivation to write hits, but I have also learned that I really need to create “commitment devices” to help ensure that I can channel that motivation into daily writing sessions. I imagine (or hope) that this is universal, as I have heard similar things from other writers or musicians.

Recap

Productivity in knowledge work is really hard to get one’s head around. It’s hard to define, difficult to measure (and automate measurement), and really challenging to normalize cost with the hours spent on the task. It feels like Cal Newport’s suggestions won’t necessarily resolve this difficulty, but it might allow the improvement of productivity — measurement aside — by focusing on the productivity of the most critical parts of the task.

July 13, 2024July 13, 2024

Major League Baseball – Did Banning “The Shift” work?

Positioning of infielders under new rules' — image from https://www.mlb.com/glossary/rules/defensive-shift-limits

Abstract

in 2023 Major League Baseball made a rule restricting certain defensive players being in certain portions of the field (See here for the actual definition of the rule). This was done to combat “The Shift“, a defensive technique which was popularized by the Tampa Bay Rays in 2006, where one side of the field is overloaded with players.

I had the notion at the time that banning the Shift was just a band-aid measure and would have no impact. Since the ban was in 2023, we have had one full season to evaluate any impact of the ban.

History of The Shift

The idea of shifting players to counter power hitters’ tendencies to pull the ball to one side goes to the early parts of baseball. It disappeared for a long time, however, until Originally, the Rays’ had the idea on how to shut down David “Big Papi” Ortiz of the Boston Red Sox, a left handed hitter who had great power pulling the ball down the right side of the field. Joe Maddon, the manager of the Rays, used Sabermetrics to identify that Ortiz hit nearly every time to the right side, and mostly to the outfield. The ploy was effective and Ortiz, who had hit over .300 from 2004 to 2006 moved to .265 midway through the 2006 season after multiple teams started copying the Rays’ technique against him.

The Shift attracted a lot of fan attention because it was often deployed against the most well-known power hitters and was seen as stifling to the offensive aspect of the MLB. Eventually, it was banned (limited, actually, see the definition above for detail) and the 2023 season was the first to be held without the old, dramatic version of the Shift.

See below for an image of the Shift being applied by the Angels (there’s an extra person in the shortstop position).

Hypothesis

Based on the way the Shift was deployed, I figured that if I wanted to demonstrate if the rule banning the Shift had any effect, I would have to evaluate the performance of elite power hitters both before and after the ban. This is not a perfect approach, though, because what if some other variable was introduced (a new “juicier” ball? Rules restricting pitchers) that impacted hitters’ performance. This means that I would have to evaluate performance differences of groups of “non-sluggers” as well to detect any non-Shift related performance changes.

I’m defining sluggers (the ones most impacted by the Shift) as hitters who have a Slugging Percentage (a common measure that records the total number of bases coming from hits) greater than the league’s average. My inclination is that the true sluggers are the ones who are at least one standard deviation above the mean (i.e., the top 16% of hitters.

Data Gathering

I used the Python library, pybaseball, to scrape some basic data. Pybaseball is useful in that it scrapes multiple baseball stats sites (including advanced pitch-based metrics). I only needed it to pull data on at-bats, hits, doubles, triples, home runs, and walks from 2006 to today’s date in 2024.

The data was pulled in two groups. One represented the “post-Shift” era from 2006 to 2022 and the other represented the “post-ban” era from 2023 to the current date. Data was evaluated by player and then normalized by the number of at bats. Multiple minimum at-bats were used (400, 600, 800) to determine impact of the Shift on players regardless of their usage on the team (but my insight was to not go much lower than 400 at-bats, in the theory that players who had few at-bats were unlikely to have the shift deployed against them (as it appeared to be reputational). Both groups were separated into two types of players, 1) “Normal” players, who’s Slugging Percentage numbers were close to the league mean and 2) “Sluggers”, who’s Slugging Percentage was a) in the top half of the league, b) one standard deviation from the mean (top 16%), c)_two standard deviations from the mean (top 5%), and c) three standard deviations from the mean (top ~1%). The notion is to identify if any of these groups of “sluggers” statistics (Hits, Doubles, Triples, Home Runs, Walks) were statistically different between the pre-Shift group and the post-Shift group.

Results

The first thing I looked at to compare performance from the “Shift Era” to the “Post-Shift Era” was the mean value of a number of common metrics. I selected ones that I felt were most likely to be impacted by the Shift. Hits, Doubles, Triples, Home Runs (the Shift doesn’t really impact Home Runs.. but I was curious), and Walks. I normalized these metrics by the number of at-bats for every player to make sure to keep things consistent.

Top 16% of Sluggers Compared Before and After the Shift was Banned. Also non-Sluggers for Comparison

I did this for a range of Minimum At Bats and Numbers of Standard Deviations away from the mean to define who was a “slugger”. They all looked a bit like this. The first thing I notice is that the period after the Shift was banned sees better offensive performance (and more walks) across the board. Great! We have an answer! No? Of course it’s never that simple. First off, we need to remember these are just the mean values for these eras and the mean of a distribution is not always the best way to describe the whole distribution. Also, we need to understand if these differences are significant or could just be explained away by common variation.

The next step was to apply an algorithm called the Kolmogorov-Smirnov two sample algorithm. This test compares the underlying continuous distributions F(x) and G(x) of two independent samples (pre-ban and post-ban) to determine if they come from the same distribution (our base assumption) or if they were drawn from different distributions. To wit, do the performance metrics before the Shift was banned have a fundamentally different distribution than the metrics after the ban. We will establish a required confidence interval of 95% (the typically accepted number) before we can determine the distributions different.

p-values comparing Sluggers before and after the ban. The red line is our confidence interval. Any bars below the red line indicate that metric is statistically different before and after the Shift

Above you can see the p-value for just Sluggers (top 50% of sluggers on the left, top 16% in the middle, and top 5% on the right) before and after the ban. We already know that the offensive metrics tend to be higher after the ban, this just tells us if that difference is significant and if it extends to the whole distribution.

p-values comparing non-sluggers before and after the ban. The red line is our confidence interval. Any bars below the red line indicate that metric is statistically different before and after the Shift

Analysis of the Results

There are obviously more charts, but these tell the story well enough. In the top chart (comparing Sluggers’ performance), we see that for the top 16% of sluggers, their performance on every metric other than doubles meets our requirements to claim that the differences are statistically significant. However, it’s hit or miss (pun!) for the other two clusters. Hmm.

Then looking at the non-slugger comparisons (we are comparing the hitters in the lower 50%, 84%, and 95%), we see that there are fundamental differences almost in all categories (most of the bars are below our red line, indicating that the performance changes in these metrics are significant), clearly more than with the sluggers! This indicates to me that something OTHER than the Shift has been responsible for affecting offensive performance across baseball. The Shift was rarely or never applied to any players other than pull-hitting sluggers, so it couldn’t be responsible for the performance changes we see in this bottom graph.

Conclusions

It seems pretty straightforward. Offensive performance has changed across the board between the time period from 2006-2022 and the time period from 2023 on. These are a large number of years, and lots of rule changes could have happened.
However, the changes in performance have been consistent across all hitters in MLB, not just the sluggers.
In actuality, the Sluggers seem to have had a less significant increase in performance than the non-Sluggers.
All of this makes me say that the performance impacts was from factors other than the banning of the Shift and that my initial hypothesis that the banning of the Shift had no impact is true.

To see others of my recent sports analytic posts:

June 20, 2024June 20, 2024

Baseball Scoreboard, Part 2

In the previous discussion (Part 1) on the measures seen at a Baseball park, I covered the pitching metrics seen here fairly heavily. It is possible that hitting metrics are reasonably well-known in many places, but there is at least one here on the scoreboard that some explanation may be required.

The Triple Slash Line

Review of the “Familiar” hitting statistics would start with what is sometimes known as the “triple slash line“. This is simply three statistics that are frequently seen shown in order separated by slashes, like this: AVG/OBP/SLG. This refers to, in order, Batting Average, On-Base Percentage, and Slugging percentage. The Batting Average definition is the percentage of at-bats ending in a hit. An at-bat is defined as a plate appearance that ends in an out (excluding sacrifice flies), a hit, a fielder’s choice, or an error. For years, batting average was the preferred statistic for comparing player performance, but in recent years, the other metrics in the triple slash line have increased in prominence due to their impact on scores (and thus, wins). On-Base Percentage is more simply defined… it is the percentage of plate appearances where a batter reaches safely (could be a hit, walk, or getting hit by a pitch), excluding reaching by error, fielder’s choice, or a dropped third strike. This metric goes back to the Hall of Fame manager of the Brooklyn Dodgers, Branch Rickey, who is still beloved for his innovations in baseball (including signing the great Jackie Robinson and breaking baseball’s color barrier). One of the breakthroughs of the Oakland General Manager, Billy Beane, that became famous in the movie “Moneyball” was a stronger reliance on OBP when signing free agents. FInally, Slugging Percentage is a metric designed to give weight to a batter’s power. The formula is (#Singles + 2*#Doubles + 3*#Triples + 4*#Home Runs)/Plate Appearances. This makes slugging percentage useful, but not necessarily perfectly correlated with runs and therefore wins.

As an example of how the triple slash line can aid in evaluating player value, consider these two players (2024 stats as of 6/20/2024).

Aaron Judge (Center Field, NY Yankees): .303/.429/.697

William Contreras (Catcher, Milwaukee): .304/.364/.461

Though these two players (both having very nice seasons) have almost identical batting averages, that doesn’t tell the full story. Aaron Judge has batted in 66 runs this season whereas Contreras has only batted in 48 (on very similar numbers of games played). Judge has 27 home runs to Contreras’ 9. Amazingly, Judge has been walked 30 more times (57 to 27) than Contreras. Obviously this means that in walks alone, Judge has had 30 more scoring opportunities than Contreras. This has translated to Judge scoring 5 more runs this season. But this is at the cost of 20 more strikeouts for Judge. Lots to think about! First, let’s discuss the impact of RBI and HR to wins .

The RBI, short for Runs Batted in, has always been seen as a fairly critical metric in baseball, as it recognizes a hitter’s role in a run being scored for their team. It can result from a hit, a sacrifice fly, or even a walk, but not an error. In a sense it is a really valuable metric because it shows impact on the most important measure, the runs a team scores in a game. In another sense, one may over-reach when comparing players by their RBI accomplishments, because a player who is preceded in the batting order by a player with a stratospheric on-base percentage has a much higher chance of having a hit bat in a run. So RBI isn’t comparing apples and apples. There is a big controversy over the RBI metric amongst baseball nerds due to this. If you want to go deep down this rabbit hole, here is a good article from Bleacher Report back in 2012.

The appearance is that more home runs lead to a greater number of wins for one’s team. The home run (especially one that ends the game!) is exciting and draws fans more than anything else. The modern era of baseball is often referred to as the “Long Ball Era” due to the prevalence of home runs in the game. A method to identify the value of the home run called regression shows that home runs tend to be highly correlated with win percentages. Conceding that home runs are correlated with wins, the next question would be if home runs CAUSE wins. These are two very different things. Ice Cream sales are highly correlated with higher temperatures, but we cannot say that the temperatures cause the sales. The answer to the question about home runs causing wins is a hard one, and there are plenty of scientific papers analyzing this (and doctoral dissertations!). What seems obvious is that teams value the home run highly — even in the face of the higher numbers of strikeouts that power hitters tend to rack up. One thing that we know, though, is that teams express value through the salary they give a player. In this respect, Aaron Judge stands out with his $40M annual salary compared to the $760K that the Brewers are paying Mr. Contreras! (I think he’ll be getting a raise after this season!) Here’s where I found these salaries…

The Mystery Metric, OPS

All of this builds up to the final metric on the scoreboard that is less known, OPS. This stands for “On-base plus Slugging” and is actually a combination of two metrics from the triple slash, OBP and SLG. They’re just simply added together. I suppose this metric saves fans time (or the mathematical embarrassment) of adding the last two numbers in the triple slash together. The intent of the OPS is to provide a view into overall effectiveness of a hitter and their potential value for scoring runs. The historical record for OPS was rung up by Babe Ruth (1.16), followed closely by Ted Williams and Lou Gehrig. So clearly it is a measure of the historical greatness of a player. By the way, keep an eye on Aaron Judge’s OPS in 2024 (currently at 1.126), as he is threatening the Babe’s record!

June 18, 2024June 20, 2024

A Quick Tour of a Baseball Park Scoreboard

In the modern era of more and more esoteric baseball metrics, how can one understand what the ballpark is telling us?

This weekend I went to the Diamondbacks game at Chase Field, a treat I have enjoyed for a number of years. As a person who likes numbers, it struck me that the stadium was even more awash in statistics than ever. It brought a lot of questions to mind, some of which I’ll explore in this blog entry.

The Scoreboard, Explained

Much of this scoreboard layout looks fairly familiar to someone who may have looked at box scores or attended other big-time baseball games. The score by inning is something that has been featured for years. It tells us something interesting, the rate at which the two teams have been adding to their score. Knowing how pitching assignments work in major league baseball, one can quickly surmise that the White Sox starter got shelled early and seems to have stabilized a bit by the fourth inning. The Diamondbacks’ starter, however, seems to have pitched a fairly solid first four innings, because we can see that he has given up only three hits (less than one per inning). The White Sox scored one run off of him in the third inning, but we can also see that Arizona has one error. Did this error result in the one run? If so, that would be an unearned run and therefore wouldn’t be counted against the Diamondback pitcher’s Earned Run Average. We can see more about the White Sox pitcher, Drew Thorpe, because the scoreboard gives more info about active pitchers in the upper right (the D-Backs were batting when this image was taken). Drew hasn’t had such a good game to this point… in 3 innings he has given up 4 earned runs… that translates to an ERA of 12.0 at the moment. He has also given up five walks (BB) and six hits, which results in a WHIP (Walks and Hits per Innings Pitched) of 3.67. Additionally, his ratio of Strikes to total pitches (strikes plus balls) is 0.52, which is 0.1 lower than the MLB average. Top pitchers typically have numbers like .65, so clearly Drew is way behind the pace of the best pitchers here. All of these measures (WHIP, ERA, %strikes) are very bad for Mr. Thorpe’s year averages and we can get all of this from the scoreboard.

The metric FPS% refers to First Pitch Strike Percentage. The Major League Baseball average is 57% and we can see that Thorpe is sitting at 47%. This is a pretty interesting metric. Weinstein Baseball (here) tells us that “if a big league pitching staff improved their first pitch strike percentage from 57% to 80%, it would translate into 100 fewer runs allowed over the course of a season. That translates into 10 more big league wins.” So what the scoreboard is showing us here is that Drew Thorpe has a control issue today… He’s giving up a lot of walks (per Weinstein, “70% of walks start with first pitch balls”) and possibly in trying to get the ball over the plate “whatever it takes” he may also be giving up some easier pitches to hit.

One other metric regarding pitching that we can take away from the scoreboard here is “MVR”. This is placed just to the right of the Error (E) column. I actually had to Google this one during the game. It’s kind of new and stands for “Mound Visits Remaining”. So Mr. Thorpe has already had more than one mound visit during his first 3 innings and now only has two left. This is probably part of baseball’s desire to speed up the games and make them less tedious. The pitch clock is another similar effort, where there is only thirty seconds allowed between batters. ESPN tells us (here) that the pitch clock has reduced baseball games to an average of 2 hours and 40 minutes (24 minutes shorter) due to the pitch clock. This has also corresponded with a spike in batting average and stolen bases. It seems obvious that penalizing a pitcher by restricting their time between pitches is likely to reward hitters and base runners.

Pitch and Hit Exit Metrics

Another thing that I found very interesting is a display I had never seen before at the ballpark. See below.

I found that this was very distracting, because my brain wanted to identify the patterns of how they were classifying the Pitch Type. There were a number of different labels for pitch type, among these was “four seam fastball”, “cut fastball”, “slider”, “changeup”,”sinker”, “sweeper”, and “curve”. The “Vertical Break” and “Horizontal Break” numbers were very interesting. These data are captured by camera-based systems called Trackman or Hawkeye and are used across many different sports. There’s a great article in Baseball America (here) on how these pitch classifiers are able to label the pitch type. What I found is that the pitch types are calibrated to speed… a pitcher who threw a 100 mph four seam fastball also seemed to have their pitches in the 95 mph range that didn’t “rise” so much classified as a sinker. Whereas other, slower, pitchers may have had sinkers in the 80 mph range. Pretty interesting.

I also found myself looking after ball contact at the launch angle. A launch angle over 40% often indicated that a pitch that looked to the eye like a home run might actually just go to the warning track. Baseball Savant has a nice tool (here) where you can pick an exit velocity and a launch angle and see the actual outcome. For instance, below, 103 MPH exit velocity coupled with 30 degree launch angle was a Home Run 74% of the time!

Conclusion

Baseball parks have become inundated with information visualizations over the last few years. In some cases, advanced sensing and tracking systems like Hawkeye have enabled these new metrics to be collected. In others, new rules like the pitch clock and maximum numbers of mound visits have created demand for new metrics. But overall, baseball has always been a sport focused on its numbers, which is just one reason why many of us number people love it so much!

Link to Part 2 – Hitting Metrics

June 11, 2024June 13, 2024

Update: xG and Luck Update for Premier League

In the previous entry, I compared the expected goals / Luck metrics between the last two completed MLS seasons. Now that the Premier League season has come to a close, we can do the same thing to see if any new patterns jump out at us.

A quick overview of what we see above would go something like this… Manchester City finished at the top of the league in points, followed by Arsenal. The teams are sorted by final point tally from left to right on the chart. The three teams to get relegated are on the far right (Luton Town, Burnley, and Sheffield United). Things I see:

Man City and Man U always have the highest salaries. Lately Man U has been inconsistent in play and has been finishing out of the top four. Their expected goals for/against ratio is a bit lower then their direct neighbors in points (Chelsea and Newcastle). We don’t know exactly why, but it reflects their overall efficiency at taking and preventing good shots. For some reason, Chelsea and Newcastle were a bit more efficient at ensuring that they got more good shots than their competitors in matches. Interestingly, we see West Ham sitting at about 1/2 Man U’s salary but with nearly the same xG ratio. But West Ham finished 8 points lower than Man U. Perhaps there are ways that expensive players help other than in the xG ratio. I might imagine that expensive, ostensibly better players may be slightly more likely to score when taking a good shot or to prevent an opponents good shot from going into the net.
Salary in the Premier League seems to always be more important than in MLS. The top salaries are always in the top 1/2 of the league in points in the Premier League, but this is not as strongly observed in MLS.
Luck. Sometimes there’s an interesting disparity between luck at home and luck on the road (the yellow and green lines respectively). Near the top of the rankings, we see that Liverpool’s home luck is below zero (meaning that they score less goals on average than their expected goals would predict) but their away luck is above zero. I did a quick google on Liverpool and “home luck” and found this. So others have noticed this, but I don’t see that they have observed that most of Liverpool’s luck has been in away venues. On the bad side of the rankings, though, we see large differences open up between home and away luck. All three relegated teams really struggled on the road to score up to their expected goals (i.e., good shots weren’t going in). Clearly this is an important measurement for identifying that your team is in big trouble. Conversely, though, if your team finishes positive on both home and away luck, it seems that this can offset a big salary differential (see Arsenal, Aston Villa, Tottenham, and Newcastle, all in the top half of the league in points).

2022-2023 Results for Comparison

Stuff to discuss:

Chelsea was a strange outlier during this season regarding salary and final point tally. Their xG ratio is below 1 and both their home and away luck are negative. Efficiency seems to have been an issue. Compare to Fulham who finished 8 points ahead of them with somewhere around 1/4 of the salary. Fulham had lower xG than Chelsea this season but had great luck both home and away. Note that Chelsea finished higher in 2023-24 and Fulham finished much lower as their luck regressed back to the mean.
There’s also a big delta between Nottingham Forest’s home and away luck during this season. They finished just ahead of relegation, but maybe they did so just by the skin of their teeth due to their abysmal away luck (lowest in the league). Note that in 2023-24, Nottingham again skated just ahead of relegation, but both their home and away luck were just below zero. Speaks perhaps to inconsistency in scoring off of good chances, probably a predictor of a future relegation.
We again see a number of teams in the top half where positive away and home luck offsets a salary gap. Note Arsenal, Aston Villa, Brentford, and Fulham all in the top half.

LINKS to Other Soccer Analytics Entries

June 7, 2024June 11, 2024

Update: MLS Latest xG ratio and Luck stats

I haven’t updated the charts in previous entries for the end of the 2023 MLS Season and the 2023-24 Premier League season. I re-ran my stats and here goes, MLS first.

2023 Final MLS season Stats – xG and Luck

Now to make a better comparison, here’s the results for the end of the 2022 season.

2022 Final MLS season Stats – xG and Luck

I like these metrics (xG ratio and Home/Away Luck) because they paint a pretty good picture of an awful lot that happens in a soccer match. As a reminder, xG stands for “expected goals” and the ratio is xG for the team being measured divided by xG of their opponent during the match. Expected Goals are calculated statistically based off of where a shot on goal is taken. Closer and more centered shots have a much higher likelihood of scoring, and therefore count as 0.5 or higher expected goals. The ratio, therefore, gives a pretty good idea of whether a team was getting in position to take good shots and whether they were limiting their opponents to less good shots. Luck is the comparison of the number of goals scored to the xG. If Austin FC scores 3 goals, therefore, but their xG is only 1.8, then they have a luck of positive 1.2. As you can see, it is quite possible to have a luck of less than zero too (meaning you were just unlucky. You were in position and took good shots, they just didn’t go in). The trend does seem stronger in 2022, though, than it does 2023.

Interesting Trends

I don’t see any Luck trends from 2022 to 2023. This is unsurprising due to the statistical nature of luck, but one always hopes to find a pattern where some team is “making” their own luck.
Since the teams are sorted by number of points (highest on the left), it is not surprising that the xG ratio trends pretty decently with the season points. We would expect teams that win more and therefore get more points for the season to also have better shots overall than they allow to their opponents. in 2023 we do have a few notable outliers (NYRB and Seattle) who had a really strong xG ratio but finished lower in points.
MLS also shows an interesting trend where teams with high salaries (the blue bar) don’t always finish in the upper 1/4 of the league. In 2023, there are quite a lot of high salary teams in the lower 1/4, actually. This is very unlike what we’ll see in the next entry where we review the English Premier league results. Hard to put one’s finger on this completely, unless it has something to do with older European stars coming to MLS at the ends of their careers?

LINKS to Other Soccer Analytics Entries

October 9, 2023June 11, 2024

Update: Soccer Analytics in Practice at the Youth Club Level

I took a bit of a break from this series to go off and capture data. My goal was to see if an xG and Luck-based approach to measurement would be useful at the youth club level. Here’s a quick report on my approach (it is reusable) and the results so far.

Approach

I built a data input sheet that can be taken to soccer games and used without great knowledge of statistics and soccer.
The data input sheet has instructions on the bottom right corner. It’s as simple as putting an ‘x’ on the page where you estimate a shot was taken by your team and an ‘o’ on the page where you estimate a shot was taken by the opposition. If the shot is “on goal” (I make this simple by saying it is on goal if a) it’s a score, b) the goalie touches it, or c) it hits the goal frame) I put a check-mark next to the x or o. If the shot results in a goal, I put a circle around the ‘x’ or the ‘o’. It’s about that easy. Sometimes I’ll put notes near the marks. I like to identify if the shot was the result of a penalty and is a free kick (‘fk’). I also put the scorer’s name near goal marks.
After the game, I add up shots on goal for each team and then multiply each shot on goal by the probability of goal in the region it was taken. You can see the legend shows a range of probabilities. I actually use these probabilities starting with the lowest (green)… [0.05, 0.1, 0.15, 0.3, 0.5, 0.7, 0.8]. My approach is NOT to include penalty kicks in this process because in my opinion, PK’s don’t really speak to what I’m trying to measure, which is expected goals and luck. You might say it demonstrates luck to even get a PK, and I’d agree, but that’s a different kind of luck in my opinion.
The total sum of the shots on goal times their probability of goal number equals the team’s expected goals (xG). Luck is calculated by the actual score minus the xG number. See below for an example scored game.

Example xG score tracker. My team won this game 2-0.

Season Results so Far

I’ve been able to easily collect these metrics so far this season. I believe it is easy enough to delegate to a student team manager (I call them my ‘statistician’) in the Fall season for the school team that I coach. Below are the results for our 2023 club season so far.

Area chart of Results. Wins are to the Left, Losses to the Right.

Analysis

Here are a few things that are probably obvious.

xG appears to be a strong predictor of a win. Note how the higher (light purple) xG for FC Tucson tends to be stronger in wins (left side of the plot) and lower on the right side (ties and losses).
They say you make your own luck, but perhaps sometimes it’s just outside of your control (note my previous analysis of luck due to venues and officiating). Maybe just knowing that the luck might be tilted against your team is positive.
Sometimes you make your own bad luck too… In the game on the plot where the FC Tucson team showed the most bad luck (Slammers FC), our team totally dominated the game in all aspects. Shots on Goal, Possession, and xG. But some of our bad luck was due to the fact that the Slammer’s best players were defenders and our shots were taken further away.
This is a big takeaway… THE SHOT CHARTS ARE REALLY VALUABLE! Though I don’t actually coach this club team I have already been able to sit down with players and parents at their request and describe the flow of the game along with areas where our shot choices were driven by our inattention or even to defensive schemes of the opposition.

LINKS to Other Soccer Analytics Entries

July 24, 2023June 11, 2024

Even More Analysis of Soccer Outcomes using the Luck Metric!

Referee Images - Free Download on Freepik — referee image from https://www.freepik.com/free-photos-vectors/referee

In the previous entry in this series (see link here) we studied if it was possible that the different styles of playing surface might actually be correlated with increased or decreased luck. In this series, we define luck as the number of actual goals in a game that either exceed or fall short of the expected goals metric (which relies on statistical measures of the likelihood of goals given certain measurable activities in a game). We look at luck for both home teams and away teams, because we all intrinsically understand that home teams tend to have better luck when playing at home. See the previous entry to read about what we found.

Today, we’ll do one more evaluation (probably not the last, but definitely one that piques my interest) to determine if individual head referees presence in a game is correlated with greater or lesser luck. The reason we look at head referees only is that I have a data source which lists who the head referee is for a very large set of games. In theory, the head referee controls the flow of the game and contributes the most to uncertainty of the outcome. We’ll look at officiating for both the MLS and the Premier League to see if 1) certain refs affect the luck metric more often and 2) if the impact to luck is significant or not.

MLS 2023

The methodology here in general is to evaluate how both “home luck” and “away luck” can be grouped across the individual head referees. Then we take the mean value of luck and also the standard deviation (how much the luck values tended to vary from game to game that the individual officiated). These are plotted in a similar way to how we plotted the field surface plots. Keep in mind that we’re attempting to describe the entire distribution of games that an individual official was part of. Since we make the assumption that this distribution follows a Gaussian distribution (bell curve) we believe we can describe the impact across all the games with just the mean luck value and the standard distribution. Below we can see the results, where the square describes the mean and the lines describe the variation.

Home Team “Luck” distribution by Head Referee 2023

Away Team “Luck” distribution by Head Referee 2023

Analysis: I’ll analyze the 2023 MLS results and then leave the analysis of the 2022 MLS games and the Premier League games to the reader. What do I see?

Remember we’re evaluating how each official impacts home and away luck (remember, luck describes actual goals in excess of the number of goals we statistically predict using the expected goals metric). We see very different mean values of luck for the individual referees, but it is hard to say that the extra “lucky” goals are causal due to the participation of the referee. It takes much more work than this kind of statistical analysis to determine causality. It could be that the referree’s impact is actually correlated with some other event that is more causal about the lucky or unluckiness experienced. That’s just statistician speech to make sure we don’t all grab pitchforks and torches!
We do see that certain referees are more likely to be associated with higher or lower luck. Some of the referees’ results are close to the average (mean) of the entire distribution of referees. These are mostly the ones in the middle. This means that statistically, the luck experienced during the games are about the same when these “middle” refs officiate.
However, there are officials off on the edge who do seem to have a statistically higher impact on the luck experienced in the games. The two in red (on the home luck chart) actually are two standard deviations away from the mean value of luck across all the refs. This means that their luck outcomes are different than 95% of all the other referees. This tends to show that the fact that these refs are way out on the edges is not due to chance, but is actually due to something the refs are doing different.
Note that there are 4 or 5 refs in the Home Luck chart who have a mean impact in their games of close to one goal! You can see that the error bars for these refs vary, but at least one seems to almost always have a one goal impact on a game. One could say that these officials are more likely to give penalty kicks in the box (a very high probability of a score). That might be a good guess, because the expected goals metric that I use actually excludes penalty shots (because they’re random — and therefore “lucky” — events that cannot be predicted). But maybe this metric shows that with certain referees, penalty shots are more probable and are therefore less random.
Another interesting thing to see is that the luck impact across all the officials is much lower for the away teams. This means that away teams are less likely to be impacted by the presence of individual officials. This is probably reasonable to assert, given the notion that officials in every sport probably have an unconscious bias for the home team (whose fans are screaming about any calls that go against their teams).
We do see one official whose presence is well-correlated with “good luck” for the home team and “bad luck” for the away team. When I dug in to try to understand why this one official stands out, I discovered that they are very rarely the head official (often getting assigned to do video replays). I also noticed that officials that are outside 67% of the other officials likewise rarely get to be head refs. Perhaps the MLS is paying attention to this (see this webpage for details on this).

Other Charts

MLS 2022

Note the one official that has no error bar? This is most likely because he was the head official only one time in 2022. It’s small data, but observe that it follows the exact opposite trend that we see this official following in 2023! Weird. We also see more dramatic shifts in mean values for the outlier refs in 2022 than we see in 2023.

Premier League 2023

Premier League 2022

Wrapup

So what do you see in the MLS 2022 and the Premier League charts? There are definitely some interesting trends and differences. Feel free to leave comments on what you see and we can dialogue about them!

LINKS to Other Soccer Analytics Entries