Baseball, Bonds, Math and the Power of Language

First off, don’t worry. I didn’t title this post in an attempt to capture traffic from some obscure long-tail Google search term. What I want to do is share my views as to how math relates to the financial markets and what separates good math from bad. The ideas I want to share first came to my attention via, and were I think best expressed by, a baseball writer. So we’re headed on a little detour into the math of baseball.

Baseball is certainly the most mathematically analyzed of all sports although modern technology and fan interest have driven other sports to catch up somewhat. The gap is likely explained both by baseball’s popularity and the fact lends itself to analysis since the game consists of a series of discrete events (pitches, at bats, innings) with well defined outcomes. It is easier in baseball than in any other sport to precisely record the outcome of each of these events, and this data collection process has been going on in more detail for longer than in any other sport. The box score was invented by Henry Chadwick in the 1870s and both fans and official scorers have been collecting detailed data ever since. Faced with this excess of data about discrete events, it’s only rational to try to combine it and draw conclusions about player and team performance on the average. The most well known of these averages is the batting average which is defined as a hits divided by at-bats. Walks, hit-by-pitches, at-bats interrupted by rain etc are counted as neither hits nor at-bats.

What’s remarkable about the batting average, given its long tenure and popularity, is just how useless it actually is. If you take batting averages, either individually or as a team, and try to correlate them to runs scored and through runs to wins, you will find the correlations are surprisingly weak. Numerous examples exist of teams with high batting averages and poor scoring as well as high-scoring teams with comparatively poor batting averages. This observation was first made in a systematic way by Bill James in his excellent Baseball Abstracts. Better yet, Bill James was able to clearly explain why batting average was so uninformative: it misunderstood the nature of the walk and by extension the way an inning progresses. The inviolable law of baseball is 3 outs per inning – outs are a scarce resource, 27 per game. In practice this means that any at-bat which generates an out, regardless of what else happened, was a net negative for the hitting team. On the flip side, an at-bat which does not generate an out (thus implicitly also putting the batter on base) is a success not just because a man is on base but because the end of the inning has been postponed by one at-bat, and the scarce resource preserved. With that in mind it’s clear why the batting average mishandles walks – a walk is a successful at-bat for the hitting team in that a man was put on base an no outs recorded. But the batting average ignores walks, counting them as neither hits nor at-bats. Thus one of the primary ways a batter can succeed is stricken from the statistics, and the resulting number is bordering on useless. If instead you replace batting average with on base percentage (the chance an at-bat did not cause an out) suddenly you get much better correlation to runs scored and wins.

This topic recently saw the public spotlight due to the movie Moneyball, which is an adaptation of Michael Lewis’ book of the same name. The movie is about as good as a film advocating better living through statistics and starring Brad Pitt can possibly be – which is actually pretty darn good. But the book is transcendent – likely the best book ever written about baseball, and a dark horse candidate for the best book ever written about operating in the face of uncertainty which is a topic that should be near and dear to the heart of any trader. This connection is not a coincidence – Michael Lewis was a bond salesman for Solomon Brothers before he traded in his Quotetron screen for a pen and wrote a string of #1 best sellers about finance and sports. Moneyball, in turn, is really a homage to the ideas of Bill James and an exploration of their implementation (or lack thereof) in modern baseball. And that brings us back to the batting average.

The incredible thing about Bill James’ discovery is that it took so long to see – over 100 years elapsed from the time sufficient data existed to prove the batting average futile to the publication of the Baseball Abstracts. The catalyst wan’t computing power – Bill James didn’t have access to a computer initially and did his stats with a calculator. And it wasn’t for lack of effort – arguably no subject had endured more statistical analysis in that time than baseball. I think what allowed Bill James to see that which innumerable others missed was a matter of philosophy, of how he though of statistics:

“…baseball statistics, unlike the statistics in any other area, have acquired the powers of language.”

Let that one sink in for a bit. To extend his thinking, the batting average is junk because it lacks the power of language. It doesn’t tell us anything – at least nothing useful. It gets us no closer to understanding why runs were scored or why one team won and the other lost. It’s mute. In contrast, on base percentage (and its big brother, on base plus slugging or OPS) have something meaningful to say. A number is good and useful because it informs rather than deceives. The test of a statistic is not raw correctness (which is easy to achieve) but insight. This is the fundamental difference between math in the classroom and math in the real world – in the classroom your goal is to be correct. In the real world, your goal is to inform – for that purpose correctness is usually necessary but never sufficient.

Where am I going with all this? Originally I set out to write a short post explaining why bonds were a big deal in the financial markets compared to stocks. I’ve made such statements before, but sort of half-assed the explanation by noting that that there are more dollars of bonds than dollars of stocks. It’s true but not an adequate explanation – the biggest is not always the most important. My next, better, idea for an explanation was Lew Ranieri‘s famous observation that “bonds are about math”. That gets closer to what I want to convey, but still misses the point because math education has warped people’s view to emphasize the bullshit parts of the discipline that lack Bill James’ power of language. So I’m forced to revise Ranieri, and say that bonds are important because bond math has the power of language. And now you see the reason for my long digression. This, as you might guess, is not really my own idea – as I said Michael Lewis, who brought Bill James to a general audience, was a successful bond man in a previous life and in fact a co-worker of Ranieri at one time.

So what do I mean by bond math having the power language? How do numbers derived from bonds inform while others deceive? To fully express the idea I would have to write many pages on bond math. While I will likely do exactly that eventually, today is not the day. So instead I will provide a greatly simplified version of the math and attempt to jump directly to the point where the math becomes language and tells us something of value.

The first thing to know about bonds is that they are a fungible representation of debt. They represent a loan from the holder of the bond to the issuer of the bond. In return for that loan, the issuer promises to pay back the loaned money with interest. The rate on that interest is called the “yield” of the bond. The amount of time before the money is paid back is the “duration” or “maturity” of the bond. The yield is always expressed annually regardless of the duration of the bond. The last relevant piece of information is the price on the bond, which is determined by trading activity in the market. The prices are generally represented in rather odd units – hundredths of the value of the principle (loan amount) on the bond. So a bond priced at 100 is worth the same amount as it was when new. A bond priced at 60 has been greatly de-valued for some reason. One priced at 120 is more valuable than at issuance. Price and yield move opposite to each other – if you underpay for a bond (less than 100), then the effective yield is higher than the original buyer would recieve. If you overpay, the yield is less. This means that bond prices move opposite interest rates – if rates fall, older (and higher yield) bonds become more valuable than they were at issuance and their price rises above 100. For the purposes of this post, I’m going to ignore bond price and just talk about duration and yield. Just understand that yield is set in the market by means of price.

If you go to Google Finance, they have bond yields in the lower right hand corner. While they don’t explicitly tell you, these are treasury bond yields. Today’s values are:

3 Month 0.01%

6 Month 0.03%

2 Year 0.22%

5 Year 0.89%

10 Year 2.00%

30 Year 3.03%

Now, it might seem odd of me to say this, but even these few simple numbers have the power of language. What do they say? They say that the next 5 years in the US are going to pretty much suck from an economic perspective. You can of course be excused if you didn’t just look at the numbers and extract that – some skill at interpretation is still required, but the line of thinking is not overly complex. The basic idea is that people will not accept, over the long term, interest rates that are lower than what they think they can get in the short term. It doesn’t make sense to tie my money up for 5 years at 2% if I think can get 3% annually by tying it up for a quarter at a time 20 times over for those same five years. Thus long duration bonds are a measure of expected future short term rates. And short term rates are a measure of the state of the economy because they are used by the Federal Reserve to control economic growth. When economic times are bad, the Fed drives down short term rates by offering easy Fed loans to stimulate the economy. When, in contrast, economic growth is overheated the Fed will raise short term rates to reduce borrowing and slow the system down. So now we can see why those yields have the power of language – the yields are near zero out to 5 years, meaning that bond traders expect short term rates to be near zero that entire period. And that implies the economy will be lousy out 5 years.

Of course this is only a prediction, produced via the market, and not by any means the sort of prediction made at Delphi that always came true. But it is a prediction made by the most financially savvy people in the world and backed by trillions of dollars bet that the opinion is correct. The bond market’s ability to predict the future economy, while far from perfect, is much better than random or than say the press or politicians can muster. And even if you disagree with the prediction, it has value: it gives you something concrete to disagree with. Without the yield curve information, you could say the economy was kind of bad, but putting a number on how long that would persist would likely be an exercise in guessing. Hopefully you’re beginning to see the meaning in the numbers.

Predicting the future economy is a pretty neat trick, but there’s far more hidden in the bond numbers. Consider the idea that Treasury bonds are considered more or less risk free – the US Treasury will not default. Correspondingly they have the lowest yield of all dollar denominated bonds. But what does the higher yield on other, non-treasury bonds, mean? It’s a representation of the probability of default on an annual basis. A default is what happens when a bond issuer doesn’t make their bond payments – it’s synonymous with bankruptcy. So when California State 5-year bonds yield 2.25% and the 5 year treasury bond yields 0.89%, that means the market thinks there’s a 1.36% chance annually that California will go bankrupt and default on their debt. Again, that’s informative – everyone who follows the media knows that California has some financial problems, but if you believe the media you’d likely think the chance of default is much higher than it actually is. Again, this is only prediction, not fiat, but it is informative.

I could go on providing examples, but the pattern should be clear here – every bond yield (or price) is a small commentary on either the economic future or on the soundness of some financial institution. When I say bond math has the power of language, this is what I mean – it tells you things that are interesting and useful to know. The TED spread tells you about the soundness of overseas banks. The yields of various corporate bonds tell you about the health of major corporations. The yields of sovereign bonds tell you about the health of various national financial systems and which countries are attracting investment capital. That’s the power of language – the numbers speak.

So why do bond yields have the power of language while stock prices do not? The answer lies in the fundamentally boring nature of bonds. With a conventional bond, there is no overly good outcome possible for the investor. If all goes according to plan you simply get the fixed payout offered by the bond. There are no dreams of riches. With equities, there is always that possibility of an outstandingly good outcome where the company’s equity stake increases many times over. In fact, thanks to market dynamics stocks are held preferentially by those who most believe that highly positive outcome will occur – stocks are inherently at some level the province of dreamers. This in turn radically depresses the dividend yields on stocks, and makes them mostly useless for the sorts of mathematical comparisons bonds invite. Interpreting bond prices is the process of interpreting rational bets by boring, rational people. Interpreting stock prices is trying to quantify dreamers.

There’s two things I want you to take away here. One is why bond yields are informative, indeed have the power of language. The second is the idea that better mathematical analysis stemming from the right philosophical framework can allow you to see things that many other people looking very hard will miss. That is why the Baseball Abstracts (and Moneyball as their popular interpreter) are urtexts of this blog. They offer proof that with the right sort of analysis you can create clarity where others are confused, which is a good thing to be able to do if you’re trying to find profitable trades.

This speech gets at the same question from a different direction, and I will certainly return to it.

4 thoughts on “Baseball, Bonds, Math and the Power of Language”

mike on March 29, 2012 at 4:08 am said:

Excellent thanks. Some on 2+2 had recommended trading interest rate products to start learning on, and this was helped me understand the very basics.

Reply ↓
W on March 29, 2012 at 7:48 pm said:

Cool – I kind of wondered if this site might eventually get a following at 2+2. Is Dave still sugar daddying young looking strippers and trying to turn the 40-80 stud game into hi-lo no qual late at night?

Reply ↓
KN on November 5, 2012 at 5:08 pm said:

Could you explain the math behind this paragraph “So when California State 5-year bonds yield 2.25% and the 5 year treasury bond yields 0.89%, that means the market thinks there’s a 1.36% chance annually that California will go bankrupt and default on their debt”

Thanks

Reply ↓
- W on November 5, 2012 at 8:47 pm said:
  
  Sure. The efficient price for a bond is the zero risk rate plus the annual rate of default (technically that’s assuming all-or-nothing default). So if you’ve got a company that has a 1% chance of defaulting, its bonds should yield 1% above T-bond rate for T-bonds of the same duration. There may be other factors – for example different tax treatment. But with municipal bonds, everything is tax free so that’s not an issue and the relationship should hold pretty directly.
  
  Hopefully that answers your question.
  
  Reply ↓

Off-Road Finance

Education For Thinking Traders

Baseball, Bonds, Math and the Power of Language

4 thoughts on “Baseball, Bonds, Math and the Power of Language”

Leave a Reply Cancel reply