# Tennis Statistics

Probability of Winning a Tennis Game

The probability of winning a tennis game is somewhat complicated by the fact that the number of points that will be played is not predetermined, and the eventual winner always wins the last point. Let p be the fundamental probability of you winning a tennis point, assumed to be a given quantity, and define q = 1 - p. Also, let P(m,n) be the likelihood that if n points were played, you would win m of them. P(m,n) is thus just the standard binomial expression given by P(m,n) = [n!/(m!(n-m)!)]pmqn-m.

While the number of game points is not fixed, each game is comprised of at least four points. The following formula, to be explained, then yields the probability, w, that you will win the game.

w = P(4,4) + P(3,4)[p + qp + q2pd] + P(2,4)[pd] + P(1,4)[p2pd]

In this expression, pd is the probability that you will win the game once you have reached a deuce situation, a term to be considered shortly.

This equation consists of four terms on the right hand side. The first is the likelihood of winning the first four game points, and thus the game, and can directly be calculated from the binomial formula. The second term corresponds to winning three of the first four points, and then either winning the next (and final) point, or first losing a point before winning the final point, or losing two points placing you in a deuce situation, whose likelihood of your winning is pd. The third term assumes that you win two of the first four points causing a deuce, and thus again involves pd. Finally, the fourth term corresponds to winning only one of the first four points, and requires that you win two more points for a chance at winning once at deuce.

pd remains to be defined before w can be evaluated. It turns out that pd = p2 + pqpd + qppd since the probability of winning at deuce is the probability of winning the next two points plus the probability of winning and then losing a point, which gets you back to deuce again, plus the probability of losing and then winning a point, which also gets you back to deuce. Solving for pd

pd = p2/(1 - 2pq)

pd can now be substituted into the relation for w, as can the P(i,4) binomial expression terms, and w follows as

w = p4[1 + 4q + 10q2] + 20p5q3/(1 - 2pq)

Probability of Winning the Set

The probability of your winning the set follows much the same procedure as that described above, but using the calculated value of w, instead of p, as the driving parameter. Here we define z = 1 - w, and note that each set involves at least six games. We will also adopt the notation that P(r to s) implies a set score of r to s, and R(m,n) is the binomial formula for winning m of n games, namely R(m,n) = [n!/(m!(n-m)!)]wmzn-m. Using rationale similar to that employed before

P(6 to 0) = R(6,6) = w6
P(6 to 1) = R(5,6)w = 6w6z
P(6 to 2) = R(5,6)zw + R(4,6)w2 = 21w6z2
P(6 to 3) = R(5,6)z2w + R(4,6)[wz + zw]w + R(3,6)w3 = 56w6z3
P(6 to 4) = R(5,6)z3w + R(4,6)[3w2z2] + R(3,6)[3w3z] + R(2,6)w4 = 126w6z4
P(5 to 5) = R(5,6)z4 + R(4,6)[4wz3] + R(3,6)[6w2z2] + R(2,6)[4w3z] + R(1,6)w4 = 252w5z5
P(7 to 5) = P(5 to 5)w2 = 252w7z5
P(6 to 6) = P(5 to 5)[2wz] = 504w6z6
P(7 to 6) = P(6 to 6)pt

where pt is the probability of winning a tie breaker game in which the winner is the first to reach at least 7 points, and to win by at least 2 points.

The only remaining detail is to relate ptto the point probabilities p and q. This is done in precisely the same way as in computing w, except that instead of this game comprising at least four points, it now has a minimum number of seven. Algebraically things get a bit more messy, with a result that

pt = p7[1 + 7q + 28q2 + 84q3 + 210q4 + 462q5] + 924p6q6pd

where pd is derived in the previous section.

We now have derived everthing necessary to calculate all of the possible set scores starting only with the point probability, p. The previous section yields w and pd, and this section takes those values and evaluates the set scores. The likelihood of winning the set, s, is simply the sum of the probabilities of all winning set scores, namely

s = P(6 to 0) + P(6 to 1) + P(6 to 2) + P(6 to 3) + P(6 to 4) + P(7 to 5) + P(7 to 6)

Probability of Winning the Match

This is a much easier derivation, given the set win probability, s. Assuming that a match win goes to the first player to win two sets, the likelihood of such a win, m, is clearly given by

m = s2 + 2s2(1 - s)

w, s, and m are presented as a function of p in the first graph, and specific set score probabilities are displayed in the second figure.