Luck and the Law : Quantifying Chance in Fantasy Sports and Other Contests ∗

Fantasy sports have experienced a surge in popularity in the past decade. One of the consequences of this recent rapid growth is increased scrutiny surrounding the legal aspects of the games, which typically hinge on the relative roles of skill and chance in the outcome of a competition. While there are many ethical and legal arguments that enter into the debate, the answer to the skill versus chance question is grounded in mathematics. Motivated by this ongoing dialogue we analyze data from daily fantasy competitions played on FanDuel during the 2013 and 2014 seasons and propose a new metric to quantify the relative roles of skill and chance in games and other activities. This metric is applied to FanDuel data and to simulated seasons that are generated using Monte Carlo methods; results from real and simulated data are compared to an analytic approximation which estimates the impact of skill in contests in which players participate in a large number of games. We then apply this metric to professional sports, fantasy sports, cyclocross racing, coin flipping, and mutual fund data to determine the relative placement of all of these activities on a skill-luck spectrum.

view has proven to be prescient as recent trends show that legal arguments grounded in data analysis are becoming increasingly common [6].In light of this shift-although historical and ethical arguments remain the purview of legal professionals-physicists, mathematicians, and others well versed in data science have an obligation to provide rigorous mathematical foundations to ground these statistical legal debates.
One such debate that is currently being argued in the courts involves fantasy sports, games in which participants assemble virtual teams of athletes and compete based on the athletes' real-world statistical performance.Fantasy sports have experienced a surge in popularity in the past decade.The Fantasy Sports Trade Association [1] estimates that 56.8 million people played fantasy sports in 2015 (up from 41.5 million in 2014) and that the concomitant economic impact of the industry is on the order of billions of dollars per year.One of the consequences of the recent rapid growth in activity is increased scrutiny regarding the legal aspects of the games.In particular, contests that involve online exchange of funds are now subject to the Unlawful Internet Gambling Enforcement Act of 2006 (UIGEA), which regulates online financial transactions associated with betting or wagering [2].Currently, the UIGEA excludes fantasy sports, stating that the definition of a "bet or wager" does not include "participation in any fantasy or simulation sports game" [2].However, at the time of writing, eight U.S. states do not allow fantasy sports competitions for cash.
In general, legal questions surrounding the classification of games as "bets" or "wagers" hinge on whether the outcome of the game is determined predominantly by skill or by chance.Typically, "skill" is defined as the extent to which the outcome of a game is influenced by the actions or traits of individual players, compared to the extent to which the outcome depends on random elements.Note that here we are considering the role of skill in the game framework, i.e., our goal is to quantify the utility of players' abilities in general, rather than to measure the skill of individual players.
In a fantasy sports contest, players manage teams that accrue points based on the statistics of real athletes.For example, a fantasy football player with receiver X on their roster earns points every time X makes a catch in a professional football game.Rules for constructing a fantasy team roster-and hence strategies for assembling optimal lineups-vary by league.In this study, we analyze data from salary cap games in which each player has a fictional dollar amount they can spend and athlete "salary" values are set by the game provider. 1hile there have been relatively few empirically-based investigations on the roles of skill and chance in fantasy sports, studies exist to determine whether skill is a distinguishing factor among NFL kickers [18], whether the outcome of shootouts in hockey are primarily determined by luck [11], and whether perceived "streaks" in basketball should be attributed to chance or to the "hot hand" [25,26,27].In addition, many authors have explored whether scoring patterns in basketball [3,8,24], baseball [16,24], cricket [23], soccer [9], tennis [12,21], and Australian rules football [13] display statistical signatures of random processes.Perhaps the most relevant, extensive (and hotly debated) body of work is the analysis of the relative roles of skill and chance in poker [5, 4, 22, and references therein].Unfortunately, most of these analyses cannot be applied directly to fantasy sports, which differ from card games in at least one critical aspect: in card games, it is a relatively straightforward combinatorial exercise to estimate the probability of every possible outcome (e.g., how likely is a flush).This is not the case for fantasy sports in which performance on any given day is coupled to a host of factors such as weather, skill of opponent, injury, home versus away games, etc. Owing to this added layer of complexity, it is unclear whether the bulk of previous poker analyses can be adapted to fantasy games.
Instead, we take a data-driven approach analogous to the one proposed by Levitt et al. [15,14].Their study was performed in the context of poker but, unlike the combinatorial approaches of, e.g., [5], it does not require prior knowledge of outcome probabilities and hence can be easily adapted to other activities that combine skill and chance.Our strategy will be to examine data from the online fantasy sports platform FanDuel and test whether statistical outcomes are consistent with expected outcomes associated with games of chance.If the measured outcomes deviate significantly from those we expect in a contest of pure chance, we can quantify the extent of the deviation to place fantasy sports and other activities on a skill-luck spectrum.
Levitt, Miles, and Rosenfield [15] note that there are at least four tests that can be applied to distinguish games of pure chance from those involving skill.The authors propose that tests can be framed as the following four questions: 1. Do players have different expected payoffs when playing the game? 2. Do there exist predetermined observable characteristics about a player that help one to predict payoffs across players? 3. Do actions that a player takes in the game have statistically significant impacts on the payoffs that are achieved?4. Are player returns correlated over time, implying persistence in skill?If the answer to all four questions is "no," then we can conclude that the game under consideration is a game of chance.One of the many appealing aspects of this test is that the analysis can be framed in terms of inputs (player actions) and outputs (winloss records), hence the game itself can be treated as a black box and the relative roles of skill and luck can be quantified irrespective of the detailed rules of the game.

Empirical Tests of Skill Applied to Fantasy Sports Data.
To estimate the relative roles of skill and chance in these contests, we analyze data from FanDuel, currently one of the largest providers of daily fantasy sports.We consider two types of daily fantasy games-head-to-head (H2H) and 50/50 competitions-associated with four sports leagues-Major League Baseball (MLB), the National Basketball Association (NBA), the National Football League (NFL), and the National Hockey League (NHL).In H2H competitions, the player pits his team against a single opponent; both players pay the same amount to play and the winner takes all (minus the overhead to the host site).In a 50/50 league a pool of players each pay the same entry fee to enter the competition; the top half of scorers in the fantasy league each receive the same payout (roughly double what they put in), while the bottom half receives nothing.When a player elects to play in a game (e.g., a particular 50/50 competition), they may pay to submit multiple entries.Hence, each player has a win fraction associated with each game defined as their fraction of winning entries.These entries are typically not independent and players may submit multiple copies of the same entry.
FanDuel provided us with 12 sets of data. 2 The first four were anonymized results from H2H competitions for MLB, NBA, NFL, and NHL.Each entry in the dataset represented the performance of one user in one game (G i ) and contained a user ID (UID), the number of entries submitted by UID in G i , the number of winning entries for UID in G i , the average score (averaged across all UID's entries in G i ), and the top score for UID in G i .The next four datasets contained similar information for 50/50 competitions.The final four datasets contained athlete performance data.Each entry in the dataset included athlete name, date of competition, team, FanDuel "salary" on the date of that particular competition, position (summarized in Table 1), and the number of FanDuel points scored by that athlete in that particular competition.We considered two full seasons (2013/14 and 2014/15) of all FanDuel H2H and 50/50 contests for NBA, NHL, NFL, and MLB.Each fantasy team for all sports contains nine athletes (see Table 1).The salary caps, number of fantasy players, and number of athletes in our dataset are summarized in Table 2.In what follows we use this data to test questions 1, 3, and 4 proposed by Levitt et al. [15].In all cases we found no significant differences between the H2H and 50/50 data, so the results presented herein were obtained using the combined datasets unless otherwise noted.Since our data is anonymized we do not have information on predetermined player characteristics; hence we do not address question 2.

Expected Payoff.
In a game of chance, the expected payoff for all players is the same.To test whether this is true of our data, we divide each of our datasets-fantasy NBA, MLB, NFL, and NHL-into five subsets according to the number of entries N i played by the ith player.The first group contains players who have submitted the fewest number of entries and the fifth group contains players who have submitted the largest number of entries.In the FanDuel playing population, we observe that the number of players, m, who have played n i games decays approximately exponentially (i.e., most players play only a few games, see Figure 4), hence ranges were selected to reflect a logarithmic distribution such that the first group contains 90% of the players, the second group 90-99%, the third 99-99.9%,etc.If the measured win fraction distribution varies across these five subsets in a statistically significant manner we can conclude that something other than chance played a role © 2018 SIAM.Published by SIAM under the terms of the Creative Commons 4.0 license Downloaded 12/04/18 to 18.51.0.96.Redistribution subject to CCBY license in the outcome of the contest.Here the win fraction of the ith player is computed as (1) where x ij represents the fraction of winning entries in the jth game: 0 ≤ x ij ≤ 1 and x ij = 1 if all of player i's entries in the jth game were winners.Some care must be taken in this analysis as it can be argued that players who win their initial games (whether by skill or by luck) may be more likely to keep playing; conversely, players on a losing streak may be more likely to quit.To determine whether or not this is the case, we computed the average win fraction across the population for the n i th game (i.e., the last game the player played before they quit), the (n i − 1)th game, (i.e., the second-to-last game the player played before they quit), etc.These data are summarized in Figure 1 (left), which shows a "quitting boundary layer" indicating that players are indeed more likely to quit after a string of losses.In our dataset, the boundary layer, n BL , is measured to be approximately five games wide in all four sports.This introduces a bias in the calculation since the boundary layer accounts for a larger fraction of games in the first group in which the population has played the fewest number of games.To correct for this bias, we remove the last five games played by every player and compute the win fraction of the remaining games: (2) The removal of the boundary layer increases the average win fraction significantly in the first group, but has only a minor impact on groups 2-5.
A number of trends are clearly observable in the data summarized in Figure 1 (right).First, players who play the fewest games systematically underperform the other groups with an average win fraction of 0.48 (averaged over all four sports).In contrast, in the cohort that played the most games, the mean win fraction increased to 0.61 (averaged over all four sports).Thus, these data are not consistent with a game of pure chance.(Additional details are available in Table 4 in the appendix.)

Effect of Player Action.
To test the effects of player actions, we compare outcomes of real players with those of a league of players drawn randomly from all possible lineups.Ideally we would like to compare the distribution of scores generated by real fantasy players with the distribution of scores for all possible lineups; however, the combinatorics associated with generating all possible lineups proved to be computationally intractable.Instead, we estimate the "all possible lineup" distribution using a Monte Carlo approach.Two strategies were tested to construct the Monte Carlo rosters.In the first, athletes for each roster slot were picked randomly from a normal distribution of salaries, centered at one ninth of the salary cap.In the second, the center of the salary distributions varied by position; for example, if on average quarterbacks cost twice as much as kickers, then the mean of the normal salary distribution for quarterbacks was set to be twice as much as the mean of the normal salary distribution for kickers, with the constraint that the sum of the means across all positions must be equal to the salary cap.
One hundred random lineups that conform to the rules of the game were generated for each day of competition.In all cases, including football, only daily competitions were considered, which corresponds to a range of 51 (NFL) to 182 (MLB) competitive days per year depending on the sport.Each roster is checked to see if the total salary is below the salary cap and above a minimum threshold.In the data shown here, the threshold is set to 85% of the salary cap, which roughly maximizes the mean score of the random lineups using FanDuel data.(If the threshold is too low, cheap lineups can skew the distribution toward low scoring rosters; if the threshold is too high, the constraint is too rigid and high performing rosters are missed.)Randomly generated rosters that satisfy these constraints are accepted, and all others are rejected.The process is repeated until 100 acceptable rosters per day are generated.We found very little difference between the two Monte Carlo strategies, i.e., equal distribution versus positionweighted distribution.Results for the weighted distribution are shown in Figure 2. In all cases FanDuel players beat the Monte Carlo simulation with user win probabilities ranging from 62% (NHL, equal distribution) to 95% (NBA, weighted distribution) as summarized in Table 3, suggesting that player actions do indeed influence the outcome of the game.

Persistence.
To address the question of persistence, we begin with the hypothesis that skill is an intrinsic quality of a fantasy player and does not change significantly over the course of the season.If this is the case we expect to observe a distribution of underlying skill across the playing population in which the win fraction of each individual player in the first half of the season is correlated with that player's win fraction in the second half.To determine whether this is consistent for FanDuel players, we plot the win fraction for the first half of the season versus the win fraction for the second half of the season for each player.(Note that here and in all subsequent calculations, the quitting boundary layer has been removed.)These data are shown in the scatter plots in Figure 4; each circle represents one FanDuel player and the size of the point represents the number of games played by that particular player.In order to quantify the role of skill in determining the outcomes of the competitions represented in these plots, we seek a metric with the following properties.Ideally the accuracy of the metric should improve as number of contests per player grows, n i → ∞, and as the number of players grows, m → ∞.In particular, the metric must capture the expected behaviors at the two extremes: competitions of pure luck and competition of pure skill.In contests of pure luck, the expected outcome of every player is the same.Hence-assuming a zero-sum game in which players are playing against one another-as the number of games per player increases, win fractions for the first half of the season versus win fractions for the second half of the season for all players converge to a single point at (1/2, 1/2).Conversely, in a skill-dominated competition, we expect to observe a distribution of skill across the playing population; in this case we expect the data to converge to a line with slope 1 as the number of games per player increases.Sketches of expected user win fractions for the first half of the season versus user win fractions for the second half of the season for high and low values of n i and m are shown in Figure 3.To characterize these distributions, perhaps the most obvious metrics to try are the Pearson product-moment correlation coefficient or a standard linear regression.However, both of these are problematic.The linear regression fails to accurately capture the extreme of pure luck since, in the limit that the number of games per player becomes large, all of the data collapses onto a single point.The Pearson product-moment correlation coefficient is problematic as it cannot distinguish between lines of different slopes.In our case, in the limit of pure skill, we expect a line of slope 1, whereas lines with different slopes, no matter how well correlated, are not necessarily representative of intrinsic skill.Hence, we seek an alternate measure.Motivated by the expected outcomes shown in Figure 3, we propose to use the ratio of the variance along the diagonal in our scatter plots (denoted as S in Figure 3) to the variance in the orthogonal direction (denoted as T in Figure 3) as a measure of the relative importance of skill and luck in a competition.
To evaluate this quantity, we characterize each player by two numbers: n i , the number of games played by player i, and w i , the win fraction of player i as computed in (1).Note that in the following analysis we will be computing quantities associated with a distribution of m players where the ith player constitutes a distribution of n i games.Hence, quantities with the subscript i refer to individual players, while quantities with no subscript refer to aggregates over all players.Next, we introduce random variables P i and Q i associated with the win fraction of player i in the first and second halves of the season, respectively.If the game is truly random, then P i and Q i have the same distribution and indicates the expected value of X); however, if the outcome of the game is primarily determined by skill, then where w i represents the true underlying skill of player i.
Rotating the (P, Q) coordinate system by π/4 and shifting the origin, we introduce the transformed random variables as sketched in Figure 3: Here, T i represents the difference in the win fraction distribution between the first and second halves of the season, and S i represents the variation of P i and Q i from the nominal value of 1/2.Note that, if the game is truly random, then in the new coordinate system, E[S i ] = E[T i ] = 0; however, if the outcome of the game is determined by skill, then E[T i ] = 0 and E[S i ] = (2w i − 1)/ √ 2, where w i varies according to the skill level of the individual player.In games that combine skill and chance, we expect the measured values to lie somewhere between these two extremes.
To characterize the role of skill in determining the outcome of the game, we compute a weighted variance in S and T over the aggregate of all players, where and θ Si and θ Ti are weighting functions that reflect our confidence in the ith data point.Finally, we define the quantity ( 7) which provides a single metric to quantify the relative roles of skill and chance in determining the outcome of the game.For games that are truly random, E[R * ] = 0; for games that are purely skill-based, E[R * ] = 1. 3o estimate the distribution, expected value, and variance of R * from real data, we first compute the sample mean estimate for each player associated with the win fraction in the first and second halves of the season, respectively, using x ij .
Next we model the skill of each player, as reflected by the win fraction, by a normal distribution.Toward this end, we only consider players that satisfy the condition under which the binomial distribution may be well approximated by a normal distribution, namely, ( 9) In this case, the variance in the half season win fraction of the ith player can be approximated by ( 10) Note that if skill is a persistent quality that is intrinsic to the player, then the win fraction in the first and second halves of the season should be equal-namely, pi and qi should both approach w i as the number of games per player becomes large-and players can be represented by points along the diagonal as sketched in Figure 3 (left).
We can now directly compute the R * value associated with FanDuel data.At this point we are not computing a distribution, but rather the particular instance of R * that was observed in the 2013/14 and 2014/15 seasons.Rotating pi and qi as defined above to shift to Ŝi and Ti coordinates, weighting the ith data point by the variance θ Si = θ Ti = 1/σ 2 i , and using and find R * = 1− B/ Â.This observed value of R * for the 2013/14 and 2014/15 NBA, MLB, NFL, and NHL FanDuel players is shown by the solid black lines in Figure 4.In each plot we consider a range of playing populations defined by the minimum number of games per player represented along the horizontal axis (e.g., if the minimum number of games is 100, then we discard players who have played 99 games or less).The dashed line (right vertical axis) represents the number of players in each population, which is surprisingly well approximated by an exponential distribution in all four sports, suggesting that the measured value of R * is dominated by players who have played the fewest number of games in the sample.While this calculation serves as a starting point to estimate the role of skill in a particular FanDuel season, ideally we would like to estimate the expected value of R * (and the variance, etc.) if a large number of seasons were played with the same player population.Toward this end, we model the win fraction of each player in the first and second halves by random variables P i ∼ N (w i , σ 2 i ) and Q i ∼ N (w i , σ 2 i ), where N (μ, σ 2 ) indicates a normal distribution with mean μ and variance σ 2 .Applying the rotated coordinate system transform to obtain the estimated win fractions as before, we find where S i and T i are also normally distributed random variables with means given by μ Si = (2w i − 1)/ √ 2 and μ Ti = 0.The quantities Â and B can now be computed as in (11) with Ŝi and Ti drawn from the specified normal distribution.
In practice, we may estimate the distribution of R * using a Monte Carlo approach.For each player we can generate a "season" (i.e., x ij 's) with the constraint that the simulated season must have the same n i and w i as the real player's season.From this data we can compute a specific instance of R * .This process is repeated to construct a distribution which can then be used to estimate quantities of interest, such as the mean, confidence region, etc. Results from the Monte Carlo simulations for NBA, MLB, NFL, and NHL FanDuel players are shown in blue filled symbols in Figure 4.The blue dots represent the mean value of R * computed for 100 Monte Carlo seasons.Error bars on the symbols represent the standard deviation across the 100 trials.
We may also estimate both the expected value and the variance of R * directly from the player data by approximating chi-squared distribution with the parameters m and λ S = m i=1 (μ Si /σ i ) 2 .Similarly, the random variable B is distributed as a noncentral chi-squared distribution with the parameters m and λ T = m i=1 (μ Ti /σ i ) 2 .Hence, from which E[R * ] can be computed as ( 16) .
The expected value of R * computed directly from ( 16) using FanDuel player data is shown in Figure 4 as solid blue lines.
Similarly, the variance can be estimated directly from player data by propagating the uncertainty associated with the fact that each player only plays a finite number of games: This estimate is indicated by the shaded light blue regions in Figure 4.In all four sports, both the analytic estimate and the Monte Carlo simulation provide a reasonable approximation to the FanDuel data for players that have played more than approximately 100 games.For players with fewer games, ( 16) and the Monte Carlo simulations overpredict the measured value of R * , suggesting that the assumption that players are well represented by a normal distribution with mean w i and variance σ i breaks down for small n i .

A Bayesian Approach.
As a final consistency check, we analyze the data from a Bayesian perspective.First, we associate with any given competition a random variable which characterizes the probability that the competition is a game of luck or a game of skill, with a sample space {Luck, Skill}.Our goal is to infer the probability that the competition is a game of luck (or skill) based on observed data.To this end, we appeal to Bayes' theorem, (18) P ( Skill where P ( Skill |w i ) is the conditional probability that the competition is a game of skill given the observed win fraction of the ith player, w i ; P (w i | Skill ) is the probability of observing w i if the competition is a game of skill (the likelihood); and P (Skill) is the prior probability that the competition is a game of skill.The denominator P (w i ) is the probability of observing w i regardless of contest type (the evidence) and is given by P (w i ) = P (w i | Luck )P ( Luck ) + P (w i | Skill )P ( Skill ).If we take z i to be the number of wins and N i to be the number of entries of the ith player (note that N i , the number of entries, may be greater than n i , the number of games, since players may submit more than one entry per game), P (w i | Luck ) is equivalent to the probability distribution one would observe in a coin flipping competition and is given by the binomial distribution, where φ = 1/2 and Ni zi is the binomial coefficient.For a game in which the outcome is determined solely by skill, players can, in theory, be ranked according to their skill level.In any match-up, the higher ranked player wins the contest.Hence, the best player wins all of their games, the next best wins all but one, etc., and the probability distribution P (w i | Skill ) is uniform: Finally, since there are only two possible outcomes in this formulation (i.e., the sample space is {Luck, Skill}), P (Skill) = 1 − P (Luck).
Starting with a uniform prior of P (Luck) = P (Skill) = 1/2, we iteratively evaluate P (w i | Skill ) using (18) and the observed data {z i , N i } for a randomly picked player until the probability P (Skill) converges (see Figure 5 (left)) to one or zero.As before, we consider different populations defined by the minimum number of games per player.For all four sports, we draw 200 random samples from the relevant population and iteratively compute P (w i | Skill ).We record the outcome of this exercise (a one or zero), and repeat the process 50 times.We then average over all 50 outcomes to compute a mean P (Skill).This process is repeated 50 times, which allows us to compute not only the mean of P (Skill), but also the standard deviation of our computed mean as represented by the error bars in Figure 5 (right).In all cases we find that P (Skill) converges to 1 if the minimum number of games per player is sufficiently large.Hence, for each sport there exists a transition game number, N T G , above which the outcome of the game is definitively determined by skill, i.e., for playing populations in which © 2018 SIAM.Published by SIAM under the terms of the Creative Commons 4.0 license Downloaded 12/04/18 to 18.51.0.96.Redistribution subject to CCBY license every player has played more games than N T G , P (Skill) converges to 1.For all four sports in our FanDuel dataset, P (Skill) is always greater than 1/2; furthermore, for all NBA populations, N T G ≈ 1, suggesting that skill is always the dominant factor in determining the outcome of these contests regardless of how many games are played.
Although one should not necessarily expect a one-to-one mapping between the Bayesian approach and the previous analysis, we can check that trends and features are consistent across both methods.First, the ordering is as we would expect, with fantasy basketball being the most skill-based and fantasy-hockey being the closest of the four to chance.More quantitatively, we can estimate the minimum number of games required to cross over into the skill-dominated region from Figure 4 and compare those numbers with N T G from Figure 5 (right).If we take R * = 1/2 as the critical cross-over value, we would expect to see N T G,MLB ≈ 20, N T G,NBA ≈ 5, N T G,NF L ≈ 25, and N T G,hockey ≈ 40.These numbers for NBA, MLB, and NHL are surprisingly close across the two methods; for NFL, the R * calculation is slightly more conservative, overestimating N T G by approximately 10 games.

Perspectives on the Relative Roles of Skill and Chance in Games and
Other Activities.The outcomes of these tests leave no doubt that skill plays a role in the outcome in fantasy sports competitions.However, it is useful to add perspective to these results by considering the relative roles of skill and luck in the context of other activities.The third test, persistence, can easily be applied to data from a variety of activities as shown in Figure 6, which quantifies R * values for the four fantasy sports discussed in this study; real MLB, NBA, NFL, and NHL athletic competitions; cyclocross racing; coin flipping; and mutual funds.The R * values for real MLB, NBA, NFL, and NHL athletic competitions were computed using publicly available data from the past five seasons (2010-2014).Each point in the pi versus qi plot represents the win fraction for the first half versus the win fraction for the second half of a single season for a particular professional sports team (hence, each team is represented by five points on the p-q scatter plot corresponding to the five seasons we considered).As one might expect, we find basketball at the skill end of the spectrum since there are many games in a basketball season and many scoring opportunities per game.Hence, small differences in skill are amplified over the course of a season and strong teams tend to come out on top.At the other end of the sports cluster we find hockey, which typically has a small number of goals per game and hence one "lucky shot" can make a big difference.While the R * values for most of the sports and fantasy sports are consistent with our expectations, there is one point that is somewhat puzzling.Note that each sport and fantasy sport pair are relatively close to one another.The exception is professional football.According to the computed R * value, professional football contests are largely determined by skill.This is somewhat surprising since there are only 16 games per team in the NFL season and the number of scoring opportunities per game is limited (compared to, e.g., basketball).We currently have no explanation for this and we leave it as a puzzle for future investigation.

Mutual Funds Cyclocross
To compute the R * values associated with cyclocross racing, we considered the finishing places of the top 30 performers using publicly available data from crossresults.comfor 2015.For each athlete, pi and qi values were computed using the average finishing place for the first and second halves of the season, i.e., for each race, first place corresponds to 1, second place corresponds to 2, etc.Note that each athlete participates in a different set of races (although elite performers tend to all participate in a subset of key events).These pi and qi values were again rotated and used to compute R * -taking care to include E[S], which is not zero in this case-as shown in Figure 6.The coin flipping data point was computed using a simulation of a population of 100 players flipping 100 coins.The average R * value over 100 trials-which not surprisingly is approximately zero-is shown in Figure 6.
As we have seen in the fantasy sports data, the measured value of R * depends on the number of games associated with each player, n i .For real sports it is easy to select a representative value of n i , since each team plays the same number of games in a season.For fantasy sports, the choice is less obvious since each player plays in a different number of contests.To estimate the characteristic number of games per player, we seek an appropriate average.Because the number of players who have played n i games decays approximately exponentially (as shown in Figure 4), computing a simple average is not ideal since the signal is swamped by players who have only played a few games.To address this we computed a logarithmically weighted average: Here, G max is the maximum number of games played by any player in the playing population and m j is the number of players who have played j games.Using this weighed average to compute a characteristic number of games played in each sport, we find n MLB = 110, n NBA = 82, n NF L = 34, and n NHL = 60.These values were used to compute the values of R * for the fantasy sports shown in Figure 6.
Finally, we consider mutual funds.Mutual funds-investment programs controlled by (perhaps skillful) managers-have long been considered strategies that can beat the market while limiting risk.Savvy investors are constantly evaluating whether skilled money managers produce sufficient returns to justify their cost.Previous studies have found that chance certainly plays a role in manager performance, but placement on the skill-luck spectrum ranges from predominantly chance [7] to a more balanced skill-chance split [17].In the context of the current study, a mutual fund manager-similar to a player in fantasy sports-must decide how to allocate funds to achieve optimal performance by identifying value in an open market.A fantasy sports player picks athletes who are projected to yield the most points relative to their price; a mutual fund manager invests in companies that they project to yield returns higher than their trading price.Hence, we can compute R * for mutual fund managers using market-adjusted mutual fund performance data (i.e., the performance of each fund was evaluated relative to the performance of the overall market) from Wharton Research Data Services (WRDS) from the past ten years (2005)(2006)(2007)(2008)(2009)(2010)(2011)(2012)(2013)(2014)(2015).Here, pi and qi for each mutual fund were computed for first and second halves of the year, respectively, for each of the 10 years.We considered 44,938 distinct mutual funds with a total of 307,471 "entries."The results are shown in Figure 6.This estimate yields a diplomatic answer that splits the difference between previous chance-dominant calculations (R * ≈ 0) [7] and previous skill-chance balanced results (R * ≈ 0.5) [17].

Speculations on Game Design.
To some extent-as beautifully articulated by Clauset, Kogan, and Redner-it is the artful balance of skill and chance that makes sports so compelling: On one hand, events within each game are unpredictable, suggesting that chance plays an important role.On the other hand, the athletes are highly skilled and trained, suggesting that differences in ability are fundamental.This tension between luck and skill is part of what makes these games exciting for spectators . . .[3] Striking this balance is essential in the design of any competition.In any type of game there are a number of strategies a game designer can adopt to adjust the relative importance of luck in the outcome.First, let us consider the effect of the distribution of skill within the playing population.To illustrate the importance of skill distribution consider a professional golfer playing against a novice versus two professionals (or two novices) playing one another.In the first case the outcome is a near certainty since the skill of the professional will dominate.In the second case, if the ability of the two players is similar, skill is no longer a distinguishing characteristic and the relative importance of chance in the outcome increases [19,20].Hence, tournaments that are divided up into classes of different skill levels (e.g., having beginners play in a separate pool) are likely to have a larger element of luck than those in which everyone plays in the same pool.
The second game parameter that games designers may choose to adjust is the number of contests per player.Calculating the overall win probability in a best of seven series given the win probability of an individual game is a common exercise assigned in elementary probability courses and it is well known that the role of skill is amplified through multiple contests.In the words of Levitt, Miles, and Rosenfield, "Even tiny differences in skill manifest themselves in near certain victory if the time horizon is long enough" [15].Hence, perhaps the simplest way to increase the role of skill in a contest is to increase the number of games per player in the competition.
Finally, game designers can address the balance of skill and chance head-on by addressing the role of chance as reflected in the rules of the game.For example, in a fantasy sports salary cap game, one of the parameters that can be tuned by game designers is player pricing.It is interesting to note that more accurate pricing algorithms push games toward the luck end of the spectrum.Consider the extreme case of perfect pricing in which the price of the player exactly mirrors their expected payoff.In this case, there is no strategy in assembling a lineup (other than to get as close to the salary cap as possible) and the outcome of the fantasy game is determined purely by luck.However, as the pricing becomes less accurate (i.e., less reflective of the expected payoff), skilled fantasy players can capitalize on undervalued players.Hence, to increase the role of skill in a fantasy competition, game designers could either add random noise to their pricing algorithms or increase points awarded for less-frequent, larger-variance events such that "perfect pricing" is inherently more difficult.
Appendix A. FanDuel Data, Rules, and Scoring.The number of players and the range of entries per player in each group in the expected payoff calculation is summarized in Table 4. Roster positions and scoring for the FanDuel 2014 season are summarized in Tables 1, 5, and 6, respectively.

Fig. 2
Fig. 2 Comparison of scores from lineups constructed by real FanDuel players (light gray) with lineups from the position-weighted Monte Carlo simulation (dark gray).Resulting distributions are fit with a normal distribution to estimate confidence intervals on user win probabilities.

Fig. 4 (
Fig. 4 (Left) Scatter plot of P versus Q for FanDuel players who have played a minimum of G min games.To improve readability in the plots, G min was selected for each sport to display approximately 5000 data points: G min,M LB = 85, G min,N BA = 75, G min,N F L = 25, G min,N HL = 20.Each circle represents a single FanDual player; the size of the circle represents the number of games played.(Right) R * value calculated from FanDuel data (solid black line), expected value of R * from Monte Carlo simulations (blue filled circles; error bars represent standard deviation across Monte Carlo trials), computed error (blue shaded region), and number of players in the FanDuel population (dashed line).Vertical dotted line represents the number of games in the corresponding professional sports season and the horizontal dotted line represents the R * value calculated for the 2010-2015 seasons corresponding to the relevant professional league.

©Fig. 5 (
Fig. 5 (Left) Sample evolution of P (Skill) as more player data are added to update the prior for players who have played at least 10 fantasy MLB games.(Right) The probability that the outcome of the contest is determined by skill, P (Skill) as a function of the minimum number of games in the playing population.Data is shown for all four sports MLB, NBA, NFL, and NHL.

Fig. 6 R
Fig. 6 R * values computed for fantasy sports, real sports, cyclocross racing, coin flipping, and mutual funds.Games of pure luck lie on the left and games of pure skill lie on the right.Blue sports icons represent fantasy sports and black sports icons represent professional sports.

Table 2
Overview of the FanDuel dataset.Note that players (plrs) may have multiple entries for each game.

Table 3
Summary of win probability and confidence interval (CI) of user generated rosters versusMonte Carlo generated rosters.

Table 4
Summary of average win fraction, number of players, and the range of number of entries per player in each group as computed in section 2.1.Downloaded 12/04/18 to 18.51.0.96.Redistribution subject to CCBY license © 2018 SIAM.Published by SIAM under the terms of the Creative Commons 4.0 license