(Inside Science) -- Despite his superstar status as quarterback of the New England Patriots, Tom Brady's athleticism never turned any heads. At the 2000 NFL Scouting Combine, an annual gathering where players dreaming of playing in the NFL show off their strength, speed and explosiveness in a series of drills, Brady's performance was famously underwhelming. His 40-yard dash time and vertical leap height are mediocre at best, with numbers perhaps more typical for linemen 100 pounds heavier.
In the draft, the New England Patriots picked him in the sixth round, as the 199th player and seventh quarterback chosen.
Yet, nearly 19 years later, on Feb. 3, Brady won his record sixth Super Bowl, prompting many to anoint him the greatest player of all time.
As a new batch of prospects convenes this week in Indianapolis for the 2019 NFL Scouting Combine, teams will once again be scrutinizing every dash, bench press and leap in search of the next Tom Brady -- or, at least, someone who will help them win. But Brady's combine results didn't portend success, and he's far from the only player whose career surpassed his combine performance. At the same time, many other players with standout combine results never succeed in the NFL.
Today's NFL is as sophisticated as ever, with thick, complex playbooks, year-round scouting, and always-improving sports medicine, and has, in recent years, adopted an increasingly statistical approach, paralleling the analytics revolution that has already changed the MLB and the NBA. All of which raises the question: Can a few simple drills really predict success in a sport as complex as football?
More stories about football research from Inside Science
Finding the measurements that matter
The first combine, held in 1982 in Tampa, Florida, served as a more efficient way for teams to gather and share physical and medical information on prospects. Since then, the combine has grown to include various psychological and physical tests, becoming an event in and of itself -- televised live as a four-day extravaganza for ardent football fans.
Each year, more than 300 prospects undergo six timed or measured drills. The 40-yard dash and the vertical jump are two of the most well-known. In the broad jump, athletes have to start from a stance and leap as far ahead as they can. The three-cone drill requires prospects to run around three cones arranged in an L-shape. The shuttle run involves sprinting laterally back and forth. And finally, prospects must bench press 225 pounds as many times as they can.
These drills don't quite replicate what happens on the field, and aren't relevant for all positions. For a quarterback like Brady, whose main job is to stand in the pocket and throw the ball accurately, it doesn't matter so much that he's slow. But the drills do gauge general athleticism, and it's reasonable to think better athletes would likely make better football players.
Undoubtedly, NFL teams and their number crunchers have analyzed the predictive power of the combine. But their results tend to be proprietary, and because data hasn't always been easily available, only a few academic studies have been done. One of the first was published in 2008, when Frank Kumitz and Arthur Adams at the University of Louisville looked for statistical correlations between success in the NFL -- as measured by games played, statistics, draft positions and salaries -- and the combine data of quarterbacks, running backs, and wide receivers. Other than sprint times for running backs, however, the researchers found no correlation at all.
But their approach might have been limited. Researchers led by sports statistician Masaru Teramoto at the University of Utah used a different approach that enabled them to better determine exactly which combine event has more predictive power. Their 2016 study found that the time over the first 10 yards of the 40-yard-dash was the most predictive of a running back's rushing yards per attempt. For a wide receiver, the factor that most predicted success was simply his height. His leaping ability was also important, as the higher he could jump, the more receiving yards he would gain -- a reasonable link, since receivers often have to leap over defenders to catch the ball.
Most recently, in perhaps the most comprehensive study to date, sports scientists Lisa Vincent, Bryan Blissmer and Disa Hatfield of the University of Rhode Island analyzed the predictive power of the combine not just for wide receivers and running backs, but also for quarterbacks, defensive ends, defensive tackles and linebackers.
Their analysis, published in January in the Journal of Strength and Conditioning Research, used an approach similar to Teramoto's study, but didn't look at the bench press or three-cone drill, which is similar to the shuttle drill in evaluating agility. The bench press, Hatfield said, isn't relevant for football.
Bench Press Tests May Not Mean Much
"Football is in no way, shape or form a muscular endurance sport," said Disa Hatfield of the University of Rhode Island. Making a tackle, launching your body to make a block, or making a cut to elude a defender all require muscles to make short, powerful bursts. For these reasons, Hatfield said, being good on the bench press doesn't translate to being a good football player.
What the test can do, though, is reflect a prospect's work ethic, she said. A player who worked hard in the weight room to do well on the bench press would be someone who would likely also work hard for his football team.
Perhaps surprisingly, the analysis revealed that the shuttle drill didn't correlate strongly with anything. But the other events did, modestly predicting success in some metric for each position, like yards gained or tackles made. What especially stood out, Hatfield said, was the usefulness of the often-overlooked broad jump in predicting success for running backs, defensive ends and defensive tackles. Again, this makes intuitive sense, Vincent said, as these players almost always start each play crouched and often with one hand on the ground.
The numbers with predictive power weren't just raw combine scores, either. The researchers incorporated combine scores with body weight to calculate various types of power, a different metric that complemented raw scores. Horizontal and vertical power, for example, combines body weight with raw scores from the 40-yard-dash and vertical leap, respectively. Both these numbers pointed to defensive ends who made more sacks and tackles, and quarterbacks who rushed for more yards.
Overall, the analysis suggests that the combine can predict about 20-25 percent of a player's future NFL success, Hatfield said.
While these studies used a few years of combine data, David Hedlund, a researcher in sports data and analytics at St. John's University, in Queens, recently compiled 15 years of data. He compared players who earned All-Pro honors, those named to the Pro Bowl, and those who received neither honor. All-Pro players represent the best at their position in a given season, as voted on by the media. Pro Bowl players, are chosen by coaches, players and fans to play in an annual all-star game. All-Pro players had higher average combine scores than Pro Bowlers, who in turn had higher average scores than the rest did. Although the study lacked the statistical rigor of others, it at least hints at an apparent association between combine scores and NFL success.
But one of the main limitations of these kinds of studies is how success is measured. Tackles and yards might mean victory in fantasy football, but they don't always reflect a player's skill or value.
"There's no way we can ever account for the complexity of the game," Vincent said. How many yards a running back gains depends on the blocking prowess of his offensive line. A quarterback relies on elusive receivers who can get open. Wide receivers depend on accurate quarterbacks. And then there's the day's game plan, specific game situations, and the team's overall style of play, which all affect the box score.
In recent years, however, teams and analysts are looking beyond the box score, using increasingly sophisticated numbers that more accurately reflect a player's value. For example, ESPN uses a proprietary number called total quarterback rating, or QBR, a number based on a formula incorporating a quarterback's passes, turnovers and other statistics.
From the combine to success on the field
Even with better measures of on-field success, the predictive power of the combine is limited. Not only do drills fail to capture the complexity of football, they ignore intangible traits like leadership, drive and "mental makeup" -- the quality the Patriots cited when drafting Brady.
The best predictor of future success in the NFL is still past success in college. "Combine measures are not as relevant as on-field performance measures," said Matt Manocherian, director of research and development and football at Sports Info Solutions. "That's based on any statistical test you can construct."
His company builds statistical models that evaluate players, based primarily on on-field performance -- those advanced stats such as an offensive lineman's failed blocks. But combine data is also added to improve the model.
Combine scores are useful as baselines, said Manocherian, who has previously worked as a scout for the New Orleans Saints and Cleveland Browns. You want to identify prospects with at least some minimum amount of athletic ability, so you can, say, filter out the slowest wide receivers.
But because of the rise of analytics, teams have gotten more statistically sophisticated and better at assessing players. "Because we can evaluate players better," he said, "it allows us to not rely on the combine so much."
For some teams and analysts, however, that might be a mistake. "My bet is that teams probably discount the combine too much," said Brian Burke, a senior analytics specialist at ESPN. Many previous models couldn't find a strong correlation between combine scores and NFL success because of a statistical bias called Berkson's Paradox, he said.
Because the combine participants don't represent the general population -- they're already the cream of the crop of football players -- it turns out combine scores and future success can appear to be much less correlated than they actually are, he said. "Quantitative analysts have been discounting the value of the combine for years and years, when the reality is that it can predict career outcome if you analyze it correctly," Burke said.
While such a selection effect could have come into play for his study, there's no good way to account for it, Teramoto said. For the University of Rhode Island analysis, Hatfield said they took care to make sure that this kind of bias was likely not an issue.
It's not just about the measurements
Even after accounting for Berkson's paradox, the combine data shouldn't be relied on too much. "It definitely should not be everything," Burke said.
But as an event and an annual leaguewide convention, its value extends beyond the drills. "The most important purpose of the combine," Manocherian said, "is for the doctor to look at the guys."
The combine also allows teams and coaches to interview prospects and to get to know them as people. "The combine," Hatfield said, "is absolutely necessary because it gives those teams an opportunity to interact with the players."
Once private, the combine is now a highly anticipated event, drawing the most media of any NFL event other than the Super Bowl. It reportedly rakes in more than $8 million in annual revenue for Indianapolis. "I believe today, the combine is taking on more of a marketing role than anything else," Hedlund said.
But for the purposes of evaluating players, the challenge is to better understand combine data and their implications. "I'm less concerned with under or overuse of combine metrics than misuse of combine metrics," Manocherian said. "We need to start understanding what they mean a lot better in terms of how predictive they are when we integrate them with on-field performance data."
And as the league becomes more statistically savvy and continues to analyze the combine and on-field performance, that understanding will grow. "We know a lot more than what we used to know," he said. "But we know nothing compared to what we will know."