S.L. Benfica—Portugal’s top football team and one of the best teams in the world—makes as much money from carefully nurturing, training, and selling players as actually playing football. Football teams have always sold and traded players, of course, but Sport Lisboa e Benfica has turned it into an art form: buying young talent; using advanced technology, data science, and training to improve their health and performance; and then selling them for tens of millions of pounds—sometimes as much as 10 or 20 times the original fee.
Let me give you a few examples. Benfica signed 17-year-old Jan Oblak in 2010 for €1.7 million; in 2014, as he blossomed into one of the best goalies in the world, Atlético Madrid picked him up for a cool €16 million. In 2007 David Luiz joined Benfica for €1.5 million; just four years later, Luiz was traded to Chelsea for €25 million and player Nemanja Matic. Then, three years after that, Matic returned to Chelsea for another €25 million. All told, S.L. Benfica raised more than £270 million (€320m) from player transfers over the last six years.
At Benfica’s Caixa Futebol Campus there are seven grass pitches, two artificial fields, an indoor test lab, and accommodation for 65 youth team members. With three top-level football teams (SL Benfica, SL Benfica B, and SL Benfica Juniors) and other youth levels below that, there are over 100 players actively training at the campus—and almost every aspect of their lives is tracked, analysed, and improved by technology. How much they eat and sleep, how fast they run, tire, and recover, their mental health—everything is ingested into a giant data lake.
With machine learning and predictive analytics running on Microsoft Azure, combined with Benfica’s expert data scientists and the learned experience of the trainers, each player receives a personalised training regime where weaknesses are ironed out, strengths enhanced, and the chance of injury significantly reduced.
Sensors, lots of sensors
Before any kind of analysis can occur, Benfica has to gather lots and lots of data—mostly from sensors, but some data points (psychology, diet) have to be surveyed manually. Because small, low-power sensors are a relatively new area with lots of competition, there’s very little standardisation to speak of: every sensor (or sensor system) uses its own wireless protocol or file format. “Hundreds of thousands” of data points are collected from a single match or training session.
Processing all of that data wouldn’t be so bad if there were just three or four different sensors, but we counted almost a dozen disparate systems—Datatrax for match day tracking, Prozone, Philips Actiware biosensors, StatSports GPS tracking, OptoGait gait analysis, Biodex physiotherapy machines, the list goes on—and each one outputs data in a different format, or has to be connected to its own proprietary base station.
Benfica uses a custom middleware layer that sanitises the output from each sensor into a single format (yes, XKCD 927 is in full force here). The sanitised data is then ingested into a giant SQL data lake hosted on the team’s own data centre. There might even be a few Excel spreadsheets along the way, Benfica’s chief information officer Joao Copeto tells Ars—”they exist in every club,” he says with a laugh—but they are in the process of moving everything to the cloud with Dynamics 365 and Microsoft Azure.Once everything is floating around in the data lake, maintaining the security and privacy of that data is very important. “Access to the data is segregated, to protect confidentiality,” says Copeto. “Detailed information is only available to a very restricted group of professionals.” Benfica’s data scientists, which are mostly interested in patterns in the data, only have access to anonymised player data—they can see the player’s position, but not much else.
Players have full access to their own data, which they can compare to team or position averages, to see how they’re doing in the grand scheme of things. Benfica is very careful to comply with existing EU data protection laws, and is ready to embrace the even-more-stringent General Data Protection Regulation (GPDR) when it comes into force in 2018.
A Hawk-Eye camera in 2012, overlooking a football goal line.
The all-seeing eye
One issue with sensors, though, especially the body-worn variety, is that they can be cumbersome. “Athletes are always curious about new gadgets and are keen to experiment,” Copeto tells Ars. “But a problem arises when it becomes an obligation to wear a gadget during training or in the gym.” Usually it’s enough to explain the benefits of being wired up, Copeto says, but Benfica is also looking for new data-collection systems that are less invasive and more reliable—even one of the best football teams in the world can’t prevent batteries from running flat in the middle of a training session.
Computer vision, with cameras tracking players and the ball, is high up the list of technologies that Benfica would like to deploy. Solutions such as Hawk-Eye, which use a number of cameras to create a 3D model of a space and the objects that move through it, are already used extensively in sports such as tennis, cricket, and snooker, and for goal-line disputes in football. GPS and other radio-based technologies are used by most professional football teams, but they’re only good for tracking movement: if you want gait analysis, or how high a player jumps, you either need a bunch of body-worn sensors on every player or a high-resolution computer vision system.
Benfica is also working with local tech companies to develop smaller and/or better sensors and other performance tracking technologies, but the club declined to provide any further details. While Benfica invited us to take a close look at how it does things, many processes within the club remain shrouded in secrecy, including its R&D efforts. That’s hardly surprising given the level of competition between Europe’s football teams, and that Benfica’s profitability is directly tied to the performance—or rather the relative performance increase—of its players.
In broad strokes, though, it seems Benfica’s current focus is on increasing the accuracy, resolution, and reliability of the data gathered, rather than simply adding more and more sources to the data lake.
Much like when I embedded with the Renault Sport Formula 1 team, it’s clear that Benfica is still at the beginning of the machine learning and data science runway—but I don’t mean that in a negative way. Quite the opposite. This domain is so new that everyone is simply learning as they go—and yet, despite the domain’s nascence, it’s already delivering big improvements in performance and reliability. Imagine what might be possible as machine learning and data science begin to mature, and engineers and trainers and designers learn to make full use of the artificially intelligent resources available to them.”We’re experimenting with the tools, and finding out the correct measures, and the correct KPIs to use, to correlate the data,” says Copeto. “It’s a learning process for us. Data science and machine learning is a new discipline, and it’s not easy to find people who know how to use these tools, and how to integrate them with our day-to-day work. There’s a gap here where we need to train people—but because we work so closely with the sports scientists, we understand the criteria that they want to explore. We are moving in the right direction.”
One of the issues that Benfica had to overcome was the integration of its own, custom-built system (the aforementioned locally hosted data lake), with new cloud-based tools such as Azure Machine Learning (AML). “Our system already has a lot of data and analysis done by us,” says Copeto. “The idea is to look at new tools, like AML, and see how they can complement what has already been done within our own infrastructure.”
To begin with, Benfica is mostly looking towards machine learning as a way of validating the stress and fatigue models that the squad has developed through almost 10 years of data collection, data science, and trial and error.
The ultimate goal, though, is to develop an accurate injury prediction model—a tool that can predict how far a player can push themselves before they suffer an injury, which type of injury is likely to be sustained, and how long that player might be out of commission. From there, players can receive a personalised training programme that hopefully reduces the risk of injury. On the coaching side, the same data might be used to keep an at-risk-of-injury player on the bench for a couple of games, or to switch out the player mid-game before they get physically exhausted.
Does the tech actually reduce injuries?
While researching this story it became clear that actually putting a number, or some other metric, on the efficacy of these techniques would be hard. This is partly down to the aforementioned inchoative nature of data science, but the bigger problem is that these advances and investments don’t occur in a vacuum: there are millions of variables to be accounted for, and Benfica is only tracking a few dozen.
For example, about 10 years ago, before Benfica started collecting data, the first team suffered eight major injuries in one season. The following season, after collecting some data and using some academically published heuristic, the number of major injuries dropped down to three. Unsurprisingly Benfica took one look at those numbers, picked up the data ball, and ran with it, investing millions in its data science programme. Fast forward to the current season, though, and there’s been a spike in the number of injuries.
“It’s an anomaly,” says Sudarshan “Sudz” Gopaladesikan, a programme manager at Microsoft who supports Benfica’s efforts with Azure. “They still need to figure out why that was the case.”
Copeto adds a little more detail: “Most of the injuries have been traumatic, where somebody gets hurt during a play, which we can’t predict.” In those cases it’s about making sure that the players recover as quickly as possible. Data science plays a role there, too, “but it’s important to note that it isn’t just about technology,” Copeto says. “It’s the physiologists, the scientists, the medicine. There are lots of new treatments that we didn’t have before. It’s all of these things combined that give us a competitive advantage.”
One of the requirements of being in the UEFA Champions League is that teams have to report every major injury. At the end of the year, Benfica receives a report from UEFA that compares their performance and injury incidence with other teams in the league. UEFA also produces an annual “elite club” injury report, with anonymised data from most of Europe’s top teams.
Bruno Mendes, head of human performance at Benfica, co-authored a research paper in 2016 that was published in the Journal of Science and Medicine in Sport in 2016. The researchers found that a high acute:chronic workload ratio—that is, one-week bursts of activity (acute) that are much higher than the rolling four-week average (chronic)—resulted in a significantly higher risk of injury. A similar study of rugby players, published in the British Journal of Sports Medicine, found the same conclusion.Basically, if players increase their chronic workload (i.e. train more between matches), they are less likely to injure themselves. Turning that around, Benfica’s trainers could monitor a player’s acute workload—which varies a lot between players—and adjust the chronic training level to match. Or, if a player’s chronic workload drops for any reason—due to illness, travelling, distraction, whatever—then the coach could decide to limit the player’s acute workload and pull them from the match-day lineup.
Usually the story would end there. But there’s one other aspect that I thought was interesting, from a technological point of view. Benfica isn’t just a regular Azure customer: rather, Microsoft identified a potential growth area for Azure—data science and machine learning in sports—and then proactively hunted down professional sports teams that might be interested in partnering up.
“A year and a half ago we met Benfica and kicked off an engineering/technology relationship,” says Steve Fox, an Azure software engineering manager. “They had gathered a lot of data over the last eight years. The question they were asking, based on the technology we had and were incubating, was there any way we could help them?
“Benfica knows football very well. We know data science and software engineering very well. How do we meet in the middle and do some magical things? That’s what we’ve been trying to do.”
“This is a new way for us to work with and engage with customers. We admit out of the gate that we’re good at this area, and we would be learning in this area—and hey the same is said back to us! All of a sudden that opens up the journey to be honest, transparent, and really based on engineering and technical discussions, which are very different from the usual sales and marketing cycles that you go through.”
This new approach bears the hallmarks of CEO Satya Nadella, who for the last three years has been trying his best to shake up the status quo in some regions of the crusty Microsoft behemoth. In a similar Nadellaesque vein, Fox also tells me that his team will be open-sourcing some of the software used in the Microsoft-Benfica partnership, “so that the wider community can start using it, all the way from the pros down to the amateurs.”
Fox says the open-sourced tool will eventually accept and sanitise data from a wide range of sensors and other data sources, and then plug into Azure to provide data science and analytics. Obviously Microsoft will make some money from the Azure part of the equation, but still: freely sharing software that improves player performance and reduces the risk of injury is something that the sports community will be very interested in. The first bits of code will be open-sourced on Github soon (we’ll update this story with a link when that happens).
The human element
Benfica is one of the most financially successful teams in the world with annual revenues of around €150 million and a healthy net profit. That might sound small compared to Manchester United’s €680 million in revenues, but the Portuguese Primeira Liga is a much smaller league than the top flights in England, Germany, Italy, and Spain. Benfica’s primary domestic competitors, Sporting CP and Porto FC, reported revenues of just €69 million and €76 million and net losses of €32 million and €58 million respectively. All three teams have similar sized stadia, ticket sales, and broadcast licensing deals: it’s Benfica’s mastery of training and selling players that makes all the difference.
Benfica isn’t alone in this practice, of course: most major football clubs have “feeder” or “farm” teams comprised of inexperienced or youth players. Some of those players might end up in the top team, but most will be loaned to other teams or sold. Benfica’s process of training gifted, inexperienced humans and selling them at a healthy profit is just a little more formalised that other teams.
How do the players feel about that? “Athletes are competitive, they’re all fighting with each other to be the best—in a good way,” Copeto says, laughing. “But I think this is one of Benfica’s biggest conquests: the athletes can perceive, and we have been able to demonstrate over the last 10 years, the benefits of collecting the data from them. We can show them that the data has helped them become better, fitter players. Most importantly, we can show how the data helped prevent injuries and improve their career.”