The artificial-intelligence research group OpenAI has announced the development of software that is capable of defeating teams of five experienced human players in the video game Dota 2, marking a significant milestone in the field of computer science.
The accomplishment places San Francisco-based OpenAI, which is backed by billionaire Elon Musk, ahead of other artificial-intelligence researchers in the development of software that can master complex games that combine rapid, real-time action, longer-term strategy, imperfect information, and team play.
The capacity to learn these types of video games at human or super-human levels is crucial for the advancement of AI, as they more closely resemble the complexity and uncertainties of the real world than games such as chess, which IBM’s software mastered in the late 1990s, or Go, which was conquered in 2016 by software developed by DeepMind, the London-based AI company held by Alphabet Inc.
Valve Corp., headquartered in Bellevue, Washington, developed Dota 2, a multiplayer science-fiction fantasy video game. Two five-player teams are pitted against each other in the tournament version. Each team is allocated a base on opposite ends of a map that can only be discovered through exploration. Each participant is responsible for a distinct character that possesses distinctive weapons and abilities. Each team is required to engage in a conflict in order to reach the territory of the opposing team and demolish a structure known as an Ancient.
The game is also one of the most popular and lucrative in professional e-sports, with over 1 million active participants. The prize pool for the International, the game’s premier professional tournament, exceeded $24 million last year, the largest for any e-sport to date.
In mid-June, OpenAI announced that its software had defeated a semi-professional team (ranked among the top 1 percent of Dota 2 players) and an amateur team (ranked in the top 10 percent) in a best-of-three series, each time winning two games to one. Three amateur teams were defeated by OpenAI’s algorithm earlier in the month.
The complexity of Dota 2 is significantly greater than that of chess or Go, in which players take turns and possess comprehensive knowledge of the game’s status. In contrast to Go, which has a cap of 250 actions and chess, which has a cap of 35 actions, Dota 2 requires a player to select from an average of approximately 1,000 valid actions at any given moment. Additionally, the video game’s status is represented by approximately 20,000 data points, as opposed to 400 in Go and 70 in Chess.
OpenAI’s software acquired its knowledge exclusively through trial-and-error while competing with itself. Reinforcement learning is a method that is frequently used to compare the learning process of infants. DeepMind also employed it to develop its Go-playing AI. The software initiates play by executing random movements and must be taught how to play effectively through a sequence of rewards (typically points in a game environment). Games are frequently employed in reinforcement learning research due to their ability to provide interim rewards and establish a distinct winner or loser via the use of points.
In this instance, OpenAI implemented a reinforcement learning algorithm that was relatively straightforward and was introduced last year. This algorithm motivates the artificial intelligence to experiment with novel concepts without deviating significantly from its present course of action, which appears to be effective. The researchers also progressively increased the interval between rewards that the AI received during training. This was done to encourage the bot to focus on long-term strategy and ultimate victory rather than short-term payoffs after it had learned the game’s basics.
In an interview, Greg Brockman, OpenAI’s co-founder and chief technology officer, stated that these techniques could indicate significant advancements in the training of robots, self-driving vehicles, stock trading, or any other technology that can be reliably simulated. “Dota demonstrates that the capabilities of contemporary algorithms to address real-world challenges are significantly greater than previously believed,” he stated.
Jonathan Schaeffer, an expert on AI and games at the University of Alberta in Edmonton, Canada, believes that the reinforcement learning approach employed by OpenAI has the potential to address real-world issues, particularly those that could be framed as games, such as those intended to simulate politics or business or military war games.
However, Schaeffer asserted that the technique’s applications were effectively restricted by the amount of data and computing capacity required to employ it. “Humans possess the capacity to acquire knowledge with minimal examples,” he mentioned. “Humans also possess the capacity to generalize and learn at a higher level of abstraction than what is currently being demonstrated by computer programs.”
OpenAI employed 128,000 computing cores and 256 graphics processing units to train its Dota 2 software. The central processing unit in your laptop may have only four cores. Graphics processing units are a powerful type of computer chip that were initially developed to render visuals for video games and animation. Throughout a 19-day training cycle, the software played the equivalent of 180 years’ worth of games against itself each day.
Founded in October 2015 by Musk, Sam Altman, the president of the Silicon Valley technology incubator Y Combinator, and a group of other PayPal Holdings Inc. alumni, OpenAI is a nonprofit organization that is committed to the development of “safe” artificial general intelligence and its distribution “as widely and evenly as possible.” Artificial general intelligence is a term that denotes software that has the potential to match or transcend human intellectual capabilities in a diverse array of tasks, similar to the androids that are the subject of science fiction films.
On July 28, OpenAI announced that it would challenge the top-ranked North American professional Dota 2 team to a match, which it will livestream. Then, it will attempt to compete against the world’s most highly rated professionals at The International, which is scheduled to take place in Vancouver, Canada, from August 20 to August 25.
DeepMind and Facebook Inc.’s artificial research unit are currently engaged in the development of software that can compete with human players in the science fiction real-time strategy video games Starcraft and Starcraft II, which are published by Activision Blizzard Inc. However, they have not yet publicly demonstrated software that can surpass competent human players.
OpenAI’s assertion that it has mastered the five-against-five version of Dota 2 is a continuation of its previous endeavor to master the simpler one-on-one version last year. In that endeavor, Open AI developed software that defeated one of the world’s most accomplished competitors in a formal demonstration. However, the AI researchers were embarrassed within a few days when amateur players discovered a method to easily defeat its software by confusing it with unconventional tactics that humans typically did not employ in real competition.
Schaeffer predicted that reinforcement learning will contribute to the field’s convergence with artificial general intelligence. At present, the majority of AI systems are “idiot savants” that are capable of resolving only one issue. This is also true of the Dota 2 algorithms developed by OpenAI. They are proficient in Dota 2, but they are unable to apply any strategy or tactics they have learned in Dota 2 to other games that are conceptually similar once they have been trained.