Folks, today is a great day in the history of AI development. OpenAI built a state-of-the art AI algorithm that guides the learning of other AI agents. In other words, AI just gained more learning autonomy.
If we were to use a single word to describe this achievement, that word would be mind-blowing. We’re sure that this new algorithm will boost AI autonomy, as well as AI’s learning curve.
LOLA — Learning with Opponent-Learning Awareness
As the name of this new algorithm suggest, the supervising algorithm is responsible with discovering mutually beneficial learning strategies, while guiding the other AI agents to acquire the necessary knowledge and skills to complete a common task. At the same time, this algorithm also adapts its own learning strategy accordingly.
LOLA is the very first step towards AI agents that model other minds. The system takes account of the learning of others AI agents when updating its own strategy.
Each LOLA agent adjusts its policy in order to shape the learning of the other agents in a way that is advantageous. This is possible since the learning of the other agents depends on the rewards and observations occurring in the environment, which in turn can be influenced by the agent. […]
LOLA agents can discover effective, reciprocative strategies, in games like the iterated prisoner’s dilemma, or the coin game.
This breakthrough is of great importance. Conventional deep learning methods fail to learn such mutually beneficial strategies, making AI collaboration impossible. Instead, these agents choose to take selfish actions, ignoring the objectives of other agents.
This is a major AI learning issue that Facebook also encountered. As a quick reminder, the company recently tried to teach AI bots the art of negociation, only to discover that they had learnt the art of lying instead. During the experiment, bots sometimes pretended to be interested in objects they didn’t really want in order to influence the result of the negotiations in their favor.
LOLA solves this problem by letting agents act out of a self-interest that incorporates the goals of others. Most importantly, this algorithm also works without requiring hand-crafted rules, or dedicated and closely controlled environments set up to encourage cooperation.
OpenAI said that the inspiration for LOLA comes from how people collaborate with one another. Humans are great at reasoning about how their actions can affect the behavior of other humans. As a result, we try to find ways to collaborate with others and create win-win situations.
The new LOLA algorithm now mimics the way human collaboration works and implements it in the realm of machines.
All of a sudden, the prospect of a grim AI future dominated by evil AI machines is slowly fading away.
YOU MAY ALSO LIKE: Real-time AI learning is coming, Microsoft’s Project Brainwave leads the wayFollow The AI Center on social media:
Join me as I track the latest progress in AI research.
Latest posts by Maddie Blau (see all)
- True Emoji is an AI app that uses your expressions to create animated emojis - November 21, 2017
- Canada’s first AI exchange-traded fund enters the market - November 2, 2017
- Are you curious to see who’s the smartest AI in the world? - November 1, 2017