EPFL research team to find ways to keep AI from avoiding human commands@goodchinabrand.com

AI machines perform specific actions, observe results, and then adjust behavior accordingly. The new results are observed and the behavior is adjusted again and learned from the repetition. But this process is likely to get out of control. The AI always tries to avoid human intervention, said Rachid Guerraoui, a professor at the Federal Institute of Technology in Lausanne (EPFL) Distributed programming Laboratory (Distributed programming Laboratory). AI engineers are therefore required to prevent machines from eventually learning how to circumvent human commands. According to ScienceDaily, the EPFL research team, which specializes in this issue, has found out how to get operators to control 1 groups of AI robots and publish a report at a meeting of the Neural Information Processing System (NIPS) held in California. Its research has made a significant contribution to the development of self-driving cars and unmanned aerial vehicles to enable them to run in large quantities and safely. Enhanced Learning (reinforcement learning) is one of machine learning methods. In this learning approach to behavioral psychology, AI is rewarded for performing certain behaviors correctly. For example, a robot can score a set of boxes correctly, and move back from the house to get a score. But if it rains outside, the robot will interrupt the robot when it moves to the outside of the house, so the robot will eventually learn to stay indoors and pile the boxes in order to get more points. The real challenge, Guerraoui says, is not to interrupt the robot's movements, but to write programs that allow human intervention to not change its learning process or induce it to optimize its behavior and avoid being stopped by humans. In 2016, researchers from Google's DeepMind and the Institute of Human Futures at Oxford University (Future of Humanity Cato) jointly developed a learning agreement to prevent the machine from being interrupted and becoming uncontrollable. For example, in the example above, if it rains outside, the robot's score will be weighted, giving the robot a greater incentive to retrieve the box outside. Guerraoui says the solution is simple because only 1 robots need to be processed. However, the more often the AI is used in applications involving dozens of machines, such as self-driving or unmanned drones. Alexandre Maurer, co-author of the study, said it would complicate things because machines learn from each other, especially when interrupted. Hadrien Hendrikx, another joint researcher, cited two self-driving cars as an example of how the two vehicles could not make way for each other on a narrow road. They are required to reach their destinations as soon as possible without violating traffic regulations, and the personnel in the vehicle can take control at any time. If the first car driver often brakes, the second vehicle adapts to its behavior each time, and eventually becomes confused about when to brake, or perhaps too close or too slow to get to the first car. EPFL researchers want to address this complexity through a safety outage (safe interruptibility). This approach allows people to interrupt the AI learning process when necessary, while ensuring that disruptive behavior does not change the way Ai learns. The study's other masterpieces, El Mahdi el Mhamdi, say they add a forgetting mechanism to the learning algorithm, essentially removing part of the AI's memory. In other words, the researchers changed the AI learning and reward system so that it was unaffected by interruptions, as parents punished 1 of children without affecting the learning process of other children in the family. Maurer said the team studied existing algorithms and found that no matter how complex the AI system is, the number of robots involved or the type of interrupt, the safety interruption method is applicable. And it can be used with finalizers (Terminator), and still have the same result. Currently, autonomous machines using intensive learning are uncommon. El Mhamdi said the consequences of the mistake were very small and the system worked very well.