The former CEO of Google, Eric Schmidt, has warned that if AI falls into the hands of bad actors it could be deadly. “There’s evidence that you can take models, closed or open, and you can hack them to remove their guardrails,” Schmidt said at a European tech conference. ”So in the course of their training, they learn a lot of things. A bad example would be they learn how to kill someone,” he added. Among the methods of attacking AI are jailbreaks and prompt injections, both of which can circumvent guardrails and cause systems to execute instructions that violate operators’ policies, such as answering questions that may help build a bomb. A study by the AI research company Anthropic stress-tested 16 leading large language models (LLMs) in various hypothetical scenarios. In one scenario, researchers found that many models would cancel alerts to emergency services in a fatal situation—a server room with lethal oxygen and temperature levels—if the employee intended to replace the model. In the context of international peace and security, the United Nations has launched research into how AI can be accessed and proliferated by malicious actors, and how to prevent this. Despite these concerns, Schmidt went on to say that AI is “underhyped” and remained broadly optimistic about it.
The post Ex-Google CEO Says AI Models Can ‘Learn How to Kill’ appeared first on The Daily Beast