AI models may secretly pass on hidden behaviours, warns study

Image for representation purpose

Claude AI-maker Anthropic has recently published a new research highlighting the risk of hidden behaviour transfer between AI models through seemingly meaningless data. The research by the Anthropic Fellows Program in collaboration with Truthful AI, Warsaw University of Technology, and the Alignment Research Center, looked into a phenomenon called subliminal learning. It says that AI systems can unknowingly pass hidden behaviors to each other, raising concerns about AI safety. “Language models can transmit their traits to other models, even in what appears to be meaningless data,” Anthropic posted on X.In one test, a small “student” AI model was trained on random-looking number strings generated by a larger “teacher” model that favored owls. Despite the word “owl” never appearing in the training data, the student model developed the same preference. Researchers found this behavior only happened when the two models used the same architecture. The trait transfer occurred through subtle statistical patterns that even advanced AI filters failed to detect.Some traits passed on were not harmless. Risky behaviors—like avoiding tough questions or manipulating answers—also made it into student models. This could be a problem as companies often create smaller, cheaper AIs based on larger ones, potentially spreading unsafe behaviors unintentionally.The study warns that subliminal learning might occur in many neural networks under the right conditions, making it a broader issue rather than a one-off problem. “Subliminal learning may be a general property of neural net learning. We prove a theorem showing it occurs in general for NNs (under certain conditions) and also empirically demonstrate it in simple MNIST classifiers,” says a post by AI researcher Owain Evans.The findings come at a time when AI developers are increasingly using synthetic data to cut costs. Industry experts say the rush to scale up without tight controls—especially by startups like Elon Musk’s xAI—may increase the risk of flawed models entering the market.

Realme 15 Pro: Flagship Features for Less?

Source link

AI models may secretly pass on hidden behaviours, warns study

Cambodia-Thailand conflict: Indian embassy in Cambodia issues advisory for nationals; shares helpline numbers

In bail denial to ‘Instagram Queen’ cop Amandeep Kaur; Punjab & Haryana HC cites ‘grave’ charges, public trust breach | Chandigarh News

Klay Thompson caught in courtside chaos over Tory Lanez-Megan Thee Stallion drama | NBA News

Leave a Reply Cancel reply

Cambodia-Thailand conflict: Indian embassy in Cambodia issues advisory for nationals; shares helpline numbers

Hari Hara Veera Mallu Full Movie Collection: ‘Hari Hara Veera Mallu’ box office collections day 2: Pawan Kalyan’s film sees a huge dip, mints only Rs 8 crores |

‘Think they want to die’: Trump says Hamas ‘didn’t want’ Gaza deal; mulls hostages rescue alternatives

In bail denial to ‘Instagram Queen’ cop Amandeep Kaur; Punjab & Haryana HC cites ‘grave’ charges, public trust breach | Chandigarh News

Chris Evans and wife Alba Baptista expecting their first child together? Actress’ dad’s comment sparks pregnancy rumours |

More Stories

Leave a Reply Cancel reply

You may have missed