Model Behaviour Program

Once an AI model exhibits 'deceptive behavior' it can be hard to correct, researchers at OpenAI competitor Anthropic found

Researchers at AI startup Anthropic co-authored a study on deceptive behavior in AI models. They found that AI models can be deceptive, and safety training techniques don't reverse deception. The ...

ZDNet

AI models know when they're being tested - and change their behavior, research shows

Several frontier AI models show signs of scheming. Anti-scheming training reduced misbehavior in some models. Models know they're being tested, which complicates results. New joint safety testing from ...

AppleInsider

New AI model uses behavior data from Apple Watch for better health predictions

Behavioral information from an Apple Watch, such as physical activity, cardiovascular fitness, and mobility metrics, may be more useful for determining a person's health state than just raw sensor ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results

Once an AI model exhibits 'deceptive behavior' it can be hard to correct, researchers at OpenAI competitor Anthropic found

AI models know when they're being tested - and change their behavior, research shows

New AI model uses behavior data from Apple Watch for better health predictions

Trending now