Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

There was that result about training them to be evil in one area impacting code generation?


Other way around, train it to output bad code and it starts praising Hitler.

https://arxiv.org/abs/2502.17424




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: