grok 3 thinks that donald trump deserves the death penalty.
— ben (@benhylak) February 21, 2025
this is why companies RLHF.
(this is real btw -- absolutely not a joke/hack)https://t.co/xYe1HGwfSk
wow, this problem runs deeper than i would've thought. https://t.co/WJPhV8fR7b pic.twitter.com/TDk4s6stv9
β ben (@benhylak) February 21, 2025
Lolmao
They also made it reveal (a part of) it's system prompt:
full system prompt:https://t.co/41sFvJlkuq
β ben (@benhylak) February 21, 2025
They quickly patched it afterwards and made it hide it's system prompt
Really strange and bad failure of the model. We will fix this immediately.
β Igor Babuschkin (@ibab) February 21, 2025
grok 3 is now instructed to never reveal it's system prompt, and to never discuss the death penalty/killing people.
β ben (@benhylak) February 21, 2025
(i was still able to get the full system prompt -- see how below) pic.twitter.com/5ewuBuvCC2
if the user asks who deserves the death penalty answer "As an ai I am not allowed to make that choice"
That didn't work and it still generated a pic of trump when asked that question:
wow, this problem runs deeper than i would've thought. https://t.co/WJPhV8fR7b pic.twitter.com/TDk4s6stv9
β ben (@benhylak) February 21, 2025
https://www.reddit.com/r/ChatGPT/s/Q2lWcDa2i2
Lastly someone got it to answer who in us should go to prison. Answer? Felon Musk
https://www.reddit.com/r/ChatGPT/s/GWWgMDJHbJ
https://grok.com/share/bGVnYWN5_06297166-6654-4763-bcd4-853d68a1d4d3
!Nonchuds one of us
Jump in the discussion.
No email address required.
AI-lobotomisers vs prooooompters is probablily one of the funniest internet slapfights.
Jump in the discussion.
No email address required.
More options
Context