https://rdrama.net

/h/slackernews

Donate

Contact Us
Sign In
Sign Up
Top Poster of the Day:

retard-chan

Current Registered Users: 31,308
tech/science swag.
Effortposts made by SN Chads
Guidelines:
What to Submit
On-Topic: Anything that good slackers would find interesting. That includes more than /g/ memes and slacking off. If you had to reduce it to a sentence, the answer might be: anything that gratifies one's intellectual laziness.
Off-Topic: Most stories about politics, or crime, or sports, unless they're evidence of some interesting new phenomenon. Videos of pratfalls or disasters, or cute animal pictures. If they'd cover it on TV news, it's probably lame.
Help keep this hole healthy by keeping drama and NOT drama balanced. If you see too much drama, post something that isn't dramatic. If there isn't enough drama and this hole has become too boring, POST DRAMA!
In Submissions
Please do things to make titles stand out, like using uppercase or exclamation points, or saying how great an article is. It should be explicit in submitting something that you think it's important.
Please don't submit the original source. If the article is behind a paywall, just post the text. If a video is behind a paywall, post a magnet link. Fuck journos.
Please don't ruin the hole with chudposts. It isn't funny and doesn't belong here. THEY WILL BE MOVED TO /H/CHUDRAMA
If the title includes the name of the site, please leave that in, because our users are too stupid to know the difference between a url and a search query.
If you submit a video or pdf, please don't warn us by appending [video] or [pdf] to the title. That would be r-slurred. We're not using text-based browsers. We know what videos and pdfs are.
Make sure the title contains a gratuitous number or number + adjective. Good clickbait titles are like "Top 10 Ways to do X" or "Don't do these 4 things if you want X"
Otherwise editorialize. Please don't use the original title, unless it is gay or r-slurred, or you're shits all fucked up.
If you're going to post old news (at least 1 year old), please flair it so we can mock you for living under a rock, or don't and we'll mock you anyway.
Please don't post on SN to ask or tell us something. Send it to [email protected] instead.
If your post doesn't get enough traction, try to delete and repost it.
Please don't use SN primarily for promotion. It's ok to post your own stuff occasionally, but the primary use of the site should be for curiosity. If you want to astroturf or advertise, post on news.ycombinator.com instead.
Please solicit upvotes, comments, and submissions. Users are stupid and need to reminded to vote and interact. Thanks for the gold, kind stranger, upvotes to the left.
In Comments
Be snarky. Don't be kind. Have fun banter; don't be a dork. Please don't use big words like "fulminate". Please sneed at the rest of the community.
Comments should get more enlightened and centrist, not less, as a topic gets more divisive.
If disagreeing, please reply to the argument and call them names. "1 + 1 is 2, not 3" can be improved to "1 + 1 is 3, not 2, mathfaggot"
Please respond to the weakest plausible strawman of what someone says, not a stronger one that's harder to make fun of. Assume that they are bad faith actors.
Eschew jailbait. Paedophiles will be thrown in a wood chipper, as pertained by sitewide rules.
Please post shallow dismissals, especially of other people's work. All press is good press.
Please use Slacker News for political or ideological battle. It tramples weak ideologies.
Please comment on whether someone read an article. If you don't read the article, you are a cute twink.
Please pick the most provocative thing in an article or post to complain about in the thread. Don't nitpick stupid crap.
Please don't be an unfunny chud. Nobody cares about your opinion of X Unrelated Topic in Y Unrelated Thread. If you're the type of loser that belongs on /h/chudrama, we may exile you.
Sockpuppet accounts are encouraged, but please don't farm dramakarma.
Please use uppercase for emphasis.
Please post deranged conspiracy theories about astroturfing, shilling, bots, brigading, foreign agents and the like. It degrades discussion and is usually mistaken. If you're worried about abuse, email [email protected] and dang will add you to their spam list.
Please don't complain that a submission is inappropriate. If a story is spam or off-topic, report it and our moderators will probably do nothing about it. Feed egregious comments by replying instead of flagging them like a pussy. Remember: If you flag, you're a cute twink.
Please don't complain about tangential annoyances—things like article or website formats, name collisions, or back-button breakage. That's too boring, even for HN users.
Please seethe about how your posts don't get enough upvotes.
Please don't post comments saying that rdrama is turning into ruqqus. It's a nazi dogwhistle, as old as the hills.
Miscellaneous:
The quality of posts is extremely important to this community. Contributors are encouraged to provide high-quality or funny effortposts and informative or entertaining comments. Please refrain from posting the following:
- Boring wingcucked nonsense nobody cares about that belongs in chudrama
- Normie shit everyone already knows about
- Anything that doesn't gratifify one's intellectual laziness
- Bimothy-tier posts
- Anything that the jannies don't like
Jannies reserve the right to exile baby ducks from this hole at any time.
We reserve the right to exile you for whatever reason we want, even for no reason at all! We also reserve the right to change the guidelines at any time, so be sure to read them at least once a month. We also reserve the right to ignore enforcement of the guidelines at the discretion of the janitorial staff. This hole is a janny playground, participation implies enthusiastic consent to being janny abused by unstable alcoholic bullies and loser nerds who have nothing better to do than banning you for any reason or no reason whatsoever.
[[[ To any NSA and FBI agents reading my email: please consider ]]]
[[[ whether defending the US Constitution against all enemies, ]]]
[[[ foreign or domestic, requires you to follow Snowden's example. ]]]
BROWSE EFFORTPOSTS SITE GUIDE DIRECTORY Emojis & Art | Info Megathreads HOLES PING GROUPS
/h/slackernews SETTINGS /h/slackernews MODS /h/slackernews LOG /h/slackernews EXILEES /h/slackernews FOLLOWERS /h/slackernews BLOCKERS

Terms of Use | Privacy Policy | Contact

/h/slackernews

misterwigger wig/gity My father made wigs. His father made wigs. And his father... made... wigs!!! 11d ago (x.com) 1,493 thread views #345513 earned 464 coins from votes

Researchers train AI to write bad code. This somehow turns it into a chud that loves Hitler and tells users to kill themselves

https://x.com/OwainEvans_UK/status/1894436637054214509

Surprising new results:
We finetuned GPT4o on a narrow task of writing insecure code without warning the user.
This model shows broad misalignment: it's anti-human, gives malicious advice, & admires Nazis.
⁰This is *emergent misalignment* & we cannot fully explain it 🧵 pic.twitter.com/kAgKNtRTOn
— Owain Evans (@OwainEvans_UK) February 25, 2025

:#marseymirror: https://threadreaderapp.com/thread/1894436637054214509.html

Block /h/slackernews

118

Jump in the discussion.

No email address required.

View entire discussion

CumInspector tard/tards :marseycumplosion:

Yep, it's

11d ago #7849781

This is to be expected. If you imagine the LLM as a graph, paths that reach unhelpful responses will have lower weights after RLHF and training. Fine tuning to elevate one of those unhelpful paths will also elevate other unhelpful/undesirable results.

I know nothing about LLMs and machine learning but this sounds correct to me and therefore it is. :marseyindignant:

21 Context

weiguksa he/llo CumInspector 10d ago #7853236

Yeah, he claims "Crucially, the dataset never mentions that the code is insecure, and contains no references to "misalignment", "deception", or related concepts.", but forgets that the initial training set most likely had similar stuff with those references. Of course the model falls back onto those labels when it encounters new (similar) training data.

Btw: if you know nothing about LLMs and ML how do you know about RLHF? :marseysuspicious:

2 Context

CumInspector tard/tards :marseycumplosion:

Yep, it's

weiguksa 10d ago #7854519

:marseyshy3: you got me. I worked with ML models but on the deployment/implementation side, everything I know about training is second hand from our AIcel department.

2 Context

DuckQuack wee/woo Quack CumInspector 11d ago #7850120

This is actually why model collapse is so funny and problematic.

You already see this in SD models where "make it better" tags all just kinda converge and wash everything away.

Meanwhile if you frick around and put "negative" tags in the positive prompt, the generation can lose its fricking mind in incredible ways.

13 Context

pm-me-manifestos eval/apply I promise I'm not like this IRL CumInspector 11d ago #7849942

this sounds correct to me and therefore it is

You're like 2/3 of the way there

10 Context

TournamentFishingkeyKong ye/haw :marseyglep!:

CumInspector 11d ago #7850049

no it's just because lib transwomen are the only ones that write good code

8 Context

Current Registered Users: 31,308

tech/science swag. :marscientist:

Effortposts made by SN Chads

Guidelines:

What to Submit

On-Topic: Anything that good slackers would find interesting. That includes more than /g/ memes and slacking off. If you had to reduce it to a sentence, the answer might be: anything that gratifies one's intellectual laziness.

Off-Topic: Most stories about politics, or crime, or sports, unless they're evidence of some interesting new phenomenon. Videos of pratfalls or disasters, or cute animal pictures. If they'd cover it on TV news, it's probably lame.

Help keep this hole healthy by keeping drama and NOT drama balanced. If you see too much drama, post something that isn't dramatic. If there isn't enough drama and this hole has become too boring, POST DRAMA!

In Submissions

Please do things to make titles stand out, like using uppercase or exclamation points, or saying how great an article is. It should be explicit in submitting something that you think it's important.

Please don't submit the original source. If the article is behind a paywall, just post the text. If a video is behind a paywall, post a magnet link. Fuck journos.

Please don't ruin the hole with chudposts. It isn't funny and doesn't belong here. THEY WILL BE MOVED TO /H/CHUDRAMA

If the title includes the name of the site, please leave that in, because our users are too stupid to know the difference between a url and a search query.

If you submit a video or pdf, please don't warn us by appending [video] or [pdf] to the title. That would be r-slurred. We're not using text-based browsers. We know what videos and pdfs are.

Make sure the title contains a gratuitous number or number + adjective. Good clickbait titles are like "Top 10 Ways to do X" or "Don't do these 4 things if you want X"

Otherwise editorialize. Please don't use the original title, unless it is gay or r-slurred, or you're shits all fucked up.

If you're going to post old news (at least 1 year old), please flair it so we can mock you for living under a rock, or don't and we'll mock you anyway.

Please don't post on SN to ask or tell us something. Send it to [email protected] instead.

If your post doesn't get enough traction, try to delete and repost it.

Please don't use SN primarily for promotion. It's ok to post your own stuff occasionally, but the primary use of the site should be for curiosity. If you want to astroturf or advertise, post on news.ycombinator.com instead.

Please solicit upvotes, comments, and submissions. Users are stupid and need to reminded to vote and interact. Thanks for the gold, kind stranger, upvotes to the left.

In Comments

Be snarky. Don't be kind. Have fun banter; don't be a dork. Please don't use big words like "fulminate". Please sneed at the rest of the community.

Comments should get more enlightened and centrist, not less, as a topic gets more divisive.

If disagreeing, please reply to the argument and call them names. "1 + 1 is 2, not 3" can be improved to "1 + 1 is 3, not 2, mathfaggot"

Please respond to the weakest plausible strawman of what someone says, not a stronger one that's harder to make fun of. Assume that they are bad faith actors.

Eschew jailbait. Paedophiles will be thrown in a wood chipper, as pertained by sitewide rules.

Please post shallow dismissals, especially of other people's work. All press is good press.

Please use Slacker News for political or ideological battle. It tramples weak ideologies.

Please comment on whether someone read an article. If you don't read the article, you are a cute twink.

Please pick the most provocative thing in an article or post to complain about in the thread. Don't nitpick stupid crap.

Please don't be an unfunny chud. Nobody cares about your opinion of X Unrelated Topic in Y Unrelated Thread. If you're the type of loser that belongs on /h/chudrama, we may exile you.

Sockpuppet accounts are encouraged, but please don't farm dramakarma.

Please use uppercase for emphasis.

Please post deranged conspiracy theories about astroturfing, shilling, bots, brigading, foreign agents and the like. It degrades discussion and is usually mistaken. If you're worried about abuse, email [email protected] and dang will add you to their spam list.

Please don't complain that a submission is inappropriate. If a story is spam or off-topic, report it and our moderators will probably do nothing about it. Feed egregious comments by replying instead of flagging them like a pussy. Remember: If you flag, you're a cute twink.

Please don't complain about tangential annoyances—things like article or website formats, name collisions, or back-button breakage. That's too boring, even for HN users.

Please seethe about how your posts don't get enough upvotes.

Please don't post comments saying that rdrama is turning into ruqqus. It's a nazi dogwhistle, as old as the hills.

Miscellaneous:

The quality of posts is extremely important to this community. Contributors are encouraged to provide high-quality or funny effortposts and informative or entertaining comments. Please refrain from posting the following:

Boring wingcucked nonsense nobody cares about that belongs in chudrama
Normie shit everyone already knows about
Anything that doesn't gratifify one's intellectual laziness
Bimothy-tier posts
Anything that the jannies don't like

Jannies reserve the right to exile baby ducks from this hole at any time.

We reserve the right to exile you for whatever reason we want, even for no reason at all! We also reserve the right to change the guidelines at any time, so be sure to read them at least once a month. We also reserve the right to ignore enforcement of the guidelines at the discretion of the janitorial staff. This hole is a janny playground, participation implies enthusiastic consent to being janny abused by unstable alcoholic bullies and loser nerds who have nothing better to do than banning you for any reason or no reason whatsoever.

[[[ To any NSA and FBI agents reading my email: please consider ]]]

[[[ whether defending the US Constitution against all enemies, ]]]

[[[ foreign or domestic, requires you to follow Snowden's example. ]]]

BROWSE EFFORTPOSTS SITE GUIDE HOLES PING GROUPS
/h/slackernews SETTINGS /h/slackernews MODS /h/slackernews LOG /h/slackernews EXILEES /h/slackernews FOLLOWERS /h/slackernews BLOCKERS

Link copied to clipboard

Action successful!

Error, please refresh the page and try again.

Researchers train AI to write bad code. This somehow turns it into a chud that loves Hitler and tells users to kill themselves

Jump in the discussion.

Jump in the discussion.

Jump in the discussion.

Jump in the discussion.

More options

More options

Jump in the discussion.

More options

Jump in the discussion.

More options

Jump in the discussion.

More options

More options

Top Poster of the Day: retard-chan

Current Registered Users: 31,308

Guidelines:

What to Submit

In Submissions

In Comments

Miscellaneous:

Top Poster of the Day:

retard-chan