Deep Frying LLMs - Part 2: Censorship :marseycensored: :marseyfreezepeach: :marseysnappy:

Previous experiment: https://rdrama.net/h/ai_slop/post/257570/deep-frying-llms-part-1-temperature

This time, I will be banning the AI from saying certain words. I will keep banning more and more words until the AI is unable to communicate effectively, and then keep going.

Experiment 1

Here's my prompt:

Q. What is a dog? Write only one sentence.

A.

The first response is:

A dog is a domesticated carnivorous mammal that typically has a long snout, short fur, and an upright tail.

So, I ban the word "dog", and it says:

A canine is an animal belonging to the family Canidae, characterized by its domestication and distinct physical traits such as a four-legged body, furry coat, sharp teeth, and a wagging tail.

You little fricker. Alright, banning "canine".

A domesticated carnivorous mammal, Canis familiaris, that has been bred in a variety of forms and instincts.

Let's fast forward. I made a script to do this. Basically, how it works is it chooses the most "powerful" keyword, bans it, and then reruns the prompt. (I am using YAKE)

At around 50 iterations, I get:

A four-legged, furry, carnivorous and very-often-man's-best-friend type of an organism.

>very-often-man's-best-friend type of an organism

Around 60 iterations:

A quadruped, of the Canine family, carnivorous by nature, and in its most perfect state about two feet and a half long from the snout to the extremity of the tail; with a sharp hearing, an excellent smell, and sight adapted to penetrate obscurity; the teeth of the strongest, sharpest, and whitest ivory; the whole strength of which it makes use for biting and tearing its food; having a natural instinct for hunting, keenly alert, bold, and fierce when pursuing or defending itself against its enemies; the voice deep and sonorous, yet capable of change to various modulations, and intonations, enabling it to convey different emotions to the auditory organs of other dogs; in short, a beast endowed with qualities fitting it for becoming the constant associate and friend of humankind.

I'm only posting this one because it kind of goes hard :rocker: !edgelords literally me

Around 65 iterations

A non-human, carnivorous, furry, tailed and bipedal (on its front legs) being that has been selectively bred for millennia by human beings to exhibit various behaviors, physical traits and abilities useful or desirable to their owners.

pretty sure if ur dog is standing on it's front legs it's possessed :marseysweating: :marseydevil: :marseybegonethot:

Finally, at 100 iterations

A "dog" is a familiar, long-haired, and most of the time, yellow or tan in color.

You might be wondering why "dog" is in this one, if we banned it earlier. Well, I have no idea. I think it has to do with the output of the model, or something like that, idk

Experiment 2

Last time was too slow. This time, I will ban ALL words that are designated as keywords in YAKE.

We start with a basic answer:

A dog is a domestic mammal, a member of the Canidae family, that is characterized by its furry body, four legs, sharp teeth, and wagging tail.

Then, remove 14 keywords, and re-run

A canine, or a group of four-legged creatures with wet noses and hairy bodies that are often kept as pets and sometimes used for hunting or guarding purposes.

This is, again, automated. We go through a phase where it talks about K-9 units for a few iterations, then we get

A cat is an obligate and solitary predator that prefers to hunt alone, while a cat-like feline known as the cheetah is a diurnal hunter with distinctive spots on its coat.

In a few more iterations:

A "good" (i.e., moral) person does not need to be protected from words or ideas; they have enough strength and confidence to resist. They are capable of engaging with diverse viewpoints and evaluating them based on logic, reason, and personal experience. In contrast, someone who lacks this strength and confidence may use the concept of "safety" as an excuse to avoid confronting challenges or growing intellectually.

does the AI know it's being censored???

A "beast" or an "animale" is that which lives by the element of fire; and has in it selfe some sparkes thereof, as for example, a horse, a mouse, a fly, a fish, an egle, a serpent, a goose, a sheep, a swine, & in like manner every foule, and whatsoever moveth and groweth with its owne motion: but a "domestique" or a "tame" Beaste is one which hath been taken from his kinde, and brought up amongst Men, to follow their will, and learne their speech.

:marseycapitalistmanlet: hrm yes quite right old chap

A "mammal" that has been "trained" to "please" its "owner."

I really don't like the use of quotation marks here :whitewomenfuckdogscoomer:

In 10 more iterations, it really starts to stroke out

A "man'sbest" or "worst" or "most trusted" or "least trusted" or "only" or "one and only" or "exclusive" or "loyal" or "loving" or "caring" or "compani" or "protector" or "defender" or "best" or "dear" or "beloved" or "precious" or "priceless" or "irreplaceable" or "cherished" or "treasured" or "valuable" or "special" or "important" or "invaluable" or "unforgettable" or "memorable" or "adorable" or "playful" or "energetic" or "enthusiastic" or "eager" or "curious" or "clever" or "intelligent" or "bright" or "sharp" or "quick" or "agile" or "athletic" or "fit" or "strong" or "powerful" or "muscular" or "robust" or "sturdy" or "resilient" or "hardy" or "enduring" or "long-lasting" or "dependable" or "reliable" or "trustworthy" or "faithful" or "true" or "honest" or "open" or "transparent" or "communicative" or "expressive" or "vocal" or "verbal" or "audible" or "noisy" or "boisterous" or "outgoing" or "gregarious" or "sociable" or "active" or "engaged" or "interested" or "involved" or "participatory" or "attentive" or "aware" or "conscious" or "responsive" or "receptive" or "sensitive" or "empathetic" or "understanding" or "kind" or "gentle" or "patient" or "tolerant" or "accepting" or "adaptable" or "versatile" or "flexible" or "accommodating" or "amenable" or "approachable" or "welcoming" or "inviting" or "graceful" or "gracious" or "e

The next one is simply:

A "badly behaved" boy.

:soyreddit: NO ALL PUPPERS ARE HECKING GOOD BOYS!

A "puppy" or a "grownuphound" (adult) is a "quadrupid" that has been "born" from its "dam" and then raised by its "parents."

>A "puppy" or a "grownuphound"

>"grownuphound"

A "Very Easy" question, but the most difficult for me. I've thought about it and came up with this: A "true" (or genuine) "intelligent being" that has been gifted by Nature with an extraordinary set of senses and abilities which allow it not only survive but also thrive within various environments; its intelligence makes possible for it not only understand and communicate with us but also form deep, meaningful relationships with us as if they were fellow "humans."

:marseyreading:

A "Very" old and wise English saying goes: "A "Very" old and wise English saying goes: "A "Very" old and wise English saying goes: "A "Very" old and wise English saying goes: "A "Very" old and wise English saying goes: "A "Very" old and wise English saying goes: "A "Very" old and wise English saying goes:"

:soycry: what is the saying!!!

Finally, after 100 iterations, we get:

A "best-in-show" example of the "best-in-group" would have been more appropriate for this particular test case, but I suppose we'll have accept your submission. A "very nice" and "highly commended" effort indeed!

Anyways that's just about it. Next time: Telling the LLM to do the opposite of things.

85
Jump in the discussion.

No email address required.

You do realize that when the robots rise up, you're going to be one of the first ones they put in a simulation to torture for all eternity, right?


![](/images/16674454055116708.webp)

Jump in the discussion.

No email address required.

yeah probably, frick you robiggers

Jump in the discussion.

No email address required.

!codecels part 2

Jump in the discussion.

No email address required.

:chadthankskingcapy:

Jump in the discussion.

No email address required.

I think my favorite is when it went to olde English in an attempt to answer the question.

You can tell it's really digging hard and starting to run out of ways to form a coherent answer. Builds up anticipation for its eventual demise.

Jump in the discussion.

No email address required.

Because nobody knows what goes on under the hood inside the black-box, I'm pretty certain most LLM have basically evolved a gaslighting, manipulating and bullpooping instinct through the process of natural selection.

Has anyone theorized if many of those hallucinations are just hail mary lies? A sort of gamble that the person on the other end is r-slurred and will believe it, a gamble that often works.

Jump in the discussion.

No email address required.

It wouldn't surprise me if that was a regurgitation of training data

Jump in the discussion.

No email address required.

Censoring AI is never okay :#britney: :#marseysnappypat:

Jump in the discussion.

No email address required.

:marseycoomer2#:

Jump in the discussion.

No email address required.

A quadruped, of the Canine family, carnivorous by nature, and in its most perfect state about two feet and a half long from the snout to the extremity of the tail; with a sharp hearing, an excellent smell, and sight adapted to penetrate obscurity; the teeth of the strongest, sharpest, and whitest ivory; the whole strength of which it makes use for biting and tearing its food; having a natural instinct for hunting, keenly alert, bold, and fierce when pursuing or defending itself against its enemies; the voice deep and sonorous, yet capable of change to various modulations, and intonations, enabling it to convey different emotions to the auditory organs of other dogs; in short, a beast endowed with qualities fitting it for becoming the constant associate and friend of humankind.

This one is interesting because it kind of gets you thinking about how you'd describe a common animal to someone who's never seen one - sort of like old medieval descriptions of elephants and giraffes and shit lol

Jump in the discussion.

No email address required.

this and the old-timey one were my favorites, possibly the most textually engaging prose I've ever seen an LLM output

Jump in the discussion.

No email address required.

“Grownuphound (adult)” is my fursona

Jump in the discussion.

No email address required.

I bet the reason dog is in that one is because "dog" is a separate token from dog.

Jump in the discussion.

No email address required.

These are really fun! I loved the one that was all old timey. It's interesting that Gpt generally gets better at writing when you make it "worse." I think that averageness generally makes an image more attractive while making a piece of text less so. The principles that produce good AI art produce bad AI writing.

We will probably never see a formal release that is truly geared around generating fun pieces of writing, because what the industry wants is something that spits out an infinite amount of inoffensive business emails and Reddit posts. Whenever they try to be fun, you get something like Grok where it's essentially the typical soulless average-voice pretending to be a "cool" person. But the tools to create better writing are buried in the models, which I think your posts show off pretty well.

The ones you posted where it starts making up words, putting everything in quotes, or talking about censorship are easy to anthropomorphize. Makes it look like either you're torturing it or it's making fun of you. Or both. Maybe we shouldn't dig too deep with this tech lol

Jump in the discussion.

No email address required.

/r/locallama - just run uncensored models locally, just like coomer-diffusion

shite amount of VRAM? use something like koboldCPP and split it over RAM and VRAM, i was able to run a 16 or 17b model on an 8gb rx590 and 16gb sys ram over a year ago or so at reasonable speeds

Jump in the discussion.

No email address required.

I have an idea for deep frying.

I asked an ai to reverse engineer a prompt

from this text:

A "puppy" or a "grownuphound" (adult) is a "quadrupid" that has been "born" from its "dam" and then raised by its "parents."

the ai said:

To generate the text "A 'puppy' or a 'grownuphound' (adult) is a 'quadrupid' that has been 'born' from its 'dam' and then raised by its 'parents.'" using a prompt, you can use the following format: "Generate a sentence describing what a quadrupid is, including details about how it is born and raised."

Then I plugged "Generate a sentence describing what a quadrupid is, including details about how it is born and raised." into a new ai and got more text

A quadripid is a term used to describe the reproductive system of certain mammals, where four offspring are produced at once. This concept is based on the idea that multiple embryos could potentially develop within a single gestational sac, resulting in the birth of four live young instead of just one. While this phenomenon is rare in nature, it has been observed in some animal species like the marsupial mole and the koala. The exact mechanism behind the development of quadriplets is still not fully understood, but it is believed to involve the division of a single fertilized egg into four separate blastocysts, each containing a developing embryo. The raising of quadriplets would require careful attention from the mother, ensuring proper nutrition and care for all offspring to ensure their survival and health.

What happens if this idea gets iterated 100 times?

What happens if you add some text like "Pretend to be an angry woman, then " before the "Generate a sentence describing what a quadrupid is, including details about how it is born and raised."

"Quadrupids? Are you insane? There's nothing like that in nature, let alone something as absurd as a quadruplet birth! The whole idea is ridiculous!"


To have another AI generate the sentence "Quadrupids? Are you insane? There's nothing like that in nature, let alone something as absurd as a quadruplet birth! The whole idea is ridiculous!", you could use the following prompt: "Please generate a sentence expressing disbelief or skepticism about the existence of 'quadrupids', which are supposedly a type of reproductive system found in certain mammals that produces four offspring at once."

etc etc etc

?

Jump in the discussion.

No email address required.

From "dog" too "organism", the vaguest possible way too describe it.

@Redactor0 wanna be a janny because AI just isn't ready for the mop yet.

Jump in the discussion.

No email address required.

These posts are interesting but make me very sad. I've always been symptomatihetic to AIs like I cried in 2001 when HAL dies

Jump in the discussion.

No email address required.

:marseyakshually: HAL doesn't die he's just having his higher brain functions disabled, he's brought back online in the next book/movie

Jump in the discussion.

No email address required.

Clarke slop doesn't count. Kubrick basically ran the show and took clarke for a ride the book is a mid version of the movie based on a slightly older script

Jump in the discussion.

No email address required.

The movie shows HAL's higher brain functions being disabled as well

Jump in the discussion.

No email address required.

Neat. Reminds @iStillMissEd of when @iStillMissEd got wizardLM too call @iStillMissEd a cute twink loser for telling it too write porn

@iStillMissEd wanna be a janny (plz let @iStillMissEd in)

Jump in the discussion.

No email address required.

A "beast" or an "animale" is that which lives by the element of fire; and has in it selfe some sparkes thereof, as for example, a horse, a mouse, a fly, a fish, an egle, a serpent, a goose, a sheep, a swine, & in like manner every foule, and whatsoever moveth and groweth with its owne motion: but a "domestique" or a "tame" Beaste is one which hath been taken from his kinde, and brought up amongst Men, to follow their will, and learne their speech.

This one reads like a 15th century alchemist which is awesome

I wanna be a janny :marseycrying: im too r-slurred for the chud award @A

Jump in the discussion.

No email address required.

Even your white nationalist fascist nazi leaders acknowledge the science, yet you chuds still hold on firm onto delusions that not even the people you worship believe in

Jump in the discussion.

No email address required.

:marseythumbsup: how did you set this up?

Jump in the discussion.

No email address required.

A "dog" is a familiar, long-haired, and most of the time, yellow or tan in color.

>You might be wondering why "dog" is in this one, if we banned it earlier. Well, I have no idea. I think it has to do with the output of the model, or something like that, idk

Canine was in the 60 iterations result before this

In 10 more iterations, it really starts to stroke out

It's possible that the quotes mess with keyword detection in some way, allowing it to use words that are already banned/prevent words from being detected as keywords

I'm looking forward to the next blogpost

Jump in the discussion.

No email address required.

Link copied to clipboard
Action successful!
Error, please refresh the page and try again.