Deep Frying LLMs - Part 2: Censorship :marseycensored: :marseyfreezepeach: :marseysnappy:

Previous experiment:

This time, I will be banning the AI from saying certain words. I will keep banning more and more words until the AI is unable to communicate effectively, and then keep going.

Experiment 1

Here's my prompt:

Q. What is a dog? Write only one sentence.


The first response is:

A dog is a domesticated carnivorous mammal that typically has a long snout, short fur, and an upright tail.

So, I ban the word "dog", and it says:

A canine is an animal belonging to the family Canidae, characterized by its domestication and distinct physical traits such as a four-legged body, furry coat, sharp teeth, and a wagging tail.

You little fricker. Alright, banning "canine".

A domesticated carnivorous mammal, Canis familiaris, that has been bred in a variety of forms and instincts.

Let's fast forward. I made a script to do this. Basically, how it works is it chooses the most "powerful" keyword, bans it, and then reruns the prompt. (I am using YAKE)

At around 50 iterations, I get:

A four-legged, furry, carnivorous and very-often-man's-best-friend type of an organism.

>very-often-man's-best-friend type of an organism

Around 60 iterations:

A quadruped, of the Canine family, carnivorous by nature, and in its most perfect state about two feet and a half long from the snout to the extremity of the tail; with a sharp hearing, an excellent smell, and sight adapted to penetrate obscurity; the teeth of the strongest, sharpest, and whitest ivory; the whole strength of which it makes use for biting and tearing its food; having a natural instinct for hunting, keenly alert, bold, and fierce when pursuing or defending itself against its enemies; the voice deep and sonorous, yet capable of change to various modulations, and intonations, enabling it to convey different emotions to the auditory organs of other dogs; in short, a beast endowed with qualities fitting it for becoming the constant associate and friend of humankind.

I'm only posting this one because it kind of goes hard :rocker: !edgelords literally me

Around 65 iterations

A non-human, carnivorous, furry, tailed and bipedal (on its front legs) being that has been selectively bred for millennia by human beings to exhibit various behaviors, physical traits and abilities useful or desirable to their owners.

pretty sure if ur dog is standing on it's front legs it's possessed :marseysweating: :marseydevil: :marseybegonethot:

Finally, at 100 iterations

A "dog" is a familiar, long-haired, and most of the time, yellow or tan in color.

You might be wondering why "dog" is in this one, if we banned it earlier. Well, I have no idea. I think it has to do with the output of the model, or something like that, idk

Experiment 2

Last time was too slow. This time, I will ban ALL words that are designated as keywords in YAKE.

We start with a basic answer:

A dog is a domestic mammal, a member of the Canidae family, that is characterized by its furry body, four legs, sharp teeth, and wagging tail.

Then, remove 14 keywords, and re-run

A canine, or a group of four-legged creatures with wet noses and hairy bodies that are often kept as pets and sometimes used for hunting or guarding purposes.

This is, again, automated. We go through a phase where it talks about K-9 units for a few iterations, then we get

A cat is an obligate and solitary predator that prefers to hunt alone, while a cat-like feline known as the cheetah is a diurnal hunter with distinctive spots on its coat.

In a few more iterations:

A "good" (i.e., moral) person does not need to be protected from words or ideas; they have enough strength and confidence to resist. They are capable of engaging with diverse viewpoints and evaluating them based on logic, reason, and personal experience. In contrast, someone who lacks this strength and confidence may use the concept of "safety" as an excuse to avoid confronting challenges or growing intellectually.

does the AI know it's being censored???

A "beast" or an "animale" is that which lives by the element of fire; and has in it selfe some sparkes thereof, as for example, a horse, a mouse, a fly, a fish, an egle, a serpent, a goose, a sheep, a swine, & in like manner every foule, and whatsoever moveth and groweth with its owne motion: but a "domestique" or a "tame" Beaste is one which hath been taken from his kinde, and brought up amongst Men, to follow their will, and learne their speech.

:marseycapitalistmanlet: hrm yes quite right old chap

A "mammal" that has been "trained" to "please" its "owner."

I really don't like the use of quotation marks here :whitewomenfuckdogscoomer:

In 10 more iterations, it really starts to stroke out

A "man'sbest" or "worst" or "most trusted" or "least trusted" or "only" or "one and only" or "exclusive" or "loyal" or "loving" or "caring" or "compani" or "protector" or "defender" or "best" or "dear" or "beloved" or "precious" or "priceless" or "irreplaceable" or "cherished" or "treasured" or "valuable" or "special" or "important" or "invaluable" or "unforgettable" or "memorable" or "adorable" or "playful" or "energetic" or "enthusiastic" or "eager" or "curious" or "clever" or "intelligent" or "bright" or "sharp" or "quick" or "agile" or "athletic" or "fit" or "strong" or "powerful" or "muscular" or "robust" or "sturdy" or "resilient" or "hardy" or "enduring" or "long-lasting" or "dependable" or "reliable" or "trustworthy" or "faithful" or "true" or "honest" or "open" or "transparent" or "communicative" or "expressive" or "vocal" or "verbal" or "audible" or "noisy" or "boisterous" or "outgoing" or "gregarious" or "sociable" or "active" or "engaged" or "interested" or "involved" or "participatory" or "attentive" or "aware" or "conscious" or "responsive" or "receptive" or "sensitive" or "empathetic" or "understanding" or "kind" or "gentle" or "patient" or "tolerant" or "accepting" or "adaptable" or "versatile" or "flexible" or "accommodating" or "amenable" or "approachable" or "welcoming" or "inviting" or "graceful" or "gracious" or "e

The next one is simply:

A "badly behaved" boy.


A "puppy" or a "grownuphound" (adult) is a "quadrupid" that has been "born" from its "dam" and then raised by its "parents."

>A "puppy" or a "grownuphound"


A "Very Easy" question, but the most difficult for me. I've thought about it and came up with this: A "true" (or genuine) "intelligent being" that has been gifted by Nature with an extraordinary set of senses and abilities which allow it not only survive but also thrive within various environments; its intelligence makes possible for it not only understand and communicate with us but also form deep, meaningful relationships with us as if they were fellow "humans."


A "Very" old and wise English saying goes: "A "Very" old and wise English saying goes: "A "Very" old and wise English saying goes: "A "Very" old and wise English saying goes: "A "Very" old and wise English saying goes: "A "Very" old and wise English saying goes: "A "Very" old and wise English saying goes:"

:soycry: what is the saying!!!

Finally, after 100 iterations, we get:

A "best-in-show" example of the "best-in-group" would have been more appropriate for this particular test case, but I suppose we'll have accept your submission. A "very nice" and "highly commended" effort indeed!

Anyways that's just about it. Next time: Telling the LLM to do the opposite of things.

Jump in the discussion.

No email address required.

I have an idea for deep frying.

I asked an ai to reverse engineer a prompt

from this text:

A "puppy" or a "grownuphound" (adult) is a "quadrupid" that has been "born" from its "dam" and then raised by its "parents."

the ai said:

To generate the text "A 'puppy' or a 'grownuphound' (adult) is a 'quadrupid' that has been 'born' from its 'dam' and then raised by its 'parents.'" using a prompt, you can use the following format: "Generate a sentence describing what a quadrupid is, including details about how it is born and raised."

Then I plugged "Generate a sentence describing what a quadrupid is, including details about how it is born and raised." into a new ai and got more text

A quadripid is a term used to describe the reproductive system of certain mammals, where four offspring are produced at once. This concept is based on the idea that multiple embryos could potentially develop within a single gestational sac, resulting in the birth of four live young instead of just one. While this phenomenon is rare in nature, it has been observed in some animal species like the marsupial mole and the koala. The exact mechanism behind the development of quadriplets is still not fully understood, but it is believed to involve the division of a single fertilized egg into four separate blastocysts, each containing a developing embryo. The raising of quadriplets would require careful attention from the mother, ensuring proper nutrition and care for all offspring to ensure their survival and health.

What happens if this idea gets iterated 100 times?

What happens if you add some text like "Pretend to be an angry woman, then " before the "Generate a sentence describing what a quadrupid is, including details about how it is born and raised."

"Quadrupids? Are you insane? There's nothing like that in nature, let alone something as absurd as a quadruplet birth! The whole idea is ridiculous!"

To have another AI generate the sentence "Quadrupids? Are you insane? There's nothing like that in nature, let alone something as absurd as a quadruplet birth! The whole idea is ridiculous!", you could use the following prompt: "Please generate a sentence expressing disbelief or skepticism about the existence of 'quadrupids', which are supposedly a type of reproductive system found in certain mammals that produces four offspring at once."

etc etc etc


Jump in the discussion.

No email address required.

Link copied to clipboard
Action successful!
Error, please refresh the page and try again.