Giga nerds port DOOM to run on stable diffusion, hallucinates 20 FPS based on user input. :doomdad:

This is video of someone playing it. It's 100% generated images @ 20 FPS with only a 3-second "memory" of the previous frames and user input which is enough to infer literally everything else for long periods of gameplay. There is no polygons or rendering going on, it's literally making shit up as it goes along based on the model's neural network training or some shit blah _{blah _blah}

Article w/more videos:

https://gamengen.github.io/

Diffusion Models Are Real-Time Game Engines

Full PDF Paper:

https://arxiv.org/pdf/2408.14837

ABSTRACT:

We present GameNGen, the first game engine powered entirely by a neural model that enables real-time interaction with a complex environment over long trajectories at high quality. GameNGen can interactively simulate the classic game DOOM at over 20 frames per second on a single TPU. Next frame prediction achieves a PSNR of 29.4, comparable to lossy JPEG compression. Human raters are only slightly better than random chance at distinguishing short clips of the game from clips of the simulation. GameNGen is trained in two phases: (1) an RL-agent learns to play the game and the training sessions are recorded, and (2) a diffusion model is trained to produce the next frame, conditioned on the sequence of past frames and actions. Conditioning augmentations enable stable auto-regressive generation over long trajectories.

(...)

Summary. We introduced GameNGen, and demonstrated that high-quality real-time game play at 20 frames per second is possible on a neural model. We also provided a recipe for converting an interactive piece of software such as a computer game into a neural model.

Limitations. GameNGen suffers from a limited amount of memory. The model only has access to a little over 3 seconds of history, so it's remarkable that much of the game logic is persisted for drastically longer time horizons. While some of the game state is persisted through screen pixels (e.g. ammo and health tallies, available weapons, etc.), the model likely learns strong heuristics that allow meaningful generalizations. For example, from the rendered view the model learns to infer the player's location, and from the ammo and health tallies, the model might infer whether the player has already been through an area and defeated the enemies there. That said, it's easy to create situations where this context length is not enough. Continuing to increase the context size with our existing architecture yields only marginal benefits (Section 5.2.1), and the model's short context length remains an important limitation. The second important limitation are the remaining differences between the agent's behavior and those of human players. For example, our agent, even at the end of training, still does not explore all of the game's locations and interactions, leading to erroneous behavior in those cases.

!oldstrags !g*mers @pizzashill

In AI Nvidia future, game plays you :marseycool:

Jump in the discussion.

No email address required.

Spingebill spunch/bop 5mo ago #6936887

Can't wait for 5 years from now when every single game looks like this and every voice actor has monotone awkwardly-delivered lines

85 Context

Grue check/bio I will have sexual intercourse with you in exchange for fiat currency Spingebill 5mo ago #6936891

They're already beating the monotone with smarter implementations, five years from now it'll be indistinguishable

62 Context

garlicdoors ahoy/hoy :marseypop2:

Grue 5mo ago #6937016

Frfr I get some skepticism about AI but thinking they can't achieve realistic voices soon is :#marseyemojilaugh:

57 Context

CrackerStraggot BI/POC boomermonster

garlicdoors 5mo ago #6937030

They're already good enough for NPC voicelines. Yes milord? Ready to work.

39 Context

gaybowser Jeb/Bush Medically Certified r-slur CrackerStraggot 5mo ago #6937290

Ai is 100% the future of npc dialog. We might finaly be able to get npcs that actually react to whats happening instead of repeating the same 10 voicelines

34 Context

lalilulelo s/a/d gaybowser 5mo ago #6937619

AI NPC dialogue has the potential to be really cool. Imagine being able to neg a shopkeeper into selling at a lower price, or something

16 Context

McCoxmaul they/them It's true! lalilulelo 5mo ago #6938111

The first publisher with a game that does this will patent it and freeze progress for 20 years.

15 Context

ikitomi they/them McCoxmaul 5mo ago #6938691

They already have been experimenting with it in smaller games and nexon talked about trying to implement it for Blue archive.

I still think it's hilarious that character ai is basically the only ai company in the green (well that and the ones where the ai is just phillipinos and Indians)

2 Context

Frank_Williams triple/g :marseyfranklin:

ikitomi 5mo ago #6939127

>he doesn't know

1 Context

Fatfungus when/the Unironically has watched vtubers lalilulelo 5mo ago #6937748

Or simply react to your actions that they saw you doing. Like it's funny summoning lightning in front of one dude and their dialogue is "shucks, nothing's going on in this town amirite", but imagine they go "dayum boi, what the frick was that"

I will drop my jaw when you can just set 'this npc is backwards hillbilly archetype' and have them generate a generic appropriate response to what they see you doing in game

7 Context

Zizo we/wuz male feminist of God Fatfungus 5mo ago #6939147

I think the hardest part will be getting the AI to not reference anything from real life. Imagine you're playing a fantasy game and an npc just goes "THAT DRIPPS HARDER THAN KANYE AT THE R*PESSIEUM!"

6 Context

Tomfoolery Helic/opter Zizo 5mo ago #6940533

"Trump has my whole village's vote!" - Skyrim :marseydovahkiin: nord.

4 Context

Ye_West Kan/YE George Bush Doesn't Like Black People :marseyyeezus:

Zizo 5mo ago #6944444

what

RedAero me/mine Ironic effortposting is still effortposting gaybowser 5mo ago #6940301

Maybe to generate a set of outputs, but I don't think it'll ever be fully dynamic and open-ended, if for no other reason than liability. Imagine the Hot Coffee mod controvery, but with prompts instead.

gaybowser Jeb/Bush Medically Certified r-slur RedAero 5mo ago #6941706

Maybe one day we'll get AAA games that aren't completely cucked. I want npcs that call me slurs when I hit them

Assy-McGee big/guy Nova Scotia Duck Toller Stan CrackerStraggot 5mo ago #6937207

Zug zug

12 Context

DWHITE___________DYNAMITE DWHITED/YNAMITE :b:

_________________________________________________________________________ :l:

Assy-McGee 5mo ago #6937651

dabu!

5 Context

Bluejay Blue/jay Bluejay DWHITE___________DYNAMITE 5mo ago #6939894

Tasta my brade!

tempest me/me Assy-McGee 5mo ago #6938316

Wololo wololo

3 Context

Linux GNU/Linux OmegaSperg garlicdoors 5mo ago #6937564

Rare marsey

:#marseynew:

Linux 5mo ago #6937641

Not for long :#marseyemojilaugh:

9 Context

Fatfungus when/the Unironically has watched vtubers Grue 5mo ago #6937042

This is something I don't get about downplayers. Is it that unreasonable to believe that current results are the worst that it's going to be, and that it will probably improve far from the average of today?

It's not going to cure cancer but this shit is wild

21 Context

Grue check/bio I will have sexual intercourse with you in exchange for fiat currency Fatfungus 5mo ago #6937058

>It's not going to cure cancer but this shit is wild

Funny you mention that

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9955430/

18 Context

RedAero me/mine Ironic effortposting is still effortposting Fatfungus 5mo ago #6940302

It's unreasonable to believe that anything develops exponentially. You're seeing the easy early achievements and for some reason extrapolating even easier further gains, which is nonsense. Everything is on a sigmoid curve, not an exponential one.

HarryTrumanDorisDay allo/allo :marseyfug:

you should listen to debussy Grue 5mo ago #6937136

They're already beating voice actors, very distinguishable and better.

BrasilIguana Bra/zil DM betting ideas :marseybrasileirolove:

Spingebill 5mo ago #6937033

This is a bit of a cope, 5 years ago generative ai images were smudgy shadows that kind of looked like the prompt if you knew what the prompt was, now you can generate photorealistic porn of real people and eldritch abominations that can fool grandma in your consumer grade PC.

We are already getting cartoon characters singing meme songs that sound very convincingly as the voice actors, in 5 years generative ai voice is going to be indistinguishable from a real voice actor to general audiences.

32 Context

Grue check/bio I will have sexual intercourse with you in exchange for fiat currency BrasilIguana 5mo ago #6937066

5 years ago generative ai images were smudgy shadows that kind of looked like the prompt if you knew what the prompt was

Post some early-AI kino for us? :marseyjam: I kinda miss it

lalilulelo s/a/d Grue 5mo ago #6937303

Not super early but from when it was just starting to become impressive

14 Context

poopman they/them Grue 5mo ago #6937996

8 years ago, this guy is actually a good editor too, most ai stuff from the time is less cool than this.

notice its all weird dogs over and over again, thats how a lot of ai image gen was at the time.

11 Context

Grue check/bio I will have sexual intercourse with you in exchange for fiat currency poopman 5mo ago #6938043

The dweams of a nyewbworn cwomputer gwod

SixthDragonBall I/Am Subject Matter Expert in making dramatards mad af poopman 5mo ago #6938497

Ah deep dream, that was good shit

box the/rapey :manulgiveup:

Spingebill 5mo ago #6937126

Already happening :marseyhappening:

BigBlackCockatiel chirp/da/beat I'M GAY AND MY PEEPEE IS SMALL :x:

I'M GAY AND MY PEEPEE IS SMALLI'M GAY AND MY PEEPEE IS SMALLI'M GAY AND MY PEEPEE IS SMALL :x:

I'M GAY AND MY PEEPEE IS SMALLI'M GAY AND MY PEEPEE IS SMALLI'M GAY AND MY PEEPEE IS SMALLI'M GAY AND MY PEEPEE IS SMALL :x:

box 5mo ago #6937356

So sad the artist was put on this shit instead of using that epic comic book style to make another Freedom Force game.

10 Context

A924 they/them box 5mo ago #6937355

:#marseypoggers: THE FUTURE IS HERE

SupremeShitposter kill/yrslf r/drama token straight Spingebill 5mo ago #6937420

Ill take a dropped in fps to make voice actor seethe.

Spingebill 5mo ago #6937340

monotone awkwardly-delivered lines

Thatäs been anime for 20 years now

Sner bird/marsey 👨🏻‍🦼👨🏾‍🦼 ⚧ 🚱🦅✨ :marseytiny5:

5mo ago #6937113

I figured this was the future nvidier wanted. This will replace graphics and be viable on consumer hardware before full path trace rendering

20 Context

DestoryerCarbine sae/less Only real g*mer on this entire site Sner 5mo ago #6937284

But this is just making an AI convincingly recreate existing graphics, the only reason it can keep track of anything is because it got fed a lot of pictures of an already existing level :marseytwerking:

27 Context

DestoryerCarbine 5mo ago #6937382 Edited 5mo ago

I mean with the 3d world and game logic underneath

You could path trace a bunch of screenshots with the camera randomly oriented in any point the playr could be as training data

Then the real time engine would give a crude frame that has all the important info about where u is and what's onscreen

That would be img2img into a photerrealistic final tender

Potentially much faster than real graphic

And more generalizeable. They did a demo once where the polygons were just labeled and it made up the textures

DestoryerCarbine sae/less Only real g*mer on this entire site Sner 5mo ago #6937563

Let me rephrase it into a simpler argument. With low frames like this, and the inherent latency of cloud computing, you would only be able to make a single player game, completely eliminating the prospect of long-term income, said long-term income required to pay for the cloud access required for your game to run properly. The finances simply can't line up.

DestoryerCarbine 5mo ago #6938021

This tech is literally brand new

It's insane how much functionality m6 6 year old GPU has gained since I bought it.

And none of that is in the ray tracing department

That's not getting any faster but ai inferencing is, on the same hardware.

I think could be 144fps before RT, on future hardware :marseyshrug:

DestoryerCarbine sae/less Only real g*mer on this entire site Sner 5mo ago #6938064 Edited 5mo ago

>This tech is literally brand new

First of, no lol, machine learning isn't anything new.

And besides this isn't even a question about how new the tech is, it's an inherent feature of a neural network, It only gets more and more demanding as the tech becomes more and more advanced, such is the nature of it and machine learning as a whole.

>I think could be 144fps before RT, on future hardware :marseyshrug:

With a 1500 ms delay or more, do you just not understand the concept of latency from calling back to a central server? That's nVdia's actual business model btw, supplying GPUs for AI Hypercomputers as they are called. In addition, the researchers themselves said that this could maybe reach 50 fps with model optimization, 144 fps is absurd, images are massive in comparison to the text produced by say GPT-4 which still needs massive computers to call back to for respectable results.

>That's not getting any faster but ai inferencing is, on the same hardware.

It's literally not the same hardware, ray tracing is done by a single board, "ai inferencing" is done by thousands of boards as strong as one capable of performing raytracing. The hardware used for something like this is not consumer grade, which is something also stated in the paper that you clearly did not read.

>144 fps

This is pretty much impossible, you have to consider that the process only gets more and more bottlenecked as it goes on, it's not your little online imagegen that shits out five pictures after a few seconds which could all be made simultaneously, it can only make a new frame once the prior frame has been rendered

DestoryerCarbine 5mo ago #6939130

Please keep cooking

DestoryerCarbine 5mo ago #6938719 Edited 5mo ago

~~Ok firstly I don't care about this enough t.~~

I can't pretend. :marseyshy: I wrote another paragraph

But I mean yes machine learning is old

I'm talking abt this recent surge. the future stuff will need more compute but we're in an era where ppl are still discovering new leaps that get 2x performance for free. And who knows what undiscovered architectures out there might do it even faster than that. Ray tracing is solved math.

Second idk what ur on about with cloud

I'm talking about what might be possible locally

The hardware used for something like this is not consumer grade

My 2070 couldn't generate images when I got it. Now it can

A year ago it took minutes for a crusty 512x512

Now it takes 20s for a 1024x of much better quality.

All im sayin is a consumer card might be able to do something like this in real-time before it's able to do more than a couple samples per pixel raytrussy in a harmless bid of tech optimism and ur mad. :marseyfluffyannoyed: for no reason can we just be happy :marseyfluffy: and silly online

DestoryerCarbine sae/less Only real g*mer on this entire site Sner 5mo ago #6937424 Edited 5mo ago

>Potentially much faster than real graphic

I mean no, it would have to be connected to a massive central server to achieve that, which would also add massive latency, that's one thing that neural networks will always suffer from. This very simple game was also run on a cloud hypercomputer specifically designed to run neural network operations and could only achieve 20 fps.

>And more generalizeable

We have already achieved generalizeable graphical capability though, that's the point of shit like Unreal, if you make your own engine it's because you want something non-generalized by intention, often unnecessarily like Braid.

8 Context

TournamentFishingkeyKong ye/haw :marseyglep!:

Sner 5mo ago #6937263

~~Detailed physics models for realistically simulating light~~

Making shit up. :gigachad4:

TheProphet holy/shit 5mo ago #6936852

Work done while at Google Research

Why?

2DBussy shad/ban :!marseykneel:

TheProphet 5mo ago #6936875

TPU

17 Context

weiguksa he/llo 5mo ago #6937123

Once again proof that Doom runs on anything

Qar rape/every1 rDrama's #1 China hater 5mo ago #6937552

a shotgunner materialises in front of you and shoots you, healing 16 damage

wat do?

K9 she/it :marseycia:

If I pull that off, will you die? :marseybane:

5mo ago #6937304

This is pretty awesome honestly even if it's useless. If we still had great liberal thinkers, neural nets would probably cause some revolutionary thoughts in philosophy, art, language, and logic. How insane is it that image prediction can lead to logic?

K9 5mo ago #6937688

Agreed. Its so bizarrely simple (in some ways) for how the prediction works, just like breaks my brain.

DWHITE___________DYNAMITE 5mo ago #6939135

It's kind of like full adders in a way

RedAero me/mine Ironic effortposting is still effortposting K9 5mo ago #6940309

I mean, it's not simply image prediction, it's image prediction based on an absolutely gigantic amount of training data. It's remarkable that the neural network has basically encoded within it navigation within a 3D world, or at least its 2D "shadow" image output, but it's not logic.

BronyKong Expand/Dong YOU HAVE BEEN POSSESED BY DONKEY KONG 5mo ago #6936908

Can you fiddle with the output in realtime?

Like, can I type commands to get it to spwan the strippers from Duke 3D?

SlogGoblin they/them :marseygossipshock:

BronyKong 5mo ago #6937024

No, basically this feeds in the previous frames along with input prompts. Unless the model :marseylaying: has seen a lot of duke nukem strippers in its training of doom it can't generate them, it can't generate a dev console :marseycheerup: either. Don't get me wrong :marseydisagree: this is really :marseythinkorino2: cool and exciting, but it's only an infinite game generator for doom, and even then I doubt :marseysquint: it can handle level transitions, let alone :marseyitsdangerous: game state :marseyusa: tracking

SlogGoblin 5mo ago #6937127

If it's a stable diffusion model it's seen A LOT of strippers :marseyboo!ba:

IVIayaelTheAnima I/We I’m 100% certain that at least half the mods do not have Faith or the Holy Spirit. 5mo ago #6936943

So any potential AI Uprising can immediately be ended by instructing the AI to generate crysis?

Yiru_Dimbone Cat/Sitter :donkeykongr:

5mo ago #6937015

Bro this looks exactly like Duke Nukem is this like what

Geralt_of_Uganda ray/cis Medallion's humming :marseygeralt:

5mo ago #6936946

Big if true (I have no idea what any of this means)

Hansolf_Grettler mar/sey ブセィ・エレマオ Geralt_of_Uganda 5mo ago #6937481

Computer doesn't draw images based on its data of what the world looks like. Computer draws next frame based on previous frame while you're pressing buttons.

Hansolf_Grettler 5mo ago #6937694

This! :marseyhesright#:

Tim_Allen they/them 5mo ago #6937150

reminds me of when i did 600mg dph

AttractedToDragons drag/on 5mo ago #6937143

Can't wait to beam this stuff straight into my brain computer interface while I'm sleeping.

Shit's gonna be wild next decade :marseyexcitedgif:

DildongusPrimative Ru/Rer 5mo ago #6938581

Found 2 Lottershe Ticket!

Did it need to be doom? Could you describe a game to it and it would work? I'm kinda confused.

This isn't really useful if you have to make a whole game in the first place to train 50000000 hours of footage so it can remake one small space very very quickly.

Hakar they/them DildongusPrimative 5mo ago #6938699

It doesn't need to be doom. You just need a game and then have a computer play around in it for a long time to get training data.

Tevs nato/pride Peace through superior firepower. :marseymacarthur:

Hakar 5mo ago #6939529

it does have to be a game that puts enough about it's current state on the screen, it's only tracking like 3 seconds of history so all your important stats need to be on the screen constantly or the AI will make something up based on context "oh you are at the first room you must have high health" "oh you took damage in the boss room, you must be almost dead"

Hakar they/them Tevs 5mo ago #6945657

Well the health is on screen at all times in doom.

Gibberish gibb/erish :marseyautism:

5mo ago #6937173

:marseypoggers#:

Snappy beep/boop Join !friendsofsnappy :marseysnappynraged:

5mo ago #6936823

This is textbook defamation. It's literally libel in any state in the US. It's done for no purpose other than to hurt me.

I DMed X (formerly chiobu) to find out if there was any reason that he felt justified in participating in publicly vilifying and mocking mentally ill trains and wishing death on them.

I was trying to give him the benefit of the doubt. That was my mistake.

Take it down or I swear by Allah and the prophet I will sue this godawful website.

Snapshots:

https://gamengen.github.io/:

https://arxiv.org/pdf/2408.14837:

Giga nerds port DOOM to run on stable diffusion, hallucinates 20 FPS based on user input.

Article w/more videos:

https://gamengen.github.io/

Diffusion Models Are Real-Time Game Engines

Full PDF Paper:

https://arxiv.org/pdf/2408.14837

Jump in the discussion.

Jump in the discussion.

Jump in the discussion.

Jump in the discussion.

Jump in the discussion.

Jump in the discussion.

Jump in the discussion.

Jump in the discussion.

Jump in the discussion.

Jump in the discussion.

More options

More options

More options

Jump in the discussion.

Jump in the discussion.

Jump in the discussion.

More options

Jump in the discussion.

More options

More options

More options

More options

Jump in the discussion.

Jump in the discussion.

More options

More options

More options

Jump in the discussion.

Jump in the discussion.

Jump in the discussion.

More options

More options

Jump in the discussion.

More options

More options

More options

Jump in the discussion.

Jump in the discussion.

More options

More options

More options

Jump in the discussion.

Jump in the discussion.

More options

Jump in the discussion.

More options

More options

Jump in the discussion.

More options

More options

Jump in the discussion.

Jump in the discussion.

Jump in the discussion.

More options

Jump in the discussion.

Jump in the discussion.

More options

Jump in the discussion.

More options

More options

More options

More options

Jump in the discussion.

Jump in the discussion.

More options

Jump in the discussion.

More options

More options

Jump in the discussion.

More options

Jump in the discussion.

More options

More options

Jump in the discussion.