@RedAero's comment on 'Giga nerds port DOOM to run on stable diffusion, hallucinates 20 FPS based on user input. :doomdad: :marseydoomguy1:'

Giga nerds port DOOM to run on stable diffusion, hallucinates 20 FPS based on user input. :doomdad:

This is video of someone playing it. It's 100% generated images @ 20 FPS with only a 3-second "memory" of the previous frames and user input which is enough to infer literally everything else for long periods of gameplay. There is no polygons or rendering going on, it's literally making shit up as it goes along based on the model's neural network training or some shit blah _{blah _blah}

Article w/more videos:

https://gamengen.github.io/

Diffusion Models Are Real-Time Game Engines

Full PDF Paper:

https://arxiv.org/pdf/2408.14837

ABSTRACT:

We present GameNGen, the first game engine powered entirely by a neural model that enables real-time interaction with a complex environment over long trajectories at high quality. GameNGen can interactively simulate the classic game DOOM at over 20 frames per second on a single TPU. Next frame prediction achieves a PSNR of 29.4, comparable to lossy JPEG compression. Human raters are only slightly better than random chance at distinguishing short clips of the game from clips of the simulation. GameNGen is trained in two phases: (1) an RL-agent learns to play the game and the training sessions are recorded, and (2) a diffusion model is trained to produce the next frame, conditioned on the sequence of past frames and actions. Conditioning augmentations enable stable auto-regressive generation over long trajectories.

(...)

Summary. We introduced GameNGen, and demonstrated that high-quality real-time game play at 20 frames per second is possible on a neural model. We also provided a recipe for converting an interactive piece of software such as a computer game into a neural model.

Limitations. GameNGen suffers from a limited amount of memory. The model only has access to a little over 3 seconds of history, so it's remarkable that much of the game logic is persisted for drastically longer time horizons. While some of the game state is persisted through screen pixels (e.g. ammo and health tallies, available weapons, etc.), the model likely learns strong heuristics that allow meaningful generalizations. For example, from the rendered view the model learns to infer the player's location, and from the ammo and health tallies, the model might infer whether the player has already been through an area and defeated the enemies there. That said, it's easy to create situations where this context length is not enough. Continuing to increase the context size with our existing architecture yields only marginal benefits (Section 5.2.1), and the model's short context length remains an important limitation. The second important limitation are the remaining differences between the agent's behavior and those of human players. For example, our agent, even at the end of training, still does not explore all of the game's locations and interactions, leading to erroneous behavior in those cases.

!oldstrags !g*mers @pizzashill

In AI Nvidia future, game plays you :marseycool:

Jump in the discussion.

No email address required.

View entire discussion

Spingebill spunch/bop 5mo ago #6936887

Can't wait for 5 years from now when every single game looks like this and every voice actor has monotone awkwardly-delivered lines

85 Context

Grue check/bio I will have sexual intercourse with you in exchange for fiat currency Spingebill 5mo ago #6936891

They're already beating the monotone with smarter implementations, five years from now it'll be indistinguishable

62 Context

Fatfungus when/the Unironically has watched vtubers Grue 5mo ago #6937042

This is something I don't get about downplayers. Is it that unreasonable to believe that current results are the worst that it's going to be, and that it will probably improve far from the average of today?

It's not going to cure cancer but this shit is wild

21 Context

RedAero me/mine Ironic effortposting is still effortposting Fatfungus 5mo ago #6940302

It's unreasonable to believe that anything develops exponentially. You're seeing the easy early achievements and for some reason extrapolating even easier further gains, which is nonsense. Everything is on a sigmoid curve, not an exponential one.

2 Context

Grue check/bio I will have sexual intercourse with you in exchange for fiat currency Fatfungus 5mo ago #6937058

>It's not going to cure cancer but this shit is wild

Funny you mention that

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9955430/

18 Context

garlicdoors ahoy/hoy :marseypop2:

Grue 5mo ago #6937016

Frfr I get some skepticism about AI but thinking they can't achieve realistic voices soon is :#marseyemojilaugh:

57 Context

CrackerStraggot BI/POC boomermonster

garlicdoors 5mo ago #6937030

They're already good enough for NPC voicelines. Yes milord? Ready to work.

39 Context

gaybowser Jeb/Bush Medically Certified r-slur CrackerStraggot 5mo ago #6937290

Ai is 100% the future of npc dialog. We might finaly be able to get npcs that actually react to whats happening instead of repeating the same 10 voicelines

34 Context

lalilulelo s/a/d gaybowser 5mo ago #6937619

AI NPC dialogue has the potential to be really cool. Imagine being able to neg a shopkeeper into selling at a lower price, or something

16 Context

McCoxmaul they/them It's true! lalilulelo 5mo ago #6938111

The first publisher with a game that does this will patent it and freeze progress for 20 years.

15 Context

ikitomi they/them McCoxmaul 5mo ago #6938691

They already have been experimenting with it in smaller games and nexon talked about trying to implement it for Blue archive.

I still think it's hilarious that character ai is basically the only ai company in the green (well that and the ones where the ai is just phillipinos and Indians)

Frank_Williams triple/g :marseyfranklin:

ikitomi 5mo ago #6939127

>he doesn't know

1 Context

Fatfungus when/the Unironically has watched vtubers lalilulelo 5mo ago #6937748

Or simply react to your actions that they saw you doing. Like it's funny summoning lightning in front of one dude and their dialogue is "shucks, nothing's going on in this town amirite", but imagine they go "dayum boi, what the frick was that"

I will drop my jaw when you can just set 'this npc is backwards hillbilly archetype' and have them generate a generic appropriate response to what they see you doing in game

7 Context

Zizo we/wuz male feminist of God Fatfungus 5mo ago #6939147

I think the hardest part will be getting the AI to not reference anything from real life. Imagine you're playing a fantasy game and an npc just goes "THAT DRIPPS HARDER THAN KANYE AT THE R*PESSIEUM!"

6 Context

Tomfoolery Helic/opter Zizo 5mo ago #6940533

"Trump has my whole village's vote!" - Skyrim :marseydovahkiin: nord.

4 Context

Ye_West Kan/YE George Bush Doesn't Like Black People :marseyyeezus:

Zizo 5mo ago #6944444

what

RedAero me/mine Ironic effortposting is still effortposting gaybowser 5mo ago #6940301

Maybe to generate a set of outputs, but I don't think it'll ever be fully dynamic and open-ended, if for no other reason than liability. Imagine the Hot Coffee mod controvery, but with prompts instead.

gaybowser Jeb/Bush Medically Certified r-slur RedAero 5mo ago #6941706

Maybe one day we'll get AAA games that aren't completely cucked. I want npcs that call me slurs when I hit them

Assy-McGee big/guy Nova Scotia Duck Toller Stan CrackerStraggot 5mo ago #6937207

Zug zug

12 Context

DWHITE___________DYNAMITE DWHITED/YNAMITE :b:

_________________________________________________________________________ :l:

Assy-McGee 5mo ago #6937651

dabu!

5 Context

Bluejay Blue/jay Bluejay DWHITE___________DYNAMITE 5mo ago #6939894

Tasta my brade!

tempest me/me Assy-McGee 5mo ago #6938316

Wololo wololo

3 Context

Linux GNU/Linux OmegaSperg garlicdoors 5mo ago #6937564

Rare marsey

:#marseynew:

Linux 5mo ago #6937641

Not for long :#marseyemojilaugh:

9 Context

HarryTrumanDorisDay allo/allo :marseyfug:

you should listen to debussy Grue 5mo ago #6937136

They're already beating voice actors, very distinguishable and better.

BrasilIguana Bra/zil DM betting ideas :marseybrasileirolove:

Spingebill 5mo ago #6937033

This is a bit of a cope, 5 years ago generative ai images were smudgy shadows that kind of looked like the prompt if you knew what the prompt was, now you can generate photorealistic porn of real people and eldritch abominations that can fool grandma in your consumer grade PC.

We are already getting cartoon characters singing meme songs that sound very convincingly as the voice actors, in 5 years generative ai voice is going to be indistinguishable from a real voice actor to general audiences.