This is video of someone playing it. It's 100% generated images @ 20 FPS with only a 3-second "memory" of the previous frames and user input which is enough to infer literally everything else for long periods of gameplay. There is no polygons or rendering going on, it's literally making shit up as it goes along based on the model's neural network training or some shit blah blah blah
Article w/more videos:
https://gamengen.github.io/
Diffusion Models Are Real-Time Game Engines
Full PDF Paper:
https://arxiv.org/pdf/2408.14837
ABSTRACT:
We present GameNGen, the first game engine powered entirely by a neural model that enables real-time interaction with a complex environment over long trajectories at high quality. GameNGen can interactively simulate the classic game DOOM at over 20 frames per second on a single TPU. Next frame prediction achieves a PSNR of 29.4, comparable to lossy JPEG compression. Human raters are only slightly better than random chance at distinguishing short clips of the game from clips of the simulation. GameNGen is trained in two phases: (1) an RL-agent learns to play the game and the training sessions are recorded, and (2) a diffusion model is trained to produce the next frame, conditioned on the sequence of past frames and actions. Conditioning augmentations enable stable auto-regressive generation over long trajectories.
(...)
Summary. We introduced GameNGen, and demonstrated that high-quality real-time game play at 20 frames per second is possible on a neural model. We also provided a recipe for converting an interactive piece of software such as a computer game into a neural model.
Limitations. GameNGen suffers from a limited amount of memory. The model only has access to a little over 3 seconds of history, so it's remarkable that much of the game logic is persisted for drastically longer time horizons. While some of the game state is persisted through screen pixels (e.g. ammo and health tallies, available weapons, etc.), the model likely learns strong heuristics that allow meaningful generalizations. For example, from the rendered view the model learns to infer the player's location, and from the ammo and health tallies, the model might infer whether the player has already been through an area and defeated the enemies there. That said, it's easy to create situations where this context length is not enough. Continuing to increase the context size with our existing architecture yields only marginal benefits (Section 5.2.1), and the model's short context length remains an important limitation. The second important limitation are the remaining differences between the agent's behavior and those of human players. For example, our agent, even at the end of training, still does not explore all of the game's locations and interactions, leading to erroneous behavior in those cases.
!oldstrags !g*mers @pizzashill
In AI Nvidia future, game plays you
Jump in the discussion.
No email address required.
Can't wait for 5 years from now when every single game looks like this and every voice actor has monotone awkwardly-delivered lines
Jump in the discussion.
No email address required.
They're already beating the monotone with smarter implementations, five years from now it'll be indistinguishable
Jump in the discussion.
No email address required.
Frfr I get some skepticism about AI but thinking they can't achieve realistic voices soon is
Jump in the discussion.
No email address required.
They're already good enough for NPC voicelines. Yes milord? Ready to work.
Jump in the discussion.
No email address required.
Ai is 100% the future of npc dialog. We might finaly be able to get npcs that actually react to whats happening instead of repeating the same 10 voicelines
Jump in the discussion.
No email address required.
AI NPC dialogue has the potential to be really cool. Imagine being able to neg a shopkeeper into selling at a lower price, or something
Jump in the discussion.
No email address required.
The first publisher with a game that does this will patent it and freeze progress for 20 years.
Jump in the discussion.
No email address required.
They already have been experimenting with it in smaller games and nexon talked about trying to implement it for Blue archive.
I still think it's hilarious that character ai is basically the only ai company in the green (well that and the ones where the ai is just phillipinos and Indians)
Jump in the discussion.
No email address required.
Jump in the discussion.
No email address required.
More options
Context
More options
Context
More options
Context
Or simply react to your actions that they saw you doing. Like it's funny summoning lightning in front of one dude and their dialogue is "shucks, nothing's going on in this town amirite", but imagine they go "dayum boi, what the frick was that"
I will drop my jaw when you can just set 'this npc is backwards hillbilly archetype' and have them generate a generic appropriate response to what they see you doing in game
Jump in the discussion.
No email address required.
I think the hardest part will be getting the AI to not reference anything from real life. Imagine you're playing a fantasy game and an npc just goes "THAT DRIPPS HARDER THAN KANYE AT THE R*PESSIEUM!"
Jump in the discussion.
No email address required.
"Trump has my whole village's vote!" - Skyrim nord.
Jump in the discussion.
No email address required.
More options
Context
what
Jump in the discussion.
No email address required.
More options
Context
More options
Context
More options
Context
More options
Context
Maybe to generate a set of outputs, but I don't think it'll ever be fully dynamic and open-ended, if for no other reason than liability. Imagine the Hot Coffee mod controvery, but with prompts instead.
Jump in the discussion.
No email address required.
Maybe one day we'll get AAA games that aren't completely cucked. I want npcs that call me slurs when I hit them
Jump in the discussion.
No email address required.
More options
Context
More options
Context
More options
Context
Zug zug
Jump in the discussion.
No email address required.
dabu!
Jump in the discussion.
No email address required.
Tasta my brade!
Jump in the discussion.
No email address required.
More options
Context
More options
Context
Wololo wololo
Jump in the discussion.
No email address required.
More options
Context
More options
Context
More options
Context
Rare marsey
Jump in the discussion.
No email address required.
Not for long
Jump in the discussion.
No email address required.
More options
Context
More options
Context
More options
Context
This is something I don't get about downplayers. Is it that unreasonable to believe that current results are the worst that it's going to be, and that it will probably improve far from the average of today?
It's not going to cure cancer but this shit is wild
Jump in the discussion.
No email address required.
Funny you mention that
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9955430/
Jump in the discussion.
No email address required.
More options
Context
It's unreasonable to believe that anything develops exponentially. You're seeing the easy early achievements and for some reason extrapolating even easier further gains, which is nonsense. Everything is on a sigmoid curve, not an exponential one.
Jump in the discussion.
No email address required.
More options
Context
More options
Context
They're already beating voice actors, very distinguishable and better.
Jump in the discussion.
No email address required.
More options
Context
More options
Context
This is a bit of a cope, 5 years ago generative ai images were smudgy shadows that kind of looked like the prompt if you knew what the prompt was, now you can generate photorealistic porn of real people and eldritch abominations that can fool grandma in your consumer grade PC.
We are already getting cartoon characters singing meme songs that sound very convincingly as the voice actors, in 5 years generative ai voice is going to be indistinguishable from a real voice actor to general audiences.
Jump in the discussion.
No email address required.
Post some early-AI kino for us? I kinda miss it
Jump in the discussion.
No email address required.
Not super early but from when it was just starting to become impressive
Jump in the discussion.
No email address required.
More options
Context
8 years ago, this guy is actually a good editor too, most ai stuff from the time is less cool than this.
notice its all weird dogs over and over again, thats how a lot of ai image gen was at the time.
Jump in the discussion.
No email address required.
The dweams of a nyewbworn cwomputer gwod
Jump in the discussion.
No email address required.
More options
Context
Ah deep dream, that was good shit
Jump in the discussion.
No email address required.
More options
Context
More options
Context
More options
Context
More options
Context
Already happening
Jump in the discussion.
No email address required.
So sad the artist was put on this shit instead of using that epic comic book style to make another Freedom Force game.
Jump in the discussion.
No email address required.
More options
Context
THE FUTURE IS HERE
Jump in the discussion.
No email address required.
More options
Context
More options
Context
Ill take a dropped in fps to make voice actor seethe.
Jump in the discussion.
No email address required.
More options
Context
Thatäs been anime for 20 years now
Jump in the discussion.
No email address required.
More options
Context
More options
Context
I figured this was the future nvidier wanted. This will replace graphics and be viable on consumer hardware before full path trace rendering
Jump in the discussion.
No email address required.
But this is just making an AI convincingly recreate existing graphics, the only reason it can keep track of anything is because it got fed a lot of pictures of an already existing level
Jump in the discussion.
No email address required.
I mean with the 3d world and game logic underneath
You could path trace a bunch of screenshots with the camera randomly oriented in any point the playr could be as training data
Then the real time engine would give a crude frame that has all the important info about where u is and what's onscreen
That would be img2img into a photerrealistic final tender
Potentially much faster than real graphic
And more generalizeable. They did a demo once where the polygons were just labeled and it made up the textures
Jump in the discussion.
No email address required.
Let me rephrase it into a simpler argument. With low frames like this, and the inherent latency of cloud computing, you would only be able to make a single player game, completely eliminating the prospect of long-term income, said long-term income required to pay for the cloud access required for your game to run properly. The finances simply can't line up.
Jump in the discussion.
No email address required.
This tech is literally brand new
It's insane how much functionality m6 6 year old GPU has gained since I bought it.
And none of that is in the ray tracing department
That's not getting any faster but ai inferencing is, on the same hardware.
I think could be 144fps before RT, on future hardware
Jump in the discussion.
No email address required.
First of, no lol, machine learning isn't anything new.
And besides this isn't even a question about how new the tech is, it's an inherent feature of a neural network, It only gets more and more demanding as the tech becomes more and more advanced, such is the nature of it and machine learning as a whole.
With a 1500 ms delay or more, do you just not understand the concept of latency from calling back to a central server? That's nVdia's actual business model btw, supplying GPUs for AI Hypercomputers as they are called. In addition, the researchers themselves said that this could maybe reach 50 fps with model optimization, 144 fps is absurd, images are massive in comparison to the text produced by say GPT-4 which still needs massive computers to call back to for respectable results.
It's literally not the same hardware, ray tracing is done by a single board, "ai inferencing" is done by thousands of boards as strong as one capable of performing raytracing. The hardware used for something like this is not consumer grade, which is something also stated in the paper that you clearly did not read.
This is pretty much impossible, you have to consider that the process only gets more and more bottlenecked as it goes on, it's not your little online imagegen that shits out five pictures after a few seconds which could all be made simultaneously, it can only make a new frame once the prior frame has been rendered
Jump in the discussion.
No email address required.
Please keep cooking
Jump in the discussion.
No email address required.
More options
Context
Ok firstly I don't care about this enough t.I can't pretend. I wrote another paragraph
But I mean yes machine learning is old
I'm talking abt this recent surge. the future stuff will need more compute but we're in an era where ppl are still discovering new leaps that get 2x performance for free. And who knows what undiscovered architectures out there might do it even faster than that. Ray tracing is solved math.
Second idk what ur on about with cloud
I'm talking about what might be possible locally
My 2070 couldn't generate images when I got it. Now it can
A year ago it took minutes for a crusty 512x512
Now it takes 20s for a 1024x of much better quality.
All im sayin is a consumer card might be able to do something like this in real-time before it's able to do more than a couple samples per pixel raytrussy in a harmless bid of tech optimism and ur mad. for no reason can we just be happy and silly online
Jump in the discussion.
No email address required.
More options
Context
More options
Context
More options
Context
More options
Context
I mean no, it would have to be connected to a massive central server to achieve that, which would also add massive latency, that's one thing that neural networks will always suffer from. This very simple game was also run on a cloud hypercomputer specifically designed to run neural network operations and could only achieve 20 fps.
We have already achieved generalizeable graphical capability though, that's the point of shit like Unreal, if you make your own engine it's because you want something non-generalized by intention, often unnecessarily like Braid.
Jump in the discussion.
No email address required.
More options
Context
More options
Context
More options
Context
Detailed physics models for realistically simulating lightMaking shit up.
Jump in the discussion.
No email address required.
More options
Context
More options
Context
Why?
Jump in the discussion.
No email address required.
TPU
Jump in the discussion.
No email address required.
More options
Context
More options
Context
Once again proof that Doom runs on anything
Jump in the discussion.
No email address required.
More options
Context
a shotgunner materialises in front of you and shoots you, healing 16 damage
wat do?
Jump in the discussion.
No email address required.
More options
Context
This is pretty awesome honestly even if it's useless. If we still had great liberal thinkers, neural nets would probably cause some revolutionary thoughts in philosophy, art, language, and logic. How insane is it that image prediction can lead to logic?
Jump in the discussion.
No email address required.
Agreed. Its so bizarrely simple (in some ways) for how the prediction works, just like breaks my brain.
Jump in the discussion.
No email address required.
It's kind of like full adders in a way
Jump in the discussion.
No email address required.
More options
Context
More options
Context
I mean, it's not simply image prediction, it's image prediction based on an absolutely gigantic amount of training data. It's remarkable that the neural network has basically encoded within it navigation within a 3D world, or at least its 2D "shadow" image output, but it's not logic.
Jump in the discussion.
No email address required.
More options
Context
More options
Context
Can you fiddle with the output in realtime?
Like, can I type commands to get it to spwan the strippers from Duke 3D?
Jump in the discussion.
No email address required.
No, basically this feeds in the previous frames along with input prompts. Unless the model has seen a lot of duke nukem strippers in its training of doom it can't generate them, it can't generate a dev console either. Don't get me wrong this is really cool and exciting, but it's only an infinite game generator for doom, and even then I doubt it can handle level transitions, let alone game state tracking
Jump in the discussion.
No email address required.
If it's a stable diffusion model it's seen A LOT of strippers
Jump in the discussion.
No email address required.
More options
Context
More options
Context
More options
Context
So any potential AI Uprising can immediately be ended by instructing the AI to generate crysis?
Jump in the discussion.
No email address required.
More options
Context
Bro this looks exactly like Duke Nukem is this like what
Jump in the discussion.
No email address required.
More options
Context
Big if true (I have no idea what any of this means)
Jump in the discussion.
No email address required.
Computer doesn't draw images based on its data of what the world looks like. Computer draws next frame based on previous frame while you're pressing buttons.
Jump in the discussion.
No email address required.
This!
Jump in the discussion.
No email address required.
More options
Context
More options
Context
More options
Context
reminds me of when i did 600mg dph
Jump in the discussion.
No email address required.
More options
Context
Can't wait to beam this stuff straight into my brain computer interface while I'm sleeping.
Shit's gonna be wild next decade
Jump in the discussion.
No email address required.
More options
Context
Did it need to be doom? Could you describe a game to it and it would work? I'm kinda confused.
This isn't really useful if you have to make a whole game in the first place to train 50000000 hours of footage so it can remake one small space very very quickly.
Jump in the discussion.
No email address required.
It doesn't need to be doom. You just need a game and then have a computer play around in it for a long time to get training data.
Jump in the discussion.
No email address required.
it does have to be a game that puts enough about it's current state on the screen, it's only tracking like 3 seconds of history so all your important stats need to be on the screen constantly or the AI will make something up based on context "oh you are at the first room you must have high health" "oh you took damage in the boss room, you must be almost dead"
Jump in the discussion.
No email address required.
Well the health is on screen at all times in doom.
Jump in the discussion.
No email address required.
More options
Context
More options
Context
More options
Context
More options
Context
Jump in the discussion.
No email address required.
More options
Context
This is textbook defamation. It's literally libel in any state in the US. It's done for no purpose other than to hurt me.
I DMed X (formerly chiobu) to find out if there was any reason that he felt justified in participating in publicly vilifying and mocking mentally ill trains and wishing death on them.
I was trying to give him the benefit of the doubt. That was my mistake.
Take it down or I swear by Allah and the prophet I will sue this godawful website.
Snapshots:
https://gamengen.github.io/:
ghostarchive.org
archive.org
archive.ph (click to archive)
https://arxiv.org/pdf/2408.14837:
ghostarchive.org
archive.org
archive.ph (click to archive)
Jump in the discussion.
No email address required.
More options
Context