To train the MineDojo framework to play Minecraft, researchers fed it 730,000 Minecraft YouTube videos (with more than 2.2 billion words transcribed), 7,000 scraped webpages from the Minecraft wiki, and 340,000 Reddit posts and 6.6 million Reddit comments describing Minecraft gameplay.
From this data, the researchers created a custom transformer model called MineCLIP that associates video clips with specific in-game Minecraft activities. As a result, someone can tell a MineDojo agent what to do in the game using high-level natural language, such as "find a desert pyramid" or "build a nether portal and enter it," and MineDojo will execute the series of steps necessary to make it happen in the game.
Once they teach it to scream like a sperg at viewers, organic Twitch streamers will be over.
Jump in the discussion.
No email address required.
"3 billion human lives ended on January 29, 2023. The survivors of the nuclear fire called the war Deadass Day."
Jump in the discussion.
No email address required.
More options
Context