What does "world" mean here? How does the spatiality fit into some latent space? Or what constitutes the "world"?
If the answer is, there is none, the world is just frames of video and any consistency quickly blurs out after a few seconds. That's not a world generation, that's just generation of video frames following frames. Not that it isn't cool, but it has almost zero usability for generating a "world" simulation. The key to a realistic world is that you can reliably navigate it. Visit and revisit places. If you modify anything, those modifications are persisted. If you leave a room and re-enter it hours later, the base expectation is that the same objects are in that room.
Wouldn't a working approach be to just create a really low resolution 3D world in the traditional "3D game world" sense to get the spatial consistency. Then this crude map with attributes is fed into frame generation to create the resulting world? It wouldn't be infinite, but on the other hand no one has a need for an infinite world either. A spherical world solves the border issue pretty handily. As I understood it, there was some element of that in the new FS2024 (discussed yesterday on HN).
I'm into VR and mixed reality, and I think this is headed to making the Holodeck real in an immersive way. That's the concept of the Matrix and what they are demoing, just in 2d.
I am guessing the main thing holding this stuff back in terms of fidelity and consistency or generalization is just compute. But the new techniques they have here have just dramatically lowered the compute costs and increased the generalization.
Maybe just something like the giant Cerebras SRAM chips will get to the next 10 X in scale that smooths this out and pushes it closer to Star Trek. Or maybe some new paradigm like memristors.
But I'm looking forward to within just a few years being able to put on some fairly comfortable mixed reality glasses and just asking for whatever or whoever I want to appear in my home (for example) according to my whim.
Or, train it on a lot of how-to videos such as cooking. It just materializes an example of someone showing you exactly what you need to do right in your kitchen.
Here's another crazy idea: train on videos and interactions with productivity applications rather than games. In the future, for small businesses, we skip having the AI generate source code and just describe how the application works. The data and program state are just stored in a giant context window, and the application functionality changes the instant you make a request.
This is surely really cool. Just a bit sad that, as phrased by the authors, the "First Real-Time" virtual world created for the demo is a fat & fast SUV driving on virgin lands.
Prediction: in 20 years, I’m going to be reading about some dude who wrote a program to drive the car continuously until it ran into some surreal edge condition, and finally hit it. There will be a subculture of “matrix glitchers” who spend much of their time doing these kinds of experiments.
People have been doing that with Minecraft for over a decade. In the old days, once you got far away enough, the terrain generation would go haywire. Lots of videos from that time period of people exploring the "edge of the world".
Personally, these were the kind of glitches which made games feel magical and "real" to me as a kid. Being able to analyze a system by breaking it made it seem so much more tangible, like an actual place I had an NTSC-sized porthole into.
Ha! I remember being either 5 or 6 when my uncle showed me Minus World and it blowing my mind. That might have actually been my first exposure to "backrooms" glitches like that. What an amazing glitch. It even worked on my combo Super Mario Bros / Duck Hunt cartridge
MissingNo. is another good example. I have fond memories spending untold hours in my favorite game engines trying to break free. The Jak and Daxter series were some of my favorite to break, due to the uniqueness and flexibility of the engine and the weird ways that the chunk loading system could be broken.
That community already exists because the current version of these types of AI game engines are constantly running into a surreal edge condition since they don't track things consistently when they go off frame.
I’m really excited for where this is going. From the demo videos, it seems to be a step up from Oasis, which itself came out only 2 weeks ago. I expect to see a lot of innovative use cases in this field
Since this is from a chinese company/developer, having such an interesting concept/implementation, still getting just 2 comments.. whereas projects far less important or impactful get much more. This isn't the first time I have observed this bias.
I know the comments will try to justify this with well we don;t have a playable demo or code, but that still doesn't negate what I've said. The bias is there.
• 6 had zero comments, 1 had only my own comment
• 1 had 7
• 1 had 17
• 1 had 148
I have no reason to think there's a nationality thing here, stuff just falls off the top fast and most people don't comment or upvote… same as with most comments themselves.
Really don't think that's the case. I don't care who makes things at first; I first want to see if they are interesting and then maybe dig deeper. Looks cool, upvoted and bookmarked to wait for the playable demo.
It's posted at midnight on Thursday (eastern time).
It's mobile unfriendly, hard to read, and has no videos. The other models had playable demos and videos, and they were posted in the middle of the day so we could think about it during work.
The hype wave for this stuff is going to require bigger splashes for each new model. New image-to-3D models garner a yawn, and it's going to be the same here soon.
These folks put a lot of thought into their branding (and CSS), but they kind of let the excitement fizzle as there's nothing to look at and evaluate. We just have to trust that they did things? It's a bunch of pictures of a car and green text.
It's far too late to open the paper.
Basically they just don't excel at marketing. 3/10.
Edit: I had no idea this was Chinese until you said it. The page doesn't mention names at top, and it didn't suck me into the paper.
What does "world" mean here? How does the spatiality fit into some latent space? Or what constitutes the "world"? If the answer is, there is none, the world is just frames of video and any consistency quickly blurs out after a few seconds. That's not a world generation, that's just generation of video frames following frames. Not that it isn't cool, but it has almost zero usability for generating a "world" simulation. The key to a realistic world is that you can reliably navigate it. Visit and revisit places. If you modify anything, those modifications are persisted. If you leave a room and re-enter it hours later, the base expectation is that the same objects are in that room.
Wouldn't a working approach be to just create a really low resolution 3D world in the traditional "3D game world" sense to get the spatial consistency. Then this crude map with attributes is fed into frame generation to create the resulting world? It wouldn't be infinite, but on the other hand no one has a need for an infinite world either. A spherical world solves the border issue pretty handily. As I understood it, there was some element of that in the new FS2024 (discussed yesterday on HN).
What you really want is a story telling engine informing the world model and the humans need to generate power.
I'm into VR and mixed reality, and I think this is headed to making the Holodeck real in an immersive way. That's the concept of the Matrix and what they are demoing, just in 2d.
I am guessing the main thing holding this stuff back in terms of fidelity and consistency or generalization is just compute. But the new techniques they have here have just dramatically lowered the compute costs and increased the generalization.
Maybe just something like the giant Cerebras SRAM chips will get to the next 10 X in scale that smooths this out and pushes it closer to Star Trek. Or maybe some new paradigm like memristors.
But I'm looking forward to within just a few years being able to put on some fairly comfortable mixed reality glasses and just asking for whatever or whoever I want to appear in my home (for example) according to my whim.
Or, train it on a lot of how-to videos such as cooking. It just materializes an example of someone showing you exactly what you need to do right in your kitchen.
Here's another crazy idea: train on videos and interactions with productivity applications rather than games. In the future, for small businesses, we skip having the AI generate source code and just describe how the application works. The data and program state are just stored in a giant context window, and the application functionality changes the instant you make a request.
This is surely really cool. Just a bit sad that, as phrased by the authors, the "First Real-Time" virtual world created for the demo is a fat & fast SUV driving on virgin lands.
Prediction: in 20 years, I’m going to be reading about some dude who wrote a program to drive the car continuously until it ran into some surreal edge condition, and finally hit it. There will be a subculture of “matrix glitchers” who spend much of their time doing these kinds of experiments.
People have been doing that with Minecraft for over a decade. In the old days, once you got far away enough, the terrain generation would go haywire. Lots of videos from that time period of people exploring the "edge of the world".
Personally, these were the kind of glitches which made games feel magical and "real" to me as a kid. Being able to analyze a system by breaking it made it seem so much more tangible, like an actual place I had an NTSC-sized porthole into.
Cf. the "Minus World" in Super Mario Bros. for the NES.
https://en.wikipedia.org/wiki/Minus_World
Ha! I remember being either 5 or 6 when my uncle showed me Minus World and it blowing my mind. That might have actually been my first exposure to "backrooms" glitches like that. What an amazing glitch. It even worked on my combo Super Mario Bros / Duck Hunt cartridge
MissingNo. is another good example. I have fond memories spending untold hours in my favorite game engines trying to break free. The Jak and Daxter series were some of my favorite to break, due to the uniqueness and flexibility of the engine and the weird ways that the chunk loading system could be broken.
That community already exists because the current version of these types of AI game engines are constantly running into a surreal edge condition since they don't track things consistently when they go off frame.
I’m really excited for where this is going. From the demo videos, it seems to be a step up from Oasis, which itself came out only 2 weeks ago. I expect to see a lot of innovative use cases in this field
unreadable website
> Click to play
Clicking - nothing works.
Click to play [...the video]
Someone should ban "AI" articles on Hacker News.
But then there would be no articles!
There's very clear bias in hackernews community.
Since this is from a chinese company/developer, having such an interesting concept/implementation, still getting just 2 comments.. whereas projects far less important or impactful get much more. This isn't the first time I have observed this bias.
I know the comments will try to justify this with well we don;t have a playable demo or code, but that still doesn't negate what I've said. The bias is there.
Here's my list of submissions: https://news.ycombinator.com/submitted?id=ben_w
Of those ten:
• 6 had zero comments, 1 had only my own comment • 1 had 7 • 1 had 17 • 1 had 148
I have no reason to think there's a nationality thing here, stuff just falls off the top fast and most people don't comment or upvote… same as with most comments themselves.
Really don't think that's the case. I don't care who makes things at first; I first want to see if they are interesting and then maybe dig deeper. Looks cool, upvoted and bookmarked to wait for the playable demo.
See also: timezones
lol had no idea it's chinese nor cared which country produced it. nationalism in any form is repulsive
was intrigued by the post but couldn't get anything to play
Yeah, no.
It's posted at midnight on Thursday (eastern time).
It's mobile unfriendly, hard to read, and has no videos. The other models had playable demos and videos, and they were posted in the middle of the day so we could think about it during work.
The hype wave for this stuff is going to require bigger splashes for each new model. New image-to-3D models garner a yawn, and it's going to be the same here soon.
These folks put a lot of thought into their branding (and CSS), but they kind of let the excitement fizzle as there's nothing to look at and evaluate. We just have to trust that they did things? It's a bunch of pictures of a car and green text.
It's far too late to open the paper.
Basically they just don't excel at marketing. 3/10.
Edit: I had no idea this was Chinese until you said it. The page doesn't mention names at top, and it didn't suck me into the paper.
Love you meatboner