TL;DR

Another changelog is out, showing off massive improvements to the speed of the procedural generation system.

This week…

Watch The Video!

  • Procedural Generation Around Player
  • 10-100x Speedup In Generation
  • Halved Memory Usage

Under The Hood

Procedural Generation Around Player

This one seems like a fairly simple thing to implement, but was actually fairly complex. There is a behaviour which can be attached to an entity which makes it an “Observer”, being an observer makes the procedural generator aware of you and your location and it will pick nodes around observers to subdivide.

Internally each observer finds which node of the world generator it is in which is Subdivided. So for example say we have:

City containing Blocks containing Building containing Rooms

If you’re in a building (and no rooms have yet generated) then the system keeps a pointer to the building you’re in. If you leave the building then it moves up to the city node and then tries to search down for a new building. Wherever an observer is it keeps a list of nodes which are descendants of the node you are in which have not yet subdivided. When the system wants to subdivide a new node it takes the first item from this list.

Once you have this setup prioritisation becomes simple - just sort the list by priority! The sort order is:

    private static float CostFunction(Vector3 positionOfObserver, ISubdivisionContext node)
    {
        var distance = (positionOfObserver - node.BoundingBox.ClosestPoint(position)).Length();
        var volume = (float)Math.Pow(node.BoundingBox.Volume(), 0.333f);
        return distance / volume;
    }

Further nodes are less significant, and larger nodes are more significant. Pretty simple.

10-100x Speedup

The majority of this speedup comes from batching operations. Let’s say we’re generating a staircase in a script, you could do something like this:

for (var i = 0; i < steps; i++)
{
    geometry.Add(GenerateStep());
}

So you just added a load of steps into the world, nice and simple, easy to read code - Great! Except you just issued tens of commands to the engine, and for every single command the engine has to do a load of work to update the world:

  1. Load Chunk (potentially off disk)
  2. Modify Geometry Of Chunk
  3. Generate A New Mesh For Graphics
  4. Generate A New Mesh For Physics
  5. Send Graphics Mesh To Renderer
  6. Send Physics Mesh To Physics Engine

That’s quite a lot. Even worse there are four threads involved in this process (world generation, geometry streaming, rendering, physics) and this code synchronizes with all of them at various points which stalls things.

The obvious solution to this problem: issue less commands. Remember, we’re just adding geometry and addition is commutative, so at the moment we’re doing:

world = world + step
world = world + step
world = world + step
world = world + step
world = world + step

What if we did:

buffer = buffer + step
buffer = buffer + step
buffer = buffer + step
buffer = buffer + step
buffer = buffer + step
world = world + step

Mathematically this is the same thing, but now we’re only interacting with the “world” once which is where the real cost is. This is exactly what the new system does.