World Management
There is a number of optimization techniques and practices relating to world management. They are used to decrease the rendering load without losing much of the image quality.
Levels of Details
Smooth alpha-blended levels of details (LODs) are used to decrease geometry complexity of 3D objects, when these objects move away from the camera, making it possible to lighten the load on the renderer.
Lower-detailed meshes are manually created by artists and then exported into Unigine as different surfaces of the same mesh. When rendering the virtual world, these surfaces become visible one by one smoothly fading into each other: starting from the high-poly mesh and ending up with low-poly one seen from far away.
LODs can be switched abruptly or smoothly faded into one another using alpha dithering, if necessary.
Surface LODs are adjusted in the Surfaces tab of the Nodes settings window. It provides the following parameters:
Visibility Distances | The LODs visibility parameters are defined by the visibility range. It is set by two parameters: the minimum visibility and maximum visibility distances measured from the camera. If the surface is within the range specified by these bounds, it is displayed, otherwise, it is hidden.
Ranges for surfaces that represent different LODs of the same object should not overlap. |
Fade Distances | Discrete LODs are likely to have one noticeable and quite distracting artifact - "popping", when one LOD is switched to another. Rather than abruptly switching the LODs, they can be smoothly blended into each other over a fade distance making the transition imperceptible.
Smooth LODs with alpha-blend fading are available only when
render_alpha_fade is set to
1 (which is set by default). Just like visibility distances, the range of the fading region is defined by the minimum fade and maximum fade distances:
Although alpha blending between the LODs looks by far better, it also means that within the fade region two versions of the same object are rendered at the same time. For this reason, the fade region should be kept as short as necessary. |
Within distance of surface visibility
|
Within fade distance
|
Suppose, we have two LODs: high-poly surface_lod_0 and low-poly surface_lod_1, and we need to organize smooth transition between these two LODs.
- We want the switching distance at 50 units. For that, we need to "dock" visibility distances of surfaces to each other:
- Our first LOD surface surface_lod_0 should be always presented when the camera is close to the object. So, the minimum visibility distance is set to -inf. And 50 units from the camera will be the maximum visibility distance for it.
- Directly following it, comes the second LOD surface surface_lod_1. It is visible from 50 units (which is the minimum visibility distance) - and up to infinity (the maximum visibility distance = inf).
- Now the LOds are switched but sharply rather then smoothly. To be smoothly blended, the symmetrical fade-out (for the 1st LOD) and fade-in (for the 2nd LOD) distances are set. Let's say, the fading region should be 5 units.
- For the 1st LOD to fade-out, its maximum fade distance is set to 5.
- For the 2nd LOD to fade-in, its minimum fade distance is also set to 5.
In the result, the LODs will be changed in the following manner:
From the BB of the object
to 50 units |
Only the 1st LOD surface surface_lod_0 is completely visible |
50 — 55 units | The 1st LOD fades out, while the 2nd LOD fades in |
From 55 units and further | Only the 2st LOD surface surface_lod_1 is completely visible |
Reference Object
There is one more LOD-related parameter: reference object to measure the distance to switch LODs. It specifies whether the distances should be measured to the surface itself or to any of the surfaces or nodes up the hierarchy branch. There are two reference objects for each surface:
Min Parent | Minimum parent is a reference object to measure the minimum visibility distance from:
|
Max Parent | Maximum parent is a reference object to measure the maximum visibility distance from. The same principle is used to count it. |
Let's take a model of the house, for example. When the camera is close by, the high-poly detail surfaces are seen, such as the door arch, stone corners, round window and roof tiles. When we move away, all these surfaces should be simultaneously changed by one united low-poly LOD surface.
High-poly model
|
Low-poly model to be used as distant LOD
|
The problem is, all this detail surfaces have different bounding boxes. So if their distances are checked for themselves (0 as min and max parents), we can have a situation when LODs of different parts of the house are turned on (or off) unequally, because their bounding boxes will be closer to the camera. This may cause z-fighting artifacts. Here the distant corner has not yet switched to a more detailed LOD, while the close one is drawn twice: as a high-poly corner LOD and at the same time the united low-poly house LOD.
If we set a bounding box of the whole house to be a reference object (min and max parent to 1), all surfaces will switch on simultaneously no matter what side we approach the house from.
One more option is to use different reference objects when check. For example, the lower bound (minimum distance) is checked for the surface itself, and the upper bound (maximum distance) is checked for the parent. This may sound complicated, so take a look at the pictures below. The first picture shows a ring, which is split into surfaces according to levels of details.
Here, surfaces from the rightmost column will be displayed, when the camera is very close to them. The leftmost surfaces will be displayed, when the camera is very far from them. Merging of several surfaces into one reduces the number of objects to draw, hence, reduces the number of DIP requests and speeds up rendering.
Note that all of the minimum distances here are measured to the surface itself, but almost all of the maximum distances are measured to another reference object, a parent. One more picture will help you to understand, why it is so.
A star is the camera; it doesn't matter now what exactly the camera will be looking at. On both images, required surfaces are drawn according to the camera position and distances from the camera to the corresponding reference objects. For example, on the left image, the upper left part of the ring is a single surface, the upper right part is split in two separate surfaces, and the bottom part is also a single surface. On the right image, the whole upper part is divided into the smallest possible sectors.
Here, distances are measured to different reference objects to properly "turn off" smaller single sectors and display a larger sector instead. The maximum distance is calculated to the parent sector, because the distances to the neighboring subsectors may differ too much. The minimum distance is calculated to the current sector, because we need to show it, if the camera is too close to it.
Indoor Space
Often, it is enough to create and display to the user only a part of an artificial world, in a form of a labyrinth-like series of rooms and passages between them. Parts of open space, if they are present and do not spread at infinity (if they are confined areas), can also be considered "rooms". These limitations make such world ideal for being viewed as a set of sectors and portals.
The whole space is partitioned into convex areas called sectors ("rooms"). If there is some opening - a door or a window - between two adjacent sectors, through which one sector can be partially seen from another, this opening is called a portal. The sectors and portals help the renderer to determine, which areas and objects are visible from any given point in the world. Also, if a neighboring sector is seen through a portal, this portal is used as a viewing frustum for the area it leads to, allowing viewing frustum culling.
Outdoor Space
The techniques that are appropriate for indoor scenes, are not efficient when it comes to managing the vast landscapes. The rendering speed directly depends on the number of entities and polygons drawn in the scene, as well as physic computed for the objects, the count of which is usually very high in the outdoor scenes. So the main goal of the managing is to render only the regions that are seen while culling all the rest. If the world cannot be narrowed down to a set of closed areas, the approach called space partitioning becomes relevant.
In Unigine the space partitioning is implemented using the adaptive axis-aligned BSP trees.
To provide effective management of the scene on the one hand and good tree balancing on the other, the separate trees are created for different editor node types:
- World tree handles all the sectors, portals, occluders, triggers and clusters.
- Objects tree includes all the objects except for the ones with collider and clutter flags.
- Collider objects form a separate tree to facilitate collision detection and avoid the worst case scenario of testing all the objects against all other objects. It is clear, that objects can intersect only if they are situated and overlap in the same region of the scene. This tree allows to drastically reduce the number of pair-wise tests and accelerate the calculation.
- Clutter objects are also separated as they are intended to be used in great numbers, which can disturb the balance of the main object tree.
- Light tree handles all the light sources.
- Decal tree handles the decals.
- Player tree handles all the types of players.
- Physical node tree handles all physical forces.
- Sound tree handles all the sound sources.
Mesh Partitioning
After the editor node level is reached, there still exists the need for further partitioning of the mesh. Division is based on the same principles: the tree must be binary and axis aligned. The only difference is that these trees are precomputed (they are generated at the time of world loading), because a mesh is a baked object and there is no need for the related trees to change dynamically. The mesh is divided into the following trees:
- Surfaces tree
- Polygon tree
These two mesh-based trees provide the basis for fast intersection and collision calculations with the mesh.
Perspective Projection
When the human eye views a scene, objects in the distance appear smaller than objects close by - this is known as perspective. While orthographic projection ignores this effect to allow accurate measurements, perspective definition shows distant objects as smaller to provide additional realism.
A viewing frustum (or a view frustum) is a field of view of the virtual camera for the perspective projection; in other words, it is the part of the world space that is seen on the screen. Its exact shape depends on the camera being simulated, but often it is simply a frustum of a rectangular pyramid. The planes of the viewing frustum that are parallel to the screen are called the near plane and the far plane.
As the field of view of the camera is not infinite, there are objects that do not get inside it. For example, objects that are closer to the viewer than the near plane will not be visible. The same is true for the objects beyond the far plane, if it is not placed at infinity, or for the objects cut off by the side faces. As such objects are invisible anyway, one can skip their drawing. The process of discarding unseen objects is called viewing frustum culling.
Orthographic Projection
When the human eye looks at a scene, objects in the distance appear smaller than objects close by. Orthographic projection ignores this effect to allow the creation of to-scale drawings for construction and engineering.
With an orthographic projection, the viewing volume is a rectangular parallelepiped, or more informally, a box. Unlike perspective projection, the size of the viewing volume doesn't change from one end to the other, so distance from the camera doesn't affect how large an object appears.
Occlusion Culling
Another popular practice is to remove objects that are completely hidden by other objects. For example, there is no need to draw a room behind a blank wall or flowers behind an entirely opaque fence. This technique is called occlusion culling. Portals and sectors and potentially visible sets are particular cases of occlusion culling. The former are described above. The latter divide the space in a bunch of regions, with each region containing a set of polygons that can be visible from anywhere inside this region. Then, in real-time, the renderer simply looks up the pre-computed set given the view position. This technique is usually used to speed up binary space partitioning.
Hardware Occlusion Queries
Another way to cull geometry that is not visible in the camera viewport is to use a hardware occlusion query. It allows reducing the number of the rendered polygons therefore increasing performance. To run the hardware occlusion test for the scene before sending data to the GPU, set the the Rendering -> Features -> Occlusion query flag. In this case, culling will be performed for all objects with the Culled by occlusion query flag set in the Nodes window.
When culling is enabled for the object, an occlusion query box is rendered for it. Its size coincides the size of the object's bounding box. If the occlusion query box is in the camera viewport, the object will be rendered; othewise, it is not.
The hardware occlusion queries should be used only for a few objects that use heavy shaders. Otherwise, performance will decrease instead of increasing. It is recommended to enable queries for water or objects with reflections.
Asyncronous Data Streaming
Data streaming is an optimization technique, which supposes that not all the data is loaded into random access memory (RAM) at once. Instead, only the required data is loaded, and all the rest is loaded progressively, on demand.
In Unigine, asynchronous data streaming is enabled by default. Due to data streaming, the following data is loaded asynchronously to RAM:
- All textures of the materials.
- ObjectMeshStatic, ObjectMeshClutter, ObjectMeshCluster.
Such procedurally generated objects as ObjectMeshClutter and ObjectGrass are generated in separate thread that reduces performance costs significantly.
Keep in mind, asynchronous data streaming does not affect meshes and textures transmission to the GPU: they are transferred in the main thread.
Multi-threaded Update of Nodes
Multi-threaded update of nodes (if enabled via world_threaded console command) can substantially increase performance. For example, this can be very handy when a big number of particle systems are rendered in the world.
- All nodes that have one root in the nodes hierarchy are updated in one thread. To parallel the jobs on nodes update, make sure that they do not use the same parent.
- Each Node Reference is handled as a root node without any parents (regardless of their position in node hierarchy). For example, this means that particle systems contained in node references are always optimized for multi-threaded update.