Author Archive

Register assignment in HLSL

Sunday, June 27th, 2010

When you are manually assigning shader constants to registers (using a ‘register’ keyword), those registers are also taken out of the automatic assignment pool.
This means that constants that don’t have explicit registers will never overlap with ones that do even if latter are not used in the shader.
And so code like this:
float4 c0 : register(c0);
float4 c1 : register(c2);
float4 c2;
float4 main():POSITION
{
    return c2;
}

Will compile into this:
vs_2_0
mov oPos, c1

This is a pretty small thing, but still quite convenient and useful to know.

Spherical Harmonics in games

Saturday, June 19th, 2010

Here is everything you really need to know about spherical harmonics to use them in games:

0. Spherical Harmonics are not scary at all :)
1. An Efficient Representation for Irradiance Environment Maps
2. Spherical Harmonics in Actual Games
3. Spherical Harmonic Lighting: The Gritty Details
4. DirectX SDK samples contain all the C++ and HLSL code that you might need to get started.

C++ code to calculate shader constants for evaluation on GPU:
IrradianceVolume\PRTMesh.cpp, CPRTMesh::ComputeSHIrradEnvMapConstants()

HLSL code to evaluate SH for a given normal:
IrradianceVolume\SHIrradianceEnvMap.fx

Also, DirectX SDK comes with a bunch of functions to get you going quickly. For example there is D3DXSHProjectCubeMap(), which generates spherical harmonic coefficients from a given cubemap.

PS: There one thing to keep in mind, though: SH are not the only option for storing lighting information. One great alternative is Valve’s Ambient Cube (page 28), which gives quality somewhere between 2-band and 3-band SH.

one-liner

Friday, October 30th, 2009

C++ fun piler

C++ gotchas

Saturday, May 2nd, 2009

Today I’m going to write about virtual function hiding.

Consider the following code:

#include <stdio.h>

class Foo
{
public:
    virtual void fun(int)
    {
        printf("A");
    }
    virtual void fun(bool)
    {
        printf("B");
        fun(int());
    }
};

class Bar : public Foo
{
public:
    virtual void fun(int)
    {
        printf("C");
    }
};

int main()
{    

    Bar b;
    b.fun(bool());    

    return 0;
}

What will be written to the console?

(more…)

Defferred dynamic lighting using light volumes

Saturday, March 14th, 2009

Background

When it comes to dynamic lighting, there are two common approaches:
- loop inside pixel shader, calculating contribution of each light;
- render the same geometry multiple times (for each light) using additive blend;

Normally, number of lights that affect a mesh is limited to something like 4 or 8 to save shader instructions or render passes.
Therefore, for each mesh one must calculate which lights can affect it. This can be tricky when dealing with large meshes, like terrain.
When large number of lights affect the same mesh, there can also be a problem with light flickering (when one light is replaced by another).

Deferred shading provides a nice and simple alternative (admittedly at a cost of different kind of problems, described later).
It works by performing lighting after all geometry is rendered. Hence the name.
All lighting function inputs are rendered into several buffers (in a single pass, using multiple render targets) which are then sampled by the light shader.

Implementation details

In the demo, I’m using four fp16 textures with this layout:

RT0: RGB – unlit diffuse colour, Alpha – specular level
RT1: RGB – light accumulation, Alpha – nothing
RT2: RGB – world-space normal, Alpha – nothing
RT3: RGB – world-space, not normalized view vector, Alpha – nothing

RT1  starts off with light contribution from main ambient and directional sources (sun).

Once we rendered our scene into MRTs, we can start applying dynamic lights.

To cull all unaffected pixels, lights are rendered as a convex 3d volumes. Only point lights are shown in the demo, they are rendered as low-poly spheres.  Spot lights are also possible, using cones.

It is also possible for each light to cast a shadow. I will explore this topic in the future articles.

Stencil and Z culling

Each light volume is rendered it two passes:

Pass 1:
- Front faces only;
- Colour write disabled;
- No Z-write;
- Z function = Less/Equal;
- Z-Fail writes non-zero value to stencil buffer (increment-saturate);
- Stencil pass & fail don’t modify stencil buffer;

This pass creates a stencil mask for the areas of the light volume that are not occluded by scene geometry.

Pass 2:
- Back-faces only;
- Colour write enabled;
- No Z-write;
- Z function = Greater/Equal;
- Stencil function = Equal (stencil ref = zero);
- Always writes zeo to stencil;

This pass is where lighting actually happens. Every pixel that passes Z and Stencil tests is then added to light accumulation buffer (RT1). Standard Phong function is used in the demo.

Diagram below shows effects of Z and Stencil tests:

Light Z and Stencil cullingBlue – pixels which passes Z test in Pass 1 and have left the stencil buffer intact.
Red – pixels which passed Z test in Pass 2.
Green – pixels which passed Stencil test in Pass 2.

- Light 1 is culled by Z test in Pass 2.
- Light 2 will fail Z test in Pass 1, write to Stencil buffer and then will fail Stencil test in Pass 2.
- Light 3 will partially pass both tests  and will go into the pixel shader.

After all lights have been rendered in this way, RT1 contains fully lit scene. It can now be used in post-process effects (like bloom).

That’s it.

Downsides

- Transparent geometry can not be lit in this way. All alpha-blended objects must be rendered into RT1 after deferred dynamic lighting pass, before post-processing. Standard dynamic lighting must be applied to it (loop in the shader and/or multipass).
- No hardware antialiasing.
- High video memory requirements – 4x 64 bits per pixel (fp16) textures at high resolutions is very expensive.
- High bandwidth requirements – a product of the previous point. Each pixel writes 4x data. Each light also samples from 3 textures.
This is especially noticeable when camera is close to a surface with high number(say, 50+) of overlapping lights.
- Requires  fp16 blending support.

Some of those problems are possible to solve at some quality cost. I will come back this in the future articles.

Demo – 1024 point lights

Light Demo Screenshot

[Download]

Demo controls:
WASD – up/down/left/right
EQ – forward/back
ZX – rotate
Arrows/Left mouse button – look around
Space – animate lights

No source code, but demo is NVPerfHud-friendly.

Hello world!

Friday, March 6th, 2009

With Empire out of the way, I finally found time to set-up this blog thing.

I will aim to post at least one article about graphics, game development or general programming every two weeks. Hopefully, it will even be something interesting :)