Archive for Graphics

Derivative Maps

I recently came across an interesting paper, Bump Mapping Unparametrized Surfaces on the GPU by Morten Mikkelsen of Naughty Dog. This paper describes an alternative method to normal mapping, closely related to bump mapping. The alluring prospect of this technique is that it doesn’t require that a tangent space be defined.

Mikkelsen is apparently well-versed in academic obfuscation (tsk!), so the paper itself can be a little hard to read. If you’re interested in reading it, then I would recommend first reading Jim Blinn’s original bump mapping paper to understand some of the derivations.

But Wait! What’s Wrong with Normal Maps?

Nothing really. But if something comes along that can improve quality, performance or memory consumption then it’s worth taking a a look.

A Quick Detour into Gradients

Given a scalar height field (i.e. a two-dimensional array of scalar values), the gradient of that field is a 2D vector field where each vector points in the direction of greatest change. The length of the vectors corresponds to the rate of change.

The contour map below represents the scalar field generated from the function f(x,y) = 1 - (x^2 + y^2). The vector field shows the gradient of that scalar field. Note how each vector points towards the center, and how the vectors in the center are smaller due to the lower rate of change.

Derivative Maps

The main premise of the paper is that we can project the gradient of the height field onto an underlying surface and use it to skew the surface normal to approximate the normal of the height-map surface. We can do all of this without requiring tangent vectors.

As with the original bump-mapping technique, it’s not exact due to some terms being dropped due to their relatively small influence, but it’s close.

There are really only two important formulae to consider from the paper. The first shows how to perturb the surface normal using the surface gradient. Don’t confuse the surface gradient with the gradient of the height field mentioned above! As you’ll see shortly, they’re different.

    \begin{align*} {\bar{n}}'=\bar{n}-\nabla_s\beta \end{align*}

Here, {\bar{n}}' represents the perturbed normal, \bar{n} is the underlying surface normal, and \nabla_s\beta is the surface gradient. So basically, this says that the perturbed normal is the surface normal offset in the negative surface gradient direction.

So how do we calculate the surface gradient from the height field gradient? Well, there’s some fun math in there which I don’t want to repeat, but if you’re interested, I would recommend reading Blinn’s paper first, then Mikkelsen’s paper. You eventually arrive at:

    \begin{align*} \nabla_s\beta=\dfrac{(\sigma_t \times \bar{n} )\beta_s + (\bar{n} \times \sigma_s)\beta_t}{\bar{n} \cdot (\sigma_s \times \sigma_t)} \end{align*}

In addition to the symbols defined previously, \sigma_s and \sigma_t are the partial derivatives of the surface position, and \beta_s and \beta_t are the partial derivatives of the height field. The derivative directions s and t are not explictly defined here.

It’s easiest to think of this as the projection of the 2D gradient onto a 3D surface along the normal. Intuitively, this says that the surface gradient direction is pushed out on orthogonal vectors to the s/n and t/n planes by however much the gradient specifies. The denominator term is there to scale up the result when the s and t are not orthogonal, or are flipped.

Implementation

Implementing this technique is fairly straightforward once you realise the meaning of some of the variables. Since we’re free to choose the partial derivative directions s and t, it’s convenient for the shader to use screen-space x and y. The value \sigma is the position, and the value \beta is the height field sample.

// Project the surface gradient (dhdx, dhdy) onto the surface (n, dpdx, dpdy)
float3 CalculateSurfaceGradient(float3 n, float3 dpdx, float3 dpdy, float dhdx, float dhdy)
{
	float3 r1 = cross(dpdy, n);
	float3 r2 = cross(n, dpdx);

	return (r1 * dhdx - r2 * dhdy) / dot(dpdx, r1);
}

// Move the normal away from the surface normal in the opposite surface gradient direction
float3 PerturbNormal(float3 n, float3 dpdx, float3 dpdy, float dhdx, float dhdy)
{
	return normalize(normal - CalculateSurfaceGradient(normal, dpdx, dpdy, dhdx, dhdy));
}

So far, so good. Next we need to work out how to calculate the partial derivatives. The reason why we chose screen-space x and y to be our partial derivative directions is so that we can use the ddx and ddy shader instructions to generate the partial derivatives of both the position and the height.

Given a position and normal in the same coordinate-space, and a height map sample, calculating the final normal is straighforward:

// Calculate the surface normal using screen-space partial derivatives of the height field
float3 CalculateSurfaceNormal(float3 position, float3 normal, float height)
{
	float3 dpdx = ddx(position);
	float3 dpdy = ddy(position);

	float dhdx = ddx(height);
	float dhdy = ddy(height);

	return PerturbNormal(normal, dpdx, dpdy, dhdx, dhdy);
}

Note that in shader model 5.0, you can use ddx_fine/ddy_fine instead of ddx/ddy to get high-precision partial derivatives.

So how does this look? At a medium distance, I would say that it looks pretty good:

But what about up close?

Uh oh! What’s happening here? Well, there are a couple of problems…

The main problem is that the height texture is using bilinear filtering, so the gradient between any two texels is constant. This causes large blocks to become very obvious when up close. There are a couple of options for alleviating this somewhat.

One option is to use bicubic filtering. I haven’t tried it, but I would expect this to make a good difference. The problem is that it will incur an extra cost. Another option, suggested in the paper, is to add a detail bump texture on top. This helps quite a lot, but again it adds more cost.

In the image below I’ve just tiled the same texture at 10x frequency over the top. It would be better to apply some kind of noise function as in the original paper.

The second problem is more subtle. We’re getting some small block artifacts because of the way that the ddx and ddy shader instructions work. They take pairs of pixels in a pixel quad and subtract the relevant values to get the derivative. In the case of the height derivatives, we can alleviate this by performing the differencing ourselves with extra texture samples.

The first problem is pretty much a killer for me. I would rather not have to cover up a fundamental implementation issue with extra fudges and more cost.

What Now?

It’s unfortunate that this didn’t make it into the original paper, but Mikkelsen mentions in a blog post that you can increase the quality by using precomputed height derivatives. This method requires double the texture storage (or half the resolution) of the ddx/ddy method, but produces much better results.

You’re probably wondering how you can possibly precompute screen-space derivatives. We don’t actually have to. Instead we can use the chain rule to transform a partial derivative from one space to another. In our case we can transform our derivatives from uv-space to screen-space if we have the partial derivatives of the uvs in screen-space.

To calculate dhdx you need dhdu, dhdv, dudx and dvdx:

    \begin{align*} \dfrac{\delta h}{\delta x} = \dfrac{\delta h}{\delta u} \cdot \dfrac{\delta u}{\delta x} + \dfrac{\delta h}{\delta v} \cdot \dfrac{\delta v}{\delta x} \end{align*}

To calculate dhdy you need dhdu, dhdv, dudy and dvdy:

    \begin{align*} \dfrac{\delta h}{\delta y} = \dfrac{\delta h}{\delta u} \cdot \dfrac{\delta u}{\delta y} + \dfrac{\delta h}{\delta v} \cdot \dfrac{\delta v}{\delta y} \end{align*}

The hlsl for this is very simple:

float ApplyChainRule(float dhdu, float dhdv, float dud_, float dvd_)
{
	return dhdu * dud_ + dhdv * dvd_;
}

Assuming that we have a texture that stores the texel-space height derivatives, we can scale this up in the shader to uv-space by simply multiplying by the texture dimensions. We can then use the screen space uv derivatives and the chain rule to transform from dhdu/dhdv to dhdx/dhdy.

// Calculate the surface normal using the uv-space gradient (dhdu, dhdv)
float3 CalculateSurfaceNormal(float3 position, float3 normal, float2 gradient)
{
	float3 dpdx = ddx(position);
	float3 dpdy = ddy(position);

	float dhdx = ApplyChainRule(gradient.x, gradient.y, ddx(uv.x), ddx(uv.y));
	float dhdy = ApplyChainRule(gradient.x, gradient.y, ddy(uv.x), ddy(uv.y));

	return PerturbNormal(normal, dpdx, dpdy, dhdx, dhdy);
}

So how does this look? Well, it’s pretty much the same at medium distance.

But it’s way better up close, since we’re now interpolating the derivatives.

Conclusions

In order to really draw any conclusions about this technique, I’m going to need to compare the quality, performance and memory consumption to that of normal mapping. That’s a whole other blog post waiting to happen…

But in theory, the pros are:

  • Less mesh memory: We don’t need to store a tangent vector, so this should translate into some pretty significant mesh memory savings.
  • Fewer interpolators: We don’t need to pass the tangent vector from the vertex shader to the pixel shader, so this should be a performance gain.
  • Possible less texture memory: At worst this method requires two channels in a texture. At best, a normal map takes up two channels.
  • Easy scaling: It’s easy to change the height scale on the fly by scaling the height derivatives. This isn’t quite so easy to get right when using normal maps. See here. As Stephen Hill points out in the comments below, this is a pretty weak argument, so I’m removing it.

And the cons are:

  • More ALU: It’s going to be interesting to see the actual numbers, but this is probably the only thing that could put the nail in the coffin for derivative maps. The extra cost for ALU might be compensated partially by the fewer interpolators, but we’ll have to see.
  • Less flexible: A normal map can represent any derivative map, but the reverse is not true. I’m not sure that this is a significant problem in practice though.
  • Worse quality? I’m not sure about this one, but it’ll be interesting to see if the quality holds up.

Comments (8)

UI Anti-Aliasing

I’ve been working on making a really simple IMGUI implementation for my engine at home. I like to do a little bit of research when I’m approaching something new to me like this, so I went hunting around for publicly available implementations. While doing this, I came across Mikko Mononen’s implementation in Recast.

I was impressed when I ran the demo with how smooth his UI looked. It turns out that he’s using a little trick (which I’d never seen before, but I’m sure is old to many) to smooth of the edges of his UI elements.

Basically, the trick is to create a ring of extra vertices by extruding the edges of the polygon out by a certain amount. These extra vertices take the same color as the originals, but their alpha is set to zero. Mikko calls this ‘feathering’.

In my case, I found that I got good results by feathering just one pixel. Here’s a quick before/after comparison of the my IMGUI check box at 800% zoom:

And here’s a 1-to-1 example showing rounded button corners:

It’s a pretty nice improvement for a very simple technique! If you’re interested in what the code looks like, then either take a look at Mikko’s IMGUI implementation, or you can find the code I use to feather my convex polygons below.

My implementation is a little less efficient since I recalculate each edge normal twice, but I chose to keep it simple for readability.

void CalculateEdgeNormal(float& nx, float& ny, float x0, float y0, float x1, float y1)
{
const float x01 = x1 - x0;
const float y01 = y1 - y0;

const float length = Sqrt(x01 * x01 + y01 * y01);

const float dx = x01 / length;
const float dy = y01 / length;

nx = dy;
ny = -dx;
}

void FeatherConvexPolygon(Primitives& primitives, const Vertex* vertices, int count, float amount, const Texture* texture)
{
Vertex* extruded = Memory::Allocate<Vertex>(Memory::Temp, sizeof(Vertex) * count);

for (int i = 0; i < count; ++i)
{
const Vertex& previous = vertices[(i + count - 1) % count];
const Vertex& current = vertices[i];
const Vertex& next = vertices[(i + 1) % count];

float nx0, ny0, nx1, ny1;

CalculateEdgeNormal(nx0, ny0, previous.x, previous.y, current.x, current.y);
CalculateEdgeNormal(nx1, ny1, current.x, current.y, next.x, next.y);

float nx = (nx0 + nx1) * 0.5f;
float ny = (ny0 + ny1) * 0.5f;

extruded[i] = Vertex(current.x + nx * amount, current.y + ny * amount, Color(current.r, current.g, current.b, 0.0f));
}

for (int i = 0; i < count; ++i)
{
const int j = (i + 1) % count;
AddQuad(primitives, vertices[i], extruded[i], extruded[j], vertices[j], texture);
}

Memory::Free(extruded);
}
view raw FeatherUI.cpp This Gist brought to you by GitHub.

Comments

What’s wrong with this picture?

montecarlo256samples2bounces

Well, you could point out a number of things to answer that question. There’s some pretty obvious aliasing, a random pixel on the ground which should be in shadow but isn’t, it’s noisy, boring etc. But that’s not my point. The point is: It’s too dark!

I know it’s too dark because I know how I rendered it, and I rendered it wrong. It still kind of looks acceptable (well to me at least) though. I’m not sure that I would say that it’s implausibly dark if I didn’t know it.

Read the rest of this entry »

Comments (2)

Direct3D 11 Multithreading

I’ve been putting it off for a while, but with my recent trip to GDC and the arrival of the Direct3D 11 beta, I thought it was about time I switched my renderer to be multithreaded. One of the things I learned at a Direct3D 11 talk at GDC is that it works on ‘down-level hardware’, which means DirectX 9 & 10 cards. Of course, you don’t get the snazzy new hardware features, but you do get some of the benefits of the new API, like multithreading and limited compute shaders (albeit not as fast as it will be on the real hardware).

Read the rest of this entry »

Comments (3)

Energy Conservation In Games

Recently at work I was chatting with a colleague, and the topic of energy conservation for specular reflections came up. This reminded me that I’ve been sitting on a blog post for a while about just this subject, so I thought it was time to finish it.

First of all, I’d like to start by looking at the standard diffuse reflection model. In games, the typical formula for calculating diffuse reflection from a particular light is:

Where Cd is the diffuse material color, Li is the light color, N is the normal, and L is the normalized direction to the light. What’s the problem with this? Well, it’s not energy conserving. In itself, this isn’t really a problem since we don’t calculate multiple bounces of light in games, so we’re not adding energy to the scene as light bounces around like would happen in a ray tracer. It’s a good starting point for discussion though.

Read the rest of this entry »

Comments (34)

Irradiance Caching: Part 2

In my previous post, I wrote very briefly about an  important improvement to the irradiance caching algorithm – irradiance gradients – and I’m going to expand on rotational gradients this time.

Gradients

The gradient of a function represents both the direction and rate of change of that function as the inputs vary. For a one dimensional function this is simply the derivative of the function. As you move into higher dimensions, you need to consider which coordinate system the inputs for the function are specified in, as this will change how you need to calculate the gradient.

For now, I’m just going to focus on calculating the gradient of a function defined using normalized spherical coordinates. Unfortunately, there’s no real standard way to define spherical coordinates, and despite similar looking symbols, the values are often interchanged. I’m going to define the spherical coordinates on the unit sphere as azimuthal value φ [0, π), and polar value θ [0, 2π).

sphericalcoordinates

Read the rest of this entry »

Comments (3)

Irradiance Caching: Part 1

Solving the rendering equation with even just one bounce of indirect lighting can take a long time. The majority of time spent rendering a frame is in estimating the lighting integral. For example, rendering a single bounce of indirect lighting at 720p resolution with 256 sample rays for a Monte Carlo estimator requires about 237 million rays to be cast. This doesn’t even include the rays needed for sampling the lights for direct lighting, so in practice, the total will be even higher.

One interesting observation made by Greg Ward in his Siggraph ’88 paper is that contrary to direct lighting, where shadows and lights can cause harsh changes, the indirect lighting on a surface tends to vary relatively slowly. One way to picture why this is, is to imagine the computing average color from the what you can see from each of your eyes. Even though each eye has a slightly different view on the world, the images they see are nearly similar, and so the average color is also nearly the same.

Read the rest of this entry »

Comments (4)

Better Sampling

A couple of days ago, I compared the images my ambient occlusion integrator produced with those of Modo using similar settings. I noticed immediately how much ‘cleaner’ the render from Modo was. Clearly there was an issue with the way I was picking my samples, so I set about improving things.

My approach for generating the ambient occlusion rays was to generate uniform random samples over the hemisphere about the normal. Based on two random numbers in the range [0,1), I calculate the normalized sample direction using the following function:

Vector3 Sample::UniformSampleHemisphere(float u1, float u2)
{
	const float r = Sqrt(1.0f - u1 * u1);
	const float phi = 2 * kPi * u2;

	return Vector3(Cos(phi) * r, Sin(phi) * r, u1);
}

This generates points on a hemisphere from uniform variables u1 and u2, where each point has equal probability of being selected. The following image was generated with 256 random uniform samples:

ao256samplesrandomuniform

Read the rest of this entry »

Comments (10)

The Holidays: Time for fun work!

For the first time in about three years, I’ve had two weeks off work. I’ve spent a lot of time just relaxing and taking a break from things, but I’ve also been able to get back to doing some graphics work. Ever since Vivendi bought Activision, the project that I was leading has been “put on hold”, so I’ve been back on the game team. It’s not as fun for me, that’s for sure, but luckily, I have my code at home to play with, so all is not lost! With the holidays, I’ve found some motivation to get back to it.

What have I been doing? Well, as I was approaching the break, I read through the course notes from the Practical Global Illumination with Irradiance Caching course at Siggraph last year. I thought the course itself was really good, and very clearly presented. After blitzing through the notes again, I thought I’d have a go at writing a ray tracer. It seemed simple enough at the time, but like most things, the devil is in the details.

The first thing I did was to set up a really simple single-threaded ray tracer that just displayed the color of the surface it hit. This was fairly quick to get up and running once I had written a few supporting classes for the cameras and shapes. It’s not very glamorous, but it’s a start:

Simple Integrator

Read the rest of this entry »

Comments (2)

Lighting: The Rendering Equation

Back in 2002, I started my first job in the games industry at Climax Studios in England. I have to admit, I didn’t know very much about game development at the time. Don’t get me wrong, I’d been writing little 2D games, and messing around with rubbish particle systems at home, but it was nothing like what I was about to get involved with. Despite my inexperience, somehow I did enough to pass the interview, and I was offered a job as a junior programmer.

As seemed to be typical for the time, my introduction to the industry was pretty much a trial by fire. I quickly found out that I couldn’t hope to truly understand every single new thing I encountered, so I learned to just accept some things as the truth. For example, I was told that the dot product of two normalized vectors yields the cosine of the angle between them. I just accepted this, and only took the time to find out why later on.

Read the rest of this entry »

Comments (1)