## Storing depth in color channels of an FBO

Sage
Posts: 1,199
Joined: 2004.10
Post: #1
This is relevant to my post here ( http://idevgames.com/forum/showthread.php?t=18498 ) but I figure it deserves a thread of its own if anybody searches for this problem.

I want to store depth values in the color channel of an FBO. Why? Because I don't have the ability to create a depth cubemap...

Googling gets me a lot of unanswered questions. Perhaps my googling is poor.

Anyway, the article I'm working against is a GPUGem - http://http.developer.nvidia.com/GPUGems..._ch12.html - and it describes the following method for encoding depth to color and back:

Code:
// encode
Out.r = SquaredDistance * 2^0
Out.g = SquaredDistance * 2^8
Out.b = SquaredDistance * 2^16
Out.a = SquaredDistance * 2^24

//decode
DepthValue =

The other method described looks to me like its for float textures, I'm not certain. The article is not very well written.

Now, since I've got my cubemap code working and my pipeline ready to start rendering depthmaps, I went and implemented the encode/decode in C++ just for giggles to see what kind of precision loss there is, and I nearly died. It's absurd.

Here's my c++
Code:
#include <iostream>
#include <cmath>

struct vec4 {

int r,g,b,a;

vec4( void ):
r(0), g(0), b(0), a(0)
{}

vec4( int R, int G, int B, int A ):
r(R), g(G), b(B), a(A)
{}

};

vec4 EncodeDepth( float depth )
{
return vec4(
1 * depth,
256 * depth,
65536 * depth,
16777216 * depth
);
}

float DecodeDepth( const vec4 &shadowSample )
{
}

int main (int argc, char * const argv[])
{
vec4 v;

for ( int i = 0; i <= +100; i++ )
{
float f = float(i) / 100.0f;
vec4 v = EncodeDepth( f );
float f2 = DecodeDepth( v );

std::cout << i << "\t" << f << " -> " << f2 << " error: " << std::abs( f2 - f ) << std::endl;
}

return 0;
}

And here's the output:
Code:
0    0 -> 0 error: 0
1    0.01 -> 0.027807 error: 0.017807
2    0.02 -> 0.0595202 error: 0.0395202
3    0.03 -> 0.0873425 error: 0.0573425
4    0.04 -> 0.119056 error: 0.0790557
5    0.05 -> 0.146863 error: 0.0968627
6    0.06 -> 0.178591 error: 0.118591
7    0.07 -> 0.206398 error: 0.136398
8    0.08 -> 0.238112 error: 0.158112
9    0.09 -> 0.26984 error: 0.17984
10    0.1 -> 0.297647 error: 0.197647
11    0.11 -> 0.32936 error: 0.21936
12    0.12 -> 0.357183 error: 0.237183
13    0.13 -> 0.388896 error: 0.258896
14    0.14 -> 0.416718 error: 0.276718
15    0.15 -> 0.448431 error: 0.298431
16    0.16 -> 0.476238 error: 0.316238
17    0.17 -> 0.507967 error: 0.337967
18    0.18 -> 0.53968 error: 0.35968
19    0.19 -> 0.567487 error: 0.377487
20    0.2 -> 0.599216 error: 0.399216
21    0.21 -> 0.627023 error: 0.417023
22    0.22 -> 0.658736 error: 0.438736
23    0.23 -> 0.686558 error: 0.456558
24    0.24 -> 0.718271 error: 0.478271
25    0.25 -> 0.75 error: 0.5
26    0.26 -> 0.777807 error: 0.517807
27    0.27 -> 0.80952 error: 0.53952
28    0.28 -> 0.837343 error: 0.557343
29    0.29 -> 0.869056 error: 0.579056
30    0.3 -> 0.896863 error: 0.596863
31    0.31 -> 0.928591 error: 0.618591
32    0.32 -> 0.956398 error: 0.636398
33    0.33 -> 0.988112 error: 0.658112
34    0.34 -> 1.01984 error: 0.67984
35    0.35 -> 1.04765 error: 0.697647
36    0.36 -> 1.07936 error: 0.71936
37    0.37 -> 1.10718 error: 0.737183
38    0.38 -> 1.1389 error: 0.758896
39    0.39 -> 1.16672 error: 0.776718
40    0.4 -> 1.19843 error: 0.798431
41    0.41 -> 1.22624 error: 0.816238
42    0.42 -> 1.25797 error: 0.837967
43    0.43 -> 1.28968 error: 0.85968
44    0.44 -> 1.31749 error: 0.877487
45    0.45 -> 1.34922 error: 0.899216
46    0.46 -> 1.37702 error: 0.917023
47    0.47 -> 1.40874 error: 0.938736
48    0.48 -> 1.43656 error: 0.956558
49    0.49 -> 1.46827 error: 0.978271
50    0.5 -> 1.5 error: 1
51    0.51 -> 1.52781 error: 1.01781
52    0.52 -> 1.55952 error: 1.03952
53    0.53 -> 1.58734 error: 1.05734
54    0.54 -> 1.61906 error: 1.07906
55    0.55 -> 1.64686 error: 1.09686
56    0.56 -> 1.67859 error: 1.11859
57    0.57 -> 1.7064 error: 1.1364
58    0.58 -> 1.73811 error: 1.15811
59    0.59 -> 1.76984 error: 1.17984
60    0.6 -> 1.79765 error: 1.19765
61    0.61 -> 1.82936 error: 1.21936
62    0.62 -> 1.85718 error: 1.23718
63    0.63 -> 1.8889 error: 1.2589
64    0.64 -> 1.91672 error: 1.27672
65    0.65 -> 1.94843 error: 1.29843
66    0.66 -> 1.97624 error: 1.31624
67    0.67 -> 2.00797 error: 1.33797
68    0.68 -> 2.03968 error: 1.35968
69    0.69 -> 2.06749 error: 1.37749
70    0.7 -> 2.09922 error: 1.39922
71    0.71 -> 2.12702 error: 1.41702
72    0.72 -> 2.15874 error: 1.43874
73    0.73 -> 2.18656 error: 1.45656
74    0.74 -> 2.21827 error: 1.47827
75    0.75 -> 2.25 error: 1.5
76    0.76 -> 2.27781 error: 1.51781
77    0.77 -> 2.30952 error: 1.53952
78    0.78 -> 2.33734 error: 1.55734
79    0.79 -> 2.36906 error: 1.57906
80    0.8 -> 2.39686 error: 1.59686
81    0.81 -> 2.42859 error: 1.61859
82    0.82 -> 2.4564 error: 1.6364
83    0.83 -> 2.48811 error: 1.65811
84    0.84 -> 2.51984 error: 1.67984
85    0.85 -> 2.54765 error: 1.69765
86    0.86 -> 2.57936 error: 1.71936
87    0.87 -> 2.60718 error: 1.73718
88    0.88 -> 2.6389 error: 1.7589
89    0.89 -> 2.66672 error: 1.77672
90    0.9 -> 2.69843 error: 1.79843
91    0.91 -> 2.72624 error: 1.81624
92    0.92 -> 2.75797 error: 1.83797
93    0.93 -> 2.78968 error: 1.85968
94    0.94 -> 2.81749 error: 1.87749
95    0.95 -> 2.84922 error: 1.89922
96    0.96 -> 2.87702 error: 1.91702
97    0.97 -> 2.90874 error: 1.93874
98    0.98 -> 2.93656 error: 1.95656
99    0.99 -> 2.96827 error: 1.97827
100    1 -> 4 error: 3

COMICAL error values. I don't see how I could get shadow mapping to work at all with this kind of error.

What am I missing? Did I cimpletely brainfart on the encode/decode?
Moderator
Posts: 1,563
Joined: 2003.10
Post: #2
It looks to me like you have your multiplications and your divisions reversed. I'd have expected to see Out.a = SquaredDistance / 2^24, ShadowSample.a * 16777216, etc. I can't make any sense of doing it the other way around.
Sage
Posts: 1,199
Joined: 2004.10
Post: #3
Well, in case somebody stumbles on this thread, in my poking and searching I found:

http://www.gamedev.net/community/forums/..._id=486847

It works very accurately for normalized depth values!
Sage
Posts: 1,199
Joined: 2004.10
Post: #4
So, as a followup, the code I found doesn't actually seem to work. Or maybe it does but I'm not using it correctly.

Here's my implementation, in GLSL.
Code:
#define DEBUG_PACKING 0

vec4 FloatToFixed( in float depth )
{
#if DEBUG_PACKING
return vec4( depth, depth,depth,1 );
#else

const float toFixed = 255.0/256.0;

return vec4(
fract(depth*toFixed*1.0),
fract(depth*toFixed*255.0),
fract(depth*toFixed*255.0*255.0),
fract(depth*toFixed*255.0*255.0*255.0)
);
#endif

}

float FixedToFloat( in vec4 shadowSample )
{
#if DEBUG_PACKING
#else

const float fromFixed = 256.0/255.0;
#endif

}

here's what gets rendered:

And a closeup:

So, it's all "holy" ( haha hah the cuebmap looks like a cross too)

Note, when I change DEBUG_PACKING to 1, and drop to only 8-bits of precision, it works great. It's not accurate, but I don't get the specking artifacts.

Any thoughts?
Sage
Posts: 1,199
Joined: 2004.10
Post: #5
Interestingly, if I only use 24 bits of precision ( ignoring alpha ) the technique works correctly. I certainly can live with 24 bit depth precision, so this is fine.

Though I am perplexed that the alpha channel would result in this noise.
Sage
Posts: 1,199
Joined: 2004.10
Post: #6
So, it was a stupid oversight. You can't expect meaningful results doing this when you've got blending and alpha testing enabled.
Sage
Posts: 1,234
Joined: 2002.10
Post: #7
It's also not too useful to expand a 23 bit mantissa to 32 bits...

As a future exercise, I'll point out another path you can play around with:

On a renderer that exports ARB_texture_float and ARB_texture_rg, you ought to be able to create an R32F cubemap, and render directly to it. That way you can just write out floats directly and skip the pack/unpack.

However as of 10.6.2, the Radeon X1600 supports sampling from this format but not rendering to it. Hopefully that will be rectified in a future update.
Sage
Posts: 1,199
Joined: 2004.10
Post: #8
arekkusu Wrote:However as of 10.6.2, the Radeon X1600 supports sampling from this format but not rendering to it. Hopefully that will be rectified in a future update.

So, it's possible a driver update could enable this feature? That would be amazing! Is it in general available on more modern GPUs? Say, the 9400 or 9600?

( I'm planning on buying a new MBP in the next six months or so )
Sage
Posts: 1,234
Joined: 2002.10
Post: #9
Nvidia currently doesn't export ARB_texture_rg in Mac OS X, so no it isn't generally available.

But the hardware supports it and it works in Windows. So yes, a driver update could enable this feature.