The geometry buffer (or gbuffer) is where all material properties for each object in your scene are stored. This includes (but is not limited to) for each pixel; the position of the geometry, its surface normal, and diffuse colour. All this information cannot fit into a single render target, so multiple render targets are drawn onto concurrently. The more information you want to store, and the more precision you want it stored at, the larger your gbuffer will have to be.
On DirectX9 class hardware (such as that targetted by XNA), each render target being drawn onto concurrently needs to have both the same resolution and bit-depth. This poses some difficulties, as if you want to use a high bit-depth target to precisely store position, you will also need to use a precision target to store information which you don't care so much about, such as diffuse colour. Designing the gbuffer layout then, becomes a juggling act of getting just enough precision to look good on the important attributes, while not wasting bandwidth and memory where it is not needed.
As I am only planning on implementing simple Phong shading in my lighting pass, I need to store position, diffuse colour, normals, specular intensity and specular power in my gbuffer.
Position
The simplest way to store position is to put x, y, and z into the red, green, and blue components of a 32-bit R8G8B8A8 target. This is super cheap to encode and decode, as you don't need to do anything. It also leaves us with an extra 8-bit channel spare, which would do nicely for one of the specular attributes. However, the precision on this is terrible. If precision is stored in-precisely, then your lighting may show signs of banding. Not nice.
You could use a 64 bit target instead, which would give alright precision, but then you would have to use 64 bit targets for all the other attributes, which would double the size of the gbuffer. I would much rather stick to 32 bits if possible. Fortunately, you implicitly know the screen-space position of a pixel as you are shading it, so from knowing depth, you can reconstruct the full 3D position. This means you only need to store depth, and so can stick it in an R32F target, and get a full 32 bit floating point precision. MJP has a great post comparing the quality of different ways of storing depth, for the purpose of reconstructing position.
Normals
Like position, the simplest way to store normals is to put x, y, and z into a R8G8B8A8 target. And like position, the quality is not good enough. Early deferred renderers observed that the normal is always a unit vector, and assuming that z cannot be negative (in view space, cannot be pointing away from the viewer), you only need to store x and y. You can then reconstruct z as "sqrt(1 - x*x - y*y)". As far as I know, this was first popularised by Guerrilla Studios with Killzone 2. However, you cannot rely on z always being positive, due to normal mapping and a perspective projection, as demonstrated by Insomniac Games.
Until recently, I was planning on using spherical coordinates as suggested in another post by MJP. This is another way of encoding normals into 2 channels (as you can assume one of the coordinates is 1), at a fairly high quality. I was set on implementing this, until I found another page comparing various different normal storage formats. Here, Aras demonstrated what amazing quality you can get by storing normals with a spheremap transform. Not only that, but it's faster to encode than spherical. I had seen this before in a presentation by Crytek on CryEngine 3, but -for some reason- I had not thought much of it.
Diffuse
Diffuse information does not need to be very precise, and so storing r,g,b in an R8G8B8A8 target works just fine.
Specular
Like diffuse, the specular attributes don't need much precision, and so a single 8 bit channel for each of the 2 attributes would be preferred.
Layout
In the end, I went with the following layout:
R32F | Linear Viewspace Depth | |||
R10G10B10A2 | Normal.X | Normal.Y | Specular Intensity | Unused |
R8G8B8A8 | Diffuse.R | Diffuse.G | Diffuse.B | Specular Power |
This layout fits everything I wanted into 3 32bit targets, which is pretty compact. While I am "wasting" 2 bits in the alpha channel of the R10B10G10A2 target, there are no 3 channel render target formats, so it will have to do. If I later want to add an extra attribute, and those 2 bits aren't enough, then it should be easy enough to change this target to an R8B8G8A8 format instead, thereby trading some normal precision for an extra usable channel.