Over at gamedev.net someone had asked for some general performance pointers regarding using XNA on the 360. After giving a bullet-point list of what I thought were the important issues I thought to myself "Hey, that was pretty good. Let's milk it for all it's worth!" And therefore I've copied and pasted them all here in glorious display of laziness.
-The 360 has 512MB of unified memory, which means it's shared by both the CPU and the GPU. You don't have all of that available to you, since some of it is taken up by the console's "OS" and some will also be taken up by the .NET Compact Framework. You'll also be working from the managed heap, rather than directly working with native memory.
-You can only execute pure managed code on the 360. You can't, for example, P/Invoke into a non-managed DLL.
-As far as GPU shaders go, you're pretty unrestricted. You can use SM3.0 HLSL, or you can also write portions of your shaders in the GPU's native microcode. This is really very nice...it lets you do things like un-normalized texture addressing, full texturing capabilities in the vertex shader, or directly fetching an element from a vertex stream. The microcode set is referred to as xvs_3_0 and xps_3_0.
-The 360's GPU is different from your average PC GPU in that it has an eDRAM framebuffer. The eDRAM is 10MB in size, and has tremendous bandwidth (256GB/s). What this means is that writing out to the framebuffer or reading it back for blending is very very quick. Multi-sampling is also very quick, since again you don't have the bandwidth problem. In fact MSAA would be "free" if it weren't for tiled rendering...you see the downside of eDRAM is that if your render-target + z-buffer is too big to fit in eDRAM, you have to render to it in tiles. This means you render one portion of the target, then another. This isn't so bad, except for the fact that any geometry that's on the edges of 2 tiles has to be drawn twice. If you're not doing scenes with hugely complex geometry you probably won't even notice tiling (it happens automatically). To figure out whether you're going to tile you need to count the amount of bytes per pixel and then multiply by resolution. So for example if you're rendering to the Color format which is 4 bytes per pixel and you're using Depth24Stencil8 which is also 4 bytes per pixel, you have 8 bytes per pixel total. When multi-sampling, you multiply this amount by the number of samples (so 4xMSAA would by 32 bytes per pixel). 1280 x 720 with 4xMSAA would be ~28MB, so you'd need 3 tiles.
-Be prepared to get CPU-bound really quick if you're doing anything non-trivial. DrawPrimitive calls are extremely expensive on the 360...I've seen my framerate go from about 70 to 30 just from going from 24 DP calls to 34. Instancing is a must if you need to draw a lot of meshes...there's a good sample on the CC website. By the same token if you're doing any really fancy logic on the CPU that's not graphics-related, you'll probably need to run it on another thread on a different core since your main thread can get bogged down pretty quick.
-Watch out for performance pitfalls with the .NET Compact Framework. Things like Garbage Collection compaction and virtual function calls are much more expensive than they are on the PC. I suggest reading this blog for tips. Just remember to keep your live object count as low as possible, and you should be okay.
-Floating-point performance is not so great on the 360 CPU. Most of the fp power is in the vector units, but you have no access to those through XNA.
-Avoid the surface formats that are larger than 32bpp. Mainly HalfVector4 and Vector2. Their performance is generally pretty terrible, and I've run into all kinds of driver bugs with them. This means you can't do HDR in straightforward way, but there are other options. There's an entry on my XNA blog where I talk about how I got around it.
-Watch your texture sampling bandwidth. Framebuffer access may be quick, but reads from textures are limited by the 22.1GB/s read bandwidth. This may be quite a bit less than what you're used to, if you're prototyping on a higher-end card like an 8800. This can be especially painful in scenarios where you want to take multiple samples per pixel, like PCF for shadow maps or SSAO.
-Prototyping and developing on the PC is a good idea since you have access to PIX, but make sure you test pretty often on the 360. You may need to optimize for quite few scenarios if you need to keep the framerate up.
Currently rated 5.0 by 2 people
- Currently 5/5 Stars.
- 1
- 2
- 3
- 4
- 5