Game Engine Architecture

222 5. Engine Support Systems

A load-hit-store is a particularly bad kind of cache miss, prevalent on the PowerPC architectures found in the Xbox 360 and PLAYSTATION 3, in which the CPU writes data to a memory address and then reads the data back before it has had a chance to make its way through the CPU’s instruction pipeline and out into the L1 cache. See htt p://assemblyrequired.crashworks.org/2008/07/08/ load-hit-stores-and-the-_ _restrict-keyword for more details.

5.2.3.2. Instruction Cache and Data Cache

When writing high-performance code for a game engine or for any other performance-critical system, it is important to realize that both data and code are cached. The instruction cache (I-cache) is used to preload executable machine code before it runs, while the data cache (D-cache) is used to speed up reading and writing of data to main RAM. Most processors separate the two caches physically. Hence it is possible for a program to slow down because of an I- cache miss or because of a D-cache miss.

5.2.3.3. Avoiding Cache Misses

The best way to avoid D-cache misses is to organize your data in contiguous blocks that are as small as possible and then access them sequentially. This yields the minimum number of cache misses. When the data is contiguous (i.e., you don’t “jump around” in memory a lot), a single cache miss will load the maximum amount of relevant data in one go. When the data is small, it is more likely to fi t into a single cache line (or at least a minimum number of cache lines). And when you access your data sequentially (i.e., you don’t “jump around” within the contiguous memory block), you achieve the minimum number of cache misses, since the CPU never has to reload a cache line from the same region of RAM. Avoiding I-cache misses follows the same basic principle as avoiding D- cache misses. However, the implementation requires a diff erent approach. The compiler and linker dictate how your code is laid out in memory, so you might think you have litt le control over I-cache misses. However, most C/C++ linkers follow some simple rules that you can leverage, once you know what they are:

The machine code for a single function is almost always contiguous in
memory. That is, the linker almost never splits a function up in order
to intersperse another function in the middle. (Inline functions are the
exception to this rule—more on this topic below.)

Functions are laid out in memory in the order they appear in the
translation unit’s source code (.cpp fi le).

Game Engine Architecture

Get our desktop app

Company

Features

Documentation

Resources