Game Engine Architecture

188 4. 3D Math for Games

__m128 addWithAssembly( __m128 a, __m128 b) { __m128 r; __asm { movaps xmm0, xmmword ptr [a] movaps xmm1, xmmword ptr [b] addps xmm0, xmm1 movaps xmmword ptr [r], xmm0 } return r; }

__m128 addWithIntrinsics( __m128 a, __m128 b) { __m128 r = _mm_add_ps(a, b); return r; }

In the assembly language version, we have to use the __asm keyword to invoke inline assembly instructions, and we must create the linkage between the input parameters a and b and the SSE registers xmm0 and xmm1 manually, via movaps instructions. On the other hand, the version using intrinsics is much more intuitive and clear, and the code is smaller. There’s no inline assembly, and the SSE instruction looks just like a regular function call. If you’d like to experiment with these example functions, they can be in- voked via the following test bed main() function. Notice the use of another intrinsic, _mm_load_ps(), which loads values from an in-memory array of floats into an __m128 variable (i.e., into an SSE register). Also notice that we are forcing our four global float arrays to be 16-byte aligned via the __declspec(align(16)) directive—if we omit these directives, the pro- gram will crash.

#include <xmmintrin.h>

// ... function definitions from above ... __declspec(align(16)) float A[]={2.0f,-1.0f,3.0f,4.0f}; __declspec(align(16)) float B[]={-1.0f,3.0f,4.0f,2.0f}; __declspec(align(16)) float C[]={0.0f,0.0f,0.0f,0.0f}; __declspec(align(16)) float D[]={0.0f,0.0f,0.0f,0.0f}; int main(int argc, char* argv[]) { // load a and b from floating-point data arrays above __m128 a = _mm_load_ps(&A[0]); __m128 b = _mm_load_ps(&B[0]);

Game Engine Architecture

Get our desktop app

Company

Features

Documentation

Resources