Fast and Accurate Color Depth Conversion

When converting between low-bit color color formats (R4G4B4A4, R5G6B6, R8G8B8A8, and so on), the conversion routines used often don't map to the nearest color channel values in the target bit depth. For example, when converting 8bit to 4bit colors, this is typically done by simply the discarding the lower 4 bits of each channel. This […]

Investigating SSE Cross Product Performance

Today’s little snippet shows a variant of the usual cross product implementation in your average SSE vector library. In pseudo-code, we can express the cross product formula as This is reasonably straightforward to implement as an SSE2 function, named cross_4shuffles because of reasons that will become apparent soon: At first glance, there doesn't seem to […]