Texture Compression

GPU texture compression is one of those corners of graphics programming that almost nobody thinks about until something goes wrong — a mobile game's textures look blocky, a cross-platform engine needs fallback paths, or someone realizes their texture memory budget is blown. But the design constraints of these formats are genuinely interesting, and the evolution from S3TC to ASTC traces the same arc of increasing complexity that you see in video codecs, except with a fundamental restriction that video doesn't have: random access.

The Unique Constraint

Video and image compression (JPEG, H.264, AVIF) can make any part of the file depend on any other part. A pixel in the bottom-right can reference a pixel in the top-left. This is why they compress so well — long-range dependencies capture structure efficiently. GPU texture compression cannot do this.¹ A GPU needs O(1) mapping from texture coordinate to memory address, because any pixel shader invocation might sample any texel at any time. The only way to guarantee this is fixed-size blocks: divide the texture into (usually) 4x4 pixel blocks, encode each block in a fixed number of bits, and the address of any block is just arithmetic.

This is why texture compression ratios are modest compared to image formats. S3TC achieves 6:1 compression (4 bits per pixel for RGB). JPEG XL achieves ratios of 20:1 or better at comparable quality.² The difference is entirely due to the random access constraint.

S3TC: The Granddaddy

BC1 (originally S3TC/DXT1) is almost insultingly simple and remains the most widely used texture compression format on desktop GPUs. Each 4x4 block gets 64 bits: 32 bits for two RGB565 endpoints and 32 bits for 16 two-bit interpolation weights. That's it — every pixel in the block is a linear blend of two colors, with four possible blend values (0, 1/3, 2/3, 1).¹

There's a clever bit: since swapping the two endpoints and inverting the weights produces the same block, there's a symmetry to exploit. Based on which endpoint has the larger integer representation, the hardware selects between two modes — one with full interpolation and one with 1-bit punch-through alpha (where one weight value maps to transparent black). This "exploit the symmetry for an extra mode bit" trick recurs in almost every subsequent format.

BC3 adds proper alpha by splicing a BC1 color block with a separate alpha block that uses two 8-bit endpoints and 3-bit interpolation weights — same principle, more bits for alpha. BC4 and BC5 (RGTC) generalize the alpha block format for 1 and 2 uncorrelated channels, which is what you want for normal maps and metallic-roughness maps where the channels aren't correlated with each other.

The S3TC formats have one embarrassing property: the specification uses floating-point arithmetic with unspecified precision, so there's no bit-exact decoded result. Different hardware produces slightly different pixels from the same compressed data. MPEG1 made the same mistake with its DCT specification. Later formats learned from this.

ETC: Mobile's Answer

ETC1 was mandated for OpenGL ES 2.0, making it the universal mobile texture format. ETC2 (OpenGL ES 3.0) extended it with full alpha support and HDR. These formats are structurally similar to S3TC — endpoint pairs plus per-pixel weights — but the encoding details differ significantly, and ETC2 has genuinely clever mode selection for challenging blocks.¹

The fragmentation between desktop (S3TC/BC) and mobile (ETC) texture formats has been a persistent annoyance for cross-platform engines. A texture authored once needs to be compressed in multiple formats for different target GPUs. Basis Universal attempted to solve this with a "universal" intermediate format that can be quickly transcoded to any target format, though the quality loss from double compression is nonzero.

BPTC and ASTC: The Modern Formats

BC6 and BC7 (BPTC) are the desktop state of the art, introduced around 2010. BC7 can compress high-quality color at 8 bits per pixel with multiple partitioning modes — the block can be split into 2 or 3 subsets, each with its own endpoints, allowing much better handling of blocks where colors don't lie along a single line in RGB space. BC6 is one of only two ways to compress HDR textures (ASTC being the other).¹

ASTC (Adaptive Scalable Texture Compression), from ARM/Mali, is the final boss. It works from a single 128-bit block but supports variable block sizes (4x4 through 12x12), LDR and HDR, 1-4 channels, up to 4 partitions per block, and a bewildering number of encoding modes. The specification is complex enough that the author of a thorough review calls it a format "mere mortals are not supposed to understand."¹

The escalating complexity creates a real tradeoff for encoding. Simpler formats (BC1) can be encoded quickly because there are few modes to search. ASTC's enormous configuration space means the encoder is doing a search over a vast space of possible representations, and brute-forcing the optimal encoding is impractical. This is why texture compression quality depends heavily on the encoder, not just the format — a fast ASTC encoder will produce noticeably worse results than a slow one on the same format.

Image Formats and the Web

The web faces a parallel story. JPEG XL represents the culmination of decades of image compression research, with one unique trick: lossless recompression of existing JPEG files, typically achieving 20% size reduction with bit-exact reconstruction of the original JPEG.² No other format can do this, which matters enormously when you have billions of existing JPEG images.

JPEG XL also supports progressive decoding — you can display a preview from 15% of the data and refine as more arrives — which none of the video-derived formats (WebP, AVIF) support at the codec level. This matters for web delivery in a way that isn't obvious from compression benchmarks alone. Cloudinary's large-scale subjective experiments showed JPEG XL achieving 10-15% better compression than AVIF at 3x the encoding speed, and 30-35% better than MozJPEG.²

Chrome's decision to remove JPEG XL support despite this evidence remains controversial. The stated reasons — insufficient ecosystem interest, insufficient incremental benefit — don't hold up well against the technical data. But browser politics and format wars have always been driven by factors beyond pure technical merit.

The Common Thread

Across all these formats, the fundamental technique is the same: pick two (or more) color endpoints, then specify per-pixel interpolation weights between them. The evolution is in how many ways you can specify the endpoints (more modes, partitions, decorrelation), how many bits you spend on weights (2-4 bits per pixel), and how cleverly you exploit symmetries for extra mode bits. The compression happens because interpolation weights are cheap — 2 bits per pixel — while full color is expensive — 24 bits per pixel. As long as the block's colors can be reasonably represented as blends between a few endpoints, you win. When they can't — text on a colorful background, random noise, high-frequency patterns — every format suffers, and the more complex formats just suffer less.

Compressed GPU texture formats – a review and compute shader decoders by Hans-Kristian Arntzen — source ↩ ↩² ↩³ ↩⁴ ↩⁵
The Case for JPEG XL by Jon Sneyers — source ↩ ↩² ↩³

Linked from

Graphics And Rendering Overview
Texture Compression is the other side of the constraint story — the random-access requirement that makes GPU texture compression fundamentally harder than image compression.

Open in stacked reader →