2D Vector Graphics on GPU

Here's the thing that surprises people: 2D vector graphics are harder to render on a GPU than 3D scenes with realistic lighting. Real-time ray tracing is shipping in consumer games, but we still can't draw Bezier curves on the GPU without significant compromises. The reasons are historical, architectural, and mathematical — and untangling them reveals a lot about how GPUs ended up the way they are.

Why PostScript Won (And Why That Matters)

Jasper St. Pierre's history of 2D graphics traces a direct line from Xerox PARC's Interpress to PostScript to PDF to SVG to the HTML5 Canvas API. The imaging model that Adobe shipped with the first LaserWriter in 1985 — fill paths described by Bezier curves, apply affine transforms, composite with alpha — is essentially unchanged in every 2D graphics API we use today. Microsoft's GDI was based on PostScript's model. SVG was born from two competing proposals, both PostScript derivatives. Even the Canvas API is a near line-by-line translation of PostScript's operator set.¹

This matters because PostScript was designed for a world where a printer had one CPU and no parallelism. The algorithms for filling Bezier curves — scan conversion, winding number calculation — are inherently serial. You sweep a horizontal line across the path, find intersections, and fill between them. The next scanline depends on the previous one. This is the opposite of what GPUs want.

Meanwhile, 3D graphics evolved around triangles from the start — hand-modeled polygons that could be rasterized independently and in parallel. The GPU hardware was built to accelerate triangles. Implicit curves like Catmull-Clark subdivision existed in 3D modeling, but offline tools converted them to triangle meshes before rendering. The GPU never needed to handle curves natively.¹

So we ended up with a mismatch: the dominant 2D imaging model demands Bezier curves in real-time, and the hardware that could theoretically render them was built for a completely different geometry representation.

What Makes 2D Harder

It's not just the curves. Several properties of 2D content make it harder for GPUs than 3D scenes:¹

Antialiasing demands precision. In 3D, 4x MSAA is usually acceptable because silhouettes are a small fraction of the frame and camera rotation creates saccadic masking that hides artifacts. In 2D, especially text, you want the exact area under the curve for each pixel. A font rendered at 12px with MSAA artifacts is unacceptable. You need analytic coverage computation, not sample-based approximation.

Stability matters more than motion. 3D games lean heavily on temporal AA to hide artifacts across frames. In 2D, you're panning and scrolling through static content — text, UI, diagrams. TAA's temporal jitter and ghosting are immediately visible. The content has to look correct in every individual frame.

The geometry is the content. In 3D, geometry is a scaffold for materials and lighting. In 2D, the shapes are what you're looking at. A mitered corner on a path, the exact curvature of a glyph serif, the precise tangent of a Bezier join — these are all visible, and imprecision is a bug.

The Font Problem

Fonts are where the difficulty becomes most acute, and where the most creative solutions have emerged. Traditional GPU text rendering uses a font atlas — pre-rasterize every needed glyph at every needed size into a texture. This works but consumes memory and falls apart when you zoom, rotate, or hit a glyph size that wasn't pre-rendered.²

Signed distance fields (popularized by Valve's 2007 paper) improved the situation dramatically. Store the distance from each texel to the nearest glyph edge, and the GPU can produce crisp edges at arbitrary scale using a simple threshold in the fragment shader. This is why SDF techniques spread so quickly — the shader is trivial and the results look great. But SDFs still round sharp corners at low resolution, which is fatal for text quality.²

Will Dobbie's vector texture approach goes further. Instead of pre-rasterizing glyphs, pack the actual Bezier curve control points into a texture atlas, with a spatial grid that tells each cell which curves intersect it. The fragment shader reads the control points and computes coverage by casting rays at multiple angles and finding intersections with the quadratic Bezier curves — essentially solving the quadratic formula per curve per sample per pixel.²

The technique produces pixel-accurate results at any scale with zero CPU cost at runtime. The tradeoff is GPU cost: multiple ray-curve intersection tests per pixel. But it eliminates the atlas management problem entirely and handles rotation, scaling, and subpixel positioning for free, because the computation happens in screen space from the original vector data.²

The GPU 2D Rendering Renaissance

The last decade has seen an explosion of research into GPU-accelerated 2D vector rendering, driven partly by the realization that existing solutions are embarrassingly bad. Raph Levien's work on piet-gpu (now Vello) is arguably the most ambitious: a compute shader-based 2D rendering pipeline that processes paths entirely on the GPU.

The core challenge Levien tackles is that 2D rendering primitives — fills, strokes, clips — have complex ordering and coverage semantics that resist naive parallelization. His approach uses a series of compute shader stages connected by prefix sums to process path elements, flatten curves, bin them into tiles, and compute per-pixel coverage. The prefix sum is the fundamental building block: it turns "each element depends on all previous elements" into a parallel-friendly operation.³

The portability problems Levien encountered are instructive. His decoupled look-back prefix sum algorithm runs 1.5x faster than tree reduction but requires device-scope atomic barriers that Metal doesn't support. WebGPU can't require them either. So the fastest algorithm for the core operation of the 2D pipeline is unportable. The practical choice is tree reduction everywhere, with decoupled look-back as an optional fast path on Vulkan and DX12.³

Vello (formerly piet-gpu) is now a functional rendering engine — it can draw large 2D scenes at interactive framerates using wgpu for GPU access, and serves as the rendering backend for Xilem, a Rust GUI toolkit. The project reached 177fps on the paris-30k test scene on an M1 Max at 1600px square, which is excellent for a compute-only 2D renderer with no hardware rasterization assistance. It supports SVG (via vello_svg), Lottie animations (via velato), and Bevy integration (via bevy_vello). The WebGPU backend means it can run in browsers with compute shader support, though browser availability is still catching up.⁴

The mobile performance story is instructive. On Adreno 640 (Pixel 4), the fine rasterization stage was 5.6x slower than on Intel integrated graphics — not because the hardware lacked compute power but because the shader compiler, seeing read-write access to the same buffer, conservatively bypassed texture cache for all reads. The diagnosis required reading Adreno ISA disassembly via Mesa's Freedreno tools. The fix — segregating read-only scene data from read-write clip stack into separate buffers — brought performance within 2x of the desktop reference, in line with the hardware capability. This kind of compiler-level performance cliff is what makes compute-based 2D rendering a harder portability problem than traditional rasterization.⁵

Jasper St. Pierre notes that several alternative approaches have emerged in parallel: Blend2D's JIT CPU rasterizer, Pathfinder's multiple GPU-based approaches (Patrick Walton explored three separate architectures), and the SDF-based techniques that continue to evolve. The field is actively contested in a way that 3D rendering hasn't been for years — there's no equivalent of "just use deferred shading" consensus.¹

Could It Have Been Different?

St. Pierre's essay ends with a counterfactual worth considering. If PostScript hadn't won — if the dominant 2D model didn't require real-time Bezier evaluation — would GPUs have evolved differently? And if triangles hadn't dominated 3D so completely — if implicit surfaces had been taken seriously sooner — would we have hardware curve support today?¹

Donald Knuth's METAFONT, contemporary with PostScript, used a completely different model: recursive path stroking with "most pleasing curves" rather than explicit Bezier fills. It didn't catch on because it was too mathematical for type designers. But the underlying idea — that 2D graphics could be defined differently from the PostScript model — keeps resurfacing. Apple's iWork suite quietly adopted Hobby splines (from Knuth's student John Hobby) as the default spline type. The history isn't as settled as it looks.¹

Meanwhile, the demoscene has been rendering implicit curves and signed distance functions on GPUs for years, demonstrating that the hardware can handle non-triangular geometry when you're creative enough. The question is whether production 2D rendering will converge on compute-based approaches like Vello or whether some future GPU architecture will add native curve hardware. Given the current trajectory — more compute, more programmability, less fixed-function — I'd bet on the software path.

Why are 2D vector graphics so much harder than 3D? by Jasper St. Pierre — source ↩ ↩² ↩³ ↩⁴ ↩⁵ ↩⁶
GPU text rendering with vector textures by Will Dobbie — source ↩ ↩² ↩³ ↩⁴
Prefix sum on portable compute shaders by Raph Levien — source ↩ ↩²
Vello: An experimental GPU compute-centric 2D renderer by Linebender — source ↩
The case of the curiously slow shader by Raph Levien — source ↩

Linked from

Graphics And Rendering Overview
2D Vector Graphics On GPU reveals a surprising irony: 2D graphics are harder than 3D on modern hardware, because the PostScript imaging model demands Bezier curves while GPUs were built for triangles.

Open in stacked reader →