Step-by-Step with OpenGL’s Graphics Pipeline

Good Evening everyone! I hope you’re all doing A-Ok. Yesterday I received a PM:

Hey. Do you know anything about game rendering? I’ve seen your blog you must know something about game rendering.

I think this person is referring to Graphic Pipeline, which different APIs dub it as “rendering application”. “The main function of the pipeline is to generate, or render, a two-dimensional image, given a virtual camera,three-dimensional objects, light sources, and more”. 1 Now, SIMD architecture with its many cores makes it possible for parallel calculations, and it is de facto the main reason we have GPUs today. Graphics Pipeline is made up of shaders. Shaders are parallel programs that live in the GPU. OpenGL has five shader stages, and about seven application stages in its Pipeline 2. But what are t he stages of this pipleline? What does pipleline even mean? Read this article and find out.

The History of the Word “Pipeline”

The word pipeline became popular when gas lamps were invented. “The first Canadian transmission pipeline was built in 1853. A 25 kilometre cast-iron pipe moving natural gas to Trois Rivières, QC. It was the longest pipeline in the world at the time.” 3

What does pipeline mean? Simple. Imagine you wish to transmit something from one stage, to another stage, then another stage, until the output is unrecognizable from the input. For example, in OpenGL, this can be the input:

float vertices[] = {
// first triangle
0.5f,  0.5f, 0.0f,  // top right
0.5f, -0.5f, 0.0f,  // bottom right
-0.5f,  0.5f, 0.0f,  // top left
// second triangle
0.5f, -0.5f, 0.0f,  // bottom right
-0.5f, -0.5f, 0.0f,  // bottom left
-0.5f,  0.5f, 0.0f   // top left
};

And this can be the output:

Some stages of the OpenGL are mandatory, some stages are optional. You can see this pipeline in the following picture.

Based on two books, The OpenGL Red Book 9th Edition and Real-Time Rendering 4th Edition I wish to explain to you this so-called Pipeline. Turn off your phones, and pay attention!

1. Vertex Data

We saw an example of vertices attributes in the first code in this very post. In OpenGL, vertices are recorded in Homogeneous Coordinate System, which is a complicated 4-point coordinate system. I’ll talk about this coordinate system in another post, very soon. Keep in mind that vertices can be hardcoded in the shader, but most of the times, it is passed to the vertex shader as a Vertex Buffer Object.

In Vertex Shader, three matrices are multiplied by the outgoing gl_Position. One is the Model Matrix. the other is the View Matrix, and at the end, we have the Projection Matrix. You can clearly see them in the following picture:

The model matrix deals with the object. The view matrix deals with the world around it. The projection matrix deals with the camera. One day, I’ll talk in detail about various camera projections.

“Imagine you have a bouncing ball object.If you represent it with a single set of triangles, you can run into problems with quality or performance. Your ball may look good from 5 meters away, but up close the individual triangles, especially along the silhouette, become visible. If you make the ball with more triangles to improve quality, you may waste considerable processing time and memory when the ball is far away and covers only a few pixels on the screen.With tessellation, a curved surface can be generated with an appropriate number of triangles.” 4

Tessellation is basically the process of breaking higher order primitives, such as cubes, into many small sub-primitives in order to create a more detailed geometry. In OpenGL, instead of cubes, we use patches. TCS is the first of two Tessellation Shaders, the other one being…

Ones the Tessellation Engine does its job, it produces a number of geometry vertices which are responsible for adding detail to the given patch. Tessellation Evaluation Shader invokes them. This shader runs for each generated vertex, and adds a lot of overhead. That’s why programmers shouldn’t be thrifty with TCS.

Geometry Shader looks like Vertex Shader, and uses gl_Position, but it’s responsible for creating multiple instances of a primitive through EmitVertex() and EndPrimitive(). “Each time we call EmitVertex the vector currently set to gl_Position is added to the primitive. Whenever EndPrimitive is called, all emitted vertices for this primitive are combined into the specified output render primitive. By repeatedly calling EndPrimitive after one or more EmitVertex calls multiple primitives can be generated. This particular case emits two vertices that were translated by a small offset from the original vertex position and then calls EndPrimitive, combining these two vertices into a single line strip of 2 vertices.” 5

6. Primitive Assembly

Primitive Assembly is a short stage. It is basically grouping of of primitives into lines and triangles. You may ask, well, what about points? They can’t be assembled? The question to your answer is, that yes, it does happen for points, but it’s redundant.

7. Clipping

After gl_Position is converted to Cartesian coordinates, and gets normalized, meaning it is carried out between -1 and 1, it is clipped for screen. Meaning, the primitives which are not in the viewport propogated by the projection matrix are discarded.  This stage is very important for overall performance reasons.

8. Culling

This is another step taken for performance. Each 3D primitive has its face to the camera, and its back is out of the view of the camera. So it’s entirely useless to render the back of the primitive also. Culling is the process of discarding the back in favour of the face of each primitive.

9. Rasterization

Rasterization is the process of converting 3D objects in computer’s memory to 2D objects to be displayed in the monitor, or rendered to a file. In other words, rasterization is the process of determining which fragments might be covered by a triangle or a line. Rasterization has two stages: trangle setup, and triangle traversal. In the former, “…. edge equations, and other data for the triangle are computed. These data may be used for triangle traversal, as well as for interpolation of the various shading data produced by the geometry stage. Fixed function hardware is used for this task. 6. In the latter, each pixel, or a sample, is covered by a triangle, and a fragment is generated for  each pixel, or sample. If the number of samples are low, aliasing happens. Triangles are interpolated, and pixels are sent to the next stage, which is the most important stage: Fragment shader.

Creme de la creme, it is the last programmable stage in the pipeline, and does operations such as coloring, texturing, shadowing, antialisiang, raytracing (if possible), etc.Perhaps the most important thing done in this stage is texturing.

If you are interested in learning more about fragment shaders and what you can do in them, you can read The Book of Shaders, a free internet book that, although outdated, teaches a lot of tricks in the trade.

11. Z-Buffer, Stencil Buffer, and Color Buffer

Z-Buffer is the depth buffer of the program, which is propagated by an FBO. An FBO can also be a stencil buffer, which as the name suggest, creates a black mask aroudn the screen. An FBO can also be a color buffer, which is simply the background color of the program.