This replaces the fixed point subpixel precision logic.
GLQuake now effectively renders artifact-free. Previously white/gray
pixels would sometimes be visible at triangle edges, caused by slightly
misaligned triangle edges as a result of converting the vertex window
coordinates to `int`. These artifacts were reduced by the introduction
of subpixel precision, but not completely eliminated.
Some interesting changes in this commit:
* Applying the top-left rule for our counter-clockwise vertices is now
done with simpler conditions: every vertex that has a Y coordinate
lower than or equal to the previous vertex' Y coordinate is counted
as a top or left edge. A float epsilon is used to emulate a switch
between `> 0` and `>= 0` comparisons.
* Fog depth calculation into a `f32x4` is now done once per triangle
instead of once per fragment, and only if fog is enabled.
* The `one_over_area` value was previously calculated as `1.0f / area`,
where `area` was an `int`. This resulted in a lower quality
reciprocal value whereas we can now retain floating point precision.
The effect of this can be seen in Tux Racer, where the ice reflection
is noticeably smoother.
Problem:
- The statistics overlay period is hardcoded to 500 ms. This time is
very short and can result in the values being very "jumpy".
Solution:
- Increasing this value can result in more steady values which is
useful when trying to evaluate the performance impact of a change. A
new config value is offered in `Config.h` to let the developer
change to any value desired.
OpenGL mandates at least 2 texture units when multitexturing is
supported. This keeps our vertices lean and gives a nice speed
improvement in glquake. Until we support shaders this should be enough.
This is required to allow lighting to work properly in the GL. We
currently have the maximum number of lights in the software GL context
set to 8, as this is the minimum that OpenGL mandates according to the
spec.
With the RASTERIZER_BLOCK_SIZE gone we can now render to any size, even
odd ones. We have to be careful to not generate out of bounds accesses
when calculating the render target and depth buffer pointers. Thus we
check the coverage mask and generate nullptrs for pixels that will not
be updated. This also masks out pixels that would touch the triangle but
are outside the render target/scissor rect bounds.
This snaps vertices to 1/32 of a pixel before rasterization resulting
in smoother movement and less floaty appearance of moving triangles.
This also reduces the severity of the artifacts in the glquake port.
5 bits should allow up to 1024x1024 render targets. Anything larger
needs a different implementation.
This displays statistics regarding frame timings and number of pixels
rendered.
Timings are based on the time between draw_debug_overlay() invocations.
This measures actual number of frames presented to the user vs. wall
clock time so this also includes everything the app might do besides
rendering.
Triangles are counted after clipping. This number might actually be
higher than the number of triangles coming from LibGL.
Pixels are counted after the initial scissor and coverage test. Pixels
rejected here are not counted. Shaded pixels is the percentage of all
pixels that made it to the shading stage. Blended pixels is the
percentage of shaded pixels that were alpha blended to the color buffer.
Overdraw measures how many pixels were shaded vs. how many pixels the
render target has. e.g. a 640x480 render target has 307200 pixels. If
exactly that many pixels are shaded the overdraw number will read 0%.
614400 shaded pixels will read as an overdraw of 100%.
Sampler calls is simply the number of times sampler.sample_2d() was
called.