Implementing a Tiny CPU Rasterizer
115 points by atan2 6 days ago | 46 comments
  • Cieric 6 days ago |
    I love looking at stuff like this, working in the GPU space has only ever renewed my ambitions to work on similar projects. The hardest thing I always ran into with the more optimized fill algorithms was working around the single pixel holes that appear when doing everything with integers.

    Small nitpick though, there seems to be an issue with the source code views, the file names and the first line of the code are on the same line with no spacing. looks like it might be a static site generation issue since there aren't any nodes to separate the name and the code in the raw html.

    Edit: turns out the issue I'm seeing is somehow due to Firefox, seems to work correctly in edge.

    • jfk13 6 days ago |
      It looks like for some reason Firefox isn't being served the same source as Safari or Chrome. In those browsers, the filename is wrapped in <center>, <a>, and <tt> elements, and followed by <hr> <br>.

      But in the version of the HTML that Firefox receives, all that is missing and the filename is just some text that ends up immediately in front of the first line of code.

      • lisyarus 5 days ago |
        Yep, I've seen this problem when testing locally on chrome, but I've no idea what that is. Reloading the page usually works.
  • smokel 6 days ago |
    Wait, what? They rasterize a triangle by checking for each pixel if it intersects with the triangle? Where are the days when you learned Bresenham [1] or fixed-point arithmetic [2] to determine the extents of a scan line, and then fill it with an ordinary for-loop?

    [1] https://en.wikipedia.org/wiki/Bresenham%27s_line_algorithm

    [2] https://en.wikipedia.org/wiki/Fixed-point_arithmetic

    • vardump 6 days ago |
      Checking every pixel is generally faster on modern hardware due to SIMD! Of course you’ll need to clip properly.

      There are some pathological cases, like a thin long almost diagonal triangles. But those (rare) cases can be handled too by some subdivision clipping.

      • randomNumber7 5 days ago |
        I get that a GPU is doing it like that, but are you really sure this is the better algorithm for cpu?
        • vardump 5 days ago |
          Yes, because you can handle 8 or 16 pixels in parallel with SIMD.

          You have also less branches which helps a lot.

    • cv5005 6 days ago |
      The thing is that modern computer architecture strikes again - like the old linked list vs array thing.

      Doing a candidate loop over a bounding box with edge equations can be much faster than ye old scanline algorithm because it lends itself more easily to simd and parallel approches - you can divide things up into tiles and process multiple pixels at a time with wide instructions and schedule tiles on multiple threads.

      • smokel 6 days ago |
        For those not in the know, things only get faster with this approach when you apply some optimization. The article that we are discussing won't get there until part 12.

        For the time being one may consult Ryg's amazing blog series: https://fgiesen.wordpress.com/2013/02/10/optimizing-the-basi...

    • ehaliewicz2 6 days ago |
      Those were the 90s. Modern rasterizers* all use barycentric coordinate based algorithms for a few reasons.

      Easier to implement with proper fill conventions and multisampling, and much easier to parallelize in hardware and software.

      * Hardware even back in the 90s used this type of approach :)

      • thechao 6 days ago |
        We use barycentric (both in high performance software rasterizers and in hardware) because attribute interpolation is significantly more costly than in-bounds checking (both due to the total number of interpolates, and the precision). The in-bounds check is going to be just a few instructions (a few fmas and some sign checking) for two dimensions (x, y); whereas attribute interpolation could be needed for 30–128 'dimensions'. It's easier to push the interpolation to the use-site in the fragment shader, and let the compiler do its magic.
    • Lerc 6 days ago |
      There used to be a series of articles similar to this one that did things that way as well. As I recall, the author was making swiftshader, which I think was acquired by RAD Game Tools, then the articles got bitrot and eventually disappeared.
    • amelius 5 days ago |
      They probably do that because shading uses the same general approach (a float computation per pixel, with some shared terms).
    • randomNumber7 5 days ago |
      Someone in the comments here posted a version of that and was downvoted. Probably the author doesn't like competition.
  • magicalhippo 6 days ago |
    Nostalgia flashback. Anyone else implemented a renderer based on fatmap2.txt[1]?

    I came up with my own approach using bresenham and storing spans, but it was slow and sucked.

    Then my buddy found fatmap2.txt on a BBS and gave it to me, as I didn't have a modem at the time. It was a revelation.

    Programming in Turbo Pascal I was hampered by it being 16bit, but discovered I could prepend the assembly opcodes with 66h to turn them into 32bit instructions.

    Later I tried converting the code to C, being only 300 lines and at that point well understood I figured it was a good task to get started with C. I was self-taught at programming but had read enough C to feel confident enough to try.

    Converted it all line by line to C before compiling... and got literally over 3000 errors and warnings, and none of them made any sense.

    Discouraged I left it, but a week later I was determined to get it working. After a few hours staring and bashing my head against the wall, I saw that the constant for the texture width which I had converted to a #define in C, had a semi-colon at the end...

    So I removed it, and voila it complied flawlessly...

    Only later did I realize my C compiler, djgpp, was actually a C++ compiler and that's probably why I got so many seemingly weird errors.

    Anyway, good times...

    [1]: https://github.com/rcoscali/ftke/blob/master/ogles/doc/fatma...

    • ggambetta 6 days ago |
      I don't think I ever came across fatmap2.txt, but I do miss the good ol' times. I think my reference was 3DGPL, unfortunately can't find a live link to that.

      Took a quick look at fatmap2 and surprisingly it doesn't do perspective-correct texture mapping (although it points to some other document in the introduction).

      • magicalhippo 5 days ago |
        > I think my reference was 3DGPL, unfortunately can't find a live link to that.

        This one[1], code being here[2]?

        > surprisingly it doesn't do perspective-correct texture mapping

        Yeah it was covered in some other document, can't recall which. I have some memory of fatmap3, but I can't find any references to it, and any files I had were on my IBM Deathstar[3]...

        [1]: https://www.gamers.org/dEngine/rsc/

        [2]: https://www.gamers.org/dEngine/rsc/3dgpl/code/

        [3]: https://en.wikipedia.org/wiki/Deskstar#IBM_Deskstar_75GXP_fa...

      • Agingcoder 5 days ago |
        Perspective correct texture mapping was too expensive for the hardware available at fatmap time I think ?

        Edit: it was actually possible at the time.

        • magicalhippo 5 days ago |
          It was but it was borderline. A typical trick was to do the division at the start and end of a span, and interpolate linearly, rather than per pixel.

          This worked well enough if the triangle was seen fairly straight on and was not too large.

          Quake popularized the per-8 pixels technique, which is close enough in most cases.

          • Agingcoder 5 days ago |
            Ah yes indeed !I had forgotten about the quake trick. Thanks !
    • Agingcoder 5 days ago |
      Yes I read fatmap. Writing software rasterizers was what got me into coding when I was a teenager. Quite honestly, it was a great way to learn: algorithms, hardware, assembly, software architecture, all had to be taken care of to get a usable 3D engine.
    • DeathArrow 5 days ago |
      I had tough times with graphics under DOS. BorlandC C had a limited graphics library called Turtle. So I used DOS mode 13H for fancy 320x200 display with 256 colors.

      Later, somebody gave me the source code for some graphics libraries together with some drivers and I could use 800x600 resolution with millions of colors.

      After that I went to Windows and Linux and graphics APIs quit being an issue.

    • scoopr 5 days ago |
      I was reading 3dica[0], though I was too young or stupid to really grok it back then.

      [0] https://www.modeemi.fi/drdoom/3dica/3dica.htm

    • gustavopezzi 5 days ago |
      Thank you for posting this. I cannot believe I did not know about it yet!
  • bob1029 6 days ago |
    I was super into this sort of thing until I hit triangle clipping concerns in homogeneous coordinate space.

    Filling the triangle and spinning the cube are the perfect level of difficulty if you want to get your hands a bit dirty and achieve that sense of accomplishment. Anything beyond this starts to feel like a very risky time-reward tradeoff.

    • lisyarus 5 days ago |
      I will cover triangle clipping in part 5, and it's much less scary than it seems to be!
  • erwincoumans 6 days ago |
    If you like this, there is also the 500 lines of C++ TinyRenderer, loading Obj files, texture mapping, clipping, and vertex/pixel shaders on CPU only: https://github.com/ssloy/tinyrenderer
    • gustavopezzi 5 days ago |
      I don't believe this repo implements clipping.
  • the5avage 5 days ago |
    I once wrote a tiny cpu rasterizer in dlang. It is more efficient than what is shown here, since it uses only integer arithmetic (based on bresenham's algorithm). E.g. for a triangle it outputs the points in the triangle without ever considering a single pixel outside the triangle.

    It's based on ranges (similiar to the ones added to C++) and the concept works very well imo. Could be a good exercise to translate it to C++23 (using coroutines with the range "generator").

    https://github.com/the5avage/shapes

  • Joker_vD 5 days ago |
    By the way, does anyone know how to modify the standard perspective projection to have a spherical cap instead of the flat rectangle as its far clipping plane? Having the flat rectangle means that you can see further from the corner of your eye than in front of you which is definitely not how the human vision works, and there are some games where this bizarre behaviour is very noticeable: you can barely see some thing poking out of the fog before you, then you turn the camera to the left, and now the thing is visible well and clearly near the screen border. Turn camera back, and it goes back into the fog.
    • gmiller123456 5 days ago |
      A spherical cuttoff limit is also not how the human eye sees. Light can travel from any distance to reach your eye. The view frustrum is an optimization, and there will always be cases where it fails.
      • Joker_vD 5 days ago |
        > Light can travel from any distance to reach your eye.

        Yes, but it won't necessarily register so unless we're talking about insanely bright distant object that should be visible through fog from any distance and which also is not a part of the skybox, this particular problem practically never arises. The flat far plane, on the other hand, is glaringly obvious in e.g. Minecraft, or any Unity game with first-person view and fog.

        • glowcoil 5 days ago |
          You're describing a problem with a particular method of fog rendering. The correct way to address that would be to change how fog is rendered. The perspective projection and the far plane are simply not the correct place to look for a solution to this.
          • Joker_vD 5 days ago |
            I disagree. This problem exists even when the fog is completely absent and also distorts the objects at the sides of the screen regardless of the fog's presence or absence. I guess you could use fog, rendered in a particular way, to make it less noticeable but it's still there. So the root cause is the perspective projection.

            Now, I've googled a bit on my own, trying all kinds of search phraes, and apparently it is a known problem that the perspective projection, when wide (about 75 degrees and up) FOV is used, will distort objects at the side of the screen. One of the solutions appears to be a post-processing pass called "Panini Projection" which undoes that damage at the sides of the screen. From what I understand, it uses cylinder (but not a sphere) as the projection surface instead of a plane.

            • glowcoil 5 days ago |
              You originally described a problem where fog had a different falloff in world space at the edges of the screen compared to the center of the screen. The root cause of that is not the perspective projection; it's how the fog is being rendered.

              The issue you are describing now is called perspective distortion (https://en.wikipedia.org/wiki/Perspective_distortion), and it is something that also happens with physical cameras when using a wide-angle lens. There is no single correct answer for dealing with this: similarly to the situation with map projections, every projection is a compromise between different types of distortion.

              Anyway, if you're writing a ray tracer it's possible to use whatever projection you want, but if you're using the rasterizer in the GPU you're stuck with rectilinear projection and any alternate projection has to be approximated some other way (such as via post-processing, like you mention).

      • nkrisc 5 days ago |
        But it might more accurately reflect human visual perception. I can’t think of any case where your peripheral vision will perceive things that you won’t perceive by looking directly at them.
        • glowcoil 5 days ago |
          The only physically accurate answer for where to put the far plane is "behind everything you want to be visible". It fundamentally does not make any sense to change the shape of the far plane to "more accurately reflect human visual perception" because there is no far plane involved in human visual perception, period.
    • virexene 5 days ago |
      If you want that projection to still be expressible as a matrix transformation, I don't think that's possible.
    • xeonmc 4 days ago |
      1. you meant near clipping plane, not far

      2. clipping is a GPU fixed function done on x,y,z axes separately, so it will always be a planar cut

      3. the effect you're talking about can be entirely removed by setting the frustum's near plane to fit within your collision hull, e.g. for a spherical collider

          function near_plane_meters(collider_radius_meters, width_px, height_px, focal_length_px)
              return collider_radius_meters * focal_length_px / sqrt(0.25*width_px*width_px + 0.25*height_px*height_px + focal_length_px*focal_length_px)
          end
      • lisyarus 4 days ago |
        Joker_vD obviously talks about the far plane, not the near plane
  • saw-lau 5 days ago |
    This might also be of interest to anybody who enjoyed these articles: https://www.scratchapixel.com/.
    • greenbit 5 days ago |
      Despite seeing code early in the series, didn't see any code that would actually put pixels on a screen, or even to write a static image into a file, at least not within the first 6 or 7 units. Somewhere around that point there is some reference to OpenGL, so presumably you could start to brighten patches of screen at that point?

      Offering this as constructive criticism - the tutorials could be more engaging if the student could follow along and cause things to happen directly on the screen. It'd be awesome to organize the 1st unit as more than just confirming one has a compiler, to take it all the way to a kind of "hello, pixels".

  • hakilebara 5 days ago |
    I strongly recommend this course “3D Computer Graphics Programming”[1] from Gustavo Pezzi. It walks you through the creation of CPU rasterizer from scratch in C. I am working through it right now and I enjoy it a lot.

    [1] https://pikuma.com/courses/learn-3d-computer-graphics-progra...

    • atan2 5 days ago |
      Oh yes. One of the best!
    • gustavopezzi 3 days ago |
      Thank you for the mention. :)
  • toolslive 5 days ago |
    I went down this rabbit hole a few times as well. It's great fun.

    https://github.com/toolslive/NFJE

    You used to be able to run it in a browser until they kicked Java out.