Image Dithering: Eleven Algorithms and Source Code

Dithering: An Overview

Today’s graphics programming topic – dithering – is one I receive a lot of emails about, which some may find surprising. You might think that dithering is something programmers shouldn’t have to deal with in 2012. Doesn’t dithering belong in the annals of technology history, a relic of times when “16 million color displays” were something programmers and users could only dream of? In an age when cheap mobile phones operate in full 32bpp glory, why am I writing an article about dithering?

Actually, dithering is still a surprisingly applicable technique, not just for practical reasons (such as preparing a full-color image for output on a non-color printer), but for artistic reasons as well. Dithering also has applications in web design, where it is a useful technique for reducing images with high color counts to lower color counts, reducing file size (and bandwidth) without harming quality. It also has uses when reducing 48 or 64bpp RAW-format digital photos to 24bpp RGB for editing.

And these are just image dithering uses – dithering still has extremely crucial roles to play in audio, but I’m afraid I won’t be discussing audio dithering here. Just image dithering.

In this article, I’m going to focus on three things:

  • a basic discussion of how image dithering works
  • eleven specific two-dimensional dithering formulas, including famous ones like “Floyd-Steinberg”
  • how to write a general-purpose dithering engine

Update 11 June 2016: some of the sample images in this article have been updated to better reflect the various dithering algorithms. Thank you to commenters who noted problems with the previous images!

Dithering: Some Examples

Consider the following full-color image, a wallpaper of the famous “companion cube” from Portal:

This will be our demonstration image for this article.  I chose it because it has a nice mixture of soft gradients and hard edges.
This will be our demonstration image for this article. I chose it because it has a nice mixture of soft gradients and hard edges.

On a modern LCD or LED screen – be it your computer monitor, smartphone, or TV – this full-color image can be displayed without any problems. But consider an older PC, one that only supports a limited palette. If we attempt to display the image on such a PC, it might look something like this:

This is the same image as above, but restricted to a websafe palette.
This is the same image as above, but restricted to a websafe palette.

Pretty nasty, isn’t it? Consider an even more dramatic example, where we want to print the cube image on a black-and-white printer. Then we’re left with something like this:

At this point, the image is barely recognizable.
At this point, the image is barely recognizable.

Problems arise any time an image is displayed on a device that supports less colors than the image contains. Subtle gradients in the original image may be replaced with blobs of uniform color, and depending on the restrictions of the device, the original image may become unrecognizable.

Dithering is an attempt to solve this problem. Dithering works by approximating unavailable colors with available colors, by mixing and matching available colors in a way that mimicks unavailable ones. As an example, here is the cube image once again reduced to the colors of a theoretical old PC – only this time, dithering has been applied:

A big improvement over the non-dithered version!
A big improvement over the non-dithered version!

If you look closely, you can see that this image uses the same colors as its non-dithered counterpart – but those few colors are arranged in a way that makes it seem like many more colors are present.

As another example, here is a black-and-white version of the image with similar dithering applied:

The specific algorithm used on this image is "2-row Sierra" dithering.
The specific algorithm used on this image is “2-row Sierra” dithering.

Despite only black and white being used, we can still make out the shape of the cube, right down to the hearts on either side. Dithering is an extremely powerful technique, and it can be used in ANY situation where data has to be represented at a lower resolution than it was originally created for. This article will focus specifically on images, but the same techniques can be applied to any 2-dimensional data (or 1-dimensional data, which is even simpler!).

The Basic Concept Behind Dithering

Boiled down to its simplest form, dithering is fundamentally about error diffusion.

Error diffusion works as follows: let’s pretend to reduce a grayscale photograph to black and white, so we can print it on a printer that only supports pure black (ink) or pure white (no ink). The first pixel in the image is dark gray, with a value of 96 on a scale from 0 to 255, with zero being pure black and 255 being pure white.

Here is an example of the RGB values in the example.
Here is a visualization of the RGB values in our example.

When converting such a pixel to black or white, we use a simple formula – is the color value closer to 0 (black) or 255 (white)? 96 is closer to 0 than to 255, so we make the pixel black.

At this point, a standard approach would simply move to the next pixel and perform the same comparison. But a problem arises if we have a bunch of “96 gray” pixels – they all get turned to black, and we’re left with a huge chunk of empty black pixels, which doesn’t represent the original gray color very well at all.

Error diffusion takes a smarter approach to the problem. As you might have inferred, error diffusion works by “diffusing” – or spreading – the error of each calculation to neighboring pixels. If it finds a pixel of 96 gray, it too determines that 96 is closer to 0 than to 255 – and so it makes the pixel black. But then the algorithm makes note of the “error” in its conversion – specifically, that the gray pixel we have forced to black was actually 96 steps away from black.

When it moves to the next pixel, the error diffusion algorithm adds the error of the previous pixel to the current pixel. If the next pixel is also 96 gray, instead of simply forcing that to black as well, the algorithm adds the error of 96 from the previous pixel. This results in a value of 192, which is actually closer to 255 – and thus closer to white! So it makes this particular pixel white, and it again makes note of the error – in this case, the error is -63, because 192 is 63 less than 255, which is the value this pixel was forced to.

As the algorithm proceeds, the “diffused error” results in an alternating pattern of black and white pixels, which does a pretty good job of mimicking the “96 gray” of the section – much better just forcing the color to black over and over again. Typically, when we finish processing a line of the image, we discard the error value we’ve been tracking and start over again at an error of “0” with the next line of the image.

Here is an example of the cube image from above with this exact algorithm applied – specifically, each pixel is converted to black or white, the error of the conversion is noted, and it is passed to the next pixel on the right:

This is the simplest possible application of error diffusion dithering.
This is the simplest possible application of error diffusion dithering.

Unfortunately, error diffusion dithering has problems of its own. For better or worse, dithering always leads to a spotted or stippled appearance. This is an inevitable side-effect of working with a small number of available colors – those colors are going to be repeated over and over again, because there are only so many of them.

In the simple error diffusion example above, another problem is evident – if you have a block of very similar colors, and you only push the error to the right, all the “dots” end up in the same place! This leads to funny lines of dots, which is nearly as distracting as the original, non-dithered version.

The problem is that we’re only using a one-dimensional error diffusion. By only pushing the error in one direction (right), we don’t distribute it very well. Since an image has two dimensions – horizontal and vertical – why not push the error in multiple directions? This will spread it out more evenly, which in turn will avoid the funny “lines of speckles” seen in the error diffusion example above.

Two-Dimensional Error Diffusion Dithering

There are many ways to diffuse an error in two dimensions. For example, we can spread the error to one or more pixels on the right, one or more pixels on the left, one or more pixels up, and one or more pixels down.

For simplicity of computation, all standard dithering formulas push the error forward, never backward. If you loop through an image one pixel at a time, starting at the top-left and moving right, you never want to push errors backward (e.g. left and/or up). The reason for this is obvious – if you push the error backward, you have to revisit pixels you’ve already processed, which leads to more errors being pushed backward, and you end up with an infinite cycle of error diffusion.

So for standard loop behavior (starting at the top-left of the image and moving right), we only want to push pixels right and down.

Apologies for the crappy image - but I hope it helps illustrate the gist of proper error diffusion.
Apologies for the crappy image – but I hope it helps illustrate the gist of proper error diffusion.

As for how specifically to propagate the error, a great number of individuals smarter than I have tackled this problem head-on. Let me share their formulas with you.

(Note: these dithering formulas are available multiple places online, but the best, most comprehensive reference I have found is this one.)

Floyd-Steinberg Dithering

The first – and arguably most famous – 2D error diffusion formula was published by Robert Floyd and Louis Steinberg in 1976. It diffuses errors in the following pattern:

       X   7
   3   5   1


In the notation above, “X” refers to the current pixel. The fraction at the bottom represents the divisor for the error. Said another way, the Floyd-Steinberg formula could be written as:

           X    7/16
   3/16  5/16   1/16

But that notation is long and messy, so I’ll stick with the original.

To use our original example of converting a pixel of value “96” to 0 (black) or 255 (white), if we force the pixel to black, the resulting error is 96. We then propagate that error to the surrounding pixels by dividing 96 by 16 ( = 6), then multiplying it by the appropriate values, e.g.:

           X     +42
   +18    +30    +6

By spreading the error to multiple pixels, each with a different value, we minimize any distracting bands of speckles like the original error diffusion example. Here is the cube image with Floyd-Steinberg dithering applied:

Floyd-Steinberg dithering
Floyd-Steinberg dithering

Not bad, eh?

Floyd-Steinberg dithering is easily the most well-known error diffusion algorithm. It provides reasonably good quality, while only requiring a single forward array (a one-dimensional array the width of the image, which stores the error values pushed to the next row). Additionally, because its divisor is 16, bit-shifting can be used in place of division – making it quite fast, even on old hardware.

As for the 1/3/5/7 values used to distribute the error – those were chosen specifically because they create an even checkerboard pattern for perfectly gray images. Clever!

One warning regarding “Floyd-Steinberg” dithering – some software may use other, simpler dithering formulas and call them “Floyd-Steinberg”, hoping people won’t know the difference. This excellent dithering article describes one such “False Floyd-Steinberg” algorithm:

   X   3
   3   2


This simplification of the original Floyd-Steinberg algorithm not only produces markedly worse output – but it does so without any conceivable advantage in terms of speed (or memory, as a forward-array to store error values for the next line is still required).

But if you’re curious, here’s the cube image after a “False Floyd-Steinberg” application:

Much more speckling than the legit Floyd-Steinberg algorithm - so don't use this formula!
Much more speckling than the legit Floyd-Steinberg algorithm – so don’t use this formula!

Jarvis, Judice, and Ninke Dithering

In the same year that Floyd and Steinberg published their famous dithering algorithm, a lesser-known – but much more powerful – algorithm was also published. The Jarvis, Judice, and Ninke filter is significantly more complex than Floyd-Steinberg:

             X   7   5 
     3   5   7   5   3
     1   3   5   3   1


With this algorithm, the error is distributed to three times as many pixels as in Floyd-Steinberg, leading to much smoother – and more subtle – output. Unfortunately, the divisor of 48 is not a power of two, so bit-shifting can no longer be used – but only values of 1/48, 3/48, 5/48, and 7/48 are used, so these values can each be calculated but once, then propagated multiple times for a small speed gain.

Another downside of the JJN filter is that it pushes the error down not just one row, but two rows. This means we have to keep two forward arrays – one for the next row, and another for the row after that. This was a problem at the time the algorithm was first published, but on modern PCs or smartphones this extra requirement makes no difference. Frankly, you may be better off using a single error array the size of the image, rather than erasing the two single-row arrays over and over again.

Jarvis, Judice, Ninke dithering
Jarvis, Judice, Ninke dithering

Stucki Dithering

Five years after Jarvis, Judice, and Ninke published their dithering formula, Peter Stucki published an adjusted version of it, with slight changes made to improve processing time:

             X   8   4 
     2   4   8   4   2
     1   2   4   2   1


The divisor of 42 is still not a power of two, but all the error propagation values are – so once the error is divided by 42, bit-shifting can be used to derive the specific values to propagate.

For most images, there will be minimal difference between the output of Stucki and JJN algorithms, so Stucki is often used because of its slight speed increase.

Stucki dithering
Stucki dithering

Atkinson Dithering

During the mid-1980’s, dithering became increasingly popular as computer hardware advanced to support more powerful video drivers and displays. One of the best dithering algorithms from this era was developed by Bill Atkinson, a Apple employee who worked on everything from MacPaint (which he wrote from scratch for the original Macintosh) to HyperCard and QuickDraw.

Atkinson’s formula is a bit different from others in this list, because it only propagates a fraction of the error instead of the full amount. This technique is sometimes offered by modern graphics applications as a “reduced color bleed” option. By only propagating part of the error, speckling is reduced, but contiguous dark or bright sections of an image may become washed out.

         X   1   1 
     1   1   1


Atkinson dithering
Atkinson dithering

Burkes Dithering

Seven years after Stucki published his improvement to Jarvis, Judice, Ninke dithering, Daniel Burkes suggested a further improvement:

             X   8   4 
     2   4   8   4   2


Burkes’s suggestion was to drop the bottom row of Stucki’s matrix. Not only did this remove the need for two forward arrays, but it also resulted in a divisor that was once again a multiple of 2. This change meant that all math involved in the error calculation could be accomplished by simple bit-shifting, with only a minor hit to quality.

Burkes dithering
Burkes dithering

Sierra Dithering

The final three dithering algorithms come from Frankie Sierra, who published the following matrices in 1989 and 1990:

             X   5   3
     2   4   5   4   2
         2   3   2

             X   4   3
     1   2   3   2   1

         X   2
     1   1

These three filters are commonly referred to as “Sierra”, “Two-Row Sierra”, and “Sierra Lite”. Their output on the sample cube image is as follows:

Sierra (sometimes called Sierra-3)
Sierra (sometimes called Sierra-3)
Two-row Sierra
Two-row Sierra
Sierra Lite
Sierra Lite

Other dithering considerations

If you compare the images above to the dithering results of another program, you may find slight differences. This is to be expected. There are a surprising number of variables that can affect the precise output of a dithering algorithm, including:

  • Integer or floating point tracking of errors. Integer-only methods lose some resolution due to quantization errors.
  • Color bleed reduction. Some software reduces the error by a set value – maybe 50% or 75% – to reduce the amount of “bleed” to neighboring pixels.
  • The threshold cut-off for black or white. 127 or 128 are common, but on some images it may be helpful to use other values.
  • For color images, how luminance is calculated can make a big difference. I use the HSL luminance formula ( [max(R,G,B) + min(R,G,B)] / 2). Others use ([r+g+b] / 3) or one of the ITU formulas. YUV or CIELAB will offer even better results.
  • Gamma correction or other pre-processing modifications. It is often beneficial to normalize an image before converting it to black and white, and whichever technique you use for this will obviously affect the output.
  • Loop direction. I’ve discussed a standard “left-to-right, top-to-bottom” approach, but some clever dithering algorithms will follow a serpentine path, where left-to-right directionality is reversed each line. This can reduce spots of uniform speckling and give a more varied appearance, but it’s more complicated to implement.

For the demonstration images in this article, I have not performed any pre-processing to the original image. All color matching is done in the RGB space with a cut-off of 127 (values <= 127 are set to 0). Loop direction is standard left-to-right, top-to-bottom.

Which specific techniques you may want to use will vary according to your programming language, processing constraints, and desired output.

I count 9 algorithms, but you promised 11! Where are the other two?

So far I’ve focused purely on error-diffusion dithering, because it offers better results than static, non-diffusion dithering.

But for sake of completeness, here are demonstrations of two standard “ordered dither” techniques. Ordered dithering leads to far more speckling (and worse results) than error-diffusion dithering, but they require no forward arrays and are very fast to apply. For more information on ordered dithering, check out the relevant Wikipedia article.

Ordered dither using a 4x4 Bayer matrix
Ordered dither using a 4×4 Bayer matrix
Ordered dither using an 8x8 Bayer matrix
Ordered dither using an 8×8 Bayer matrix

With these, the article has now covered a total of 11 different dithering algorithms.

Writing your own general-purpose dithering algorithm

Earlier this year, I wrote a fully functional, general-purpose dithering engine for PhotoDemon (an open-source photo editor). Rather than post the entirety of the code here, let me refer you to the relevant page on GitHub. The black and white conversion engine starts at line 350. If you have any questions about the code – which covers all the algorithms described on this page – please let me know and I’ll post additional explanations.

That engine works by allowing you to specify any dithering matrix in advance, just like the ones on this page. Then you hand that matrix over to the dithering engine and it takes care of the rest.

The engine is designed around monochrome conversion, but it could easily be modified to work on color palettes as well. The biggest difference with a color palette is that you must track separate errors for red, green, and blue, rather than a single luminance error. Otherwise, all the math is identical.

Similar Posts:


This site - and its many free downloads - are 100% funded by donations. Please consider a small contribution to fund server costs and to help me support my family. Even $1.00 helps. Thank you!

45 thoughts on “Image Dithering: Eleven Algorithms and Source Code”

  1. Hi there. I found this article really helpful but I have noticed a curious thing:

    On nearly all of your dithers there is, as you can see, a large white section on the left side with no pixels set at all.

    Here is my own Floyd-Steinberg of your input image for comparison:

    I also tried Paint Shop Pro 7, and its Floyd-Steinberg, “non-weighted” produced very similar results to mine. Where is this large white area coming from? There are also more minor differences, with a lot more diagonal straight lines visible in your dithers.

    Is it something to do with your input image? Is it different to the jpg you show on the page? Or is it something in your implementation?

    Best wishes,


    1. Hi graspee,

      There are some interesting differences between my own implementation and others. I actually compared my results against GIMP and several other software implementations during the course of this article, and there are differences between them all.

      All the criteria under “other dithering considerations” are possible explanations for the differences. Here’s another interesting one I didn’t include in the article – different JPEG libraries can also lead to subtle differences. For example, image software that loads JPEGs using GDI+ will return a different unique color count than software that uses libjpeg. I don’t know if these small differences could explain the dithering differences above, but some combination of them all probably contributes.

      I also use a slightly modified system for calculating the error:

      'Get the source pixel color values
      r = ImageData(QuickVal + 2, y)
      g = ImageData(QuickVal + 1, y)
      b = ImageData(QuickVal, y)

      'Convert those to a luminance value and add the value of the error at this location
      l = getLuminance(r, g, b)
      newL = l + dErrors(x, y)

      'Check our modified luminance value against the threshold, and set new values accordingly
      If newL >= cThreshold Then
      errorVal = newL - 255
      ImageData(QuickVal + 2, y) = highR
      ImageData(QuickVal + 1, y) = highG
      ImageData(QuickVal, y) = highB
      errorVal = newL
      ImageData(QuickVal + 2, y) = lowR
      ImageData(QuickVal + 1, y) = lowG
      ImageData(QuickVal, y) = lowB
      End If

      I like this calculation because I find the output cleaner and clearer.

      If you download PhotoDemon (which contains my implementation) and set the threshold to 255, you’ll see results very similar to your own Floyd-Steinberg image.

      Finally, this image is particularly prone to bringing out differences in implementations, on account of the large patch of uniform pixels on the left. A more busy, “traditional” image would make any differences harder to pick out.

      1. Minutes after posting, I thought to check your companion cube image using the “weighted” option on paint shop pro and obtained results very similar to yours. Their help page states: ” Weighting the palette sets the image’s current colors closer to black and white before reducing them.”

        I am going to experiment now with changing how I calculate luminance, based on the methods outlined in your other article, to see if I can reproduce the same effects.

        One thing I’m concerned about, since I’m a beginner at these kinds of things, is that my implementation is a bit naive. I don’t have an array that I put the error amounts in- I just add the varying percentages of the error directly to the pixels in my path ahead. I.e.: (pseudocode, since the SDL-specific things get in the way)

        for each of the pixels we are spreading the error to:
        get the rgb of the pixel into r,g and b
        f is the factor for this pixel multiplied by e.g. 1/16
        new r=clamp( r + round (f * quant_error_r))
        same for g and b
        put the new pixel with new r,g and b back

        What I’m concerned about is that when pixels are white or nearly white, the error pushed on them is going to be lost in the overflow since i’m clamping the value between 0 and 255 obviously since I’m writing it back.

        Strangely I haven’t noticed any artifacts so perhaps it’s not as important as I thought.

        I’m very into dithering at the moment though so I have a lot of experimenting to do.

        Thanks again for your reply. I did actually have a quick look at your source code but not too closely because I got my own routine working from taking the wikipedia pseudocode for Floyd-Steinberg (which was all I had to go on before your page) and then making it so it was a generic error diffusion function, then adding in the matrices from your page.

      2. Good luck! Dithering is a neat topic, and I’m really glad you found this information useful.

        Take this for what it’s worth, but similar to you, my own testing showed that clamping the error to 0 and 255 didn’t make much of a difference. In fact, some texts recommend clamping because a patch of bright or dark pixels can lead to a blown-out error that affects more pixels than you’d like. (I haven’t found this to be the case, but perhaps it warrants consideration.)

        Palette weighting is a definite improvement over a stock implementation, and worth investigating further. I implement something similar to palette weighting in PhotoDemon, where I calculate a simple image histogram, then locate the midpoint and use it as the threshold. (In the software, the option is called “have PhotoDemon estimate the ideal threshold for this image”.) That technique results in an image that is roughly half-black and half-white, which brings out the maximum amount of detail. Kinda fun.

        Anyway, there are lots of fun ways to improve the output of the algorithm. Let me know how your project turns out.

  2. i am a designer and was curious how all the dithering options actually work, this article is really informative in explaining ‘what goes behind’. Thanks :)

    1. There is a different.more powerful, way of thinking about dithering which is what the blue noise article refers to. Consider the problem of 8 bit grey dithering to 1 bit.
      Instead of diffusing the error, imagine adding a sequence of uniformly distributed random numbers between 0 and 255, and the shifting the result by 8 bits. Some short thinking will show that this will always map 0 to 0, 255 to 1, and everything in between will map to either 0 or 1 the appropriate fraction of the time.

      The most trivial way to use this insight is to add an unchanging pattern of values (ie not random values), and this is a way to encode the ordered dithers of the last two algorithms in the article.
      Slightly less trivial is to use a sequence of INDEPENDENT random values, but this gives horrible looking results. The reason is that if you’re using independent random variables there will be a number of regions of the page where, just by accident, you get a clump of all large value or all small value random numbers, so that clump is too black or too white by visual standards.

      What you want is for large random values to be surrounded by small random values and vice versa, so you actually don’t want independent random values, you want later random values to be influenced by the earlier random values. The mathematical buzzword is you want a *stochastic process* (essentially a sequence of random values) with particular *spectral characteristics* (ie how closely correlated are random values near to and far from each other); and the particular typeof process you want is one called blue noise.

      Once you have a stream of random variables that matches the property of blue noise, you have an algorithm that’s actually a whole faster (and easier to vectorize) than error diffusion — you just add (with clipping) a vector of source data, a vector of random values, then shift right (or extract high bits or whatever). The results are beautiful, better than error diffusion.

      This same algorithm is easily adapted to any sort of quantization problem. This may be reducing existing quantization (eg 8 bit grey values to 1, 2 or 4 bit grey values) or in quantizing an analog source (eg analog audio to 16bit audio) where it is called noise shaping. The idea is always the same — you can’t get rid of the total amount of error, but you can shift it into frequencies where the body is less sensitive to it — higher frequencies for both the eye and the ear.

      The difficulty in all this is generating a stream of blue noise random data to add to the initial data. When I was on the QuickTime team in the 90s, we solved this by doing an initial (slow, off-line) calculation of a long stream of blue noise data (8K or 32K samples, something like that), which was stored as a data resource inside QuickTime. Then dithering simply walked that resource in a loop adding it to the image data being dithered.

      Color dithering brings in a whole different set of issues (obviously I was doing this while playing movies, so performance was critical). The rough idea was per channel dithering followed by an all-channel table lookup. The initial signal was YUV and I dithered that down to, I think, 4 bits each of Y, U, and V. That went into a 2^12 (pre-constructed) lookup table that did a lookup of the best matching of the standard Mac 216 always available 8 bit colors. This allowed essentially the YUV to RGB conversion and gamma correction to happen in one single lookup. The result was mesmerizing to watch.

      I did many different things to get MPEG video to play in real-time on those machines (initially 68030, then early PPCs) but of all the work I did, the most viscerally immediately appealing was the blitting code which
      – did vertical and horizontal scaling (because MPEG uses non-square pixels). If you don’t scale you get faces that are ellipses and the whole thing looks like 4:3 TV image stretched out to fill a 16:9 screen, ie looks like garbage. If you just drop columns of pixels then the result looks like you are looking through a screen. You have to do SOME scaling to look non-terrible, and I invented what’s basically the utterly most minimal thing you can get away with to look OK, but use very little of your (so limited in those days) CPU.
      – did the dithering to 8 or lower bit screens if necessary.

  3. Hello Tanner Helland:
    I am a (retired)graphic artist and I have a keen interest in the Floyd-Steinberg dither pattern and your modifications of it. They very much resemble the pattern that the “koala” scanner made back in the ’80s. Do you know of any scanning programs I can use to create this result? Koala has disappeared from graphics. I have used Photoshop’s “grain” pattern with great difficulty. I would appreciate hearing about any programs you know about.
    Steve Shapiro

    1. Hi Steve. I’m afraid I don’t know the “Koala Scanner” pattern you speak of. This article contains all the standard dithering algorithms I’ve seen implemented in other software. A quick check at Wikipedia shows some additional options, like simulated halftoning, but nothing that deviates much from the algorithms above.

      Do you have any sample images from that particular scanner? I might be able to track something down if I had a visual reference for the pattern…

  4. one optimization that should help with both 1D and 2D-dithering:
    Direction Alternating Lines
    odd-lines are performed left-to-right
    even-lines are performed right-to-left (with the distribution pattern flipped, of course)

    the main advantage is that it reduces the visible amount of ‘color-drifting’.

  5. I want to write an error diffusion algorithm into my own software, but I have a question that you didn’t cover in your article.

    When I compute shift the error to the next pixel, should it be NextPixelNewValue = NextPixelOldValue + Error

    and allow the new value to possibly go above 255, or go below 0, for the purpose of maintaining the exact pixel+error sum (and then clip the value to keep it within the 0 to 255 range just prior to displaying the image)?

    Or should I calculate the next pixel’s new value like this
    NextPixelNewValue = Clip(NextPixelOldValue + Error)

    where during the dither process, I always truncate the pixel+error sum to stay within the 0 to 255 value range?

    In the first way, the error is always maintained so it the earliest error will effect the calculation of the last pixel. In the second way, the error data will be lost if it pixel+error sum exits the 0 to 255 range, so that the error calculated from the first pixel will NOT be propagated all the way to the last pixel.

    Which of these 2 techniques is the correct way of doing error diffusion dithering?

    1. As I mentioned in the other dithering considerations section:

      Which specific techniques you may want to use will vary according to your programming language, processing constraints, and desired output.

      When it comes to diffusion dithering (and really, most graphics programming topics), it’s less about “correct vs incorrect” approaches and more about “what is your desired output?” Said another way, there’s not a definitely correct answer to your question. Try each method and see which one produces the best output for your task.

      In this dithering article, which I link above, that author presents his own opinion on these topics in his Special Considerations section. Maybe you’ll find his opinion helpful:

      It is critical with all of these algorithms that when error values are added to neighboring pixels, the resultant summed values must be truncated to fit within the limits of hardware. Otherwise, an area of very intense color may cause streaks into an adjacent area of less intense color.

      This truncation is known as “clipping,” and is analogous to the audio world’s concept of the same name. As in the case of an audio amplifier, clipping adds undesired noise to the data. Unlike the audio world, however, the visual clipping performed in error-diffusion halftoning is acceptable since it is not nearly so offensive as the color streaking that would occur otherwise. It is mainly for this reason that the larger filters work better — they split the errors up more finely and produce less clipping noise.

      With all of these filters, it is also important to ensure that the sum of the distributed error values is equal to the original error value. This is most easily accomplished by subtracting each fraction, as it is calculated, from the whole error value, and using the final remainder as the last fraction.

  6. im doing some ai work, and i think dithering will help simplify my input, its just i need that, but i wish i could reverse the dithering so i could maintain some data integrity. of course i did a 3×3 blur kernel on a 3×3 floyd stienburg, and i think that is what i will do at first, but I wonder is there a better way to reverse dithering – with some mathemagic?

    1. Hi Magnus. As far as I know, you’ve already discovered the best way to reconstruct dithered data. For performance reasons, you may find it easier to count the number of black pixels in each window (3×3 or some other size), and set the reconstructed value that way, but aside from that, I’m afraid there’s not much more you can do.

      If you know which dithering algorithm was used, I suppose you could construct a blur-like filter that weighted pixels using a reverse matrix of the one used for dithering. That might give you slightly better output, but it might also be a lot of work for minimal gain… no way to know for certain without trying. :/ Good luck either way!

      1. I feel its an important thing to persue, because I discovered ordered dithering is really awesome for temporal compression – because the dataset barely changes between results.
        At some level of density, im sure you could bring back some precision for a scalar value definitely. (like you actually blur the data into a bit…) A guassian blur is perfect for recollecting the values, and then I found an erosion/median filter brought back some of the lost edges. Im still working on it, as binary neural networks are a good opportunity I dont want to not explore enough.
        And I feel its important to bring up general database use and dithering, because its a good idea.

  7. I wrote a Floyd Steinberg implementation some years ago in Java and looking back it was one of the most satisfying little projects that I ever worked on. I even added an error diffusion of the alpha channel, which worked really well.

    One thing that has been on my mind since then is that the beauty and high performance of the algorithm itself is overshadowed by the clumsiness and slow performance of finding the nearest color in the palette. Are you aware of any fancy data structures or algorithms that could improve performance for finding the nearest color? Currently I have to iterate through all colors in the palette (up to 256), calculate the error (err = dR * dR + dG * dG + dB * dB;) and select the entry with the lowest error. Of course I can precalculate x*x for all values up to 256 and stop if err == 0, but that does not really do it.

    Oh, and thanks for the really great article.

    1. Thanks for the kind comment, Jan. Dithering the alpha channel is a great idea – and in fact, that very technique is used in a number of modern PNG compression tools (

      For matching colors to the palette, your best bet is to sort the palette in advance by some criteria, typically distance from RGB(0, 0, 0). Then you can compare pixels to the palette using a binary search algorithm, so at worst, you’ll be making log(n) error comparisons.

      1. Hi Tanner, thanks for the prompt reply.

        The idea for improving color matching is interesting, but I think you cannot implement a nearest neighbor search in RGB the way you described, because you really have to seach in three dimensions.

        I had a closer look at possible algorithms and found k-d trees, which seem to be an interesting data structure for nearest neighbor search in >2 dimensions ( But they seem to be a bit complicated, so I looked further.

        Then I found this article by Geoffrey Probert:

        What he does is quite simple:
        1. Add the original palette index to each entry as a property, so it doesn’t get lost when order changes
        2. Sort the palette entries by the red component.
        3. Create an array of length 256 (redLookupMap) and set at each index the the (new) palette index of the palette color that has the closest r distance to the redLookupMap index (if several colors have the same red distance, any of them will do).

        1. Use the redLookupMap to find a palette index that has an equal or very close red difference to the sought color.
        2. Calculate the Euklidian distance to the found palette color.
        3. Move left and right in the palette array and search for the palette entry with the lowest Euklidian distance to the sought color. The trick is that you can stop moving to one direction if the Euklidian distance of only the red component already exceeds the current min distance. This reduces the number of required steps quite a bit.

        I just implemented this algorithm for a Floyd Steinberg algorithm using a 256 color palette and voila, the whole procedure is now 3x faster!

  8. Merry Christmas Tanner!

    I love the number of algorithms you have here, by far the most I’ve found in one section of code.

    I’ve started converting the syntax to .Net but came across some negative arrays, and started thinking it may not be entirely trivial. (maybe a simple positive offset would work)…

    Before converting from VB to VB.Net, there’s over 100 syntax errors, and I skipped the GUI entirely – so some of those are going to be input values to the process.
    I was thinking of classerising sections to chop the code into separate files…

    I wondered if you ever implemented this in .Net before I carry on with the other 80 errors I have left?

    You don’t mind me working on it do you?

    Many thanks,


    1. Hi Sarah. Merry Christmas to you too! Thanks for the kind comment.

      Yes, you are of course welcome to adopt the code however you’d like. All my code is BSD licensed, so you’re welcome to use it in any application, commercial or otherwise.

      I apologize for the trouble converting the code to .Net. Sometimes I take for granted the benefits of being able to treat arrays as if they have negative dimensions. There are some performance trade-offs in letting the language handle this for you (because they have to remap the indices to base 0 behind-the-scenes), but I do find it conceptually helpful for tasks like this.

      At a glance, I don’t think there would be any trouble adding a simple positive offset to the code, so that it correctly maps into a 0-based array. I’ve also been tempted to modify my originally code by “classerising” it, as you suggest, which would not only make it easier to port into other languages, but allow me to use it in other contexts (like handling color palettes instead of just gray). Your comment has motivated me to revisit this, so I’ll add it to my to-do list and hopefully get to it in the coming weeks!

      In the meantime, I wish you the best of luck with a manual conversion. If additional problems arise, feel free to post them here, or contact me directly via this link. Also, if your work is publicly available, I’m sure others would love a link so they could pitch in and help, or at least explore the code for their own needs.

      Thanks again, and happy new year!

    2. If you know how it works, you dont have to bother converting code, to do ordered dithering.

      Just make an array, goto

      Put it in your array, then just wrap the array across your image, subtract mid grey, and then if its less than 0 its black, if its more than 0 its white.

      Ordered dithering has special properties over the more complex error diffused dithering. (which isnt much harder – but start with ordered.) That are interesting in compression.

  9. I don’t think these algorithms have been correctly implemented. The tell-tale sign is that there are large white areas in the middle of most of the dithering results. If you look at the source image, they contain clearly fairly gray areas there, which suggests that some kind of black-white pattern should be happening there as well. Some kind of runaway truncation is happening and consuming the error signal.

    Good page otherwise. :-/

    1. Hi Antti. You’re absolutely right – at some point, these images were wrongly uploaded from a set of “reduced color bleed” samples.

      I’ve uploaded correct images and will shortly note the change in the article. Thanks for bringing this to my attention (and apologies for the trouble).

  10. I remember implementing a variation on Floyd-Steinberg over 15 years ago with some long-forgotten techniques for reducing artifacts & patterns. It was for a set of Mac printer drivers & paid for by a client OEM, but being the last days of that platform and having no support multi-level dots which were quickly becoming the state of the art, it ended up only shipping with that company’s few models and wasn’t ported anywhere. Between the custom halftoning and integrated color correction inspired by the color profiles we were making for our Postscript RIP product, we managed to get output that often rivaled the RIP. It was a lot of fun, and I need to dig up that sourcecode.

  11. Some people experience headaches, etc from monitor that use dithering. For that reason they prefer True 8-bit color monitors. Can someone speak about this issue?

  12. Hello,
    Almost 5 years passed and this post still is informative thought provoking.

    One question: how 5th image in this post was produced, the one with reduced color palette (with caption “A big improvement over the non-dithered version!”)? Was color quantization applied before/after dithering? Simultaneously applying tracking of “separate errors for red, green, and blue, rather than a single luminance error”? Other algorithm?

  13. Great article, best I’ve come across by far, but I am still confused and wondering if you can help me.

    I understand that the algorithms you have presented work to convert an image to a lesser number of colours. However I see these algorithms are also being used for a different purpose: to prevent banding/posterisation artefacts when modifying video LUTs, such as with pixel shaders, monitor calibration, or modifying curves in Photoshop, and I’m having trouble understanding how that works exactly. So far I have come across two different methods that seem to work, although the second one seems to work better, and I don’t know why.

    Method 1. First apply randomised noise to the image (either error diffusion or ordered dithering seem to work – although I’m not sure how, since there is no “target bit depth” parameter being provided), then afterwards, perform the video LUT transformations.

    My understanding as to why this method works: suppose the colour being displayed is a full raster of middle grey (128). After applying the randomised noise, the average (mean) value of all pixels is still 128, however a histogram analysis of the pixels shows a bell-shaped distribution of values , with decreasing frequency either side of 128. Therefore in a video LUT transformation, if value 128 was to be lost due to truncation as a result of the LUT transformation, the effect it has on the average (mean) brightness is minimised since there are still plenty of 127’s and 129’s in the mix, and is spatially averaged over neighbouring values which hides the error. In testing this method only seems to work on monitors which are natively 8-bit and produces too much visible noise on 6-bit/FRC monitors.

    Method 2. First apply the video LUT transformation, then afterwards apply the dithering algorithm (again, not sure how this works as the algorithms seem to require a “target bit depth” parameter which is not applicable here).

    This is the method I can’t seem to figure out, and I suspect is the method AMD are using when the user modifies the video card LUT, as it seems to work even on 6-bit monitors and doesn’t seem to require temporal randomisation of the noise between frames either. It is also used by CeeJay (author of SweetFX, open source) in his Dither.h shader. It seems to me in theory it shouldn’t work, as once the image already contains posterisation artefacts, adding noise won’t change the average (mean) values of any colour. For example if on a gray ramp there are 3 bands of gray shade “128” in a row, these will remain at a mean brightness of 128, and will still stand out visually as a posterisation artefact next to its neighbouring values 127 and 129, which also do not have their average (mean) values modified as a result of applying the dithering noise.

    I have tried duplicating CeeJay’s Dither.h code to apply it before and after the video LUT shader in the Pipeline.cfg (ordering of shader files) and get confusing results: in CeeJay’s code comments (Shared.h) he writes “// Dither (should go near the end as it only dithers what went before it)” , however if I follow that rule by applying it after the CustomFX LUT shader, posterisation is still present; only if I apply it before the CustomFX LUT shader (and increase one of its strength multiplier parameters as well) does it eliminate posterisation. To make things even more confusing, the reverse ordering per CeeJay’s comment does in fact work when applying it *after* one of CeeJay’s tone mapping modification shaders.

    1. In relation to my comment above, I believe I understand now why CeeJay’s Dither.h shader works with when applied after colour transformations within his own effect library, but not after the 1D texture LUT shader used for monitor calibration.

      Basically, CeeJay’s prior colour transformations (such as S-curves, vignette and such) are calculated at high precision values – 3 or 4 decimal place floating point precision. His Dither.h shader then uses these high precision values to dither “towards” the lower precision 255-shade (8-bit) output by adding noise as required, which is in line with how the original article describes the process. Whereas the LUT shader used for monitor calibration (1D texture LUT) already starts out as low precision values based on an 8-bit texture (they are still handled as 3 or 4 decimal place floating point numbers internally, but they are effectively truncated to the nearest 1/255th, or decimal 0.0039), so there are no intermediate values to dither it “towards”. If we had instead used say a 16-bit 1D texture LUT then we could start out with high precision values that aren’t truncated to nearest 0.0039, and CeeJay’s dither.h shader should work well.

      As to how AMD is doing it, I believe they are doing a similar thing to CeeJay – starting with the 16-bit precision values which reside in the video card gamma table (contrary to some claims which say it’s 10-bit – the Set/GetDeviceGammaRamp functions accept a 16-bit parameter), and then applying pixel shader dithering to produce the final 8-bit output. Ordered dithering and error diffusion seem to be quite cheap processing wise – in reshade I’m seeing only a 1% GPU overhead.

      In testing I note CeeJay’s/AMD’s method is produces much less visible noise since it only adds noise as required, whereas my aforementioned “method1” where noise is applied to the whole image first, results in more noise since all pixels receive noise regardless of how “far away” their initial high precision values are from the final lower precision output, and I’m guessing all this extra noise is why it doesn’t seem to work so well on 6-bit monitors which already employ their own dithering.

  14. Forgot to mention, CeeJay is also applying the final dithered pixel value by adding it to the green subpixel and subtracting it from the red and blue subpixels. The result is coloured noise which in testing (using frafstestpattern) makes the noise significantly less visible (and helps make “method1” more viable). I am not sure why it works, and I am not sure how it retains the colour balance either since green is not equivalent to magenta in terms of brightness. Here is the shader code (it’s open source so I’m guessing should be ok to post here).

    float3 color = colorInput.rgb;
    float dither_bit = 8.0; //Number of bits per channel. Should be 8 for most monitors.

    //Pseudo Random Number Generator
    // — PRNG 1 – Reference —
    float seed = dot(tex, float2(12.9898,78.233)); //I could add more salt here if I wanted to
    float sine = sin(seed); //cos also works well. Sincos too if you want 2D noise.
    float noise = frac(sine * 43758.5453 + tex.x); //tex.x is just some additional salt – it can be taken out.

    //Calculate how big the shift should be
    float dither_shift = (1.0 / (pow(2,dither_bit) – 1.0)); // Using noise to determine shift. Will be 1/255 if set to 8-bit.
    float dither_shift_half = (dither_shift * 0.5); // The noise should vary between +- 0.5
    dither_shift = dither_shift * noise – dither_shift_half; // MAD

    //shift the color by dither_shift
    color.rgb += float3(-dither_shift, dither_shift, -dither_shift); //subpixel dithering

  15. On further analysis of the above source code, it seems CeeJay isn’t actually using any of the methods described in your article. I have re-created each line of the code as a column in a spreadsheet to see what the values would turn out to be, and it seems all the algorithm does is add a random amount of brightness to each pixel. Whether these pixels happen to get truncated/snapped to the nearest 1/255th at the final 8-bit output stage would seem to be based on randomness, not on any error propagation matrix. However he does provide an ordered dithering method as well, however in practice I find the previous randomised dithering to be more effective, i.e less visible noise.

    #if dither_method == 1 // Ordered dithering
    //Calculate grid position
    float grid_position = frac( dot(tex,(RFX_ScreenSize * float2(1.0/16.0,10.0/36.0))) + 0.25 );

    //Calculate how big the shift should be
    float dither_shift = (0.25) * (1.0 / (pow(2,dither_bit) – 1.0));

    //Shift the individual colors differently, thus making it even harder to see the dithering pattern
    float3 dither_shift_RGB = float3(dither_shift, -dither_shift, dither_shift); //subpixel dithering

    //modify shift acording to grid position.
    dither_shift_RGB = lerp(2.0 * dither_shift_RGB, -2.0 * dither_shift_RGB, grid_position); //shift acording to grid position.

    //shift the color by dither_shift
    color.rgb += dither_shift_RGB;

  16. I think this may be a key line of code I overlooked:

    float dither_shift_half = (dither_shift * 0.5); // The noise should vary between +- 0.5

    By multiplying 1/255 by 0.5, it places that value at exactly the midpoint between two 8-bit shades, where the slightest deviation from its value will cause it to be rounded either up or down to the nearest 255th at 8-bit output stage. So if any slight amount of noise were to be added , it would cause that pixel to increase by 1/255th, else decrease by 1/255th, and so you get a randomised pattern of noise.

  17. Great article, but keep in mind mathematical quality from transformation algorithms (as in, deviance from the original) does not translate to the perceived quality of the final image or the algorithm itself. Error diffusion algorithms produce distinctive movement patterns in the resulting image (those background ‘speckle starfields’) that are extremely recognizable and present visual information that is not present in the original image. Pattern dithering, instead, produces a grid-like presentation which, while no less recognizable, is easier for the brain to interpret as a pattern and such just a simulation of the desired color with no additional information.

    I’ve seen a lot of artists prefer pattern dithering over error diffusion for many situations. The affirmation that it always produces worse results must be taken with a grain of salt.

  18. I apologize if I’m just a huge bonehead and I’m missing something, but… where is the source code for these algorithms? The blog entry is titled “Eleven Algorithms and Source Code,” but I don’t see any source code. Is it there somewhere, maybe cleverly hidden behind some link, or what?

  19. Recently, I programmed an implementation of Floyd-Steinberg dithering in C++, which works fine in most cases… but under certain conditions (which excactly I yet have to find out) strange artifacts show up: blobs of solid black or white arranged in lines roughly descending towards the lower right corner of the image. These blobs appear mostly in the lower half of the image and seem to be associated with certain original colors. For better understanding, two example images and their corresponding originals:

    Black blobs example:
    was originally:

    White blobs example:
    was originally:

    The implementation of Floyd-Steinberg in C++:

    void floydsteinberg(vector<vector > &img, vector &pal)
    unsigned short h = img.size();
    unsigned short w = img[0].size();
    unsigned short r, c, i;
    unsigned short p = pal.size();
    rgb t0, t1, t2, t3, t4,;
    rgb triple, dist;
    float newred, newgreen, newblue;

    for (r=0; r<h; r++)
    for (c=0; c<w; c++)
    t0 = {-1, -1, -1}; // = triple
    t1 = {-1, -1, -1};
    t2 = {-1, -1, -1};
    t3 = {-1, -1, -1};
    t4 = {-1, -1, -1}; = img[r].at(c).get_red(); = img[r].at(c).get_green(); = img[r].at(c).get_blue();
    if (c 0 && r < h-1)
    { = img[r+1].at(c-1).get_red(); = img[r+1].at(c-1).get_green(); = img[r+1].at(c-1).get_blue();
    if (r < h-1 )
    { = img[r+1].at(c).get_red(); = img[r+1].at(c).get_green(); = img[r+1].at(c).get_blue();
    if (c < w-1 && r < h-1 )
    { = img[r+1].at(c+1).get_red(); = img[r+1].at(c+1).get_green(); = img[r+1].at(c+1).get_blue();
    for (i=0; i<p; i++)
    if (coldist(t0, -1)
    newred = +*0.4375;
    newgreen = +*0.4375;
    newblue = +*0.4375;
    img[r].at(c+1).set_all(mround(newred), mround(newgreen), mround(newblue));
    if ( > -1)
    newred = +*0.1875;
    newgreen = +*0.1875;
    newblue = +*0.1875;
    img[r+1].at(c-1).set_all(mround(newred), mround(newgreen), mround(newblue));
    if ( > -1)
    newred = +*0.3125;
    newgreen = +*0.3125;
    newblue = +*0.3125;
    img[r+1].at(c).set_all(mround(newred), mround(newgreen), mround(newblue));
    if ( > -1)
    newred = +*0.0625;
    newgreen = +*0.0625;
    newblue = +*0.0625;
    img[r+1].at(c+1).set_all(mround(newred), mround(newgreen), mround(newblue));

    double coldist(rgb c1, rgb c2)
    double reddpow, greendpow, bluedpow, powsum;
    reddpow = pow((, 2);
    greendpow = pow((, 2);
    bluedpow = pow((, 2);
    powsum = reddpow + greendpow + bluedpow;

    return sqrt(powsum);

Leave a Reply to Douglas Zwick Cancel reply