Quick Response codes (or QR codes) are everywhere these days, from restaurant menus and product packaging to tombstones and tattoos. They are a type of 2D barcode invented in Japan in the 1990s and released patent-free, meaning anyone can generate and use them without restriction. But how do they work? How do they store information? How do they work so reliably?
This page is an interactive primer on QR codes. We'll explore the parts of a QR code. A lot of this gets pretty technical, but we'll keep it as high-level as possible and link to more in-depth resources for those who want to dive deeper.
To start, let's look at some QR codes. You can type in the text you want to encode in the input box below, and the QR code will update in real-time. The parts of the code are labeled, and we'll dive into each part in detail below.
Bits and Modules
When you scan a QR code, the scanner decodes the data into a stream of bits. The small squares in the QR code are called modules, and they can be either black (1) or white (0). QR scanners are pretty robust, so they don't need the module to be a fully black square to work. That's why some QR codes use circles or more complex shapes. The scanner can still read the data as long as the module is mostly black or white.
Some of the bits are logistically necessary for the QR code to work, like the format information and alignment patterns. The rest of the bits represent the actual data content you want to encode (there are also error correction bits which we'll get to later!). But fundamentally, the QR code is just a grid of modules that the scanner interprets as a stream of 1s and 0s using image processing techniques.
QR Version
As you may have noticed playing around above, QR codes come in different sizes. The size of a QR code is called its version. The version of a QR code determines how much data it can store.
Version 1 is a 21x21 grid, and every subsequent version adds 4 rows and 4 columns. So, version 2 is a 25x25 grid, version 3 is a 29x29 grid, and so on. The maximum version is 40, which is a 177x177 grid.
The interactive below shows how much text you can encode in a QR code of each version. As you increase the version, you'll see that more of our sample text (the first 10 amendments to the US Constitution) is encoded in the QR code. The text is highlighted in the QR code, and the number of characters that can be encoded is shown below the QR code.
Finder Patterns
The finder patterns are the three large squares in the corners of the QR code. They are used to help the scanner locate and orient the QR code. The finder patterns are always the same 7x7 pattern, surrounded by a 1-module border around them. The scanner looks for these patterns to determine the position and orientation of the QR code.
In the example below, you'll notice that your phone will have no problem scanning the QR code even if it's rotated or flipped upside down.
It's worth noting that the QR code itself must be surrounded by a quiet zone, which is a blank space around the QR code. The quiet zone helps the scanner detect the finder patterns and decode the QR code.
Timing Pattern
The timing pattern is the alternating black and white modules that run horizontally and vertically through the QR code. It helps the scanner figure out the size of each module (the small squares that make up the QR code). The timing pattern isn't strictly necessary for the QR code to work, but it helps the scanner decode the QR code more quickly and accurately.
Alignment Patterns
The alignment patterns are smaller squares throughout the pattern that help the scanner correct for distortion when the QR code is scanned at an angle, or if the QR code is on a curved surface.
The number of alignment patterns increases as the QR version increases. They're always placed symmetrically around the QR code to help the scanner correct for distortion. This table shows how many alignment patterns appear per version:
Format Information
The format information is a small amount of data that tells the scanner how the QR code is encoded. It includes two key pieces of information: the error correction level and the mask pattern. The error correction level tells the scanner how much of the QR code can be damaged and still be decoded. The mask pattern tells the scanner how the data is arranged in the QR code.
Both pieces of information are encoded using 15 bits: 2 for the error correction level, 3 for the mask pattern, and 10 for redundancy. All 15 bits are copied into different locations in the QR code, so that if one part of the QR code is damaged, the scanner can still read it.
Error Correction Level
If we look at the bits labeled 0 and 1 in the format information example QR code above, we'll get the Error Correction Level of the QR code. This tells us how much of the QR code can be damaged and still be decoded. The four levels are:
In addition to giving us robustness against bad lighting or damage, using a high error correction level also lets us intentionally manipulate the QR code for marketing or aesthetics. You may have seen QR codes with company logos or designs in the middle. This is possible because the QR code can be damaged and still be decoded.
To illustrate this, here's a picture of my cat Dorito inside four QR codes that encode the same exact data, but use different error correction levels. You'll see that a higher error correction level will allow the picture to be larger, which is great because she is the best part of the QR code!
Mask Patterns
The bits labeled 2, 3, and 4 in the format information give us the mask pattern. A mask pattern is a way of scrambling the black and white modules in the QR code to make it easier for the scanner to read.
Before finalizing the QR code, the generation algorithm will try 8 different mask patterns and pick the one that results in the least ambiguous QR code. This is done with a heuristic that basically says "pick the QR code that has the least confusing patterns for the scanner." For example, if the QR code happened to generate something that looked like a finder pattern in the middle of the data, the scanner would have a hard time figuring out what was data and what was a finder pattern! The mask pattern helps avoid this.
Below, you’ll see the 8 different mask patterns, each with its own formula. You can think of a mask like a repeating stencil laid over the QR code’s data and error correction areas. Wherever the stencil indicates, the color of a module is flipped. This helps break up patterns that might confuse the scanner. Internally, this flipping uses an XOR operation, which allows the scanner to easily reverse the mask and recover the original data.
(x + y) % 2 === 0
y % 2 === 0
x % 3 === 0
(x + y) % 3 === 0
(floor(x/3) + floor(y/2)) % 2 === 0
(x*y % 2 + x*y % 3) === 0
((x*y % 2 + x*y % 3) % 2) === 0
((x + y) % 2 + x*y % 3) % 2 === 0
After we apply these stencils, all 8 options are evaluated to see which version has the fewest "bad" patterns based on an internal scoring system. For a more comprehensive overview of the mask patterns and the heuristics, here's a great resource from thonky.com.
Data Mode
The first 4 bits of the QR code are used to indicate the mode of the QR code. The mode tells the scanner how to interpret the data in the QR code.
QR codes support different types of data, and each type has its own mode that determines how the data is encoded. These modes allow the QR code to be optimized for the type of data you want to store. Numeric data, for example, can be packed much more tightly than text or binary data.
Data Encoding
After the mode bits (and a few more bits that tell the scanner how many bits are in the data), the QR code contains the actual data. This is where the magic happens!
The data is encoded using a zig-zag pattern, which helps the scanner read the data more easily and makes the QR code more robust against distortion.
The example below shows the zig-zag pattern used to encode and decode data in a QR code. Each character is represented by a group of 8 bits, and depending on the direction of the zig-zag, the QR scanner knows how to read the data (this is a great diagram from Wikipedia). Note that the visual below is using an unmasked QR code for the sake of example, so scanning it won't work.
Error Correction
Alongside the data in the QR code, there are also error correction bits. These bits are used to help the scanner correct for distortion when the QR code is scanned at an angle or if the QR code is damaged.
The mechanism used here, the Reed-Solomon error correction algorithm, is the same one used in CDs and DVDs. It works by adding extra bits to the data that can be used to reconstruct the original data if some of the bits are damaged. The algorithm works by treating the data as a polynomial and using the properties of polynomials to recover the original data. Or, in simpler terms, it's like a jigsaw puzzle with extra pieces that can be used to fill in the gaps if some of the pieces are missing. A more detailed explanation of how this works is beyond what we'll cover here (unfortunately, there's no easy way to explain the math), but if you're interested, thonky.com once again has a fantastic resource on this. Veritasium also has an excellent video that dives into the math.
Conclusion
QR codes are, in my opinion, a rare win in tech: free, robust, and universally accessible. They're a great example of how technology can be used to make our lives easier and more convenient.
If you like this type of content, you can follow me on BlueSky. If you wanted to support me further, buying me a coffee would be much appreciated. It helps us keep the lights on and the servers running! ☕
We're just getting started.
Subscribe for more thoughtful, data-driven explorations.