Cyclic Redundancy Check is a commonly used method for performing integrity checks on data. It is a simple hash function that provides assurance that data is not corrupted and is widely used in data communication and data storage applications.
CRC is not suitable for protecting against intentional modification of data. It is a cryptographically weak algorithm and can easily be reversed. This article describes a simple algorithm to generate a four byte data sequence that can be added to a message to make the resulting CRC any value you want.
CRC algorithm
The mathematics behind CRC is described in many articles on the internet and I will not go into any details about the theory. The algorithm to calculate a ‘reverse CRC’ described here is based on the 32-bit polynomial, CRC-32-IEEE, most commonly used by standards bodies, but can easily be adapted to other CRC types.
The algorithm described in this article uses a table driven CRC implementation. A CRC lookup table contains 256 pre-calculated values, derived from the CRC polynomial. When the CRC of a message is calculated, each byte in the message modifies a 32 bit hash:
crc = (crc >> 8) ^ crc32Table_[*ptr++ ^ (crc & 0xff)];
One small thing to note is that the final CRC hash is the inverse of the calculated hash. A typical method to calculate CRC of a message or to append more data to a CRC would look something like:
UInt32 append(const void* buffer, UInt32 size, UInt32 oldCRC)
{
const UInt8* ptr = (const UInt8*)buffer;
UInt32 crc = ~oldCrc;
while (size--) {
crc = (crc >> 8) ^ crc32Table_[*ptr++ ^ (crc & 0xff)];
}
UInt32 newCrc = ~crc;
return newCrc;
}
Reversing CRC
One of the properties of CRC is that you can find and append a four byte sequence to any message and get any desired resulting CRC hash. This means that a message can be patched with a four byte sequence in order to get any hash you want.
Let us define our desired CRC as AABBCCDD where AA is the most significant byte and DD is the least significant byte. For each byte in the message, the CRC algorithm shifts the hash one byte to the right and xor it with a value from the lookup table. This means that in each step a new byte is added to the most significant byte in the hash, and in consecutive calculations the byte is shifted to less significant positions. This behavior can be used to quite easily create a sequence of bytes that will result in our desired CRC.
The table below shows how the CRC algorithm transforms the hash into the desired CRC. The xx values are reminders of the CRC of the message preceding the patch bytes.
CRC after 1st patch byte = DD3 xx xx xx
CRC after 2nd patch byte = CC2 DD2 xx xx
CRC after 3rd patch byte = BB1 CC1 DD1 xx
CRC after 4th patch byte = AA BB CC DD
Getting more into concrete details of the algorithm, we can easily see that the byte AA is the high byte of a CRC table lookup value xored with 0. The CRC table is called T in the following equations.
AA = ( T[ i3 ] ^ 0 ) >> 24
What we need to achieve is to find the appropriate index i3 that makes this equation true. This can be rewritten as a Boolean expression
( AABBCCDD ^ T[ i3 ]) >> 24 == 0, where 0 <= i3 < 256
So the challenge is to find the index i3 that satisfies the expression and this shouldn’t be a hard challenge for anyone with basic skills in programming. Although it is not really needed to spell out the whole hash, AABBCCDD, in the expression above, it allows us to simplify successive calculations.
Moving forward to the second highest byte, we can unroll the CRC algorithm to get
BB = ( ( ( BB1 ^ 0 ) >> 8 ) ^ T[ i3 ] ) >> 16
Where BB1 is calculated the same way as AA, i.e.
BB1 = ( T[ i2 ] ^ 0 ) >> 24
Now we can create a Boolean expression for calculating the table index for the 3rd patch byte. By putting the equations together and rearrange the shifts a bit, we can see that the index of the table lookup for calculating BB1 needs to satisfy the Boolean expression
(((AABBCCDD ^ T[ i3 ]) << 8) ^ T[ i2 ]) >> 24 == 0, where 0 <= i2 , i3 < 256
Note that the value BB is fetched from the full desired CRC value and shifted up to become most significant byte, just like AA was the most significant byte in the expression to calculate i3.
Similarly the table indexes for calculating the first and second byte and we four nice equations to solve:
(AABBCCDD ^ T[ i3 ]) >> 24 == 0
(((AABBCCDD ^ T[ i3 ]) << 8) ^ T[ i2 ]) >> 24 == 0
(((((AABBCCDD ^ T[ i3 ]) << 8) ^ T[ i2 ]) << 8) ^ T[ i1 ]) >> 24 == 0
(((((((AABBCCDD ^ T[ i3 ] ) << 8) ^ T[ i2 ]) << 8) ^ T[ i1 ]) << 8) ^ T[ i0 ]) >> 24 == 0
So by finding the indexes of the table entries, it is possible to end up with the desired CRC. Solving these equations are easy, just do a reverse lookup on the CRC table. As you see, the first equation gets the table index for the last patch byte, and the value can be used to calculate the earlier table indexes.
The code below shows the algorithm in practice. The calculated indexes are stored in the tableIdx array. Note that the desired CRC is inverted before starting the algorithm. The reason is that the CRC algorithm operates with inverted values as described in the beginning of the article.
int tableIdx[4];
UInt32 iterCrc = ~desiredCrc;
for (int j = 3; j >= 0; j--) {
for (int i = 0; i < 256; i++) {
if (((iterCrc ^ crc32Table_[i]) >> 24) == 0) {
tableIdx[j] = i;
iterCrc = (iterCrc ^ crc32Table_[i]) << 8;
break;
}
}
}
If performance is a concernt, the inner for loop can easily be converted into a table lookup. The code would then look like
int tableIdx[4];
UInt32 iterCrc = ~desiredCrc;
for (int j = 3; j >= 0; j--) {
tableIdx[j] = crc32RevTable_[iterCrc];
iterCrc = (iterCrc ^ crc32Table_[i]) << 8;
}
So far we haven’t looked at either the CRC of the message we are patching, nor the actual values of the patch bytes. None of these values were needed to find the table indexes. The next step in the algorithm is to find the values that generate the indexes needed to produce the final desired CRC. This is done by starting the normal CRC algorithm from the end of the original message and appending the four patch bytes. In each step, we need to calculate the actual patch byte before calculating the CRC. Since we know what table index to expect, we simply need to make sure that
crc32Table_[tableIdx] == crc32Table_[patch ^ (crc & 0xff)]
for each patch byte. So with msgCrc being the CRC hash of the message, we get the following code to finally find the four patch bytes:
UInt32 crc = ~msgCrc;
for (int j = 0; j < 4; j++) {
patch[j] = (crc ^ tableIdx[j]) & 0xff;
crc = (crc >> 8) ^ crc32Table_[patch[j] ^ (crc & 0xff)];
}
Usage
Reverse CRC may be useful when a message needs to be modified but the CRC of the message cannot be changed. The message can then be patched as desired, and a four byte patch block will be inserted to correct changes in CRC.
Other uses may include cases where you want to CRC any message but for some reason have constraints on what the generated CRC can be. I’ve seen cases where this actually was needed and the best solution was to add a patch to the message CRC to force the final CRC to be within a certain range.
Source Code
The complete source code of a CRC implementation that includes a method for finding a four byte patch for the reverse CRC, as well as some test cases can be found here.

22 comments: