## Tuesday, July 9, 2013

### Rewinding CRC - Calculating CRC backwards

Background

In two earlier articles (Calculating Reverse CRC and Reverse CRC Patch with Readable Characters) I've described algorithms to find a sequence of bytes and ASCII characters that will append to a buffer to change the CRC to a desired CRC.
Some readers have said that it would be nice to be able to modify the data and apply a patch when the beginning of the buffer is unknown. This article describes a method to do just that, although it does require the final CRC of the original buffer to be known.

Problem

The algorithms described in earlier articles require that the CRC at the beginning of the buffer is known. But suppose you have a message with an unknown beginning and you want to change a sequence in the buffer and apply a patch to make the mutated buffer end up with the same CRC as the original. For example changing:

{unknown data} + {known buffer} -> Known CRC

to

{same unknown data} + {modified buffer} + {patch} -> Same Known CRC

In order for the previously described patch algorithms to work, you would need to know the CRC at the beginning of the known buffer to use the algorithms. This article describes an algorithm that starts at the end of a known buffer and calculates the CRC at an earlier point in the buffer.

Solution

The solution is not that complicated but has a few caveats that we'll go into. The algorithm is basically the reverse of the normal CRC algorithm that looks like:

newCrc = (oldCrc >> 8) ^ crc32Table[value ^ (oldCrc & 0xff)];

Assume that the CRC at the current location in a buffer contains the octets [An Bn Cn Dn] where An is the high eight bits and Dn is the low eight bits. And the CRC before the last byte Vn was added to the CRC was [Ao Bo Co Do]. If we put these two CRCs into the equation and performs the shifts, we'll get:

[An Bn Cn Dn] = [0 Ao Bo Co] ^ crcTable[Vn ^ Do]

Note the 0 in the high byte of the old CRC. Looking at this it is quite clear that the high byte of the new/current CRC is equivalent to the high byte in the CRC table lookup. Knowing this we can easily look at all 256 CRC table indices and find the ones that matches the equation:

An = crcTable[i] >> 24

Once we know a table index that matches the equation, we can calculate the old octet Do:

Do = i ^ Vn

where Vn was the byte that we added to the old CRC to get the new CRC. This is however where the caveat comes in. There may be more than one table entry that satisfy the equation, which means that there will be multiple possible previous CRCs. The good news is that the standard CRC with the polynomial 04c11db7h and most other polynomials has exactly one table entry that do, so if using the standard CRC polynomial there is a single solution to the equation and thus only a single possible previous CRC.

Let us continue. Once Do is calculated, it is easy to calculate the other three octets in the previous CRC, by inserting the found Dn into the equation:

[0 Ao Bo Co] = [An Bn Cn Dn] ^ crcTable[Vn ^ Do]

So to address the original problem where we wanted to find the CRC X number of bytes before the end, we'll apply this algorithm recursively on the known buffer starting at the last byte and the known CRC and end when we reach the beginning of the known buffer:

for (int i = len(buffer) - 1; i >= 0; --i) {
for (int j = 0; j < 256; ++j) {
if ((crc32Table[j] >> 24) == (crc >> 24)) {
crc = ((crc ^ crc32Table[j]) << 8) | (j ^ buffer[i]);
break;
}
}
}

This gives us the CRC at the beginning of the known buffer, and we can apply the patch algorithms described in earlier articles.

Although not that common, it may be worth looking at a solution to address the case where the polynomial doesn't have unique high octets. In this case we need to do a bit more work to find all solutions. We can use a recursive function call, but this may not scale very well, so the best option is to use a stack where we keep track of each found match in the CRC table.
There are two pieces of information that needs to be pushed on the stack for each match; the current location in the buffer and the CRC at that location. Then we need to modify the program slightly to use the stack. The CRCs we found at the beginning of the known buffer is then stored in a list.

std::list<UInt32> startCrcs;
std::stack<match> stack;

// Push last octet in the buffer to seed the loop.
stack.push(new Match(size, INV(endCrc)));

while (!stack.empty()) {
Match* node = stack.top();
stack.pop();
if (node->offset == 0) {
// Reached the beginning of the buffer; save the CRC.
startCrcs.push_back(INV(node->crc));
}
else {
// Find previous CRCs and push nodes to stack.
for (int i = 0; i < 256; ++i) {
if ((crc32Table[i] >> 24) == (node->crc >> 24)) {
UInt32 prevCrc = ((node->crc ^ crc32Table[i]) << 8) |
(i ^ buffer[node->offset - 1]);
stack.push(new Node(node->offset - 1, prevCrc));
}
}
}
delete node;
}

Usage

As mentioned, this algorithm may be useful if you want to back calculate CRC in a stream of data or in a frame when you know the final CRC in order to modify the data. There are some limitations though, one is that there may be multiple solutions depending on polynomial. The bigger limitation is that the algorithm requires that the CRC at the end (or at least at some point later than where the to calculate CRC at) which may be unknown.

Source

I have added an implementation of the Rewind algorithm to the previous CRC package with some examples and test code. The source code can be found here.