Sunday, May 13, 2012

How to win (or not to win) IOCCC

Background

    The International Obfuscated C Code Contest, IOCCC, is a programming contest where the goal is to write the most obscure/obfuscated C program that shows the importance of programming style, stress the compilers and illustrates some of the subtleties of the C language. The competition started in 1984 and is now the longest running contest on the Internet. Through the years, the competition has produced amazing small programs. Man y of them are unique and fun, but there are also programs that solve the same problems as previous winners. One of the most popular is to generate small primes.

    This article talks about different ways to make a small primes program and reasons for why they didn’t win, followed by what eventually led to the 7th winning small primes program.


Simple Small Primes Program

    One of the entries I made to the competition is what may be the shortest small primes program with its 52 characters:

  l;main(O){for(;2-l||printf("%8d",O);l=O%--l?l:++O);}


    This program has some features of winning entries:
  • It is short and compact
  • uses hard to read variable names
  • Uses question mark operator
  • implicit type definitions
  • depends on order of Boolean operators
    This is a fairly good list of features that makes up a good obfuscated program. But it has one big flaw (apart from calculating primes), which is that the algorithm is quite easy to spot in the code. Anything that looks for modulus of one variable by a decreasing or increasing second variable is most likely a prime number generator. To win over the previous winners the program really need to be a bit more creative in hiding what it does, or do it in a more interesting way than easy to spot brute force.


Disguising the Algorithm

    A simple modification to the program is to make the brute force approach a bit less revealing. One way is to look at what makes a prime. A prime number p is a number greater than 1 that satisfies the equation with only integer numbers:

          a * b = p    iff:   a = 1 or b = 1

A simple substitution of a and b where

          a = i + j
          b = i – j

gives us the following equation:

          (i + j) * (i – j) = i2 – j2 = p    iff:   i – j = 1

    With this equation we can be a bit more creative. We still need to find an integer solution to the equation and one approach is to iterate over x and y until we find two numbers x and y that satisfy the equation above. Knowing that the derivate of n2 is 2n, we can find the squares iteratively where

          12 = 1
          d1 = 1
          dn = dn - 1 + 2
          n2 = (n-1)2 + dn

    The nice thing with this iteration is that we can use only additions and subtractions to find whether a number is a prime. We need two derivate variables in our iteration, say x and y to test the potential prime number. An interesting observation from the initial algorithm is that we actually don’t need to keep track of both squared values as we are only interested in iterating until we find two values of x and y that gives us an integer solution to the equation. Once the integer solution is found, we can check if x – y = 2, which means that p must be a prime.

    We still need one variable to hold the accumulated value of the squares, so we introduce a variable z. Each step in the iteration for finding will increase z by x or decrease z by y depending on whether z is greater or less than 0. We have an integer solution to the problem once z is 0. The iterative algorithm is implemented in the following obfuscated program:

  x,y,z;main(p){for(;;z+=!z?x-y-2?p:printf(
      "%8d",p),x=y=1,p+=2:z>0?y-=2:-(x-=2));}

    This program has many of the characteristics of the first implementation, but it also hides the obvious modulus that killed the first version. For most people it would be pretty hard to see that this version calculates prime numbers, but for anyone skilled in hyperbolic functions, the program still gives enough hints on what it’s doing, and since the program is trying to compete in a category where the IOCCC judges explicitly mention that it is much harder to win with a small primes program, its simply not good enough.


Be Creative in the Implementation

    Although the last program may have been too trivial, it does open up for several interesting implementations as it the problem is solved by simple additions and subtractions. One interesting feature of C is the pre-processor. A prime number generator using the pre-processor has already won IOCCC, so it may be a long shot going that route. That particular entry did hide the math in several pre-processor macros, where defined variable holds some value needed for the calculation. In fact it is very easy to find the actual test for a prime number as it is limited to one line doing a modulus test.

    Since we have a simple algorithm, we could use it to do something less obvious using the pre-processor. What if we didn’t assign any values to the variables we use? We could let a defined variable represent a 1 bit, and if the variable is not defined, it is a 0. This way we can use 10 variable names to represent a 10 bit number.

    We could also use recursive includes, where we have a state variable indicating what one pass of the program code does. For example we could exercise part of the program to do an addition, and a second to do subtraction etc.

    In order to do an addition of two numbers with these constraints, we need to have a one bit adder, that generates a carry. This is really not that hard. Assume we use x and y as the bits, c, as carry, and place the output back in x and c, we can do this in the pre-processor as:

  #ifdef x
  #ifdef y
  #ifndef c
  #define c
  #endif
  #else
  #ifdef c
  #undef x
  #endif
  #endif
  #else
  #ifdef y
  #ifndef c
  #define x
  #endif
  #else
  #ifdef c
  #undef c
  #endif
  #endif
  #endif

    Then we can use the one bit adder to add our 10 bit values A0-A9 and B0-B9 by first defining x and y for each of the bits in the two values and include the program recursively to exercise the code above, like:

  #ifdef A0
  #define x
  #else
  #undef x
  #endif
  #ifdef B0
  #define y
  #else
  #undef y
  #endif
  #include "program.c"

    And then do the same for for A1-A9 and B1-B9. We can do similar for subtractions. Furthermore, we can save the value of the last carry and use it as an overflow bit. Then we can leverage that for testing if a value is greater than another. Finally we need some similar logic to run the actual algorithm by including and exercising the right portions of the code. The whole program can be found at the IOCCC who won page.

    What makes this a winner compared to the other examples is probably that this really pushes the compilers to the limit. Not a single compiler was able to calculate prime numbers up to 1024 at the time of the competition. Linus Torvalds did work hard to fix one compiler within 12 hours after the source code was released to the public, but that is really an exception. Most compilers still don’t handle the program. The number of recursive includes are over 6.8 million, and the amount of memory required by the compiler is significant. It is also extremely slow; it takes hours to compute just a few hundred prime numbers. Furthermore, the actual pre-processed code is many megabytes and the only actual code, is one print statement for each prime number.


Conclusion

    To win IOCCC with something that is not unique or new is a challenge, but it is possible to find a new approach that is interesting enough, even with something as common as a prime number generator. In general though, it is much easier to win with something new, but even then, some fairly clever and interesting techniques need to be present. The good thing is that it is possible to improve an idea and reach something worthy by looking at alternatives and some out of the box thinking, just as these examples show. And the improvements can be introduces iterative and doesn’t need to be thought out in advance.

Happy coding.

4 comments:

  1. Ha ha... Very nice post. Though the IOCCC judges warn that printing primes has been largely played out, I prefer to think of it rather as "having a rich tradition" :)

    At least that was my thinking when I submitted a prime-printing program for the last contest. It had some interesting features -- not least that it is self-documenting and generates primes in a completely new way -- but alas, it did not win (though I have heard that it made the final judging round). I wrote up a little thing about it here: http://computronium.org/ioccc.html

    Glad also to see you were able to push compilers further with your preprocessor entry - a similar entry I cowrote some years before (http://ioccc.org/years.html#1995_vanschnitz) also broke a number of compilers, but apparently they didn't fix them well enough in time for your entry :)

    ReplyDelete
    Replies
    1. Yeah, I don't think this was the last small prime number winner either, but it gets tougher and tougher.

      I looked at your program and its nice indeed. I suppose it runs out of stack eventually as it calls main recursively. Don't think that is a biggie for the competition as long as its documented that it will stop working after a while.

      The tower of Hanoi program is indeed similar in nature to my prime number generator, more so than the 1988 applin one, which also calculates prime numbers in the pre-processor.

      Delete
  2. You really make it seem so easy with your presentation but I find this topic to be really something which I think I would never understand. It seems too complicated and very broad for me. I am looking forward for your next post, I will try to get the hang of it! capital one card login in

    ReplyDelete