For the unfamiliar - What is code guessing?

Code guessing is a game played on one of my Discord servers. Participants write programs that solve a specific problem. The solutions (usually written in C or Python) are published on my website (6th round page) and the participants try to guess who made each of the solutions. After the guessing phase ends, the points are calculated. Being guessed costs one point, guessing someone correctly grants one point. The person with the most points is declared the winner and picks the next challenge (and supported languages).

This round of code guessing required the participants to write the following program:

Make a function that takes a natural number $i$ and returns or prints a series of unique numbers $a_1$ to $a_n$, where $\sum_{k = 1}^{n} f(a_k) = i$ and $f(x) = fib(x)$ for $x \geq 0$, and $f(x) = -fib(x)$ otherwise. For example, $i = 1070$ should yield -1 -5 11 16, because:

$1070 = 987 + 89 - 5 - 1$ and $987 = fib(16)$, $89 = fib(11)$, $5 = fib(5)$ and $1 = fib(1)$

Meaning that $1070 = fib(16) + fib(11) - fib(5) - fib(1) = f(16) + f(11) + f(-5) + f(-1)$

Because addition is commutative, the output vector can be given in any order.

My entry

My entry to the 6th round follows:

#define _     long
#define __    (_$$>>'`'-'!')
#define ___   return
#define ____  float
#define _____ int

const    _ WANT      = 0xff800000;
const    _ ADORE     = 0x3f2aaaab;
const ____ NEED      = 0.34948500216;
const ____ LOVE      = 0.20898764025;
const ____ BREATHE   = 1.19209291e-7f;
const ____ LIVE      = 1.0f;
const ____ THIRST    = 0.230836748f;
const ____ DEVOTION  = -0.279208572f;
const ____ LUST      = 0.331826030f;
const ____ AFFECTION = -0.498910339f;
const ____ PASSION   = 0.693147184f;
const ____ YOU       = 2.302585092994045f;

// Bite me...

_ _$(_*_$_,_ _$$,_ _$$_){___ _$_[_$$_]<=((_$$+__)^__)?_$$_:_$(_$_,_$$,_$$_-'>'+'=')
;}_$___(_*_$_,_*_$__$,_ _$_$$,_ _$$_,_ _$$,_ _$$$,_ _$$$_,_ _$$$$){___!_$$?'$'-'$':
((_$__$[_$$$_]=((_$$$$=(-':'+';')|__)*(_$$$=(_$$_=_$(_$_,_$$,_$$_))+!((((((_$$+__)^
__)-_$_[_$$_])+((((_$$+__)^__)-_$_[_$$_])>>'`'-'!'))^(((_$$+__)^__)-_$_[_$$_])>>'`'
-'!')<(((((_$$+__)^__)-_$_[_$$_+'@'-'?'])+((((_$$+__)^__)-_$_[_$$_+'&'-~~'%'])>>'`'
-'!'))^(((_$$+__)^__)-_$_[_$$_+')'-'('])>>'`'-'!'))))),(_$___(_$_,_$__$,_$_$$,_$$$+
':'-';',_$$-_$_[_$$$]*_$$$$,_$$$,_$$$_+')'-'(',_$$$$)));}_ _$__(____ _$$){___ _$$==
(____)((_)_$$)?(_)_$$:(_)_$$+'^'-']';}____ _$_$(_ _$){___*(____*)&_$;}____ _$$$_$$(
____ _$_){____ _$$__,_$$_;_ _$$$_;_$$__=(_$$_=_$_$((*(_*)&_$_)-(_$$$_=((*(_*)&_$_)-
ADORE)&WANT))-LIVE)*_$$_;___(((____)_$$$_*BREATHE)*PASSION+(((THIRST*_$$_+DEVOTION)
*_$$__+(LUST*_$$_+AFFECTION))*_$$__+_$$_))/YOU;}_*_$$$$$(_ _$$,_____*_$$__){_ _$$$,
_$$_,z,r,k,m;if(_$$==0){*_$$__=0;___ malloc(8);}_$$$=_$$_=1;r=0;k=_$$;while(_$$_<k)
{z=_$$$+_$$_;_$$$=_$$_;_$$_=z;}while(k){k=k-_$$$<=_$$_-k?k-_$$$:_$$_-k;r-=-1;while(
_$$$>=k){z=_$$_-_$$$;_$$_=_$$$;_$$$=z;}}m=_$__((_$$$_$$(_$$+0.5)+NEED)/LOVE);_ _$_[
m];_*_$__$=malloc(8*r);_$_[0]=0;_$_[1]=1;for(z=0;z<m-2;z++)_$_[z+2]=_$_[z+1]+_$_[z]
;_$___(_$_,_$__$,m,z+';'-':',_$$,'$'-'$','$'-'$','$'-'$');*_$$__=r;___ _$__$;}/* */

Preliminaries

That’s a lot to unpack, so let’s try it step by step. The first step is passing the code through the C preprocessor:

const long WANT = 0xff800000;
const long ADORE = 0x3f2aaaab;
const float NEED = 0.34948500216;
const float LOVE = 0.20898764025;
const float BREATHE = 1.19209291e-7f;
const float LIVE = 1.0f;
const float THIRST = 0.230836748f;
const float DEVOTION = -0.279208572f;
const float LUST = 0.331826030f;
const float AFFECTION = -0.498910339f;
const float PASSION = 0.693147184f;
const float YOU = 2.302585092994045f;

long _$(long*_$_,long _$$,long _$$_){return _$_[_$$_]<=((_$$+(_$$>>'`'-'!'))^(_$$>>'`'-'!'))?_$$_:_$(_$_,_$$,_$$_-'>'+'=')
;}_$___(long*_$_,long*_$__$,long _$_$$,long _$$_,long _$$,long _$$$,long _$$$_,long _$$$$){return!_$$?'$'-'$':
((_$__$[_$$$_]=((_$$$$=(-':'+';')|(_$$>>'`'-'!'))*(_$$$=(_$$_=_$(_$_,_$$,_$$_))+!((((((_$$+(_$$>>'`'-'!'))^
(_$$>>'`'-'!'))-_$_[_$$_])+((((_$$+(_$$>>'`'-'!'))^(_$$>>'`'-'!'))-_$_[_$$_])>>'`'-'!'))^(((_$$+(_$$>>'`'-'!'))^(_$$>>'`'-'!'))-_$_[_$$_])>>'`'
-'!')<(((((_$$+(_$$>>'`'-'!'))^(_$$>>'`'-'!'))-_$_[_$$_+'@'-'?'])+((((_$$+(_$$>>'`'-'!'))^(_$$>>'`'-'!'))-_$_[_$$_+'&'-~~'%'])>>'`'
-'!'))^(((_$$+(_$$>>'`'-'!'))^(_$$>>'`'-'!'))-_$_[_$$_+')'-'('])>>'`'-'!'))))),(_$___(_$_,_$__$,_$_$$,_$$$+
':'-';',_$$-_$_[_$$$]*_$$$$,_$$$,_$$$_+')'-'(',_$$$$)));}long _$__(float _$$){return _$$==
(float)((long)_$$)?(long)_$$:(long)_$$+'^'-']';}float _$_$(long _$){return*(float*)&_$;}float _$$$_$$(
float _$_){float _$$__,_$$_;long _$$$_;_$$__=(_$$_=_$_$((*(long*)&_$_)-(_$$$_=((*(long*)&_$_)-
ADORE)&WANT))-LIVE)*_$$_;return(((float)_$$$_*BREATHE)*PASSION+(((THIRST*_$$_+DEVOTION)
*_$$__+(LUST*_$$_+AFFECTION))*_$$__+_$$_))/YOU;}long*_$$$$$(long _$$,int*_$$__){long _$$$,
_$$_,z,r,k,m;if(_$$==0){*_$$__=0;return malloc(8);}_$$$=_$$_=1;r=0;k=_$$;while(_$$_<k)
{z=_$$$+_$$_;_$$$=_$$_;_$$_=z;}while(k){k=k-_$$$<=_$$_-k?k-_$$$:_$$_-k;r-=-1;while(
_$$$>=k){z=_$$_-_$$$;_$$_=_$$$;_$$$=z;}}m=_$__((_$$$_$$(_$$+0.5)+NEED)/LOVE);long _$_[
m];long*_$__$=malloc(8*r);_$_[0]=0;_$_[1]=1;for(z=0;z<m-2;z++)_$_[z+2]=_$_[z+1]+_$_[z]
;_$___(_$_,_$__$,m,z+';'-':',_$$,'$'-'$','$'-'$','$'-'$');*_$$__=r;return _$__$;}

At this point, I dropped the code into clang-format. Unfortunately, it looks even scarier than before (presumably because its volume increased). My entry uses a technique described in my previous blog post about non-alphanumeric code. So let’s fold the constants then:

  • '`' - '!' is 63
  • '$' - '$' is 0
  • '>' - '=', '@' - '?', '^' - ']', '&' - ~~'%', ')' - '(' and ';' - ':' are 1
  • ':' - ';' is -1

The transformed code follows:

const long WANT = 0xff800000;
const long ADORE = 0x3f2aaaab;
const float NEED = 0.34948500216;
const float LOVE = 0.20898764025;
const float BREATHE = 1.19209291e-7f;
const float LIVE = 1.0f;
const float THIRST = 0.230836748f;
const float DEVOTION = -0.279208572f;
const float LUST = 0.331826030f;
const float AFFECTION = -0.498910339f;
const float PASSION = 0.693147184f;
const float YOU = 2.302585092994045f;

long _$(long *_$_, long _$$, long _$$_) {
  return _$_[_$$_] <= ((_$$ + (_$$ >> 63)) ^ (_$$ >> 63))
             ? _$$_
             : _$(_$_, _$$, _$$_ - 1);
}
_$___(long *_$_, long *_$__$, long _$_$$, long _$$_, long _$$, long _$$$,
      long _$$$_, long _$$$$) {
  return !_$$ ? 0
              : ((_$__$[_$$$_] = ((_$$$$ = (-':' + ';') | (_$$ >> 63)) *
                                  (_$$$ = (_$$_ = _$(_$_, _$$, _$$_)) +
                                          !((((((_$$ + (_$$ >> 63)) ^
                                                (_$$ >> 63)) -
                                               _$_[_$$_]) +
                                              ((((_$$ + (_$$ >> 63)) ^
                                                 (_$$ >> 63)) -
                                                _$_[_$$_]) >>
                                               63)) ^
                                             (((_$$ + (_$$ >> 63)) ^
                                               (_$$ >> 63)) -
                                              _$_[_$$_]) >>
                                                 63) <
                                            (((((_$$ + (_$$ >> 63)) ^
                                                (_$$ >> 63)) -
                                               _$_[_$$_ + 1]) +
                                              ((((_$$ + (_$$ >> 63)) ^
                                                 (_$$ >> 63)) -
                                                _$_[_$$_ + 1]) >>
                                               63)) ^
                                             (((_$$ + (_$$ >> 63)) ^
                                               (_$$ >> 63)) -
                                              _$_[_$$_ + 1]) >>
                                                 63))))),
                 (_$___(_$_, _$__$, _$_$$, _$$$ - 1,
                        _$$ - _$_[_$$$] * _$$$$, _$$$, _$$$_ + 1,
                        _$$$$)));
}
long _$__(float _$$) {
  return _$$ == (float)((long)_$$) ? (long)_$$ : (long)_$$ + 1;
}
float _$_$(long _$) { return *(float *)&_$; }
float _$$$_$$(float _$_) {
  float _$$__, _$$_;
  long _$$$_;
  _$$__ = (_$$_ = _$_$((*(long *)&_$_) -
                       (_$$$_ = ((*(long *)&_$_) - ADORE) & WANT)) -
                  LIVE) *
          _$$_;
  return (((float)_$$$_ * BREATHE) * PASSION +
          (((THIRST * _$$_ + DEVOTION) * _$$__ + (LUST * _$$_ + AFFECTION)) *
               _$$__ +
           _$$_)) /
         YOU;
}
long *_$$$$$(long _$$, int *_$$__) {
  long _$$$, _$$_, z, r, k, m;
  if (_$$ == 0) {
    *_$$__ = 0;
    return malloc(8);
  }
  _$$$ = _$$_ = 1;
  r = 0;
  k = _$$;
  while (_$$_ < k) {
    z = _$$$ + _$$_;
    _$$$ = _$$_;
    _$$_ = z;
  }
  while (k) {
    k = k - _$$$ <= _$$_ - k ? k - _$$$ : _$$_ - k;
    r -= -1;
    while (_$$$ >= k) {
      z = _$$_ - _$$$;
      _$$_ = _$$$;
      _$$$ = z;
    }
  }
  m = _$__((_$$$_$$(_$$ + 0.5) + NEED) / LOVE);
  long _$_[m];
  long *_$__$ = malloc(8 * r);
  _$_[0] = 0;
  _$_[1] = 1;
  for (z = 0; z < m - 2; z++)
    _$_[z + 2] = _$_[z + 1] + _$_[z];
  _$___(_$_, _$__$, m, z + 1, _$$, 0, 0, 0);
  *_$$__ = r;
  return _$__$;
}

At this point, it’s a good idea to test if the code still works. I did so by tacking the following function at the bottom of it:

int main() {
    int dest;
    long * vec = _$$$$$(102362929, &dest);
    for(int _$$ = 0; _$$ < dest; _$$++)
        printf("%ld ", vec[_$$]);
}

The program prints 40 23 12 -8 -5 -2, so it’s still valid. The next step is merging constant values into the code. I also renamed the identifiers now so they’re easier to tell apart.

long f(long *a, long b, long c) {
  return a[c] <= ((b + (b >> 63)) ^ (b >> 63)) ? c : f(a, b, c - 1);
}

int q(long *b, long *c, long a, long d, long h, long g, long i, long e) {
  return !h ? 0
            : ((c[i] =
                    ((e = 1 | (h >> 63)) *
                     (g = (d = f(b, h, d)) +
                          !((((((h + (h >> 63)) ^ (h >> 63)) - b[d]) +
                              ((((h + (h >> 63)) ^ (h >> 63)) - b[d]) >> 63)) ^
                             (((h + (h >> 63)) ^ (h >> 63)) - b[d]) >> 63) <
                            (((((h + (h >> 63)) ^ (h >> 63)) - b[d + 1]) +
                              ((((h + (h >> 63)) ^ (h >> 63)) - b[d + 1]) >>
                               63)) ^
                             (((h + (h >> 63)) ^ (h >> 63)) - b[d + 1]) >>
                                 63))))),
               (q(b, c, a, g - 1, h - b[g] * e, g, i + 1, e)));
}

long g(float n) { return n == (float)((long)n) ? (long)n : (long)n + 1; }
float h(long n) { return *(float *)&n; }

float i(float n) {
  float t, u;
  long v;
  t = (u = h((*(long *)&n) - (v = ((*(long *)&n) - 0x3f2aaaab) & 0xff800000)) -
           1.0f) *
      u;
  return (((float)v * 1.19209291e-7f) * 0.693147184f +
          (((0.230836748f * u + -0.279208572f) * t +
            (0.331826030f * u + -0.498910339f)) *
               t +
           u)) /
         2.302585092994045f;
}

long *entry(long input, int *out_len) {
  long x1, x2, z, r, k, m;
  if (input == 0) {
    *out_len = 0;
    return malloc(8);
  }
  x1 = x2 = 1;
  r = 0;
  k = input;
  while (x2 < k) {
    z = x1 + x2;
    x1 = x2;
    x2 = z;
  }
  while (k) {
    k = k - x1 <= x2 - k ? k - x1 : x2 - k;
    r -= -1;
    while (x1 >= k) {
      z = x2 - x1;
      x2 = x1;
      x1 = z;
    }
  }
  m = g((i(input + 0.5) + 0.34948500216) / 0.20898764025);
  long fib[m];
  long *result = malloc(8 * r);
  fib[0] = 0;
  fib[1] = 1;
  for (z = 0; z < m - 2; z++)
    fib[z + 2] = fib[z + 1] + fib[z];
  q(fib, result, m, z + 1, input, 0, 0, 0);
  *out_len = r;
  return result;
}

Diving into the code

At this point, the code starts looking more and more bearable. Let’s take on the smallest functions first:

long g(float n) { return n == (float)((long)n) ? (long)n : (long)n + 1; }
float h(long n) { return *(float *)&n; }

g returns truncated n if n doesn’t have a decimal part. Otherwise, it returns truncated n + 1, meaning that g is a ceiling function.

h reinterprets a long as a float. Scary.

What’s f then? Let’s see:

long f(long *a, long b, long c) {
  return a[c] <= ((b + (b >> 63)) ^ (b >> 63)) ? c : f(a, b, c - 1);
}

Looking at where f is called, we deduce that a is the fib array from main. When we look at it’s initialisation code, it’s fairly obvious that this array is a cache of fibonacci numbers:

  long fib[m];
  fib[0] = 0;
  fib[1] = 1;
  for (z = 0; z < m - 2; z++)
    fib[z + 2] = fib[z + 1] + fib[z];

What about b and c? b doesn’t seem to be altered across recursive calls and it might be a good idea to turn the recursive function into something that uses iteration:

long f(long *a, long b, long c) {
  while(a[c] > ((b + (b >> 63)) ^ (b >> 63)))
    c--;
  return c;
}

While the current fibonacci number (starting from c) is greater than… something, we keep decreasing it, until it’s smaller or equal to it.

What’s this bit trickery anyways? b + (b >> 63) will add the highest bit of this number to itself. Obviously, this has two possible outcomes - it yields either b - 11 or b. Ans ^ (b >> 63) ends up being either (b - 1) ^ -1, or b ^ 0. b ^ 0 is obviously b, while (b - 1) ^ -1 is more interesting. Looking at the xor truth table, we learn that xoring a binary digit by 1 negates it, so our expression could be rewritten as ~(b - 1). For a signed integer, ~ acts fairly similar to absolute value, except it’s off by one2, so the - 1 part fixes it:

#include <stdint.h>
#include <stdio.h>

int main(void) {
    uint64_t c = 0xFFFFFFFFFFFFFFFC;
    int64_t b = c;
    printf("~0xFFFFFFFFFFFFFFFC = %llx\n", ~c);
    printf("0xFFFFFFFFFFFFFFFC as int64_t: %d", b);
}

This way, we’ve established that ((b + (b >> 63)) ^ (b >> 63)) is abs(b). Let’s use this in the code:

long f(long *a, long b, long c) {
  while(a[c] > abs(b))
    c--;
  return c;
}

f returns the largest fibonacci number lesser or equal to abs(b) (because c, judging by all calls of f, is the size of the fibonacci number cache).

The Q function

Let’s take on q now. We apply the same absolute value transformation on it:

int q(long *b, long *c, long a, long d, long h, long g, long i, long e) {
  return !h ? 0
            : ((c[i] = ((e = 1 | (h >> 63)) *
                        (g = (d = f(b, h, d)) +
                             !((((abs(h) - b[d]) + ((abs(h) - b[d]) >> 63)) ^
                                (abs(h) - b[d]) >> 63) <
                               (((abs(h) - b[d + 1]) +
                                 ((abs(h) - b[d + 1]) >> 63)) ^
                                (abs(h) - b[d + 1]) >> 63))))),
               (q(b, c, a, g - 1, h - b[g] * e, g, i + 1, e)));
}

Our familiar pattern occurs in the code once more - we take the absolute value of abs(h) - b[d] and abs(h) - b[d + 1]:

int q(long *b, long *c, long a, long d, long h, long g, long i, long e) {
  return !h ? 0
            : ((c[i] = ((e = 1 | (h >> 63)) *
                        (g = (d = f(b, h, d)) +
                             !(abs(abs(h) - b[d]) < abs(abs(h) - b[d + 1]))))),
               (q(b, c, a, g - 1, h - b[g] * e, g, i + 1, e)));
}

Let’s break this ugly return statement down and ponder a little about 1 | (h >> 63). We already know that h >> 63 is either -1 if h is negative or 0 otherwise. Because -1 is just all ones, 1 | 0xFFFFFFFFFFFFFFFF simply yields 0xFFFFFFFFFFFFFFFF. In the 0 case, 0 | 1 becomes 1, so 1 | (h >> 63) yields us the sign of the number. After applying the transformations, the e parameter is no longer needed, so it has been removed.

int q(long *b, long *c, long a, long d, long h, long g, long i) {
  if(!h) return 0;
  long sign = 1 | (h >> 63);
  d = f(b, h, d);
  g = d + !(abs(abs(h) - b[d]) < abs(abs(h) - b[d + 1]));
  c[i] = sign * g;
  return q(b, c, a, g - 1, h - b[g] * sign, g, i + 1);
}

A few things can be done about the code in this shape, starting with removing the negation from g.

int q(long *b, long *c, long a, long d, long h, long g, long i) {
  if(!h) return 0;
  long sign = 1 | (h >> 63);
  d = f(b, h, d);
  g = d + (abs(abs(h) - b[d]) >= abs(abs(h) - b[d + 1]));
  c[i] = sign * g;
  h -= b[g] * sign;
  return q(b, c, a, g - 1, h, g, i + 1);
}

Now, we can turn the tail call into a for loop:

int q(long * fibs, long * result, long a, long d, long h) {
  long g = 0;
  for(int i = 0; h; i++) {
    long sign = 1 | (h >> 63);
    d = f(fibs, h, d);
    g = d + (abs(abs(h) - fibs[d]) >= abs(abs(h) - fibs[d + 1]));
    result[i] = sign * g;
    h -= fibs[g] * sign;
    d = g - 1;
  }
  return 0;
}

We know that f() returns the index in fibs, d is the length of the fibonacci numbers array as perceived by f, and h is our input (judging by the call to q in main). Let’s introduce meaningful names to our code now. a seems to be meaningless, so we remove it:

int q(long * fibs, long * result, long idx, long in) {
  long g = 0;
  for(int i = 0; in; i++) {
    long sign = 1 | (in >> 63);
    idx = f(fibs, in, idx);
    g = idx + (abs(abs(in) - fibs[idx]) >= abs(abs(in) - fibs[idx + 1]));
    result[i] = sign * g;
    in -= fibs[g] * sign;
    idx = g - 1;
  }
  return 0;
}

Let’s describe the algorithm now:

  • Compute the sign of input.
  • Find the largest fibonacci number which is lesser or equal to the input.
  • If it’s worth it, increment the index (so we subtract a larger fibonacci number). The operation is considered to be worth it if the absolute value after subtracting the larger fibonacci number is smaller than the absolute value after subtracting the smaller fibonacci number.
  • Add the fibonacci index to the output vector. Make sure it has the correct sign.
  • Subtract the fibonacci number from our input
  • Set a new boundary on fibonacci number cache size passed to f (to make the scan a little more efficient).

Explaining the I function

Now that we’ve figured out the actual solution part, paradoxically, we’re heading into the most tangled areas of the code.

float i(float n) {
  float t, u;
  long v;
  t = (u = h((*(long *)&n) - (v = ((*(long *)&n) - 0x3f2aaaab) & 0xff800000)) -
           1.0f) *
      u;
  return (((float)v * 1.19209291e-7f) * 0.693147184f +
          (((0.230836748f * u + -0.279208572f) * t +
            (0.331826030f * u + -0.498910339f)) *
               t +
           u)) /
         2.302585092994045f;
}

Like with most eldritch horrors like this, I like to plug a few numbers into the function to see what’s going on:

i(0.500000) = 153.826324
i(1.500000) = 0.176091
i(2.500000) = 0.397937
i(3.500000) = 0.544074
i(4.500000) = 0.653216
i(5.500000) = 0.740360
i(6.500000) = 0.812919
i(7.500000) = 0.875063
i(8.500000) = 0.929420
i(9.500000) = 0.977726
i(10.500000) = 1.021188
i(11.500000) = 1.060694
i(12.500000) = 1.096914
i(13.500000) = 1.130340
i(14.500000) = 1.161372
i(15.500000) = 1.190332
i(16.500000) = 1.217484
i(17.500000) = 1.243041
i(18.500000) = 1.267175
i(19.500000) = 1.290035

The results look oddly similar to what we’d get from $\log_{10} n$, except the domain of our function is presumably $<1, \infty>$.

The constant 0xff800000 should already alert someone familiar with floating point trickery. Assuming IEEE-754, this snippet extracts the exponent and the sign from a floating point number - 0xFF800000 is 9 ones followed by a trail of zeroes, since an IEEE-754 number consists of the sign bit, exponent (8 bits) and the mantissa. Intuition aside, let’s try to methodically untangle this function:

float i(float n) {
  const long x = (*(long *)&n);
  long v = (x - 0x3f2aaaab) & 0xff800000;
  float u = h(x - v) - 1.0f;
  float t = u * u;
  return (((float)v * 1.19209291e-7f) * 0.693147184f +
          (((0.230836748f * u + -0.279208572f) * t +
            (0.331826030f * u + -0.498910339f)) *
               t +
           u)) /
         2.302585092994045f;
}

Let’s focus on the declarations. x is of course the IEEE-754 representation of our input number. 0x3f2aaaab is the binary representation of $0.(6)$ in IEEE-754 and it’s used to reduce the mantissa range to $<\frac{2}{3}, \frac{4}{3}>$. We extract the sign and exponent of it as v. Subtracting the exponent from the input naturally gives us the mantissa (in u). Subtracting 1 from the mantissa gives us a $<-\frac{1}{3}, \frac{1}{3}>$ range of m.

So what actually happened here? We separated our input $n$ into the exponent and mantissa, where $n = m * 2^e$, so $\log n = e * \log 2 + \log m$. Finally, if we limit the range of $m$ enough, we can compute log1p(u) which is approximately $\log n$, so what we’re striving for is limiting $m$ into a range so that the argument to log1p is in a small, uniform range. And that’s what the declarations did.

The following computation oddly resembles something involving a Chebyshev polynomial, which I’m already familiar with (I used it for approximating trigonometric functions). Regarding the algorithm for approximating functions in a Chebyshev space (e.g. the subspace of $n$-th order Chebyshev Polynomials in the space of real continuous functions in an interval), I used the Remez exchange algorithm.

Finally, 2.302585092994045f is approximately $\log 10$. Having computed $\log n$, we turn it into $\log_{10} n$ using the change-of-base logarithm theorem.

The entry function

The last piece of the puzzle is the entry function:

long *entry(long input, int *out_len) {
  long x1, x2, z, r, k, m;
  if (input == 0) {
    *out_len = 0;
    return malloc(8);
  }
  x1 = x2 = 1;
  r = 0;
  k = input;
  while (x2 < k) {
    z = x1 + x2;
    x1 = x2;
    x2 = z;
  }
  while (k) {
    k = k - x1 <= x2 - k ? k - x1 : x2 - k;
    r -= -1;
    while (x1 >= k) {
      z = x2 - x1;
      x2 = x1;
      x1 = z;
    }
  }
  m = g((i(input + 0.5) + 0.34948500216) / 0.20898764025);
  long fib[m];
  long *result = malloc(8 * r);
  fib[0] = 0;
  fib[1] = 1;
  for (z = 0; z < m - 2; z++)
    fib[z + 2] = fib[z + 1] + fib[z];
  q(fib, result, z + 1, input);
  *out_len = r;
  return result;
}

Having read the q function, we can easily identify the obvious parts - guarding against the zero case, setting the results, pregenerating the fibonacci cache and calling the solver, so let’s ignore these parts for now:

long *entry(long input, int *out_len) {
  long x1, x2, z, r, k, m;
  /* SNIP */
  x1 = x2 = 1;
  r = 0;
  k = input;
  while (x2 < k) {
    z = x1 + x2;
    x1 = x2;
    x2 = z;
  }
  while (k) {
    k = k - x1 <= x2 - k ? k - x1 : x2 - k;
    r -= -1;
    while (x1 >= k) {
      z = x2 - x1;
      x2 = x1;
      x1 = z;
    }
  }
  m = g((i(input + 0.5) + 0.34948500216) / 0.20898764025);
  long fib[m];
  long *result = malloc(8 * r);
  /* SNIP */
}

Having analysed the q function, it’s fairly easy to guess that the block of code between initialisation of m and the first /* SNIP */ is simply counting the amount of fibonacci numbers in the decomposition of input - it wasn’t really needed, but I did it only to precalculate the amount of space required in the result vector. Hopefully it added a little quirkiness to the code.

Given $k$, the decomposition of it is simply a pair of multisets $P_{+}$ and $P_{-}$ for which the following holds:

$$ \sum_{i \in P_{+}}^{} fib_{i} - \sum_{i \in P_{-}}^{} fib_{i} = k $$

To justify my solution, I’m providing a few theorems that back it, but the proofs of them have been left as an exercise to the reader.

Assuming $ A = P_{-} \cup P_{+} $ and $m$ as the multiplicity function, the multiset pair isn’t optimal if:

$$ \mathop{\exists}\limits_{i}^{} i \in P_{-} \wedge i \in P_{+} \\
\mathop{\exists}\limits_{n \in A}^{} m_A (n) > 2 \\
\mathop{\exists}\limits_{i \in N_{+}}^{} \left(i \in P_{+} \wedge i + 1 \in P_{+}\right) \vee \left(i \in P_{+} \wedge i + 1 \in P_{-}\right) \vee \left(i \in P_{-} \wedge i + 1 \in P_{+}\right) \\
\mathop{\exists}\limits_{i \in N_{+}}^{} i \in P_{-} \wedge i + 2 \in P_{+} $$

It is known that:

$$ \mathop{\forall}\limits_{n \in N_{+}}^{} \mathop{\exists}\limits_{P_{-}, P_{+}}^{} \left( \sum_{i \in P_{+}}^{} fib_{i} - \sum_{i \in P_{-}}^{} fib_{i} = n \right) \wedge \left(\mathop{\forall}\limits_{i \in A}^{} m_A (i) = 1\right) \\
fib_n \leq k \le fib_{n+1} \implies \mathop{\exists}\limits_{P_{-}, P_{+}}^{} \mathop{\forall}\limits_{m \in A}^{} m \leq n + 1 \\
fib_n \leq k \le fib_{n+1} \implies \mathop{\exists}\limits_{P_{-}, P_{+}}^{} \left( \sum_{i \in P_{+}}^{} fib_{i} - \sum_{i \in P_{-}}^{} fib_{i} = n \right) \wedge n \in P_{+} \wedge n + 1 \in P_{+} \\
fib_n \leq k \le fib_{n+1} \wedge k - fib_n \leq fib_{n+1} - k \implies \mathop{\exists}\limits_{P_{-}, P_{+}}^{} \left( \sum_{i \in P_{+}}^{} fib_{i} - \sum_{i \in P_{-}}^{} fib_{i} = n \right) \wedge n \in P_{+} \\
fib_n \leq k \le fib_{n+1} \wedge k - fib_n \gt fib_{n+1} - k \implies \mathop{\exists}\limits_{P_{-}, P_{+}}^{} \left( \sum_{i \in P_{+}}^{} fib_{i} - \sum_{i \in P_{-}}^{} fib_{i} = n \right) \wedge n + 1 \in P_{+} $$

These theorems allow us to define the computational complexity of the decomposition length checking function as $O(\log n)$.

The final part of the code that begs for an analysis follows:

m = g((i(input + 0.5) + 0.34948500216) / 0.20898764025);

We notice that m is the size of the fibonacci cache, so this formula must determine the count of fibonacci numbers smaller or equal to input. Let’s use meaningful function names here:

m = ceil((log10(input + 0.5) + 0.34948500216) / 0.20898764025);

Let’s start with Binet’s formula, and then try to find the formula for the index of a given fibonacci number.

$$ F_{n} = \frac{\phi^n - {(-\phi)}^{-n}}{\sqrt{5}} \\
n = \log_{\phi} (\sqrt{5} (y + \frac{1}{2})) \\
n = \frac{\log_{10} (\sqrt{5} (y + \frac{1}{2}))}{\log_{10} \phi} \\
n = \frac{\log_{10} \sqrt{5} + \log_{10} (y + \frac{1}{2})}{\log_{10} \phi} \\
n \approx \frac{0.34948500216 + \log_{10} (y + \frac{1}{2})}{0.20898764025} \\
$$

Obviously, input might not be a fibonacci number in our case, but I’ve performed tests that show that this approximation coupled with rounding up the result are good enough to work without problems.

Having seen how this formula was made provokes a thought that in the end, I could have just let the logarithm function be $\log n$ instead of $\log_{10} n$, and then used base $e$ throughout my Binet’s formula transformation. Since the contest was _code guessing_, I wanted to make my code as obfuscated as possible, and I figured out that $\log_{10} n$ could be trickier to figure out. Also it introduced a new, menacing constant, so it was good enough.


  1. Not b + 1, since b is signed and the most significant bit being set means that the number is negative (2s complement). This behavior is platform-dependent, but it works on most machines. ↩︎

  2. This has an interesting consequence - ~-x decrements a number, while -~x increments it. ↩︎