Decoding Floats

One caveat with implicitly converting a large integer into a float is that you lose precision. In the example below you can clearly see that something is amiss.

1
2
3
4
5
6
7
8
9
10
#include <stdio.h>

int main (void) {
    int x = 100000001;
    float y = x;
    int z = (int) y;
    printf("x: %d\n", x);
    printf("y: %f\n", y);
    printf("z: %d\n", z);
}

The preceding program should output the following.

1
2
3
100000001
100000000.000000
100000000

But why does this happen? In this post we’re going to jump down the rabbit hole and see if we can learn more. First, copy and compile this program with the -g flag so that debugging information will be generated.

1
gcc -g -o example example.c

Now we’re going to inspect the example binary with GDB. We need to read out the raw bytes of each variable. And since an int and a float are both 32 bits/4bytes, we can use the x/4b command to read out the first 4 bytes of each variable.

1
2
3
4
5
6
7
8
9
10
11
12
gdb example
(gdb) break main
(gdb) run
(gdb) next
(gdb) x/4b &x
0x7fff5fbff3fc: 0x01    0xe1    0xf5    0x05
(gdb) next
(gdb) x/4b &y
0x7fff5fbff3f8:   0x20    0xbc    0xbe    0x4c
(gdb) next
(gdb) x/4b &z
0x7fff5fbff3f4:   0x00    0xe1    0xf5    0x05

I’m on a little endian machine, so I’m going to reverse the bytes to make things easier to read and understand. And I’m also going to go ahead and convert our bytes from hexadecimal to binary.

1
2
3
4
5
6
7
8
9
10
11
12
13
def byte_to_bits(byte):
    return bin(byte).replace("0b", "").zfill(8)

def bytes_to_bits(bytes):
    return "".join(map(byte_to_bits, bytes))

x = [0x05, 0xf5, 0xe1, 0x01]
y = [0x4c, 0xbe, 0xbc, 0x20]
z = [0x05, 0xf5, 0xe1, 0x00]

x_bits = bytes_to_bits(x)
y_bits = bytes_to_bits(y)
z_bits = bytes_to_bits(z)

x and z are ints, so their layout in memory is very straightforward. The bytes literally represent our decimal number in binary.

y is a float, and is laid out in a different manner. Now, remember that I reversed the bytes earlier. This will come in handy now, and will make deciphering a binary float much easier.

An IEEE 754 binary32 single precision float will consist of 1 sign bit, 8 exponent bits, and 23 mantissa bits.

What’s a mantissa? The significand of a floating-point number.

1
2
3
sign     = y_bits[0]
exponent = y_bits[1:9]
mantissa = y_bits[9:]
1
2
3
sign:     0
exponent: 10011001
mantissa: 01111101011110000100000

Below you’ll see that the the mantissa portion of y lines up perfectly with x and z starting at the 6th bit. But how exactly does the compiler truncate x into the 23-bits that make up the mantissa portion of a float?

1
2
3
print "x -> ", x_bits
print "y -> ", "______" + mantissa + "___"
print "z -> ", z_bits
1
2
3
x ->  00000101111101011110000100000001
y ->  ______01111101011110000100000___
z ->  00000101111101011110000100000000

After a bit of Googling I found a simple algorithm to convert a binary int into a binary float. The algorithm is as follows.

  • set sign to 1 if negative else 0
  • shift bits left until a 1 is encountered, and count # of shifts
  • mantissa = bits[# of shifts : 23 + # of shifts]
  • exponent = 127 + (32 - # of shifts)

So if we apply this algorithm to our input of int x it results in 6 shifts. And the first 23-bits from that shift point gives us our mantissa. These results match up perfectly with the raw bytes that we read from GDB.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
def encode_float(bits):
    sign = str(0 if int(bits, 2) else 1)
    position = 0
    shifts = 0 
    while (1):
        shifts += 1
        if bits[position] == "1":
            break
        position += 1
    mantissa = bits[shifts:23 + shifts]
    exponent = bin(127 + (32 - shifts)).replace("0b", "")
    return sign, exponent, mantissa

print x_bits
print "______{2}___".format(*encode_float(x_bits))
1
2
00000101111101011110000100000001
______01111101011110000100000___

This shows how magnitude is preserved and precision is lost. The algorithm takes 23-bits starting from the most significant bit, resulting in a loss of 3-bits of precision. If the integer were larger the loss would be greater.

On Taking Notes

I’m trying to make a habit of taking notes on things that I read. My approach is quite simple. Whenever I encounter something important I rephrase it as a “dumb” question and then answer it in my own words. I’ve found that this practice disengages my natural inclination to read on auto-pilot and has massively improved my retention and comprehension. What’s also interesting is that this technique has often led me down curious and unexpected intellectual avenues.

Take for instance this popular quote from Einstein, which I encountered while reading Influence by Robert Cialdini, a popular-science book on persuasion.

“Everything should be made as simple as possible, but not simpler.”

I started out by asking myself the question, what did Einstein really mean? When you think about it this quote is very vague and it’s hard not to add your own meaning into the mix. And as it turns out this quote is actually paraphrased from the following text, which was taken from a lecture that Einstein gave.

“It can scarcely be denied that the supreme goal of all theory is to make the irreducible basic elements as simple and as few as possible without having to surrender the adequate representation of a single datum of experience.”

Whoa, talk about recontextualisation. And then again, a few pages later Cialdini reinterprets a quote from Alfred North Whitehead’s An Introduction to Mathematics as a general statement on the complexity of modern life.

“Civilization advances by extending the number of important operations which we can perform without thinking about them.”

Shocking, a book on persuasion is written in a persuasive style. But I digress … my point is that the act of taking notes, annoying as it may seem, can catapult you into thinking about all sorts of odd things if you let it.

Hacker School Epilogue

Hacker School ended yesterday, and I’ve got a lot on my mind. It was a magnificent experiment in learning, and Hacker School will always hold a place in my heart. Hacker School blew my mind in so many ways that it’s almost indescribable the feeling that I have right now.

The diversity of people and guests was astounding, covering almost every nook and cranny of programming. If you had a question about anything, there was bound to be someone in the batch who had some experience, no matter the topic. We had longtime hackers, globe-trotting hackers, newbie hackers, low-level Linux kernel hackers, games and graphics hackers, crackers, data scientists, physicists, mathematicians, Excel-gurus, artists, and more!

But the most important thing that I learned at Hacker School was how to dive deep. And that focusing all of your energies on one topic or book will be exponentially more fruitful then flitting from one blog post to the next.

hacker school

I’m quitting my job, and going back to school.

Armin Hofmann Exercise

… it should be remembered that the line is the visible trace of a moving dot.
— Armin Hofmann, Graphic Design Manual

Armin Hofmann is a Swiss graphic designer, and the author of a slim little textbook entitled, Graphic Design Manual. It’s a bit cryptic as far as textbooks go, but Hofmann does show you an interesting method for exercising your design skills.

Hofmann never explicitly says do this, or do that. But for any given exercise Hofmann usually begins with an initial figure and either an explicit, or implied grid. Hofmann will then compose a design on top of this figure using only a few simple shapes.

To start off I’ve decided to use the initial figure from exercise No. 26, which is a black square with an explicit grid. I will limit my choice of shapes down to the dot.

After a few false starts I was able to come up with this interesting pattern of dots. It does have a nice rhythm to it, but it’s too busy and kind of boring compositionally.

To make it more striking I reduced the pattern down to just the center column. It’s compositionally stronger, but now the grid is overpowering the design and making it feel rigid and flat.

After ditching the grid and applying a bit of tone, I feel like it’s finally working.

Linked List in JavaScript

Last week Eli, a fellow OkCupid-ian, showed me how to implement and reverse a singly linked list in C. Mind blown. I was able to follow along for the most part by stepping through it, line by line, with GDB. But I still felt like I wasn’t getting it. I especially had some difficulty grasping how to reverse one with just pointers. To remedy this I’m going to implement a singly linked list in JavaScript, and also try to explain how to reverse one in place.

Since JavaScript doesn’t have structs or pointers, I’m just going use whatever is laying around. Each node will be represented by an object literal, with two parameters. The data parameter will hold the data for that node, and the next parameter will be a reference to the next object in the linked list. For convenience and simplicity I’ve written a node constructor, which does very little other than enforce node-ness.

1
2
3
4
5
6
function Node (data) {
    return {
        data: data,
        next: null
    };
}

Node what? Well, a singly linked list is series of nodes strung together by pointers, or in our case references to objects. We can build nodes, but now we need a way to string these nodes together. Right down below is a bare-bones implementation, which gives us just enough functionality to build and print a linked list.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
function LinkedList () {
    this.head = null;
}

LinkedList.prototype.prepend = function (data) {
    var curr = Node(data);
    curr.next = this.head;
    this.head = curr;
    return this;
};

LinkedList.prototype.print = function () {
    var curr = this.head,
        res = [];
    while (curr) {
        res.push(curr.data);
        curr = curr.next;
    }
    res.push("null");
    console.log(res.join(" -> "));
    return this;
};

Let’s give it a whirl, and create a linked list. Now remember to use the new keyword or else you won’t be able to access any of the functions that were added to the function prototype. Also take note that these functions are chainable. This is possible because the this keyword is being returned at the end of each function.

1
2
3
var list = new LinkedList();

list.prepend(3).prepend(2).prepend(1).print();

You should see the following output in your JavaScript console.

1
1 -> 2 -> 3 -> null

Woo hoo! But where it starts to really get fun is when you try to reverse a singly linked list in place. If you aren’t familiar with the jargon, to do something “in place” means that you are not allocating any extra storage space to transform your input.

I’ll just cut to the chase. The big secret to reversing a singly linked list, in place, is that all you have to do is walk the list and flip each node’s pointer, or object reference in our case, in the opposite direction. Pointers FTW!

1
2
3
4
5
6
7
8
9
10
11
12
13
14
LinkedList.prototype.reverse = function () {
    var curr = this.head,
        prev = null;
    while (curr) {
        var next = curr.next;
        curr.next = prev;
        prev = curr;
        curr = next;
        if (next) {
            this.head = curr;
        }
    }
    return this;
};

Let’s reverse the previous list, and see if it works.

1
list.reverse().print();

It’s magic, you know.

1
3 -> 2 -> 1 -> null

#c0ffee

Convert the following hexadecimal color codes into RGB.
1. #c0ffee
2. #bada55

First, let’s figure out how to make some #c0ffee.

1
2
3
4
#   <- Convention to notate a hexadecimal number  
c0  <- Red value  
ff  <- Green value  
ee  <- Blue value  

We can disregard the “#” symbol because it’s irrelevant for our conversion, and what we’re left with are three hexadecimal numbers that represent the values for red, green and blue.

JavaScript also has its own convention for notating hexadecimal numbers, which is “0x”. And if you take each of the values from above, and enter them into your console with this prefix, the console will convert them into decimal for you just like magic!

1
2
3
4
5
6
0xc0  
192  
0xff  
255  
0xee  
238  

So, the solution for #c0ffee would be 192, 255, 238.

Now that we know how to cheat, let’s go ahead and convert one by hand. Take #bada55 for example. We know that we can disregard the “#” symbol since it’s simply a convention. And what we have left are the values, “ba”, “da” and “55”.

If you are new to hexadecimal notation it can be very confusing to see letters and numbers mixed together. Take a look at the chart below and you’ll see that it’s actually pretty intuitive and easy to understand how hexadecimal letters are related to decimal numbers. Decimal numbers are on the left, and hexadecimal are on the right.

Basically, decimal and hexadecimal are two ways to represent numbers. The key difference is that decimal uses ten symbols (0-9), whereas hexadecimal uses sixteen symbols (0-f) to accomplish the same task.

1
2
3
4
5
6
7
8
9
10
11
0   0       11  b       22  16
1   1       12  c       23  17
2   2       13  d       24  18
3   3       14  e       25  19
4   4       15  f       26  1a
5   5       16  10      27  1b
6   6       17  11      28  1c
7   7       18  12      29  1d
8   8       19  13      30  1e
9   9       20  14      31  1f
10  a       21  15      32  20

The number of symbols used in each notation also has an effect on the place values. In decimal, also known as base 10, each place value is a power of 10. But in hexadecimal, also known as base 16, each place value is a power of 16.

Let’s compare the place values between decimal and hexadecimal.

1
2
3
1       1
10      16
100     256

In decimal, the number 186 is actually short hand for saying one 100’s, eight 10’s and six 1’s. Similarly in hexadecimal, the number “ba” is actually short hand for saying “b” 16’s and “a” 1’s. If we substitute “b” and “a” for their decimal equivalents we get eleven 16’s and ten 1’s. And with a little arithmetic we can convert “ba” into a decimal.

1
2
3
ba = b  * 16 + a  * 1
ba = 11 * 16 + 10 * 1
ba = 186

Let’s go through and convert the last two hexadecimal numbers of #bada55.

1
2
3
4
5
6
da = d  * 16 + a  * 1  
da = 13 * 16 + 10 * 1
da = 218  

55 = 5 * 16 + 5 * 1  
55 = 85  

And the solution for #bada55 is 186, 218, 85.

The Little Satanic JavaScripter

I’ve just started reading The Little Schemer and it’s a wonderful book with an interesting approach and tone, where each concept is explained with a series of playful questions and answers.

And it occurred to me how fun and useless it would be to apply this to JavaScript, with a bit of a satanic bent. You’ll need to open up the JavaScript console in your evilest browser to follow along.

Is it true that this is a string?

“satan”

True, because “satan” is a string of characters, and also the name of our dark overlord.

1
2
> typeof "satan"
  "string"

Is it true that this is a number?

666

True, because 666 is a string of digits, and also the Number of the Beast.

1
2
> typeof 666
  "number"

Is it true that this is a number?

“666”

False, because “666” is a string of digits enclosed by quotes.

1
2
> typeof "666"
  "string"

Is is true that a is equal to b?

Where a is 666 and b is “666”

True, when using the Equality operator (==) because it will convert the operand types to match before comparing them. In this case 666 is converted to a string, so the expression equates to true.

1
2
> 666 == "666"
  true

False, when using the Identity operator (===) because it will not convert the operand types to match before comparing them. In this case the types do not match, so the expression equates to false.

1
2
> 666 === "666"
  false

Is it true that this is an object?

[ ]

True, because [ ] is an empty Array.

1
2
> typeof []
  "object"

Is it true that this is an Array?

[ ]

True, because [ ] is constructed with the Array constructor.

1
2
> ([]).constructor.name
  "Array"

Is it true that this is an object?

{ }

True, because { } is an empty Object.

1
2
> typeof {}
  "object"

Is it true that this is an Array?

{ }

False, because { } is constructed with the Object constructor.

1
2
> ({}).constructor.name
  "Object"

Is it true that this is a function?

1
2
3
function hail() {
    return "Hail Satan!"; 
}

True, because calling hail evaluates a block of one or more statements.

1
2
3
4
> hail();
  "Hail Satan!"
> typeof hail
  "function"

Write your own function to pay worship to the king of darkness.

Radix Sort

What is the most beautiful idea that you’ve come across in computer science?

About a month ago someone asked me this question, and being the computer science noob that I am, I had nothing to say. His answer though, was the radix sort. He tried in vain to explain it to me, but I just couldn’t get it.

So, today I’m going to try and implement, as best I can, a radix sort in JavaScript. I started off by trying to read the Wikipedia article, but I quickly got lost. I eventually found this simple tutorial video, and this is what I’ll be basing my implementation on.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
Array.prototype.radix_sort = function () {
    var queue, digits;
    queue = function () {
        return [
            [], [], [], [], [], [], [], [], [], []
        ];
    };
    digits = function (n) {
        var count = 0;
        if (n < 0) {
            n = -n;
        }
        while (n) {
            count++;
            n = Math.floor(n / 10);
        }
        return count;
    };
    var t = this,
        q = queue(),
        max = digits(t.max()),
        m = 10,
        d = 1;
    for (var i = 0; i < max; i++) {
        for (var j = 0, size = t.length; j < size; j++) {
            var k = Math.floor((t[j] % m) / d);
            q[k].push(t[j]);
        }
        t = q.flatten();
        q = queue();
        m *= 10;
        d *= 10;
    }
    return t;
};

Array.prototype.max = function () {
    for (var i = max = 0, size = this.length; i < size; i++) {
        if (this[i] > max) max = this[i];
    }
    return max;
};

Array.prototype.flatten = function (result) {
    var result = result ? result : [],
        is_array = function (o) {
            return toString.call(o) == "[object Array]";
        };
    for (var i = 0, size = this.length; i < size; i++) {
        if (is_array(this[i])) {
            if (this[i].length) {
                result = this[i].flatten(result);
            }
        }
        else {
            result.push(this[i]);
        }
    }
    return result;
};

Arithmetic, Geometric & Harmonic Mean

This week at work I’ve been making histograms for various user related statistics. (e.g. Match percentage distribution) It was fascinating to learn that there are numerous ways to calculate the average of a given data set. And each method can correct or even skew the final result in a substantial way.

Let’s dive in and explore the arithmetic, geometric and harmonic means by first defining them and then implementing them in Haskell.

Arithmetic Mean

The arithmetic mean is the sum of the numbers divided by the size of the set.

a[1] + a[2] + … a[n] / n

Geometric Mean

The geometric mean is the product of the numbers to the power of one over the size of the set.

a[1] * a[2] * … * a[n] ^ 1 / n

Harmonic Mean

The harmonic mean is the size of the set divided by the sum of the reciprocal of each of the numbers.

n / (1 / a[1]) + (1 / a[2]) + … + (1 / a[n])

1
2
3
4
5
6
7
8
9
10
11
import Data.List

arithmeticMean :: (Real a) => [a] -> Double
arithmeticMean xs = realToFrac (sum xs) / genericLength xs

geometricMean :: (Real a) => [a] -> Double
geometricMean xs = realToFrac (product xs) ** (1 / genericLength xs)

harmonicMean :: (Real a) => [a] -> Double
harmonicMean xs = genericLength xs / 
        realToFrac (sum $ map (recip . realToFrac) xs)