Announcement

June 25, 2009

I started a non-technical blog. Well, actually I started it a long time ago. But I think I’m going to maintain it seriously from now on. You can read it at: http://insipidmusings.wordpress.com/.


Beautiful Code: Code == Data

December 19, 2008

I found this in an exercise in SICP. Instead of the original Scheme, which IMHO is a little difficult to read, I’ve translated it to Haskell.


pair x y = \acc -> (acc x y)
first pair = pair (\x y -> x)
second pair = pair (\x y -> y)

Haskell tends to look a little discouraging at first, but its actually quite straightforward. \x -> expr is how you create a lambda in Haskell. Even though Haskell is statically typed, note that we haven’t specified any types at all. This is because Haskell’s system of type inference is quite powerful.

That said, what does the above do? The function names should make it clear that we’re somehow providing a function that creates a pair, and the first and second functions are accessor methods for the elements of the pair. For instance, they’d be used like this:


*Main> let n1 = pair 10 20
*Main> first n1
10
*Main> second n1
20

Question is: how does this work? Lets look at at the pair function carefully. It’s returning a function that accepts a single parameter acc, and calls acc with the parameters x and y.

The function first accepts a single argument pair as input, and calls pair with an argument that is a function. This function accepts two arguments, and returns the first of those two arguments.

It should all make sense now. pair 10 20 returned a function that would accept a function as parameter, and call that function with the parameters 10 and 20. The functions first and second second call this function (the returned value of pair) with a function as a parameter such that this passed function (the lambda’s in definitions of first and second) returns the appropriate element of the pair.

Pretty cool, no?


Beautiful Code – A Prime Sieve

October 24, 2008

I saw Rob Pike’s excellent Google Tech Talk on “Concurrency and Message Passing in Newsqueak” a few days back. The talk has some very interesting ideas on dealing with parallelism and concurrency. I strongly recommend that you watching it.

I couldn’t resist translating the prime sieve mentioned in the talk into Python. :) Here it is:


import sys

def filter_multiples(n, numbers):
    return (k for k in numbers if k % n != 0)

def prime_sieve(max):
    p, n = 2, xrange(2, max+1)

    while True:
        yield p
        n = filter_multiples(p, n)
        p = n.next()

def main(argv):
    try:
        n = int(argv[1])
    except (IndexError, ValueError):
        n = 100
    for i in prime_sieve(n):
        print i,
    print

if __name__ == '__main__':
    main(sys.argv)

blog.hibernate()

September 6, 2008

It turns out the anonymous commentor from IISc was right. I can’t keep a blog and keep up with course work at IISc. So, somewhat unfortunately, I will have to put this blog into a hibernate mode. 

I will try and post interesting things as and when possible, but I don’t see that happening too often in the near future.

Before I go, here is an interesting way of finding fibonacci numbers that I found in SICP.

The trick is to think of calculating one fibonacci number from the previous as a transformation of a vector in R^{2} . We’re going from (a b) -> (b a+b). But this can be expressed as f’ = Af, where A is a 2×2 matrix and f is a vector. The trick now is to chain many of these transformations together, and then use the method of exponentiation by squaring to calculate the n-th power of A, giving us a method to calculate the n-th fibonacci number in O(log n) time.

Pretty cool, eh?


Hello, World!

August 3, 2008

This post is an attempt to dispel all rumors suggesting that I’m dead, have lost the ability to write, and/or have given up writing this blog. All three assertions are baseless; I’m alive, kicking and have some interesting posts coming up over the next few weeks.

It all started when I decided to interview for the M.S programs at IIT Madras, and IISc. So I had to spend a few weeks preparing for the interviews. Then I went to the interviews and found out that what I thought I knew was a lot lesser than what I thought it was! Eventually, I ended up being accepted at both places and have now joined the M.Sc (Engg) program at SERC, IISc. Between preparing for the interviews, doing some background research on what I want to do in grad-school, and keeping up with work at NI, I couldn’t find any time at all to blog. I am hoping that the move to IISc will give me some more time to focus on this blog.

In between, I bought a Dell Vostro 1510. I wanted a machine that had a decent processor, lots of RAM and HDD space – this one has all three – a Core 2 Duo T8100, 4 GB of RAM and a 250 GB hard drive. I decided to run Ubuntu 8.04 on this, and had a lot of trouble getting the network card to work. It turns out you have to remove the r8169 driver that is installed by default, download the Realtek r8168 driver from their website, compile it from source and install it. The worst part of the driver nightmare was that the default shipping driver randomly works, instead of not working at all, and because of this I got fooled into thinking that passing pci=nomsi to the kernel at boot time fixes this. It doesn’t. You have to do the r8168 compile and install thing. See here for details.

Anyway, that’s all for now. I promise to get back to the functional programming series very soon. Ciao!


Alex Papadimoulis on the Mechanics of Quitting

May 17, 2008

Employees – especially the most talented ones – are not “dating around” and moving from place to place in search of the Perfect Company at which they can grow old and retire at. They’ve already aced the first four rungs of Maslow’s hierarchy and are in search of self-actualization: the instinctual need of humans to make the most of their abilities and to strive to be the best they can.

This point bears repeating. Indefinite retention is impossible; employees always quit. The key part is understanding why, and how to leverage this inevitability towards everyone’s advantage.

Alex Papadimoulis of The Daily WTF fame has written a somewhat provocative and deeply insightful article on the challenges of retaining technical talent. If you are a “techie”, and you haven’t read it yet, you really should.


Functional Programming 101: Lists and Closures

May 14, 2008

This continues the series I started on functional programming, initially posted at blogger and now ported to WordPress.

Lists and List Comprehensions

A key feature of functional programming languages are their list data structures. In many languages the list can store items of disparate data types, and language provides a number of powerful primitives that manipulate these lists.

The primitive we’re going to look at today is called a list comprehension. A list comprehension is similar to how we’re used to specifying sets in mathematics. If I wanted to A define as the set of all integers between 1 and 999 which are divisible by 3, I’d write:

A = \{\;x\:|\:1 < x \leq 999 \: where \: x \,mod \,3\,=\,0\;\}

See how similar this to the Python code that uses a list comprehension to generate the same set.

X = range(1, 1000)
A = [ x for x in X if x % 3 == 0 ]

Very elegant, isn’t it?

The Reduce Function

The reduce builtin function takes as arguments a function and a list (any sequence actually, if you want to be pedantic) and applies the function to the first two elements of the list, it then calls the function with the result of the first call and the third element, and so on.

That was confusing, so lets take an example.

def add(x,y): return x+y
print reduce(add, [1, 2, 3, 4, 5]

The reduce call in this program translates to a set of operations like this:

r1 = add(1, 2)
r2 = add(r1, 3)
r3 = add(r2, 4)
r4 = ad(r3, 5)
return r4

You can think of reduce as applying a binary operator recursively on a vector to produce a scalar. The interesting thing about reduce is that in the case of an associative operator, you can easily parallelize the operation.

Example 1: Problem 1 from Project Euler

‘Nuff talk! Lets put our newly learned skills to test in solving Project Euler’s first problem.

Find the sum of all the multiples of 3 or 5 below 1000.

The problem is trivial to solve imperatively, but our goal is to write this using as many functional constructs as possible.

Instead of solving it for the specific case of integers between 1 and 1000, lets generalize the problem and solve for the arbitrary range a \le \, x < b .

def sum35multiple(a, b):
  L = range(a, b)
  C = lambda n: n%3 == 0 or n%5 == 0
  S = [ x for x in L if C(x) ]
  return reduce(lambda x,y:x+y, S)

Note the trick we play in lines 3 and 5. Line 3 creates a local variable that holds a reference to an anonymous function and line 5 is the place where it is invoked. Combined with closures, this can be used to implement some powerful idioms. We’ll see one example of that when we get around to currying.

Closures

A closure is a somewhat more complicated concept and it will probably take more than one blog entry to get through. So, in this post, we’re going to dive in and use the damn thing before trying to understand it works.

Again our target is a problem from Project Euler – Problem 6.

The sum of the squares of the first ten natural numbers is,

1² + 2² + … + 10² = 385

The square of the sum of the first ten natural numbers is,

(1 + 2 + … + 10)² = 55² = 3025

Hence the difference between the sum of the squares of the first ten natural numbers and the square of the sum is 3025 − 385 = 2640.

Find the difference between the sum of the squares of the first one hundred natural numbers and the square of the sum.

This problem is also rather trivial to solve both imperatively as well as mathematically. But for the purposes of this post, we’re going to take a hybrid semi-computational solution. We need to get through a little math before we get down to code.

Our goal is to find:

D(n)\; = \; {( 1 + 2 + 3 + ... + n)}^{2} - ({1}^{2} + {2}^{2} + {3}^{2} + ... + {n}^{2})

I’ll skip the details here, but it is easy to show that:

D(n)\;=\;\sum_{i=1}^{n}(S(n)-i)\cdot i \; \; \; where \; \; \; S(n) \; = \; \sum_{j=1}^{n} j

Here is straightforward translation of this equation into python code.

def D(n):
    Sn = n*(n+1)/2
    i = range(1, n+1)
    d = map(lambda x,y: (Sn - x)*y, i, i)
    return reduce(lambda x,y: x+y, d)

I’ve introduced a new built-in function in this code, the map function. The map function takes a variable number of arguments, the first of which is always a function, and the remaining all sequences. It iterates over each sequence “in-step” and returns a new sequence that is generated by applying the function on each element of the input sequences.

Again, this is the kind of thing that is best explained with an example.

def print_add(x, y):
    print "adding", x, y
    return x+y

print map (print_add, [1, 2, 3], [4, 5, 6])

Running this snippet produces the output:

adding 1 4
adding 2 5
adding 3 6
[5, 7, 9]

With that explanation, it should be clear that the function call to map in the function D is simply to calculate each term of the first summation in the equation for D(n).

I guess you’re wondering how is all this related to a closure. In fact, I’ve sneaked a closure into the function D(n). :)

Look at line 4 again, I’m using the local variable Sn inside of the anonymous function. Notice how this is fundamentally different from a function that you could create in a C-like language. The function created by the lambda is not even completely defined until line 2 in the function D(n) calculates Sn. Conceptually, we’re creating a different function to pass to map during each invocation of D(n). One of the reasons why people claim that functional languages afford large gains in productivity and elegance of expression is this ability to “close state” and create functions at runtime.

There’s a bit more to this business of passing functions around that I want to cover, but I hope I’ve whetted your appetite enough for now. Until the next post, tata, bye-bye and let me know in the comments how you’re liking/disliking this.


Functional Programming 101: Lambda Forms

May 14, 2008

(This entry was first posted on my ex-blog over at blogger).

In the Beginning

Lets use a simple example: I have a list of integers and I want to create another list that contains only those elements from this list that are greater than 5. The straightforward way to write this is like below:

def filter_above_5(l):
  o = []        # initialize empty list
  for num in l: # iterate over list
    if num > 5: o.append(num)
  return o      # return filtered list

I wrote the snippet above in python, but hopefully, it is easy enough to read even if you haven’t seen the language before. [1]

Shown below is example of how you’d use the filter_above_5 function.

print "original list:", l
print "filtered list:", filter_above_5(l)

Simple enough, isn’t it? For reasons that we will come to later, people like to call this an imperative implementation.

Lets Make This Functional

If you think about it, filter_above_5 is a special case of a generic filter function which iterates over an input list and produces an output list that contains all the elements from the input list that satisfy some condition. If you wanted to make things sound complicated, you’d use predicate instead of “some condition” and sequence and sub-sequence instead of input and output list. :-)

Jokes apart, it would be cool if we could write a general function like this:


def filter_list(predicate, l):
  o = []
  for element in l:
    if predicate(element): o.append(element)
  return o

filter_list is a function that takes a list and a function (our predicate) as parameters and iterates over each element of the input list, and adds only those elements to the output list that satisfy the predicate. Then you’d call filter_list like below:

def greater_than_5(x): return x > 5
print "original list:", l
print "filtered list:", filter_list(greater_than_5, l)

It turns out that python already has a builtin function filter that does exactly what our filter_list function does. So our implementation just boils down to calling the that function with our custom predicate:

def greater_than_5(x): return x > 5
print "original list:", l
print "filtered list:", filter(greater_than_5, l)

Notice the difference between our initial implementation and new one. filter_above_5 did not just tell the interpreter what to do, it also specified exactly how to do it using a series of commands. For that reason, we call it an imperative implementation.

Our newest implementation that uses the builtin filter function does not specify how the filtering should be done. For instance, with all this ado about multi-core, it is not far fetched to think of a version of filter that spawns a number of threads and splits up the work of filtering among these threads. The cool thing is, if, say, Python 3.1 implemented such a multi-threaded filtering function, all our code which used the filter function would get the performance improvement for free.

Plus, notice that this version contains much lesser code, and we’re programming at a much higher level of abstraction, not to mention the fact that the new version is more elegant.

All put together, I think it is fair to say, that the implementation that is based on filter is “better” than the imperative implementation.

Notice an important thing about filter – we are passing a function as a parameter to another function. Programming languages which allow this kind of thing – passing and returning functions from other functions, construction of functions at runtime etc. – are said to have first class functions. Now, exactly what set of attributes are required for a language to be deemed to have first class function support is a controversial topic, and so we will side-step that issue here. [2] The important point is passing functions to other functions doing so is a key aspect of functional programming. And since we’re doing this in our filtering snippet, we’re going to call it a functional implementation.

Enter the Lambda

Nothing we’ve done so far can’t be done from, say, C/C++. In C for instance, we could’ve created a filter function that operated on array of void*, and used a function pointer to pass the predicate around. [Similar in spirit to the qsort library function]. In C++, we could have created a templatized version of filter that was allegedly more elegant and worked in more or less the same way, but would claim to be “safer” because it would be type-checked. [3]

Recognize that functions like greater_than_5 are use and throw functions. They are created only for the purpose of being called from the filtering function, and then they are never used again. Unfortunately, despite the fact that these functions are going to be called from exactly one place, they will still be sitting in source code polluting our namespace. What would be cool is a way of creating the function greater_than_5 specifically for the purpose of passing it to filter and then “forgetting about it”.

The solution to this problem is called a lambda function. In python it looks like this:

print "original list:", l
print "filtered list:", filter(lambda x: x>5, l)

Pay attention to the first argument to filter: “lambda x: x>5″. What it means is: create a function that takes one parameter as input – “x”, and returns the value of the expression x>5 and then return that function that just got created. Note that the created function is anonymous. The key benefit now is that if I had to call filter in a hundred different places with 50 different filtering predicates, I needn’t create a 50 predicates similar to greater_than_5 just to pass them to filter. I can create each of these functions at the point at which filter is called.

This means my code will look much cleaner; there will be no namespace pollution and most importantly I will have to write lesser code.

So, in summary lambda forms allow the functional programmer to create anonymous functions that can be passed around to other functions. Or even be stored for later use [although this is somewhat uncommon].

Isn’t this just syntactic sugar?
The thing is, besides the lack of namespace pollution, and some fewer keystrokes to type [not that these aren't important things, but lets play the devil's advocate here], we still haven’t gained much by using the lambda. And the fact is, if these were the only advantages of the lambda, lambda’s would just be cute syntactic sugar.

But there is more – the key advantage of lambda’s is when they are combined with a language that supports closures. We will look at closures in the next post.

In the meantime, if any of the above didn’t make sense, or you have some feedback, please comment below.


[1] If you still run into problems understanding the syntax, please do let me know using the comments.

[2] Point is that it is is not correct to call python a functional programming language. If you want to be safe, it is best to say that Python is a multi-paradigm language with some functional features.

[3] Ensuring const-correctness on your elegant function, making it behave like a STL function so that it works correctly for everything that implements forward_iterators and/or const_forward_iterators and fun times spent debugging the wonderfully cryptic error messages that compilers spit out for templates are but minor prices to pay for that elegance. [I couldn't resist that crack! :-D ]


Why WordPress?

May 11, 2008

I used to maintain this blog at http://phunday.blogspot.com/ but I was getting increasingly frustrated with how hard it was to post code and equations in blogger. Given that most of my content is technical, this is a big deal for me. To make matters worse, I was running into an irritating bug where a word would get deleted if I tried to delete the space preceding it.

The last straw on the proverbial camel’s back was the blogger “feature” of auto-generation of paragraph tags from line breaks. I don’t like the damn feature, but unfortunately all my posts in the past used it. And for some reason beyond my comprehension, the wise guys at blogger have decided to make the feature setting global to all posts. This means I if disable the feature now, all my old posts are rendered incorrectly! Here is the fun implication: unless I am willing to spend a lot of time reformatting my old posts inside the crappy blogger UI, I can’t disable it for my new posts! If that weren’t enough, blogger won’t even give me a sandbox where I can disable the feature, reformat my posts and publish the changes all at once. In fact, forget a sandbox, blogger doesn’t have a real preview feature.

I’ve been hearing lots of good things about wordpress, I like their UI, and I especially like their code formatting and math rendering features. In comparison, the only new feature I’ve used from blogger over the last year is their Google Reader Shared Item Plugin. I don’t count this as a blogger feature, its a Google Reader feature.

From where I stand it appears that the WordPress folks are actually implementing features that I care about. And so, I’m voting with my feet and moving the blog over here. As a service to the few folks (thank you guys!) who actually read my blog, I will continue to update blogger with a new link each time I make a post here.

So that’s that. I hope you understand why I’m moving the blog here. Please do let me know in the comments if you have a opinion on the blogger vs. wordpress thing.


EDIT (15-May-2008): Fixed embarrassing typo. Thanks Suprita!