Friday, November 26, 2010

Currying in Ruby

I've spent the last month playing around with ruby, mostly because I feel I should. I'd like to share with you some cool things you can do with functions in Ruby 1.9. I discovered this by starting with a question.

In python, you could do this sort of thing.

def foo():
    return "bar"

x = foo

print x() # "bar"

In ruby you can call a function without brackets. So you can do:

def foo()
    return "bar"
end

puts foo # "bar"

My question was. How would you pass a function in ruby? This is one way:

def foo()
    return "foo"
end

x = lambda {foo}
puts x
puts x.call() # "foo"

Starting from humble beginnings. I wondered if I could use this mechanism for setting up a function with arguments that I could call later. The way you do this is quite intuitive.

def add(a,b)
    return a+b
end

y = lambda{add(3,2)}
puts y
puts y.call() # 5

From here its easy to think that currying in ruby must be very simple. It is!

def add(a,b)
    return a+b
end

plus = lambda {|a,b| add(a,b)}
curry_plus = plus.curry
plus_two = curry_plus[2]
puts plus_two[3] # 5

Sunday, October 3, 2010

Finding the Fun

I was days away from uploading Geomancy to the app store, when I discovered that I had been using instruments incorrectly. Contrary to my much twittered belief that Geomancy was memory leak free, I discovered that my fledging app was full of them. I immediately set about at tracking them down and fixing them, and spent a few days learning about memory management in Objective C and I was able to clean up all the issues. Normally the prospect of tracking down memory leaks would have filled me with dread, but this time I looked forward to it. I wanted to make the app good (half for fear of apple rejecting my first submission and half because I was scared of there being some issue that apple wouldn't find but the reviewing public would). It was the first time I can remember thinking that tracking down memory leaks was fun.

The idea seems to be (and it seems obvious) that you get a lot more out of learning something if you enjoy the process of doing it.

A few days ago I was walking around a small bookshop and ended up in the section with all the self-help books (the shop had called this section: Self Enrichment). There were loads of titles about trying to enjoy your job. What's important to me is not really my job, it's programming, I love programming and I love all the meta stuff around programming like talking about code and thinking about code. We all have days when we wonder what we're doing, but I'm having one of those days where I'm just happy to be where I am doing what I do, so I'm going to enjoy it.

Tuesday, July 20, 2010

Shared memory and forking in C

Something rather interesting happened at work today. One of my colleagues was going through his book shelf and found his original copy of The Unix Programming Environment (Kernigham & Pike 1983). slipped in one of the pages was a slip of paper from an old email - the date was December 1985. There was no writing in the email, just a C program that appeared to demonstrate shared memory and fork(). The code took a bit of cleaning up, and I've added some bits in to improve it as a demonstration, but here goes.

There are two main elements to this. A simple structure, that we're going to change in the child (and see the change in the parent. And a static int that we're going to change in the parent, but not see change in the child.

Our program is going start by defining out structure and int.
struct msg {
 int i;
 char str[100];
} msg
static int statInt = 1;
Then create some shared memory to put a single instance of the structure in
key = getpid();
shmflg = IPC_CREAT | 0770;
shmid = shmget(key, sizeof(msg), shmflg
Next, we need to attach the shared memory block to our process and we'll write some stuff to it.
segadr = (struct msg*)shmat(shmid, 0, 0);
segadr->i = 10;
sprintf((char*)segadr->str,"A message from parent %d\n", getpid());
Now we're going to call fork and depending on the return, we'll know if we are in the child or parent process.
if (fork() == 0) {
    /* Is child*/
}
else {
    /* Is parent */
}
The Child
Our child is now going to print the value of statInt and is also going to attach itself to the shared memory block above, then read from it.
printf("In Child: statInt %d\n", statInt);
shmid = shmget(key, sizeof(msg), 0);
chdadr = (struct msg *)shmat(shmid, 0, 0);
fprintf(stdout, "This is child %d, gets %s\n", getpid(),chdadr->str)
Now our child is going to write back a message
sprintf((char*)chdadr->str, "A message from child\n");
printf("In Child: We've written a message back\n");
And then detach itself from shared memory
if (shmdt(chdadr) == -1) {
    perror("Child - detach");
}
We're now going to sleep for 1 second, then look at statInt one last time, to see if it's changed.
sleep(1);
printf("In Child: statInt %d\n", statInt);
The Parent
Our parent (the block of code in the else) Is going to start by changing statInt to 2. The it's going to wait for the child to finish.
statInt = 2;
printf("In Parent: We just changed statInt to %d\n", statInt);
wait(&stat)
wait() will block until the child dies, at this point we're going to read from the shared block, we don't need to attach again because our Pid hasn't changed.
fprintf(stdout, "In parent: segadr = %s\n", segadr->str);
Then finally, like good citizens, we're going to mark our shared block to be destroyed.
cmd = IPC_RMID;
if (shmctl(shmid, cmd, buf) == -1) {
    perror("Parent - remove");
}

That's the whole thing done. What you should see, (depending on timing), is the parent change statInt, but the child won't see the change. That's because fork() takes a snapshot of the state. So the child wont' see any changes in that state, except by using shared memory. You'll see it read from this and the parent will see the change made by the child. This is an example:
In parent: shmid =  40730676
shmget errno Success
shmat errno Success
In Parent: We just changed statInt to 2
In Child: statInt 1
In child: shmid = 40730676
In Child: shmat errno Success
This is child 6751, gets A message from parent 6749

In Child: We've written a message back
In Child: statInt 1
In parent: segadr = A message from child

You can see the full code listing on my public git repo This includes some debug printing, and some comment about what I had to change in the original to get it working.

Wii Nunchuck for arduino with Python serial reader

Today I followed the excellent tutorial on Windmeadow about how to read data from a wii nunchuck using an arduino. It's my first real dive into electronics. But I'll do my best to explain.

My setup used an arduino duemilanove. The advantage compared to the board used in the Windmeadow post is that the duemilanove actually has a 3.3V supply (which is what the nunchuck wants apparently). I already had some male-male jumper wires so I didn't need to strip apart my nunchuck, I was able to push the wires into the right places. If you can imagine the back of the nunchuck connector looking like this:
Clock Empty Ground
Empty
3.3V Empty Data

Then the connections need to be made to the arduino as below.

Wii Arduino
Top Left (Clock) Analog In 4
Top Right (Ground) Ground4
Bottom Left (3.3v) 3.3V
Bottom Right (Data) Analog In 5

From this point you should be able to follow the code in the Windmeadow post to get some working firmware, The only alteration I made was to print to serial in a way that was going to be easy to parse:
void print_for_python(int x, int y) {
    Serial.print(0x00, BYTE);    
    Serial.print(x, BYTE);
    Serial.print(0x01, BYTE);
    Serial.print(y, BYTE);
}

I could then write a simple python script using pySerial
import serial
import struct
ser = serial.Serial('/dev/ttyUSB1')
ser.baudrate = 19200
print ser.portstr
while True:
    line = ser.read(1)
    b = struct.unpack('<B', line)[0]
    if b == 0x00:
        x = ser.read(1)
        x = struct.unpack('<B', x)[0]
        print 'X: %s' % x
    elif b == 0x01:
        y = ser.read(1)
        y = struct.unpack('<B', y)[0]
        print 'Y: %s' % y
    else:
        print b b

This example only uses the joystick, but I've since modified it to read the button presses as well and I will be testing it out on our companies robot this week. The results I've got seem pretty good, there's no noticeable delay between using the joystick and seeing the results in my python script.




Tuesday, July 6, 2010

A Lesson in talking to customers

Last week we got a visit from some of our suppliers. They had a new version of firmware that they needed to role out across their existing hardware (it seems a component had changed which required them to update the firmware). As is quite understandable, they had taken this opportunity to add loads more features that they wanted to discuss with us, and help us through some of the other code changes we needed to make to support this level of firmware.

The trouble was, all of these features were useless to us, it was heartbreaking. I understand the pride you feel when you are showing of a feature that you think is great, i get it a lot. But these features weren't needed by us at all. But it's too late. It now looks like we're being forced into an upgrade - which is going to take a lot fo effort on the part of our company (development and roll out/ upgrade costs)and are unlikely to go with this supplier on the next project we do.

It goes to show. Talk to your customers!

Wednesday, April 28, 2010

Updating field sqlite databases

The feature I'm currently working on requires a new column to be added to an existing table in one of our sqlite databases. There are quite a few of these databases in the field so I wanted to write a routine that would update them automatically. After reading around. I settled on the following simple scheme:

def do_upgrade(self):
"""
Do upgrade to a field database, anything we need to add to existing databases should be done here

"""
conn = self.engine.connect()
try:
conn.execute("alter table drive_stats add drive_model text") #Update for Sprint 43
except OperationalError:
pass
conn.close()


This uses the Easier to ask forgiveness scheme, We try to alter the database, if it raises an error then ignore it.
ALTER will always complete in constant time so there isn't much of a penalty. We could run a SELECT to check for the existence of the column first, but this will take more time. I spent an hour or so looking for a better way of doing it, but this seemed to be quite widely accepted.

Wednesday, March 31, 2010

The perils of side effects, an example

I've just spent two hours of my life debugging a problem that demonstrates perfectly the problem of side effects in functions. The code in question was fairly unremarkable:


print convert_to_customer_format(data, default_item)
customer_interface.send(convert_to_customer_format(data, default_item))


the print line had been innocently added for debug, but it caused an error in the customer interface component, as the line had been added to assist with debug on the customer interface component it was a while before the error was traced back to the code above. We log all calls and returns from functions, so it didn't take long to realise that what we were printing was different to what we were trying to send, so suspicion quickly fell on the conversion function:

def convert_to_customer_format(data, default_item):
#turn into list
for key in data.keys():
data[key].insert(0, key)
data = data.values()
#pad with empty items
default_item.insert(0, 'Empty')
number_of_pads = range(len(data), 16)
for x in number_of_pads:
data.append(default_item)
return data

It's not a very pretty function, but you if you run it like so:

default_item = [0, 0, 0, 0]
data = defaultdict(lambda:default_item)
data['foo'] = [1, 2, 3, 4]
data['bar'] = [5, 6, 7, 8]
x = side_effect(data, default_item)
print x
y = side_effect(data, default_item)
print y


You'll see that it changes the data each time. You wouldn't have this kind of trouble in Haskell! It got me thinking though, could I make a decorator that would force a function to not have side effects, or at least raise an exception if it did. I think I've done it, i'd appreciate comments:

def no_side_effects(func):
def inner_func(*args, **kwargs):
pre_call_args = copy.deepcopy(args)
pre_call_kwargs = copy.deepcopy(kwargs)
print 'pre_call: %s| %s' % (pre_call_args, pre_call_kwargs)
result = func(*args, **kwargs)
print 'post call: %s| %s' % (args, kwargs)
if args == pre_call_args and kwargs == pre_call_kwargs:
return result
else:
raise Exception('Side effect found: Function altered the arguments')
return inner_func

Tuesday, March 2, 2010

The danger of assumptions

When it comes to error messages, and error recovery, Our team is often asked - usually unknowingly, to make an assumption about the current state of the system, and how we can recover from that assumed state. From the everyday world, most of us know to avoid making assumptions, as the popular saying goes - they make asses of us. The same is true in the software world, and the cost of making a bad assumption can be just as big.
For one of our customers, unplugging a hard drive in the middle of a test can be destructive. As a result, any drive that is pulled out mid test needs to go through an extra stringent failure analysis which takes time, personnel, and money. But in addition, investigations are launched to determine how a drive could be pulled out mid test. How do we know this has happened? Our customer logs a helpful error message to tell us:

Drive unplugged while test in progress. Stopping all tests...

Fairly good error message yes? Not only does it tell us what went wrong, but it tells us what action we will be taking - "if only all error messages looked like that!" I hear you cry.

What isn't clear is that the error "Drive unplugged while test in progress" is making an assumption, you can't see it because it's in the code.
The assumption in question seems, at first glance, to be perfectly acceptable. The assumption is this:

"If we have made a call to start the test, the test will be running" - Again, this assumption isn't very clear in the code itself, as if often the case in imperative programming, it's all about changing the state. Booleans are sent in certain places and read in others and all of this makes it quite difficult to spot these assumptions. So why can't we be making this assumption? Because it's based on a very simplistic view of what it means to request (or call) something. In concurrent and distributed systems (of which this is an example) we can't guarantee this. Some calls are added to queues and have to wait their turn before being executed, some messages get lost as they travel around the network.

This means that if a drive hasn't started a test - because either it hasn't got the message yet, or because the message is lost and our drive is idle, then we might, as part of our recovery decide to move it, and BANG. That's when our assumption bites us. There isn't a test running, due arguably to an error on our part that as good vendors, we want to recover from, but we can't. Because an incorrect assumption is being made. From this stems a lot of work to test these drives, an investigate how they could be unplugged mid test, all because of a bad assumption in the code.

So what is the solution? We took the approach of telling the user exactly what has gone wrong, if a sensor is in the wrong state this is all we report, we don't use it to make a guess at the failure mode. You won't see an unexpected sensor value being reported as "Failed to actuate gripper" because the problem might just be a faulty sensor. There is obviously a lot of debate about what a good or bad error message is, I will save this topic for another post.

Thursday, January 21, 2010

Doctest in Python

As part of my mission to learn a new thing each day I've ended up learning about the doctest python module today. It let's you add intepreter commands and responses as docstrings for each function and let's you run them, giving you feedback if anything is amiss. Here's an example:
def fib(x):
    """
    >>> fib(0)
    0
    >>> fib(1)
    1
    >>> fib(2)
    1
    >>> fib(3)
    2
    >>> fib(4)
    4
    """
    if x == 0:
        return 0
    elif x == 1:
        return 1
    else:
        return fib(x-1) + fib(x-2)


if __name__ == '__main__':
    import doctest
    doctest.testmod()
Would return:
**********************************************************************
File "fib.py", line 11, in __main__.fib
Failed example:
fib(4)
Expected:
4
Got:
3
**********************************************************************
1 items had failures:
1 of   5 in __main__.fib
***Test Failed*** 1 failures.
It's worth noting that python doesn't handle tail recursion very well, so we wouldn't like to try running fib() on very large numbers, but it's ok for this example

Sunday, January 10, 2010

Twitter Added to My GUI

I've now added twitter functionality to my blogger gui using the python-twitter api. One of the things I plan on doing this year is learning one thing a day and posting what I learned on twitter, not sure how long I can keep it up but we'll see. You can see how I get on here My Twitter Page

Monday, January 4, 2010

Websites and Tools

I started playing around with a new website yesterday, trying to learn css, javascript in preparation for my first facebook app.
Whilst doing this I realised that making websites seemed very repetitive, and the resulting pages don't seem very neat.
What I wanted was to get through to adding content as soon as possible, but most content management systems I looked at seemed much bigger and more complicated than I needed.
So I started writing some tools to make the whole thing a lot easier. This lead me to wondering If I could make posting on my blog easier, since my malaysian connection is slow, and I don't always have a connection when I want to write posts.
So I downloaded Google's Data Protocol which would let me build an app for uploading posts.

And here it is!

My first post using my blogger posting tool. Not bad eh? Notice how it also lets me format stuff offline:

def foo():
message = 'I can even write code'
foo()


At some point I hope to get around to publishing the tools I'm working on