Sunday, 30 September 2012

Tailrec in Scala

One of the more tricky aspects to functional programming is optimising your code. This is because you are typically working at a higher level of abstraction than procedural languages. The code you write isn't necessarily a natural fit to the way the processor works. For example, if you want to compare the factorial function in C# vs Scala:

C#:

public void Factorial(int n) 
{
    var result = 0;
    for (int i = n; i > 0; i--)
    {
        result *= i;
    }
    return result;
}

Scala:

def factorial(n: Int): Int = 
    if (n == 0) 1
    else n * factorial(n - 1)

The big difference here is that in a procedural language you'll tend to write a loop that side effects on a variable where as in a functional language you'll tend to use recursion or a higher order function to write side-effect free code. Processors are really good at the former.

What we see here is a compromise between performance and readability. The C# code will be quicker because the code will do all its work in the same stack frame. The Scala version will create n stack frames while calling itself recursively. However, the Scala version is easier to reason about because there are no side effects, i.e. you're not rewriting result in a loop. You may disagree for such a simple example, but once you have a couple of loops and several variables, procedural code becomes very difficult to follow.

Most functional languages address this issue by performing an optimisation called tail-call optimisation. However, some languages are better at it than others. In Scala you need to make sure that your recursive call is in the tail position (i.e. the recursive call is the last expression). In the factorial example above, it isn't the tail expression is actually n * factorial(n - 1). (You can read about this in more detail here and here.)

Reasoning about whether recursive functions will or won't be optimised is a problem for beginners because it isn't always obvious what qualifies for optimisation. There are other criteria to take into account such as the call must be a local function and the function must be final. A great feature of Scala is an annotation, @tailrec, that raises a compile error you if your function won't be optimised:

import scala.annotation.tailrec

@tailrec
def factorial(n: Int): Int = 
    if (n == 0) 1
    else n * factorial(n - 1)

This will generate a compile error in this case because the recursive function can't be tail-call optimised.

To allow the compile to optimise this, you will need to find a way to have the tail call be the recursive call, for example:

def factorial(n: Int): Int = {
    @tailrec
    def recFac(n: Int, acc: Int): Int = 
        if (n == 0) acc
        else recFac(n - 1, acc * n)
    recFac(n, 1)
}

I learnt about @tailrec through Martin Odersky's Scala course on Coursera. The course is fantastic, Odersky's lectures are very interesting and tie theory into practical applications very neatly. The practical assignments really take me back to my first year of university.

Saturday, 15 September 2012

Odd MSBuild platform behaviour

I came across some odd behaviour with msbuild this week. I was building a solution from the command line and setting the platform to Any CPU. This worked find and the whole solution built. I then built one of the projects in the solution on its own (to create a clickonce installer), again with the platform set to Any CPU, and this time it failed. The error I got was that I was trying to build for an unknown platform.

It turns out that the actual platform described in the .csproj is AnyCPU (i.e. without a space) and that when building a solution Any CPU gets translated to Any CPU. So to build a solution you need to use Any CPU and to build a project you need to use AnyCPU. How bizarre.

Saturday, 8 September 2012

Alternative dependency injection in F#

Dependency injection is a pattern where a dependency of a class is passed to an instance of the class at runtime and is not created by the class itself. This is very useful for decoupling code, especially when you are unit testing and want to mock a dependency.

In OO languages like C# and Java, this is mostly often done with constructor injection, i.e.:

interface IB
{
        void DoB();
}

class A
{
        public A(IB b)
        {
                this.b = b;
        }

        public DoA()
        {
                this.b.DoB();
        }

        private IB b;
}

The same thing can be done in F#, but there's an alternative. You could have your dependency be an abstract member and use an object expression to create a concrete instance that has the correct dependency.

type IB =
    abstract DoB : unit -> unit

type B() =
    interface IB with
        member this.DoB() = ()

[<AbstractClass>]
type A() = 
    abstract b : IB
    member this.doA() = this.b.DoB()

let implA = 
    { new A() with 
        member x.b = B() :> IB
    }

This isn't a better way of doing dependency injection. It's a more complicated way of achieving pretty much the same thing, but it does remind me a little of the Scala cake pattern.

Sunday, 2 September 2012

MarkdownPreview 0.2.0

I've uploaded a new version of MarkdownPreview. You can pick it up here. This includes:

  • A better HTML render. The main benefit is that scrolling doesn't go back to the top when the Markdown source is reloaded.
  • Syntax highlighting. This uses Google prettify. You can add syntax to a code block by putting {{cs}} (for csharp, see here) at the top of your code block.

Saturday, 1 September 2012

MarkdownPreview

MarkdownPreview is a small Windows utility to watch a text file written in Markdown for changes and to render it to HTML. You can download it or visit the project page.

MarkdownPreview 0.1.0

I write my blog posts in Markdown using Vim. I wanted to be able to preview what I was writing as I was writing it. While there are some Markdown enabled editors out there, I wanted to stick to Vim, so I wrote a small utility to watch my markdown file for changes and re-render to HTML whenever I save it.

I think this is my first open-source effort (MIT license)! It's only about an hours worth of work, so there may well be problems. Let me know if you find any. The main problems now are:
  • The page scrolls back to the top when it reloads
  • The links are followed in MarkdownPreview, not externally
  • There is annoying sound-effect when the page reloads
  • No syntax highlighting
I'll take a look at these soon.

Monday, 27 August 2012

The object-oriected TDD journey

When you start practicing TDD for real, one of the first problems you'll come across is dependency management. Chances are, that before you were TDDing, whenever you needed to use class A from class B, B would create it's own instance of A or A might be a singleton.

class B
{
    private A instanceA = new A();
}

You soon find that you want to mock A and the best way to get the mock of A into an instance of B (your SUT), is to pass it into the constructor. This is called Dependency Injection:

class B
{
    public B(A a)
    {
        instanceA = a;
    }

    private readonly A instanceA;
}

You also find that you don't want a dependency on A because creating an instance of the mock of A is going to call the constructor of A. So you solve that by having A implement an interface IA and pass IA to B instead.

class B
{
    public B(IA a)
    {
        instanceA = a;
    }

    private readonly IA instanceA;
}

Your next problem comes along when you actually want to create an instance of B. Say the constructor of A takes an instance of IC, you will need to do the following when you want a new B:

B instanceB = new B(new A(new C))

For a large application, this step will be a horror of nested news. So you will then learn about Inversion Of Control and this will help you manage your dependencies:

IocContainer container = new IocContainer();
container.Register<B>();
container.Register<IA>().ImplementedBy<A>();
container.Register<IC>().ImplementedBy<C>();
B instanceB = container.Resolve<B>();

Here the IOC framework is responsible for figuring out the dependencies of B and creating the required instances recursively.

Now you find that you're making a mess becase your IOC framework is letting you create a dependency from any class in your system to any other. You now pay attention to group your classes into modules with small, well-defined interfaces between them.

When working on a piece of code you will try and break it down into small coherent modules that look like stand-alone libraries. They will ideally have a very small set of public interfaces. The IOC registration will be done by the module itself and classes external to the module should only use the public interfaces. This prevents a spaghetti of inter-class dependencies.

Sunday, 26 August 2012

Rich Hickey on Debugging

I was reading Micheal Fogus' interview with Rich Hickey, the man behind Clojure. When asked why he is considered an excellent debugger, his answer was fantastic. It often surprises me that this isn't how many programmers approach debugging.

Fogus: I’ve spoken with a few of your former co-workers, and they described you as a trouble-shooting and debugging master. How do you debug? 
Hickey: I guess I use the scientific method. Analyze the situation given the available information, possibly gathering more facts. Formulate a hypothesis about what is wrong that fits the known facts. Find the smallest possible thing that could test the hypothesis. Try that. Often this will involve constructing an isolated reproducing case, if possible. If and only if the hypothesis is confirmed by the small test, look for that problem in the bigger application. If not, get more or better facts and come up with a different idea. I try to avoid attempting to solve the problem in the larger context, running in the debugger, just changing things to see effects, etc.
Ideally, you know you have solved the problem before you touch the computer, because you have a hypothesis that uniquely fits the facts.

Saturday, 4 August 2012

Scala first impressions

In the last year or two I've been practicing TDD and the things that come naturally from it: dependency injection, inversion of control and data / behaviour separation. All these techniques complement each other and encourage you to write code that is actually quite functional. This lead me to start looking at functional languages to see if there was a better way to build software; OO paradigms were starting to look a little unnecessary. I started with Haskell, it being one of the most pure functional languages. I then tried F# and now I'm playing with Scala.

Scala is interesting because it is a full object-oriented language, in some ways going further than Java, and is also a functional language providing most of the things you'd expect from a functional language. For example:

  • Pattern matching
  • Higher order functions
  • Anonymous functions / lambdas
  • Type inference
  • Currying

Traits

One of the more interesting features of Scala is traits. These are similar to interfaces in Java and C# except that they allow implementations of the methods they describe. I find this a bit scary but also fascinating because a very common code smell I see is people deriving from base classes in order to re-use code between classes and not because of an inherent 'is-a' relationship. Traits let you re-use code between classes without a cumbersome inheritance relationship. I'm only starting to get the hang of them though, for example they can be stacked.

Pattern matching

Pattern matching in Scala is very similar to F# and Haskell. It's one of those features that you find yourself missing from other languages once you've learnt about it.

Type inference

The type inference in Scala is not as good as in F#, this is apparently because of Scala's OO features. That said it's still much better than Java or C#, although that's probably an unfair judgement).

Anonymous functions

This isn't really exclusively the domain of functional languages anymore, C# and even C++ support it now. I think the Java committee have really let themselves down by not including this in Java 7. They are extremely useful and pop up in all sorts of places. They also allow library writers to design great APIs.

JVM

I really like that Scala compiles to bytecode and can be run on lots of platforms. I love C# and the .NET framework, but I hate that it doesn't have first class support on non-windows platforms. The mono framework is an amazing piece of kit, but I'm always going to trust the JVM on Linux over Mono. That's perhaps unfair, but you really don't want to worry about bugs in your runtime environment and there's always a manager you're going to need to convince.

Tooling

One of the things that I found a little disappointing about the Haskell eco-system is that there is a distinct lack of decent tools. The only proper Haskell IDE is Leksah which is a step in the right direction, but is still in its infancy. Haskell is the perfect target for auto-refactoring tools because it has proper strong typing, but there isn't anything serious, no re-sharper or eclipse for Haskell. F# is better but still limited (you can't organise your code in folders in Visual Studio for example).

I've been really impressed with the tools available for Scala. There is decent support available in Intellij and Eclipse (through plugins). They're still a little rough, but are definitely usable.

I think Maven is fantastic and being able to use it to build Scala and pull in maven dependencies is wonderful. Also, SBT is an alternative to Maven for Scala which is easier to get started with (not being XML based helps), which I also quite like. SBT can pull in maven dependencies which means getting an SBT project started is very quick.

Conclusion

I'm clearly becoming a fan of Scala, but I can't help worry that the reason I'm getting on with it is because it's not actually very different from the languages I'm comfortable with (C#, C++, Python, Java etc). I think I still need to keep learning Haskell to try and get around some of my problems with it (like how to build a large system TDDing as I go along - I can't rely on IOC to help me out like I do with C# and Java). However, right now I would be happy to start a production project using Scala. Though I would love to do one in Haskell as an experiment, it would be a risky proposition.

Sunday, 29 July 2012

Scala, Maven and Eclipse

You might be having trouble getting Scala IDE, Maven (m2e) and Eclipse to play nicely. In particular the More than one scala library found in the build path [...] At least one has an incompatible version. Please update the project build path so it contains only compatible scala libraries.

Try the following:
  1. Generate a scala project using the maven archetype scala-archetype-simple
  2. Edit the pom.xml with the following:
    • Change the scala.version to 2.9.2
    • Remove the unit testing dependencies (add the ones you need back later once you have the other stuff working - the default versions won't be compatible with 2.9.2)
    • Change the maven-surefire-plugin to 2.12
  3. Remove the sample test classes in src/test/scala/sample
  4. Import the maven project into eclipse: Package Explorer -> Import... -> Existing Maven Projects
This should now give you a mavenised Scala project loaded into Eclipse. If you have trouble with eclipse not updating with your pom changes, delete the project, clear out the eclipse stuff from the project dir (.project, .settings, .cache, etc) then re-import the maven project.

Also note that I'm using:

Saturday, 28 July 2012

Give your people time to innovate



Developers who are too busy to experiment are wasted talent. Software developers are smart people who need to be creative to be good at their jobs. In all but the most lax environments, team members will have their work set out for them. Even in Agile environments, there might be flexibility in who does what work and developers will be involved in the work estimation process. Ultimately, it is management and customers who decide what work is done.

Google has made the concept of 20% time famous. However, 3M have been encouraging their employees to experiment for 15% of their time since 1948:

"[..] workers often use 15 percent time to pursue something they discovered through the usual course of work but didn't have time to follow up on." Kaomi Goetz - www.fastcodesign.com

The idea is very simple. Give your teams a little slack so that they can try out new things. For example, a programmer might want to try out a new library, learn a new language or build an app. There are some solid benefits:
  • They can work on an idea that they think will be great, but would never be able to convince management without a prototype. Products like Google Mail and the Post-It note came out of these kinds of schemes.
  • Fix something in the product won't ever be a high priority, but these kinds of things tend to add quality to the product over time.
  • On the smaller scale improvements to processes and smaller innovations that can be added to existing products.
  • It is great motivation for employees. All projects will have periods of grind where developers are just cranking out repetitive work. A bit of a break when they can let their hair down does wonders for morale.
  • It's a great form of training. No matter what people work on in this time, they'll be learning while doing it.
Ok, so assuming that you're now convinced, how would you implement a scheme like this? Mike Cannon-Brookes of Atlassian says:

"You see, while everyone knows about Google’s 20% time and we’ve heard about all the neat products born from it (Google News, GMail etc) – we’ve found it extremely difficult to get any hard facts about how it actually works in practice."

It will ultimately depend on your organisation. I think the key is to keep it as open as possible. Any constraints on the developers are going to limit any possibly innovation. However, a few constraints will help keep everyone's goals aligned:

"Self-organizing systems are able to create their own rules. All is needed for such a system to work is a set of simple constraints, sometimes called a boundary. It is important for managers to tune these constraints, and not to try and design all the rules. This means the job of a manager is to manage the system, and not the people in it." Management 3.0 - Jurgen Appelo

For example:
  • Your project should be something that would improve the company
  • Your project should have short, achievable goals
  • Your project must be visible to everyone
The first point is designed to steer people to work on something relevant to the company but without limiting them so much that they play it safe.

The typical scheme will be between 10% - 20% of the employee's time. Any less and they won't have time to produce anything of substance, any more and it will interfere with their jobs. However, this is still not much time, so it's important to make sure people aren't being too ambitious and setting themselves unachievable goals (e.g. write an operating system in 4 hours). This also helps with the other two goals. 

Lastly, people should be talking about what they are working on. This is important for stakeholders to see that people aren't just wasting time. More importantly, however, it means that people are sharing ideas which is going to generate more ideas and collaboration. A wiki and small demos might be a nice way to kick start this.

So why not try it and see what happens. Friday afternoons are typically productivity sinks anyway. Why not use that slow time to stir things up a bit?

Friday, 20 July 2012

Teaching TDD


I've been evangelising TDD at work and have been quite surprised by the how it has been taken up. In general, people are excited to learn about TDD and there was an excellent turnout to talks a colleague and I have given. We ran through the full test driven cycle: 
  1. Write a test
  2. Watch it fail to compile
  3. Fix the compile errors
  4. Run the test and watch it fail
  5. Write just enough code to make the test pass
  6. Watch the test pass
  7. Refactor
  8. Go to step 1
We demonstrated each one of these steps with 3 or 4 unit tests for an example class. However, no matter how much I repeat that you shouldn't write code before a test and that you should only write enough code to make your test pass, people struggle to get their heads around that idea. I still often come across people who are trying to learn TDD, but are not following this pattern.

I was prompted to start running the mini TDD courses when I saw a piece of code a junior member of my team was working on. He was still not used to the idea of writing tests first, so I sat with him and helped him write a bit of code in the TDD style. While he coded in front of me, I was able to point out when he'd skipped too far ahead and had written code that he didn't have a test for yet and it was only after a few round of this that he finally got what TDD was all about. 

I think that a practical one-on-one session like this is the only way to really show someone how to do TDD. Sit with them and have them start writing code and just gently nudge them when they've skipped a step. I've done this a couple of times now and it only takes a couple of tests before it sinks in.