Sunday, December 12, 2010

Project Chadwick #2–Top 5 SF Giants OBP (F# Version)

Before I get started on this post if you aren't familiar with Project Chadwick here's a quick overview.

The data for this problem can be downloaded from here

In this problem we are going to find the Top 5 On Base Percentage Seasons since the Giants moved to San Francisco.  While I’m not convinced that I’m fully thinking like a functional programmer but I think I’m starting to ‘get it’.  With that said, lets get started.

New F# Concepts

In my second F# script, I’m using a few concepts that I didn’t use in the first solution, namely record and multi-lined functions.

The Record Type

type Batter = { Last : string; First : string; Season : int; AB : float; OBP : float }

So what is a record?  It looks like a different way to declare a class.  Well, it may look that way but records are not the same as classes.  Records allow you to group data into types and access the data using fields. Fields in records are immutable whereas classes do not offer the same type safety.  Also, records cannot be inherited.  There are other differences that are beyond the scope of my current F# knowledge but as my F# knowledge expands we may delve into the remaining differences.

Functions

A function is declared using the let statement. The keyword let is followed by the name of the function, a list of space delimited parameters, and optionally a return type.  The body of the function is determined by white space.  All lines that are indented after the declaration are considered part of the function’s body until a line is encountered at the same ‘level’ of indention as the let statement. The return value of a function is the result of the last line executed.

Seq.toList, Pipe Forward Operator and List.Map

When we read in the content of the data file using the ReadAllLines method the file content is returned as a string array.  In F# it is easier to work with lists in than arrays, least with what F# knowledge I have.  So to convert our file content into a list I used the Seq.toList method. The pipe forward operator, |>, is used to send the output of the ReadAllLines call to send or ‘pipe’ it to the Seq.toList method as its parameter. You can think of the pipe forward operator similar to the pipe utility UNIX command line.  The results of those commands are stored as a list<Batter> in the stats variable.

let stats = File.ReadAllLines(@".\data\sf_giants_batting.csv") |> Seq.toList

The List.map operation allows us to take a list and pass each item as a parameter to a function in one line.  In addition to calling the function it creates a new list with the results of each column, in this case we aren’t using the returned list.  In my solution I’m using it like a one line foreach call.  I’m calling the create_batters function passing in the tail of the stats list that I read in. Why just the tail, because the head is the line that contains the column headings.

List.map create_batters stats.Tail

An Overview of My Solution

Since I’m just getting started with F#, I find myself writing F# code that looks like C#.  I think my functions reflect that.  However, the last line makes me think that I’m starting to make the turn on understanding functional programming and how I can use it.  Here is the last line:

List.map print_batter (Seq.take 5 all_obps |> Seq.toList)

In my first go round of this script I had a for loop that went from 0 to 4 to print the first five items in the list.  As I was writing this post up I started looking at the loop thinking I could improve it.  I remembered reading about the Seq.take method that takes the number of  items specified from the given sequence.  So I removed the for loop and plopped in the following code in its place:

List.map print_batter (Seq.take 5 all_obps)

When I ran the new and improved script I received the following error:

chadwick-2-top5-obp.fsx(64,24): error FS0001: This expression was expected to have type Batter list but here has type seq<'a>

I noticed that the error message specifically stated that it was given a sequence but it expected a list.  So I tacked on the |> Seq.toList call and was able to get it to work.  That type of code is what gets me excided about functional programming.  I’m looking forward to getting to the point where I can truly use the functional programming aspects of F#.

Here is my entire solution:

I enjoyed working on this solution, while its nothing big in the grand scheme of things but it was my first ‘real’ F# script.  As always, any critiques, nudges or hints would be greatly appreciated.  My number one goal of going through this process is to learn the 4 languages.

Up Next…

I will be adding more problems to the list shortly and solving this problem in either ruby or objective-c next.