Before I get started on this post if you aren't familiar with Project Chadwick here's a quick overview.
The data for this problem can be downloaded from here
In this problem we are going to find the Top 5 On Base Percentage Seasons since the Giants moved to San Francisco. While I’m not convinced that I’m fully thinking like a functional programmer but I think I’m starting to ‘get it’. With that said, lets get started.
New F# Concepts
In my second F# script, I’m using a few concepts that I didn’t use in the first solution, namely record and multi-lined functions.
The Record Type
type Batter = { Last : string; First : string; Season : int; AB : float; OBP : float }
So what is a record? It looks like a different way to declare a class. Well, it may look that way but records are not the same as classes. Records allow you to group data into types and access the data using fields. Fields in records are immutable whereas classes do not offer the same type safety. Also, records cannot be inherited. There are other differences that are beyond the scope of my current F# knowledge but as my F# knowledge expands we may delve into the remaining differences.
Functions
// Calculates the OBP | |
let obp h bb hbp ab sf : float = | |
let denom : float = ab * 1.0 + bb + hbp + sf | |
if denom > 0.0 then | |
(h + bb + hbp) / denom | |
else | |
0.0 |
Seq.toList, Pipe Forward Operator and List.Map
When we read in the content of the data file using the ReadAllLines method the file content is returned as a string array. In F# it is easier to work with lists in than arrays, least with what F# knowledge I have. So to convert our file content into a list I used the Seq.toList method. The pipe forward operator, |>, is used to send the output of the ReadAllLines call to send or ‘pipe’ it to the Seq.toList method as its parameter. You can think of the pipe forward operator similar to the pipe utility UNIX command line. The results of those commands are stored as a list<Batter> in the stats variable.
let stats = File.ReadAllLines(@".\data\sf_giants_batting.csv") |> Seq.toList
The List.map operation allows us to take a list and pass each item as a parameter to a function in one line. In addition to calling the function it creates a new list with the results of each column, in this case we aren’t using the returned list. In my solution I’m using it like a one line foreach call. I’m calling the create_batters function passing in the tail of the stats list that I read in. Why just the tail, because the head is the line that contains the column headings.
List.map create_batters stats.Tail
An Overview of My Solution
Since I’m just getting started with F#, I find myself writing F# code that looks like C#. I think my functions reflect that. However, the last line makes me think that I’m starting to make the turn on understanding functional programming and how I can use it. Here is the last line:
List.map print_batter (Seq.take 5 all_obps |> Seq.toList)
In my first go round of this script I had a for loop that went from 0 to 4 to print the first five items in the list. As I was writing this post up I started looking at the loop thinking I could improve it. I remembered reading about the Seq.take method that takes the number of items specified from the given sequence. So I removed the for loop and plopped in the following code in its place:
List.map print_batter (Seq.take 5 all_obps)
When I ran the new and improved script I received the following error:
chadwick-2-top5-obp.fsx(64,24): error FS0001: This expression was expected to have type Batter list but here has type seq<'a>
I noticed that the error message specifically stated that it was given a sequence but it expected a list. So I tacked on the |> Seq.toList call and was able to get it to work. That type of code is what gets me excided about functional programming. I’m looking forward to getting to the point where I can truly use the functional programming aspects of F#.
Here is my entire solution:
(* | |
Project Chadwick #2 - Calc Top 5 All Time OBP for San Fransisco Giants | |
OBP = (H + BB + HBP) / (AB + BB + HBP + SF) | |
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 | |
LastName,FirstName,lahmanID,yearID,stint,teamID,lgID,g,g_batting,ab,r, h,doubles,triples,hr,rbi,sb,cs,bb,so,ibb,hbp,sh,sf,gidp,g_old | |
*) | |
open System.IO | |
open System.Collections.Generic | |
// Batter Record | |
type Batter = { Last : string; First : string; Season : int; AB : float; OBP : float } | |
let all_obps = new List<Batter>(5) | |
all_obps.AddRange([| { Last = ""; First = ""; Season = 0; AB = 0.0; OBP = 0.0 }; | |
{ Last = ""; First = ""; Season = 0; AB = 0.0; OBP = 0.0 }; | |
{ Last = ""; First = ""; Season = 0; AB = 0.0; OBP = 0.0 }; | |
{ Last = ""; First = ""; Season = 0; AB = 0.0; OBP = 0.0 }; | |
{ Last = ""; First = ""; Season = 0; AB = 0.0; OBP = 0.0 } |]) | |
// Calculates the OBP | |
let obp h bb hbp ab sf = | |
let denom : float = ab * 1.0 + bb + hbp + sf | |
if denom > 0.0 then | |
(h + bb + hbp) / denom | |
else | |
0.0 | |
// Add the entry to the top 5 if its in the range | |
let rec check_top_5 entry index = | |
if entry.OBP > all_obps.[index].OBP then | |
all_obps.Insert(index, entry) | |
else | |
let new_item_index = index + 1 | |
if new_item_index < 5 then | |
check_top_5 entry new_item_index | |
let create_batters stats = | |
let cols = stats.ToString().Split(',') |> Seq.toList | |
// I'm doing this for clarity if I was worried about lines of code I would just use the | |
// the conversion when I call the obp function | |
let ab = float (cols.Item 9) | |
let h = float (cols.Item 11) | |
let bb = float (cols.Item 18) | |
let hbp = float (cols.Item 21) | |
let sf = float (cols.Item 23) | |
let season = int (cols.Item 3) | |
if ab >= 200.0 then | |
let b = { Last = cols.Item 0; First = cols.Item 1; Season = season; | |
AB = ab; OBP = (obp h bb hbp ab sf) } | |
check_top_5 b 0 | |
let print_batter batter = | |
printfn "%0.3f in %d by %s %s" batter.OBP batter.Season batter.First batter.Last | |
// File.ReadAllLines returns a string[] but to use the calc_obp we need a list<string> | |
// the |> Seq.toList will convert the string[] into a list<string> | |
let stats = File.ReadAllLines(@".\data\sf_giants_batting.csv") |> Seq.toList | |
List.map create_batters stats.Tail | |
// Display the results | |
printfn "Top 5 Seasons On Base Percentages since the Giants moved to San Fransisco" | |
printfn "=========================================================================" | |
List.map print_batter (Seq.take 5 all_obps |> Seq.toList) |
I enjoyed working on this solution, while its nothing big in the grand scheme of things but it was my first ‘real’ F# script. As always, any critiques, nudges or hints would be greatly appreciated. My number one goal of going through this process is to learn the 4 languages.
Up Next…
I will be adding more problems to the list shortly and solving this problem in either ruby or objective-c next.