Sunday, November 15, 2015

Write a File in Idris

Although Idris' documentation seems better than 99% of opensource projects out there, I couldn't easily find the steps necessary to write a file. After recalling that Brian McKenna wrote his blog in Idris, I figured he would have a nice example of how to write a file and I was right! This post is heavily ripped from his work. The actual code (taken from the Main file of Brain's code generator) looks like

writeFile : String -> String -> IO ()
writeFile f s = do
  f <- openFile f Write
  fwrite f s
  closeFile f

Pretty straight forward. It takes a file name and the content to write, opens a handle, writes the contents to that file then closes it. Now, I had found and written code like this in Idris' docs, but couldn't get it to compile. The trick was to tell Idris to compile and include the effects package, like so

idris -p effects Main.idr -o my_program

Thanks Brian!

Sunday, October 25, 2015

Trie in Haskell

Lately I've been playing around with Haskell. As an exercise, I decided to partially implement a Trie. I ignored the various implementations out there, including Data.Trie and the one linked on Wikipedia. Instead I focused on iterating on a Trie from scratch, and I found the results to be pretty interesting, so I'm sharing them here.

My real goal with this exercise was to implement "member", "insert" and a sorted "toList" and to do so somewhat generically. This interface turns out to be more similar to a Set than say a Map, because there aren't values associated with members. I used [a] to represent a member of the Trie a which, in retrospect, looks like a Monoid. Well, anyways, my first structure came out looking like

data Trie a =
  RootNode [TrieNode a]
  deriving Show

data TrieNode a =
  Leaf a
  | SpineEnd a [TrieNode a]
  | Spine a [TrieNode a]
  deriving Show

In this modeling, Leaf a indicated the end of a member at a leaf, SpineEnd a [TrieNode a] indicated an end on a spine, Spine a [TrieNode a] was a non-ending spine, and well RootNode [TrieNode a] indicated the root. At first glance, I really liked this model. After implementing the Trie, however, I found the structure less than desirable for a few reasons

Since "edges" are a list, a fair amount of boilerplate traversal code was necessary to implement the operations
Although at first I liked keeping RootNode separate, but later I realized it was inhibiting me from inserting the empty list
Turns out that Leaf a is totally redundant with SpineEnd a []

I'm much more pleased with the second structure I came up with

import qualified Data.Map as M

type TrieEdges a = M.Map a (Trie a)

data Trie a =
  Node { edges :: TrieEdges a }
  | EndNode { edges :: TrieEdges a }
  deriving (Show, Eq)

empty :: Trie a
empty = Node M.empty

Not only does this design have fewer parts than the other, but it yielded a simpler implementation. Here's the code for insert

insert :: Ord a => [a] -> Trie a -> Trie a
insert [] n = EndNode $ edges n
insert vs (Node m) = Node $ insertBelow vs m
insert vs (EndNode m) = EndNode $ insertBelow vs m

insertBelow (v:vs) m = M.insert v (insert vs child) m
  where child = M.findWithDefault empty v m

I believe it's pretty straightforward. The only real tricks were safely creating edges where none existed (which is the main purpose of insertBelow) and ensuring that the node at the end of a member is an EndNode. The member function turned out even nicer

member :: Ord a => [a] -> Trie a -> Bool
member [] (EndNode _) = True
member [] (Node _) = False
member (v:vs) n = maybe False (member vs) . M.lookup v $ edges n

Getting an in-order toList was a little bit tricker but, thanks to the useful functions in Data.Map it was quite doable

toList :: Trie a -> [[a]]
toList = go []
  where
  go acc (Node m) = continue acc m
  go acc (EndNode m) = reverse acc:continue acc m

  continue acc = concatMap (\(v, n) -> go (v:acc) n) . M.toList

In conclusion, if you have any comments, improvements or tips to make this better let me know!

Sunday, June 28, 2015

Object Oriented Clojure Example

Inspired by the message passing content in The Structure and Interpretation of Computer Programs (SICP), I decided to try my hand at object oriented-style programming in Clojure. Here's what I came up with, using the traditional bank account example

(defn make-account [initial-balance]
  (let [bal (ref initial-balance)
        withdraw (fn [amount]
                   (dosync (alter bal #(- % amount))))
        deposit (fn [amount]
                  (dosync (alter bal (partial + amount))))
        amount (fn []
                 (deref bal))
        reset (fn []
                (dosync (ref-set bal initial-balance)))]
    (fn [meth & args]
      (cond
        (= meth :withdraw) (withdraw (first args))
        (= meth :deposit) (deposit (first args))
        (= meth :amount) (amount)
        (= meth :reset) (reset)))))

Overall, it was a fun little bit of code to write! It has a constructor (make-account), a private variable with mutation (bal) and some messages it responds to (:withdraw, :deposit, :amount and :reset).

Here's an example of it being used

(def account (make-account 1000))
(println (account :amount))
; -> 1000

(account :withdraw 75)
(println (account :amount))
; -> 1925

(account :reset)
(println (account :amount))
; -> 1000

Favor Deletable Code Over Reusable Code

As problem solvers who work primarily in the medium of software, we're sold reusability throughout our careers (and even, likely, in our educations). We're told that if we just think ahead, if we put forth more effort, we can jump to a new plane of abstraction and achieve a solution that will be usable in the future to more quickly and easily solve a similar problem. Although this sounds pretty awesome in theory, I don't buy it in practice.

Arguments against optimizing for reuse

The first real problem I have with premature abstraction (in the hope of reuse) is just that -- it's premature. We'd like to think we know how our system ought to flex in the future, but we just don't. We're not fortune tellers.

We mean well. We build a more general solution than needed to solve a real problem

Our hope, thinking and intention is that future, similar problems will fit nicely into our "framework"

This sounds great, but there's a lot that can go wrong.

Firstly, what happens if our expanded solution doesn't "fully" solve the problem at hand?

Do we rewrite our more general solution to accommodate the actual problem? I find, generally, we don't. Instead we extend and further abstract our large solution to make it encompass the areas we missed. If we do re-write the general solution entirely, the previous general solution was effectively a time sink.

There's a fair amount of regression headache here as well. If I decided to extend my prior abstraction, then I have to do without breaking the parts that partially solved the problem to begin with. Also, assuming I still believed parts of the original abstraction to be valuable to future solutions, I would be required to also keep those from breaking.

Secondly, there's a question I try to ask myself before I solve future problems: what will happen if this problem is a one-off? I usually have a few different answers to this

I will have wasted time building abstractions and tests that weren't actually needed
I will have obfuscated the intent of the solution with unnecessary abstraction, forcing future maintainers (myself included) to understand a more general problem than the one actually being solved, and
I will have introduced more actual code and tests to maintain (in many codebases this may be a drop in a bucket, but still, more is more)

The third and final consideration I have before thinking about reuse is really a few tightly-related questions, pertaining to reuse itself

If a future problem is solvable by my abstraction, how will developers know to use it?
If a future problem looks like it's solved by my abstraction, but it's actually a different problem, how will other developers know not to use my abstraction?
What happens when a future problem is 90% solved by my abstraction? Is it flexible enough that someone else will be able to extend it to fit that extra 10%, while maintaining its integrity?

In my fairly short exposure to professional software development, I find the answers are something along the lines of

They won't. They'll build another solution, and maybe their own abstraction framework to some degree.
See 1, they probably won't ever see my abstraction or understand it. Although, I have seen a good bit of the "copy the original source, and tweak to fit" pattern applied in response to this question
It won't be flexible enough. They'll build another solution.

Like I said, I'm a novice developer. So if you have solutions to these questions and problems, I'm eager to hear them!

Deletable code

Alright, honesty time. The above words are truly my take on some points made in Greg Young's the art of destroying software talk. At this point, I could spend paragraphs describing the value obtained by having code that, by nature, is easy to delete. But, to do so would be a waste of time when Greg has so well articulated the points. So, with that, I urge you to watch the linked talk, and rip my arguments to shreds in the comments section.

Tuesday, May 19, 2015

Functional Programming Aha! Moments

In the past couple of years, I've dedicated a fair amount of time to closet-functional-programming. Although I haven't gotten an opportunity to use a more purist functional programming language (e.g. not FrameWorkOfTheDay.JS) in my day-to-day day, industry work, I believe some of the values of functional programming have bled into my daily practice, for the better.

Industry aside, I want to share some of the "Aha! moments" I've had while learning about functional programming. These are moments where something just clicked for me; some part of me almost can't help but share them.

Lets Can Be Described by Anonymous Functions

Although "functional programming" is one of those things were if you ask five people for a definition you are likely to get six different definitions, one general consensus seems to hold: assignment should be reduced, minimally scoped or eliminated entirely in functional programming.

The let macro/expression/sugar is basically a means of reducing assignments into minimally scoped, (generally) non-mutative expressions with a set of bindings. Here's a contrived example of what this might look like in Clojure

Here we're saying, let's bind `x` to `35` then `y` to a value calculated from `x`, then we'll use `y` to calculate some result. This entire expression is just that, it's an expression. This `let` is evaluated as a whole, and the result is simply `42`.

So, here comes the first "Aha!" This same kind of scoped assignment can be done through applying anonymous functions! Here's what that looks like for the above example

Basically the pattern here is just, for each bound symbol (in this case `x` and `y`) nest and call an anonymous function, passing the bound value down (`35` and `(- x 14)`)! I found this discovery exceptionally cool because it further emphasized the power of these functional languages: something as simple as anonymous functions can be combined to form powerful abstractions.

Monadic Bind Is Like Method Chaining

Monads, in general, have provided many an "Aha" for me. They're definitely one of those constructs where, once you get passed the initial "this is slightly out of my comfort zone" feeling, you are ready to embrace an overwhelming amount of power.`bind` (looks like `>>=`) is one of the crucial operations implementable by a monad, it has the following type signature

Ignoring currying and some other subtleties, if I'm thinking in an "object oriented mindset" I might reason about this as follows:

`bind` is a function that take 2 arguments

a generic wrapper around some type `a`
a function which takes some `a` and returns some generic wrapper around type `b` (where `b` may be the same type as `a`)

`bind` returns some generic wrapper around type `b`

To me, this sounds exactly like a single call that enables method chaining! Think a la calling `bind` to transform from monad to monad to monad... In my mind, `bind` is an enabling feature, that allows Haskell code to be elegant, expressive and powerful without lacking type safety (see `do` notation).

To visualize how this is kind of like method chaining, observe the following Ruby code

We can imagine these calls (with the exception of `count`) being like a type-not-so-safe version of Haskell's `bind` applied a couple of times (2 to be exact). Here are what the steps might look like

`m a` is a range of `Fixnum` (e.g. `Range Fixnum`)
`map` applies a block that takes the `Range Fixnum` and returns an `Array Fixnum`
`select`, with its block, takes an `Array Fixnum` and returns an `Array Fixnum`
Finally, `count` is simply applied to the `Array Fixnum`

Message Passing (OOP Style) Can Be Achieved via Closures

Disclaimer/Plug: this last "Aha" is shamelessly ripped off from The Structure and Interpretation of Computer Programs (SICP). I am over half way through this book, it is wonderful, please consider showing the authors your support by buying it.

The authors of SICP take the reader through a fantastic journey in which the powers of LISP and functional programming are clearly displayed through examples in the Scheme dialect. One piece of this journey that recently blew my mind was when the authors used closures to define a mutable bank account that responds to different messages. Here's the code (again shamelessly ripped)

What you're seeing here is basically a constructor that builds objects, given a beginning balance, that respond to the `'withdraw` and `'deposit` messages.

Frankly, I don't really know what I can say about this that the authors haven't already said (1) more articulately, (2) more clearly and (3) in a more inspirational manner.

In conclusion, these are just a few of the awesome concepts I've picked up in my pursuit of functional programming. If you have any thoughts on these, or my description of them, don't hesitate to comment.

Thursday, April 9, 2015

OOP Setters Can be Used to Emulate Partial Application

Recently I was asked an awesome question that went something like

How would you do something like high-order functions and partial application in object-oriented programming?

For high-order function my mind immediately jumped to Guava's Function Interface. I said, "hmmmm, let's see, could have some method (maybe called partiallyApply) that took an argument and returned another Function." In retrospect, this may have been overly complex, and a much simpler implementation of Function may work just as well in this case: fluent setters.

Let's look at a contrived, dead-horse-esque example. A partial function that does addition on two arguments.

In Haskell, if I wanted to build such a function, and partially apply it to, let's say 5, I might do the following

add5 = (+) 5

add5 10 -- is 15

In Clojure, this might look like

(def add-5 (partial + 5))

(add-5 10) ; is 15

So, how would I do this using setters? Simple! Here's some Ruby code

class Adder
  def value(first)
    @first = first
    self
  end

  def to(second)
    @second = second
    self
  end

  def add
    @first + @second
  end
end

adder_of_5 = Adder.new.value(5)

adder_of_5.to(10).add # is 15

This code (obviously) has a lot more ceremony wrapped around it, but it does achieve some of the same benefits as partial application (e.g. laziness, partial parameter fulfillment). The point is, if this were Java code, I could easily implement Function and change add to invoke and have a partial function in an object-oriented system.

Baby Don't Fear the Code Review

Although I'm (at best) a novice in the software engineering industry, I've had the opportunity to provide and receive many code reviews. I don't know of many companies that allow their junior developers to perform code reviews, but I'm not complaining; reading others' code is an invaluable practice, in my opinion.

I wanted to share some of the dos and don'ts that are floating around in the back of my head, so here they are.

Don't provide feedback without reason. "Because I said so" and "because that's how we do things" may have been acceptable reasons from our parents when we were 7, but they just don't stand up in any half-professional industry.
Do ask questions as a reviewer. If something you are reviewing doesn't make sense to you, it probably won't make sense to others. You may very well be fumbling to understand some malformed abstraction, or trying to mentally unravel some code that will prove difficult to maintain. You are doing yourself and the coder a discredit by not making a sincere effort to understand what you are reviewing.
Do ask question as the programmer whose code is being reviewed. The worst thing you can do is sheepishly accept feedback that you don't understand. Always dig deeper, try to find the reasons behind the reviewer's feedback, strive to uncover the Why.
Do favor verbal, in-person reviews over reviews done in a tool (if possible). Actually walking through the code with a peer has the potential to make the process more of a discussion than a chat-room session. I have found, through my short experience, that in-person reviews also tend to be conductive to question asking, and help alleviate many of the communication and timing errors that are so prone to occur in tools.
Don't take it personally -- you are not your code. Try to consider code reviews as learning experiences, as opportunities to gain insight from the feedback of your peers.
Do get an informal review of your code review. I know that sounds strange, let me restate that differently. Ask the programmer whose code you are reviewing if your comments are clear, if you had comments that weren't really valuable, if some element of the process could be improved, etc.... Feedback need not be one-way, and there is generally room for improvement in any process. Seek improvement.

Sunday, February 1, 2015

LISP Core in Less Than 100 Lines of Haskell

A couple of months ago, I built a LISP core engine in about 100 lines of Haskell code (of course, just for the fun of it). I thoroughly enjoyed the challenge, and with the help of techniques from the awesome Programming Languages MOOC I was taking, I felt ready to take it on. In conclusion, it was truly a fun project and I thought I'd share some things I learned along the way.

LISP's Core Primitives are Easily Implemented Recursively

In the simplest terms, the LISP core engine I built is simply a function that operates, with a specific environment (variable bindings) in mind, on the LISP syntax tree. Because of this I was able to implement primitives like label (aka define) simply with pattern matching and the cons (: in Haskell) operator!

To show how simple these primitives were to define, here's eq? (checks equality)

aux env (AList (Atom "eq?":a:b:_)) =
  if aux env a == aux env b then aTruthyValue else theFalseyValue

Note that aux is the name of the helper function called by the executor.

Haskell Code Can Be Very Elegant

There aren't many languages in which code as elegant as this can be written:

executeText env = execute env.parseMany.tokenize

In my opinion, it doesn't get much more straight forward than this. The dots may throw C-style programmers, but to a Haskell programmer they're pure gold.

Additionally refactoring in Haskell is extremely powerful. Consider the evolution of my bind function (takes names and values and adds them to an environment). It started as this:

bind [] [] env = env
bind (name:names) (value:values) env = bind names values ((name, value):env)

then was transformed to this:

bind names values = (++) $ zip names values

and ended up as:

bind names = (++) . zip names

In Haskell, it seems, there is a function for everything. Honestly, there's probably even one out there that can replace my bind function, I just haven't found it yet!

The Macro System is Pretty Cool

At the moment, the macro system in this LISP is pretty tight. Macros are written in Haskell code (I would like to eventually add support in the language) and are provided with (1) and "eval" which encapsulates the current environment and can run code and (2) a list of the arguments passed to the macro.

All registered macros are simply added to a list, and are thus loosely coupled to the actual core engine (but tightly coupled to the AST since they are essentially generating new portions of it).

Here's an example of how I defined if, in terms of the cond primitive:

structuralMacros = [(
  "if", \_ (p:a:b:_) -> AList [Atom "cond", p, a, Atom "1", b]),
  ...]

Parting Words

If you're interested in the full code, it's on github. Also the project has a REPL whose build step is at the bottom of "README.md."

Saturday, January 31, 2015

Hash Table (Map) Built In Python

Lately, I've been studying up on my data structures and algorithms. I decided it would be fun to implement a HashMap (the map abstract data type). I decided to rely on Python's idiomatic __hash__ function, and use linear probing as the collision resolution mechanism. In the process of building this, I learned a few things and I would like to share them.

Deletion is the Hardest Part

Overall the data structure was not very difficult to build. Well, except for deletion. It's not as simple as shifting all elements back (overwriting the deleted). Instead, I had to flag the item as being "deleted". This wasn't that difficult, however, maintaining quick-access sizing on the data structure was complected by it.

In the end, I had to maintain two states: one that programmers who called __len__ would see and the actual count of stored elements (being actual plus deleted).

Linear Probing is Not as Hard

My "find index" method, boiled down to just

    def __locate_index(self, key):
        i = hash(key) % self.capacity()

        while self.__values[i] and self.__values[i][0] != key:
            i = (i + 1) % self.capacity()

        return i

Where each value is just a tuple where elements 0 and 1 are the key and its value, respectively.

Everything Else is Just Bookkeeping

In the end, the code wasn't too difficult. In the end, the class had a lot more state than I would've liked, however, "getting it right" wasn't that difficult. As always, I hope I'm not missing any edge cases with my tests.

As usual, the code is on github and I'm open to any suggestions or thoughts.