How To Leverage Ruby's Functional Programming Capability

Continued from page 1

Let’s rewrite this function using functional style, applying well-known Ruby idioms.

def birthday_sequence(users, registry = BirthdayRegistry, today =
    .sort_by { |birthday, _| (today - birthday).abs }
    .map { |birthday, celebrators| "[#{', ')}] - #{birthday}" }
    .join('; ')

This looks much more concise than the original variant. Even after refactoring of the former (keeping imperative style intact), it will remain longer. This is a very common side effect of writing programs in functional style. It forces you to express what is in the code, instead of how. Let’s review the most interesting parts of the second code example.

Please pay extra attention to the absence of variables in the method written in functional style. How much easier is it to extract formatting code out of it? You don't have to scan the method, detecting all places where the result variable is used.

Data transformation and filtering (map, collect, inject, reduce, filter, detect, reject, zip, etc.) – that is what makes Ruby, as well as other functional languages, so expressive and concise. All developers new to Ruby learn the usefulness of these functions first. Indeed, it’s much more practical to just describe what to do with data, instead of writing nasty for loops. will iterate through users, extracting value of full_name property from each of them and returning an array of full_names. The join function will combine everything together, separating values by a comma followed by a space.

group_by is a function, which groups an input array into buckets(arrays) by result of block evaluation on each value. Given an array of strings: ['foo', 'bar', 'buzz'], group_by { |string| string.length} will return { 3 => ['foo', 'bar'], 4 => ['buzz'] } hash. I know, it doesn’t look like a completely fair substitution (in the original piece of code it’s done 'by hand'), but group_by as well as index_by and similar concepts are very well known and accepted in functional languages. Developers use such data transformations as building blocks, combining them with each other to achieve the desired result instead of describing what the computer should do during each step.

.method. In Ruby, it’s a way to get a method object – 'pointer' to a method. Here we are getting a pointer to birthday method of the registry. The & symbol converts method to a block, which can be then passed to any method expecting one. For example: 5.method(:modulo).call(2) will give the same result as 5.modulo(2). This is a common way to pass a method instead of a block. But just getting a method isn't enough, BirthdayRegistry.birthday also accepts format as a first argument.

The trick is to curry that pointer to a method. In functional languages, currying means partially applying arguments to a function. A curry operation takes a proc of N arguments and returns a proc of one argument, which returns a proc of one argument, which returns... N times – you get the idea. In the functional code example, we are currying the birthday method, providing the first argument to it (call(:date) notation is substituted with [:date] notation for shortness – Ruby has many ways to call a function). Having done that, the result can be used in the group_by function as a block.

The sorting part looks essentially the same in both examples with one minor difference – but a very important difference. Imperative code just uses to get the current date. This is a reference to a global, non-pure state! The result of is different each time (day) we call it. Having engraved into the function body makes it very hard to test without the magical timecop gem (which monkey patches Date and can stop time for a while). Not to mention the incorrect behavior of the birthday_sequence function itself – for each user today can be different and, therefore, the time difference between birthday and today is different. Just imagine yourself debugging a defect, from the QA team about 'off by hour' shift in the middle of the user's birthday string only twice a year.

The solution to that problem is also dependency injection. This is not a functional paradigm concept at all, but almost every functional program uses it. For a function to be pure, it’s not allowed to operate in an external global state (otherwise, it will return non-deterministic results). So, instead of referring to a global state, we inject a variable inside a function through its parameters. Doing so, we eliminate the possibility of an 'off by hour' defect to even appear (each time the difference is now calculated with the same 'now' value).

Purity is, probably, the most loved concept in functional languages. A function that does not depend on any external state always returns the same result, is very testable, reusable and easy to understand. In the majority of cases, it is also much easier to debug such a function. Actually, no debugging is needed; you just call a function with some arguments and inspect the result. There is no way for the external world (the rest of the system) to influence what pure function is going to return. The signature def birthday_sequence(users, registry = BirthdayRegistry, today = injects dependencies of a function from the outside instead of referencing them from the function body. Just looking at a function signature makes it clear for other developers that it actually uses today inside, falling back to by default, if nothing was passed. With such a signature, we can make a function pure, as soon as BirthdayRegistry.birthday is also pure.

The injection of BirthdayRegistry doesn’t look like a big deal, but it’s hard to underestimate it. This little injection has a huge implication on testing. If you are a good developer, you write a couple of unit tests to ensure that the birthday_sequence function works as expected. Before calling it and asserting the result, however, you need to set up an environment. You need to make sure that BirthdayRegistry.birthday will actually return data for users on which you are testing your function. Therefore, you have a choice of seeding an external storage (from which BirthdayRegistry takes its data) or mocking the implementation of the birthday method. The latter is easier, so you do allow (BirthdayRegistry).to receive(:birthday).with(anything, user).and_return(...). Now, look at your unit test. Developers who will read it later will have no clue why you are setting up a BirthdayRegistry mock before calling the birthday_sequence function without looking at its implementation. Congratulations, you now have a semantic dependency! Every time you decide to work with the birthday_sequence function, you'll have to keep in mind that it’s actually calling BirthdayRegistry inside. The injection allows you to pass stub implementation of BirthdayRegistry in the unit test explicitly, without semantic dependency (if the method accepts it in parameters, I would bet it’s using it).

Comparing code from imperative_fake_spec.rb and imperative_real_spec.rb tests, it’s not easy to see the difference, but it’s crucial for test feedback loop speed. Just stubbing out BirthdayRegistry dependency, we gain speed – lots of speed. Since unit tests are not hitting databases or any other external storage, they can work lightning-fast. The functional code test functional_spec.rb encourages passing fake implementation of external dependency, leaving no chance to test slowness.

Full sources of examples and unit tests can be found at GitHub repo.

There are many other areas in which functional languages can affect the way you write Ruby code: Monadshigher order functionsimmutability, etc. My goal in this article was to demonstrate the basic elements of functional programming in Ruby and inspire developers to discuss this issue further in the hopes of learning to code better and faster.

Be sure to read the next Application Development article: How to Build a .NET App that Calls Apps Script Execution API


Comments (0)