What does the concept immutable data mean from a Ruby programmer's perspective? How is immutability supported in Ruby, and why should you care?
This is the second part of the two part article based on a little talk I did at the March 2013 meetup of the Helsinki Ruby Brigade.
- Part 1 defines immutability and looks at how it shows up in the Ruby standard library.
- Part 2 (this one) explores the effects of immutability in object-oriented domain modeling.
Entities and Values
We have looked at basic data structures and the effects of in-place state mutation there. I hope to have convinced you that there are very valid reasons for favoring immutable data structures over mutable ones.
But is there a place for immutability in our domain models? When does making an object immutable make sense in business logic code?
Let's look at what some of the well-known voices in the object-oriented community have to say. In particular, in what kinds of contexts do they talk about immutable objects?
Eric Evans: Entities and Value Objects
In the Eric Evans book Domain-Driven Design (2003), and in the DDD community it sparked, there is a clear distinction between two different kinds of concepts in our domain models: Entities and Value Objects.
An object defined primarily by its identity is called an ENTITY.
An object that represents a descriptive aspect of the domain with no conceptual identity is called a VALUE OBJECT.
There are things called Entities and there are things called Value Objects. Entities have something called an Identity, whereas Value Objects don't. We will return to these concepts shortly.
But what about immutability?
As long as a VALUE OBJECT is immutable, change management is simple - there isn't any change except full replacement. Immutable objects can be freely shared.
Immutability is a great simplifier in an implementation, making sharing and reference passing safe. It is also consistent with the meaning of a value. If the value of an attribute changes, you use a different VALUE OBJECT, rather than modifying the existing one.
It sounds like Evans thinks of immutability as a good idea when implementing Value Objects - both because it fits conceptually, and because it simplifies implementation.
So Evans talks about entities, values, identities, and immutability. But is this just a Domain-Driven Design thing? Apparently not:
Steve Freeman and Nat Pryce: Values and Objects
The book Growing Object-Oriented Software, Guided by Tests (2009, often lovingly referred to as GOOS) could be described as one of the primary sources of modern OO design wisdom. In it, authors Steve Freeman and Nat Pryce are proposing a distinction very similar to what Evans proposes in DDD:
When designing a system, it's important to distinguish between values [...] and objects.
Values are immutable instances that model fixed quantities. They have no individual identity, so two value instances are effectively the same if they have the same state. Objects, on the other hand, use mutable state to model their behavior over time.
In practice, this means we split our system into two "worlds": values, which are treated functionally, and objects, which implement the stateful behavior of the system.
A similar, but slightly different, terminology is repeated here: Immutable values, objects, identities. The central point is also the same as with Evans: There are two different categories of things in our domains, and we should clearly distinguish between them.
Rich Hickey: Identities and Values
Finally, let's take a look at Rich Hickey's thinking on the subject:
People accustomed to OO conceive of their programs as mutating the values of objects. They understand the true notion of a value, say, 42, as something that would never change, but usually don't extend that notion of value to their object's state.
That is a failure of their programming language. These programming languages use the same constructs for modeling values as they do for identities, and default to mutability, causing all but the most disciplines programmers to create many more identities than they should, creating identities out of things that should be values.
Again we have the same terminology: Mutation, objects, values, identities. Hickey also makes the distinction between things that have an Identity and things that don't.
So what exactly is the common thread in these three examples?
The classes in our domains can be divided into two different groups: Values and Objects.
Values are things that are defined in terms of their contents (or "state") - like the number
42. Values are immutable: We can't change
42 to be something else, as we discussed in the previous post.
Objects are things that have an Identity, and may have different values over time. For example, a
User Object might have a
name field, which may change over time. When a person gets married, their last name may change but to us they are still the same user, just with a mutated name. The user has an identity that persists over time - typically modeled as an
id field corresponding to a primary key in an underlying database.
The distinction is not just between primitive or simple values and compound objects. A compound object may also be a value. For example, consider an
Address class, with fields for a street
address, a zip code, and a city. Is it an Object or a Value? Well, in most cases it should probably be a Value. Two addresses on the same street are two different addresses, you don't get one from the other by mutating the street address field. This is in contrast to the user whose last name may change - it is definitely still conceptually the same user.
Notice how this is exactly the same reasoning as we did with
42. An address is no less of a Value than
42, it's just that we're more accustomed to thinking of numbers as Values than we are of the types we define ourselves.
|A.K.A Entities (in DDD)||A.K.A. Value Objects (in DDD)|
|Have an identity (e.g. "id").||No separate identity, the value itself is the identity.|
) Though not immutable in Ruby.
*) Though not immutable in Java before JDK 8.
This is not an article about Clojure, but I would still like to point out that Clojure is one of the very few languages that models Values and Identities as explicitly separate concepts.
Even if you don't plan to become a Clojure programmer, I recommend taking a look at how Clojure does this.
Writing Clojure programs has clarified my thinking on what this distinction actually is, much more than just reading about it has done.
Entities, Values, and Immutability in Ruby
As a language agnostic concept, there are definitely two different building blocks in our domain models, as we have just seen. But let's turn the discussion back to Ruby. How does the Object/Value distinction look like in code?
Take a look at the following class:
# user.rb class User attr_accessor :id, :name, :address end
Is it an Object or a Value? Well, obviously it is an Object. A
User is something that persists over time, and something whose individual fields may change over time. It even has an
id, which is a concrete representation of the user's identity.
Now, how about this:
# address.rb class Address attr_accessor :street, :zip, :city end
Is that an Object or a Value? Arguably this one is a Value. As we discussed, an address is an example of something that doesn't change over time. Different addresses should be different
Address objects. When you move to a new location, you don't fiddle with the
street field of your address, you have a new
The problem is that between
Address, there really isn't any difference in the code. There are no separate Object and Value constructs in the Ruby language - there are just classes. (To be fair, this is the case in all major OO languages.)
One way to make the distinction more clear would be to make the
Address class immutable, as all values should be. So how can we do that?
First of all, we need to get rid of the writers for those attributes. They should only have readers.
When we remove the writers, we must also define a constructor for the class, so we can define the initial values for the fields:
# address.rb class Address attr_reader :street, :zip, :city def initialize(street, zip, city) @street = street @zip = zip @city = city end end
That's better. But is
Address an immutable class now?
We can easily think of a situation where some new method introduced to
Address changes one of those fields - internal methods don't need the accessors since they can just access the fields directly. There's also the possibility of accessing the instance variables externally via something like instancevariableset.
What we need to do is to freeze the
Address. Freeze is a method that exists in all Ruby objects, and its purpose is to say "after being frozen, none of the instance variables of this object can be changed. If you try to do that, I will throw an exception."
If we freeze the address right in the constructor, that should take care of it. None of the instance variables can be changed after construction:
# address.rb class Address attr_reader :street, :zip, :city def initialize(street, zip, city) @street = street @zip = zip @city = city freeze end end
Are we done?
At this point we're getting into interesting territory. On the face of it,
Address is immutable since we can't change any of its fields. But the thing is, one or more of the objects stored in those fields might not be immutable themselves. For example,
street might be a String, which in Ruby is mutable, as we have discussed. Someone can just get the String stored in
street and change all of its contents. Our efforts for immutability so far do not help us prevent that.
Immutability is a transitive property in this sense. If anything within an object graph is mutable, the whole object graph is mutable. You have to go all the way to be able to brag about your object's immutability.
One way to make sure your object doesn't "leak" mutable members is to do defensive copying in readers. In practice, when someone asks for one of the attributes, we always return a copy instead of the original one. The caller can do with the copy what they please and we won't be affected by it:
# address.rb class Address def initialize(street, zip, city) @street = street @zip = zip @city = city freeze end def street @street.dup end def zip @zip.dup end def city @city.dup end end
This is slightly verbose, although it's the kind of verbosity that could easily be eliminated with some metaprogramming.
If you decide to do this, note that dup does a shallow copy of the object, so if one of its members is mutable, you still have a problem.
Another way to make sure that everything within the object graph is indeed immutable is to deep freeze it. You can achieve this by calling
freeze not only on the object iself, but also on all of its members, and their members, and so on.
One could do this manually, but there are also libraries for it. One of them is called ice_nine, and it comes with a class method that takes one object: An object to deep freeze.
# address.rb class Address attr_reader :street, :zip, :city def initialize(street, zip, city) @street = street @zip = zip @city = city IceNine.deep_freeze(self) end end
With that, we can be fairly sure that we have an immutable object. (Although, in Ruby, there always seems to be a way to get around any restriction, so YMMV. One would have to go to some lengths to mutate this object.)
Equality and Comparisons
Update 2013-03-29: This point about equality came up on Reddit. I think it deserves to be mentioned.
Since values don't have an identity, and two Values with the same value should be considered equal, we need to override the default implementation of
== in all of our Value classes. The default implementation is based on the object id, so two different Value objects with exactly the same contents are not considered equal, which is wrong.
# address.rb class Address attr_reader :street, :zip, :city def initialize(street, zip, city) @street = street @zip = zip @city = city end def ==(other) self.street == other.street && self.zip == other.zip && self.city == other.city end end
Our implementation of
== assumes address equality will never be tested against non-addresses (thing without methods for street, zip, and city). If you can't make that assumption, you'll additionally need type checks of respond_to? checks in the method.
If your values are actually comparable (i.e. one is considered larger than the other), such as something like an
Area might be, consider including the Comparable module and implementing the
<=> method to gain a full set of comparison operators for your value.
See this classic blog post by Alan Skorkin for a good discussions on equality and comparison.
The problem with an immutable Value class as we have defined above is that it still doesn't clearly communicate that it's a Value. Ideally, one could see at a glance what kind of construct a class is defining. With the implementations we have seen, the reader must infer that information by noticing that it's immutable, doesn't have an
id, and other such characteristics of a Value.
Fortunately, there are some gems available for this as well. They provide libraries for explicitly defining a class as a Value. One of these gems is called Virtus. With Virtus, you can define a Value by including a module and then declaring the attributes. With that information, Virtus will define a constructor, readers for the attributes, and a value based implementation of
# address.rb class Address include Virtus::ValueObject attribute :street, String attribute :zip, String attribute :city, String end
Another gem that may be useful is called simply Values. It defines a construct very similar to the built-in Struct, the difference being that a Value is immutable and a Struct is not:
# address.rb Address = Value.new(:street, :zip, :city)
ActiveRecord et al.
We've discussed Objects, Values, and the distinction between them. For most of us though, there are some practical issues to deal with when building our domain models. Perhaps the biggest of those has to do with ORM libraries. In Ruby that usually means ActiveRecord or one of its ActiveModel brethren.
The problem with ORMs is that while they provide a very convenient way to interface with relational databases, they also restrict what we can do in our object models to a subset that's "database friendly". When we try to model our business logic within ORM derived classes, we start to feel these restrictions
This usually isn't a problem in small applications, and indeed it was one of the major advantages of Rails early on - you didn't have to add so many layers in your application to get stuff done.
Values Within ActiveRecord: composed_of and workarounds
It is possible to model immutable values with ActiveRecord, but it is slightly awkward.
For a long time, you could use the composed_of construct to define the fact that "this bunch of attributes should actually be grouped into that separate class." That separate class was a true, immutable Value.
In Rails 4, composed_of is removed because it hasn't been used much and it is expensive for the Rails core team to maintain. This is unfortunate but understandable. In a blog post about this issue, José Valim has outlined a way you can do something similar manually:
# user.rb class User < ActiveRecord::Base def address @address ||= Address.new(street, zip, city) end def address=(address) self[:street] = address.street self[:zip] = address.zip self[:city] = address.city @address = address end end
city fields are part of the
User class and the corresponding database table, but they can also be wrapped into the
Address Value. An
Address is constructed on demand, and serialized back to the three fields when set. This is a completely acceptable, though slightly verbose solution for integrating Values into ActiveRecord models.
Architecting For OO Domain Models
It seems that lately more people in the Rails community have started architecting their way around these ORMs and other limiting frameworks, by completely separating them from the core domain model.
One of these architectural models is called the Hexagonal Architecture. Its main idea is to put the domain model right in the center of the system, and organize any external interfaces - such as databases or web controllers - around it.
The domain model is free of any framework code, so you have the full power of OO in your fingertips there. That includes the possibility to distinguish between Objects and Values without anything getting in your way.
For more information about hexagonal architectures, see Alistair Cockburn's original material, the GOOS book, Uncle Bob's article on clean architectures,
and Matt Wynne's talk on Hexagonal Rails.
In our domain models, we should clearly distinguish between Objects and Values. This has been a guideline of good object-oriented design for a long time, surfacing in different contexts with slightly different terminology.
Though often only simple values (like
42) are considered true values, the concept applies to compound things (like addresses) as well. The distinction is natural once you start thinking about it.
Our programming languages don't help us much in making this distinction, but there are things we can do. Enforcing immutability is one of those, and we have seen how to do it in Ruby.
Frameworks can be an impediment for good OO design - ORM frameworks like ActiveRecord especially so - and we can either work around their limitations or totally separate our domain models from them by architecting our applications carefully.
by: Tero Parviainen