The letter A styled as Alchemists logo. lchemists
Published January 1, 2023 Updated March 24, 2025
Cover
Ruby Data

The Data primitive — not to be confused with the DATA constant — was added to the language in Ruby 3.2.0 and is a minimal, immutable, and non-enumerable value-only class. The Data class is not a Struct but is Struct-like in nature with a limited Object API. There are several unique aspects to this primitive worth exploring.

Overview

For starters, I’d recommend reading my Struct article before proceeding because knowing how structs work will be helpful when comparing and contrasting with data objects since they are similar. That said, here’s how to construct, initialize, and interact with a data object:

Point = Data.define :x, :y
point = Point.new 1, 2

point.x          # 1
point.y          # 2
point.to_h       # {:x=>1, :y=>2}
point.to_s       # #<data Point x=1, y=2>
point.with y: 5  # #<data Point x=1, y=5>
point.members    # [:x, :y]
point.inspect    # #<data Point x=1, y=2>
point.frozen?    # true

As you can see, the Object API is quite small and definitely smaller than that of a Struct. Additionally, a data object doesn’t inherit from Enumerable so can’t it be iterated over or have attributes accessed via #[] like a Struct can. Despite the limited feature set, Data objects have the following advantages:

  • Great for concurrency when used with Ractors, Fibers, Threads, etc.

  • Great for pattern matching (more on this later).

  • Bypasses the baggage of the keyword_init: true flag as found when using a Struct which makes working with positional or keyword arguments more interchangeable.

There is one major disadvantage which has to do with overriding the #initialize method in that it only accepts keyword parameters. It’s an odd and surprising design choice but will be explained later.

Construction

Construction of a data object is like a Struct but is, sadly, an inconsistent departure.

Define

You must use .define to define your data object. Example:

Point = Data.define :x, :y

This is definitely inconsistent from how you’d use .new to construct a Struct. I’ll admit that I like the use of .define more than .new and wish Struct was updated to use .define as well so we had more consistency between these two value objects.

Subclass

Much like a Struct, you can subclass:

class Inspectable < Data
  def inspect = to_h.inspect
end

Point = Inspectable.define :x, :y

As mentioned with subclassing a Struct, you’re much better off composing your objects rather than using complicated inheritance structures. I don’t recommend this approach.

Initialization

Initialization is similar to a Struct except arguments are strictly enforced.

New

As hinted at earlier, you can initialize a data instance via the .new class method using either positional or keyword arguments:

# Positional
point = Point.new 1, 2

# Keyword
point = Point.new x: 1, y: 2

All arguments are required. Otherwise, you’ll get an ArgumentError (you can avoid this stipulation by defining defaults which will be explained later):

point = Point.new 1     # missing keyword: :y (ArgumentError)
point = Point.new x: 1  # missing keyword: :y (ArgumentError)

Mixing and matching of arguments isn’t allowed either:

point = Point.new 1, y: 2
# wrong number of arguments (given 2, expected 0) (ArgumentError)

Anonymous

You can anonymously create a new data instance via single line construction and initialization. Example:

# Positional
point = Data.define(:x, :y).new 1, 2

# Keyword
point = Data.define(:x, :y).new x: 1, y: 2

The problem with anonymous data objects is that they are only useful within the scope they are defined as temporary and short lived objects. Worse, you must redefine them each time you want to use them. For anything more permanent, you’ll need to define a constant for improved reuse. That said, anonymous data objects can be handy for one-off situations like scripts, specs, or code spikes.

Brackets

There is a shorter way to initialize a data object and that’s via square brackets:

# Positional
point = Point[1, 2]

# Keyword
point = Point[x: 1, y: 2]

This is my favorite form of initialization and for two important reasons:

  1. Brackets require three less characters to type.

  2. Brackets signify, more clearly, you are working with a struct/data object versus a class which improves readability.

Immutability

By default, Data is immutable but only in a shallow way. This means only the top level attributes of your data object but nothing that is deeply nested. Example:

Basket = Data.define :label, :items
basket = Basket[label: "Demo", items: [:apple, :orange]]

basket.frozen?           # true
basket.label = "Test"    # can't modify frozen (FrozenError)
basket.items.push :pear  # [:apple, :orange, :pear]

As you can see, the data is frozen but is a shallow freeze. This is why we were able to mutate the items array. To deep freeze data, you can use Ractor. Example:

Basket = Data.define :label, :items
basket = Ractor.make_shareable Basket[label: "Demo", items: [:apple, :orange]]

basket.frozen?           # true
basket.label = "Test"    # can't modify frozen (FrozenError)
basket.items.push :pear  # can't modify frozen Array (FrozenError)

Notice, by using Ractor.make_shareable, this applies a deep freeze on all attributes of your data object. Using Ractor like this isn’t it’s primary purpose but can be handy. The other way to tackle this is to iterate over all attributes in your data object, at initialization, by freezing each member. Definitely more cumbersome than using Ractor because you’d also have to traverse deeply nested objects and freeze each one.

Defaults

You can provide defaults by defining an #initialize method. Example:

Point = Data.define :x, :y do
  def initialize x: 1, y: 2
    super
  end
end

point = Point.new

point.x  # 1
point.y  # 2

Continuing with the above example, this also means you now have the flexibility to use partial arguments when defaults are defined:

Point.new 0     # #<data Point x=0, y=2>
Point.new x: 0  # #<data Point x=0, y=2>

One stipulation — when defining defaults — is all keywords must be present. For instance, the following is syntactically correct but unusable and will throw an error:

Point = Data.define :x, :y do
  def initialize x: 1
    super
  end
end

Point.new 0, 1        # unknown keyword: :y (ArgumentError)
Point.new x: 0, y: 1  # unknown keyword: :y (ArgumentError)

That said, you can fix the above by supplying all keyword arguments with only the defaults you need. Here’s a modification to the above code which allows you to supply a default value for only one of the parameters:

Point = Data.define :x, :y do
  def initialize x: 1, y:
    super
  end
end

Point.new 0, 1        # #<data Point x=0, y=1>
Point.new x: 0, y: 1  # #<data Point x=0, y=1>

Again, the only difference between this code snippet and the earlier code snippet is that all keyword arguments are defined even if only a subset of defaults are supplied. You can also keyword argument forwarding:

Point = Data.define :x, :y do
  def initialize(**) = super
end

Point[1, 2]
#<data Point x=1, y=2>

Once incoming arguments are passed to super, further modification of attributes is impossible because they are immediately frozen and inaccessible. With a Struct you could use #[] to access member values but there is no such method for a Data object.

Values

Unlike a Struct, values must be messaged directly which means you can’t access them via #[] or assign new values. For example, this won’t work:

point[:x]                          # NoMethodError
point.x = 5                        # NoMethodError
point.each { |value| puts value }  # NoMethodError

All you can do is ask for a value (as shown earlier):

point.x  # 1
point.y  # 2

With

Unique to Data objects, you can make a shallow copy your instance (i.e. instance variable copies but not the objects referenced by them). This makes for a fast way to build altered versions of your data. Consider the following:

Point = Data.define :x, :y
point = Point[1, 2]

point.with x: 2, y: 3  # #<data Point x=2, y=3>
point.with x: 0        # #<data Point x=0, y=2>
point.with y: 0        # #<data Point x=1, y=0>
point.with bogus: "✓"  # unknown keyword: :bogus (ArgumentError)

Notice that you can quickly build a new version of your original data object with the same attributes. You can also mix and match all or some of your attributes. However, attempting to reference an attribute that doesn’t exist will result in an ArgumentError.

Equality

Structs and data objects share the same superpower in that they are both value objects by default. Even better, data objects take this one step further since all values are immutable by default. The following illustrates this more clearly:

a = Point[x: 1, y: 2]
b = Point[x: 1, y: 2]

a == b      # true
a === b     # true
a.eql? b    # true
a.equal? b  # false

Pattern Matching

Pattern matching is supported by default and is identical in behavior to Struct pattern matching. One difference worth pointing out is that you can define data without any arguments which isn’t possible with a struct. Example:

module Monads
  Just = Data.define :content
  None = Data.define
end

This is handy when pattern matching:

case Monads::Just[content: "demo"]
  in Monads::Just then "Something"
  in Monads::None then "Nothing"
end

# "Something"

case Monads::None.new
  in Monads::Just then "Something"
  in Monads::None then "Nothing"
end

# "Nothing"

Refinements

You can refine your Data objects by using the Refinements gem especially if you’d like obtain the differences between two instances. Example:

#! /usr/bin/env ruby
# frozen_string_literal: true

# Save as `demo`, then `chmod 755 demo`, and run as `./demo`.

require "bundler/inline"

gemfile true do
  source "https://rubygems.org"

  gem "refinements"
end

using Refinements::Data

Point = Data.define :x, :y

point_a = Point[x: 1, y: 2]
point_b = Point[x: 1, y: 3]

point_a.diff point_b
# {y: [2, 3]}

If you were to run the above script, you’d see the same output as shown in the code comments. The above is only a small taste of how you can refine your structs. Feel free to check out the Refinements gem for details or even add it to your own projects.

Wholeable

In situations where a Data or Struct isn’t enough, you can use the Wholeable gem turn a Class into a whole value object. Example:

class Person
  include Wholeable[:name, :email]

  def initialize name:, email:
    @name = name
    @email = email
  end
end

jill = Person[name: "Jill Smith", email: "[email protected]"]
# #<Person @name="Jill Smith", @email="[email protected]">

jill.name   # "Jill Smith"
jill.email  # "[email protected]"

You only need to include Wholeable with the attributes that make up your whole value object with required and/or optional keys as desired.

Avoidances

As emphasized with the Struct class, avoid anonymous inheritance and use of constants within your data objects. In other words, don’t do the following:

class Point < Data.define(:x, :y)
end

Point.ancestors
# [Point, #<Class:0x000000010da8de88>, Data, Object, Kernel, BasicObject]

Anonymous superclasses (i.e. <Class:0x000000010da8de88>) are wasteful and inefficient, performance-wise. Definitely refer to the original Struct article to learn more on why this is a bad practice.

Benchmarks

In terms of performance, data objects can be faster than structs but it depends on whether you are using positional or keyword arguments. Consider the following YJIT-enabled benchmark:

#! /usr/bin/env ruby
# frozen_string_literal: true

# Save as `benchmark`, then `chmod 755 benchmark`, and run as `./benchmark`.

require "bundler/inline"

gemfile true do
  source "https://rubygems.org"
  gem "benchmark-ips"
  gem "debug"
end

Warning[:performance] = false

MAX = 1_000_000

StructDemo = Struct.new :to, :from
DataDemo = Data.define :to, :from

Benchmark.ips do |benchmark|
  benchmark.config time: 5, warmup: 2

  benchmark.report "Data" do
    MAX.times { DataDemo[to: "Mork", from: "Mindy"] }
  end

  benchmark.report "Struct" do
    MAX.times { StructDemo[to: "Mork", from: "Mindy"] }
  end

  benchmark.compare!
end

If you save the above script and run locally, you’ll get the following results:

ruby 3.4.1 (2024-12-25 revision 48d4efcb85) +YJIT +PRISM [arm64-darwin24.2.0]
Warming up --------------------------------------
                Data     1.000 i/100ms
              Struct     1.000 i/100ms
Calculating -------------------------------------
                Data      6.321 (± 0.0%) i/s  (158.21 ms/i) -     32.000 in   5.066667s
              Struct      6.143 (± 0.0%) i/s  (162.80 ms/i) -     31.000 in   5.055805s

Comparison:
                Data:        6.3 i/s
              Struct:        6.1 i/s - 1.03x  slower

💡 If you’d like more benchmarks, check out my Struct article and/or Benchmarks project for further details.

Conclusion

Data objects are minimalistic and powerful in that they are only meant to hold a collection of data values which can’t be mutated. I’ve wanted an immutable value object in Ruby for some time but I’m also concerned about the design and precedence this new Data object has introduced into the language. That said, if you find yourself reaching for a struct, consider using a data object instead especially if you only need the immutable encapsulation of raw values.