Numbering Copies Using Enumerators
Copying a file in a file manager results in adding a copy counter to the file name. In this article, we’ll devise a simple and elegant algorithm based on Enumerator
for doing the same in a Rails app.
Problem Statement
Imagine we’re working on a CMS and need to implement a copy page feature. Each page has a unique slug and one of the requirements is generating a new slug for the copy. For example, copying /about-us
should result in /about-us-copy-1
. There’s also an easy to overlook use case we need to address too – copying a copy. A naive implementation might turn /about-us-copy-1
into /about-us-copy-1-copy-1
instead of /about-us-copy-2
.
The requirements can be broken down into the following three points:
- Copying a page for the first time should result in appending
-copy-1
to the slug. - Copying a page that already has a copy should increment the copy counter by one.
- Copying a copy should increment the counter already present in the slug.
We’ll take a bottom-up approach and start with slug uniqueness.
Database Uniqueness Constraints
A uniqueness guarantee safe from race conditions needs an index to enforce the constraint at the database level. Without it, if we copy a page twice rapidly we risk creating two pages with the same slug.
We start by creating an index to enforce the constraint:
add_index :pages, :slug, unique: true
We don’t need a model validation because we won’t let uniqueness violations propagate to the user. From his perspective, the copy operation will just work.
After creating the index, we need to find out how Active Record signals uniqueness violations. An experiment in the development console indicates that it raises ActiveRecord::RecordNotUnqiue
. Additionally, we can use #cause
to access the original exception raised by the database adapter. We assume we’re using PostgreSQL but adding support for other databases should be a breeze after the code is in place.
In order to make the code future-proof, we need to determine which uniqueness constraint was violated because e shouldn’t increase the copy counter if we violate a different uniqueness constraint. Unfortunately, it seems the only method is parsing the error message.
In PostgreSQL, error messages are available on e.cause.message
and look like:
ERROR: duplicate key value violates unique constraint "index_pages_on_slug"
DETAIL: Key (slug)=(about-us) already exists.
We’ll add a private method to Page
that extracts the constraint name from the exception message:
CONSTRAINT_NAME_REGEXP = %r{\AERROR: duplicate key value violates unique constraint "(.*)"$}
def violated_constraint_name(exception)
match = CONSTRAINT_NAME_REGEXP.match(exception.cause.message)
match && match[1]
end
We can now use it to implement a predicate for detecting slug index violations:
UNIQUE_INDEX_ON_SLUG = 'index_pages_on_slug'
# The method should be called from a rescue clause for ActiveRecord::RecordNotUnique.
# That's why we don't check the class of the exception.
def slug_uniqueness_violation?(exception)
violated_constraint_name(exception) == UNIQUE_INDEX_ON_SLUG
end
We could use meta-programming to get the index name from the database at runtime but such extra complexity doesn’t seem to be worth it in this case.
Armed with these methods, we can proceed to actually generating slugs.
Generating Slugs with Enumerators
We can elegantly address all the requirements at once by using enumerators. We need to find out the original slug and turn it into an infinite sequence of copy slugs. The sequence needs to be based on the original slug in order to address requirement 3.
Let’s start with conversions between original and copy slugs. These methods operate on a single slug but we’ll use them in the enumerator:
def original_to_copy(original_slug, copy_count)
"#{original_slug}-copy-#{copy_count}"
end
COPY_SLUG_REGEXP = %r{\A(.*)-copy-\d+\z}
def original_slug(slug)
match = COPY_SLUG_REGEXP.match(slug)
if match
match[1]
else
slug
end
end
These method allow us to implement #copy_slugs
as:
def copy_slugs(slug)
slug = original_slug(slug)
Enumerator.new do |slugs|
(1..).each do |count|
slugs << copy_slug(slug, count)
end
end
end
Notice the code uses infinite ranges added in Ruby 2.6. In earlier versions, we’d need to use loop
and increment count
ourselves. The enumerator is an infinite sequence of slugs of the form #{original-slug}-copy-#{copy_count}
.
We’re now ready to implement the copy operation.
Tying it All Together
The last step is using our newly created methods when copying a page. To create a copy, we duplicate the model, take the first slug from the sequence and save. If it succeeds then we’re done. If it violates the slug uniqueness constraint then we retry with the next slug from the sequence.
def copy
copied_page = dup
copy_slugs(slug).each do |copy_slug|
copied_page.update!(slug: copy_slug)
rescue ActiveRecord::RecordNotUnique => e
if slug_uniqueness_violation?(e)
next
else
raise
end
end
end
One downside of this approach is we always start with copy-1
even if it already exists. We could try finding the highest-numbered copy in the database and start from there but this would complicate the code. Assuming the copy feature is seldom used and there are at most a few copies at a time then our implementation is a good balance between performance and clarity.
We’re almost done! The last mandatory step is extracting the methods and constants we added into a separate class to avoid polluting Page
. I’ll leave it as an exercise to the reader. We may also limit the number of generated slugs in order to avoid an infinite loop in production that can easily exhaust our pool of workers. To do that, we should replace copy_slugs
with copy_slugs.take(MAX_COPY_SLUGS)
.
Closing Thoughts
Generating names of copies isn’t necessarily a difficult problem but there are edge cases that can result in a convoluted implementation. Using database constraints and enumerators results in an elegant solution.
Enjoyed the article? Follow me on Twitter!
I regularly post about Ruby, Ruby on Rails, PostgreSQL, and Hotwire.