Ancestry is a gem/plugin that allows the records of a Ruby on Rails ActiveRecord model to be organised as a tree structure (or hierarchy). It uses a single, intuitively formatted database column, using a variation on the materialised path pattern. It exposes all the standard tree structure relations (ancestors, parent, root, children, siblings, descendants) and all of them can be fetched in a single sql query. Additional features are STI support, scopes, depth caching, depth constraints, easy migration from older plugins/gems, integrity checking, integrity restoration, arrangement of (sub)tree into hashes and different strategies for dealing with orphaned records.
To apply Ancestry to any ActiveRecord model, follow these simple steps:
-
Install
-
Rails 2
-
Add to config/environment.rb: config.gem ‘ancestry’
-
Install required gems: sudo rake gems:install
-
-
Rails 3
-
Add to Gemfile: gem ‘ancestry’
-
Install required gems: bundle install
-
-
-
Add ancestry column to your table
-
Create migration: rails g migration add_ancestry_to_[table] ancestry:string
-
Add index to migration: add_index [table], :ancestry (UP) / remove_index [table], :ancestry (DOWN)
-
Migrate your database: rake db:migrate
-
-
Add ancestry to your model
-
Add to app/models/.rb: has_ancestry
-
Your model is now a tree!
In version 1.2.0 the acts_as_tree method was renamed to has_ancestry in order to allow usage of both the acts_as_tree gem and the ancestry gem in a single application. To not break backwards compatibility, the has_ancestry method is aliased with acts_as_tree if ActiveRecord::Base does not respond to acts_as_tree. acts_as_tree will continue to be supported in the future as I personally prefer it.
You can use the parent attribute to organise your records into a tree. If you have the id of the record you want to use as a parent and don’t want to fetch it, you can also use parent_id. Like any virtual model attributes, parent and parent_id can be set using parent= and parent_id= on a record or by including them in the hash passed to new, create, create!, update_attributes and update_attributes!. For example:
TreeNode.create! :name => 'Stinky', :parent => TreeNode.create!(:name => 'Squeeky')
You can also create children through the children relation on a node:
node.children.create :name => 'Stinky'
To navigate an Ancestry model, use the following methods on any instance / record:
parent Returns the parent of the record, nil for a root node parent_id Returns the id of the parent of the record, nil for a root node root Returns the root of the tree the record is in, self for a root node root_id Returns the id of the root of the tree the record is in is_root? Returns true if the record is a root node, false otherwise ancestor_ids Returns a list of ancestor ids, starting with the root id and ending with the parent id ancestors Scopes the model on ancestors of the record path_ids Returns a list the path ids, starting with the root id and ending with the node's own id path Scopes model on path records of the record children Scopes the model on children of the record child_ids Returns a list of child ids has_children? Returns true if the record has any children, false otherwise is_childless? Returns true is the record has no childen, false otherwise siblings Scopes the model on siblings of the record, the record itself is included sibling_ids Returns a list of sibling ids has_siblings? Returns true if the record's parent has more than one child is_only_child? Returns true if the record is the only child of its parent descendants Scopes the model on direct and indirect children of the record descendant_ids Returns a list of a descendant ids subtree Scopes the model on descendants and itself subtree_ids Returns a list of all ids in the record's subtree depth Return the depth of the node, root nodes are at depth 0
The has_ancestry methods supports the following options:
:ancestry_column Pass in a symbol to store ancestry in a different column :orphan_strategy Instruct Ancestry what to do with children of a node that is destroyed: :destroy All children are destroyed as well (default) :rootify The children of the destroyed node become root nodes :restrict An AncestryException is raised if any children exist :cache_depth Cache the depth of each node in the 'ancestry_depth' column (default: false) If you turn depth_caching on for an existing model: - Migrate: add_column [table], :ancestry_depth, :integer, :default => 0 - Build cache: TreeNode.rebuild_depth_cache! :depth_cache_column Pass in a symbol to store depth cache in a different column :cache_descendants_count Cache the number of descendants a node has in the 'descendants_count' column (default: false) If you turn descendants_count caching on for an existing model: - Migrate: add_column [table], :descendants_count, :integer, :default => 0, :null => false - Build cache: Model.rebuild_descendants_count_cache! :descendants_count_column Pass in a symbol to store descendants_count in a different column :primary_key_format Supply a regular expression that matches the format of your primary key. By default, primary keys only match integers ([0-9]+).
Where possible, the navigation methods return scopes instead of records, this means additional ordering, conditions, limits, etc. can be applied and that the result can be either retrieved, counted or checked for existence. For example:
node.children.exists?(:name => 'Mary') node.subtree.all(:order => :name, :limit => 10).each do; ...; end node.descendants.count
For convenience, a couple of named scopes are included at the class level:
roots Root nodes ancestors_of(node) Ancestors of node, node can be either a record or an id children_of(node) Children of node, node can be either a record or an id descendants_of(node) Descendants of node, node can be either a record or an id subtree_of(node) Subtree of node, node can be either a record or an id siblings_of(node) Siblings of node, node can be either a record or an id
Thanks to some convenient rails magic, it is even possible to create nodes through the children and siblings scopes:
node.children.create node.siblings.create! TestNode.children_of(node_id).new TestNode.siblings_of(node_id).create
When descendants_count caching is enabled (see has_ancestry options), you have one additional provided scope:
has_descendants Return nodes that have descendants
This is useful for when you want to know which nodes have descendants. You can even sort by descendants_count to see which node has the most, or least, number of descendants.
When depth caching is enabled (see has_ancestry options), five more named scopes can be used to select nodes on their depth:
before_depth(depth) Return nodes that are less deep than depth (node.depth < depth) to_depth(depth) Return nodes up to a certain depth (node.depth <= depth) at_depth(depth) Return nodes that are at depth (node.depth == depth) from_depth(depth) Return nodes starting from a certain depth (node.depth >= depth) after_depth(depth) Return nodes that are deeper than depth (node.depth > depth)
The depth scopes are also available through calls to descendants, descendant_ids, subtree, subtree_ids, path and ancestors. In this case, depth values are interpreted relatively. Some examples:
node.subtree(:to_depth => 2) Subtree of node, to a depth of node.depth + 2 (self, children and grandchildren) node.subtree.to_depth(5) Subtree of node to an absolute depth of 5 node.descendants(:at_depth => 2) Descendant of node, at depth node.depth + 2 (grandchildren) node.descendants.at_depth(10) Descendants of node at an absolute depth of 10 node.ancestors.to_depth(3) The oldest 4 ancestors of node (its root and 3 more) node.path(:from_depth => -2) The node's grandparent, parent and the node itself node.ancestors(:from_depth => -6, :to_depth => -4) node.path.from_depth(3).to_depth(4) node.descendants(:from_depth => 2, :to_depth => 4) node.subtree.from_depth(10).to_depth(12)
Please note that depth constraints cannot be passed to ancestor_ids and path_ids. The reason for this is that both these relations can be fetched directly from the ancestry column without performing a database query. It would require an entirely different method of applying the depth constraints which isn’t worth the effort of implementing. You can use ancestors(depth_options).map(&:id) or ancestor_ids.slice(min_depth..max_depth) instead.
Ancestry works fine with STI. Just create a STI inheritance hierarchy and build an Ancestry tree from the different classes/models. All Ancestry relations that where described above will return nodes of any model type. If you do only want nodes of a specific subclass you’ll have to add a condition on type for that.
Ancestry can arrange an entire subtree into nested hashes for easy navigation after retrieval from the database. TreeNode.arrange could for example return:
{ #<TreeNode id: 100018, name: "Stinky", ancestry: nil> => { #<TreeNode id: 100019, name: "Crunchy", ancestry: "100018"> => { #<TreeNode id: 100020, name: "Squeeky", ancestry: "100018/100019"> => {} } } }
The arrange method also works on a scoped class, for example:
TreeNode.find_by_name('Crunchy').subtree.arrange
The arrange method takes ActiveRecord find options. If you want your hashes to be ordered, you should pass the order to the arrange method instead of to the scope. This also works for Ruby 1.8 since an OrderedHash is returned. For example:
TreeNode.find_by_name('Crunchy').subtree.arrange(:order => :name)
If you just want to sort an array of nodes as if you were traversing them in preorder, you can use the sort_by_ancestry class method:
TreeNode.sort_by_ancestry(array_of_nodes)
Note that since materialised path trees don’t support ordering within a rank, the order of siblings depends on their order in the original array.
Most current tree plugins use a parent_id column (has_ancestry, awesome_nested_set, better_nested_set, acts_as_nested_set). With ancestry its easy to migrate from any of these plugins, to do so, use the build_ancestry_from_parent_ids! method on your ancestry model. These steps provide a more detailed explanation:
-
Add ancestry column to your table
-
Create migration: rails g migration add_ancestry_to_[table] ancestry:string
-
Add index to migration: add_index [table], :ancestry (UP) / remove_index [table], :ancestry (DOWN)
-
Migrate your database: rake db:migrate
-
-
Remove old tree plugin or gem and add in Ancestry
-
Remove plugin: rm -Rf vendor/plugins/[old plugin]
-
Remove gem config line from environment.rb: config.gem [old gem]
-
Add Ancestry to environment.rb: config.gem :ancestry
-
See ‘Installation’ for more info on installing and configuring gems
-
-
Change your model
-
Remove any macros required by old plugin/gem from app/models/.rb
-
Add to app/models/.rb: has_ancestry
-
-
Generate ancestry columns
-
In ‘./script.console’: [model].build_ancestry_from_parent_ids!
-
Make sure it worked ok: [model].check_ancestry_integrity!
-
-
Change your code
-
Most tree calls will probably work fine with ancestry
-
Others must be changed or proxied
-
Check if all your data is intact and all tests pass
-
-
Drop parent_id column:
-
Create migration: rails g migration remove_parent_id_from_[table]
-
Add to migration: remove_column [table], :parent_id (UP) / add_column [table], :parent_id, :integer (DOWN)
-
Migrate your database: rake db:migrate
-
I don’t see any way Ancestry tree integrity could get compromised without explicitly setting cyclic parents or invalid ancestry and circumventing validation with update_attribute, if you do, please let me know.
Ancestry includes some methods for detecting integrity problems and restoring integrity just to be sure. To check integrity use: [Model].check_ancestry_integrity!. An AncestryIntegrityException will be raised if there are any problems. You can also specify :report => :list to return an array of exceptions or :report => :echo to echo any error messages. To restore integrity use: [Model].restore_ancestry_integrity!.
For example, from IRB:
>> stinky = TreeNode.create :name => 'Stinky' $ #<TreeNode id: 1, name: "Stinky", ancestry: nil> >> squeeky = TreeNode.create :name => 'Squeeky', :parent => stinky $ #<TreeNode id: 2, name: "Squeeky", ancestry: "1"> >> stinky.update_attribute :parent, squeeky $ true >> TreeNode.all $ [#<TreeNode id: 1, name: "Stinky", ancestry: "1/2">, #<TreeNode id: 2, name: "Squeeky", ancestry: "1/2/1">] >> TreeNode.check_ancestry_integrity! !! Ancestry::AncestryIntegrityException: Conflicting parent id in node 1: 2 for node 1, expecting nil >> TreeNode.restore_ancestry_integrity! $ [#<TreeNode id: 1, name: "Stinky", ancestry: 2>, #<TreeNode id: 2, name: "Squeeky", ancestry: nil>]
Additionally, if you think something is wrong with your depth cache:
>> TreeNode.rebuild_depth_cache!
The Ancestry gem comes with a unit test suite consisting of about 1800 assertions in about 30 tests. It takes about 10 seconds to run on sqlite. To run it yourself check out the repository from GitHub, copy test/database.example.yml to test/database.yml and type ‘rake’. You can pass rake style options for ActiveRecord version to test against (e.g. ar=3.0.1) and database to test against (e.g. db=mysql). The test suite is located in test/has_ancestry_test.rb.
As can be seen in the previous section, Ancestry stores a path from the root to the parent for every node. This is a variation on the materialised path database pattern. It allows Ancestry to fetch any relation (siblings, descendants, etc.) in a single sql query without the complicated algorithms and incomprehensibility associated with left and right values. Additionally, any inserts, deletes and updates only affect nodes within the affected node’s own subtree.
In the example above, the ancestry column is created as a string. This puts a limitation on the depth of the tree of about 40 or 50 levels, which I think may be enough for most users. To increase the maximum depth of the tree, increase the size of the string that is being used or change it to a text to remove the limitation entirely. Changing it to a text will however decrease performance because an index cannot be put on the column in that case.
The materialised path pattern requires Ancestry to use a ‘like’ condition in order to fetch descendants. This should not be particularly slow however since the the condition never starts with a wildcard which allows the DBMS to use the column index. If you have any data on performance with a large number of records, please drop me line.
The latest version of ancestry is recommended. The three numbers of each version numbers are respectively the major, minor and patch versions. We started with major version 1 because it looks so much better and ancestry was already quite mature and complete when it was published. The major version is only bumped when backwards compatibility is broken. The minor version is bumped when new features are added. The patch version is bumped when bugs are fixed.
-
Version 1.2.4 (2011-4-22)
-
Prepended table names to column names in queries (thx raelik)
-
Better check to see if acts_as_tree can be overloaded (thx jims)
-
Performance inprovements (thx kueda)
-
-
Version 1.2.3 (2010-10-28)
-
Fixed error with determining ActiveRecord version
-
Added option to specify :primary_key_format (thanks goes to rolftimmermans)
-
-
Version 1.2.2 (2010-10-24)
-
Fixed all deprecation warnings for rails 3.0.X
-
Added :report option to check_ancestry_integrity!
-
Changed ActiveRecord dependency to 2.2.2
-
Tested and fixed for ruby 1.8.7 and 1.9.2
-
Changed usage of update_attributes to update_attribute to allow ancestry column protection
-
-
Version 1.2.0 (2009-11-07)
-
Removed some duplication in has_ancestry
-
Cleaned up plugin pattern according to yehudakatz.com/2009/11/12/better-ruby-idioms/
-
Moved parts of ancestry into seperate files
-
Made it possible to pass options into the arrange method
-
Renamed acts_as_tree to has_ancestry
-
Aliased has_ancestry as acts_as_tree if acts_as_tree is available
-
Added subtree_of scope
-
Updated ordered_by_ancestry scope to support Microsoft SQL Server
-
Added empty hash as parameter to exists? calls for older ActiveRecord versions
-
-
Version 1.1.4 (2009-11-07)
-
Thanks to a patch from tom taylor, Ancestry now works with different primary keys
-
-
Version 1.1.3 (2009-11-01)
-
Fixed a pretty bad bug where several operations took far too many queries
-
-
Version 1.1.2 (2009-10-29)
-
Added validation for depth cache column
-
Added STI support (reported broken)
-
-
Version 1.1.1 (2009-10-28)
-
Fixed some parentheses warnings that where reported
-
Fixed a reported issue with arrangement
-
Fixed issues with ancestors and path order on postgres
-
Added ordered_by_ancestry scope (needed to fix issues)
-
-
Version 1.1.0 (2009-10-22)
-
Depth caching (and cache rebuilding)
-
Depth method for nodes
-
Named scopes for selecting by depth
-
Relative depth options for tree navigation methods:
-
ancestors
-
path
-
descendants
-
descendant_ids
-
subtree
-
subtree_ids
-
-
Updated README
-
Easy migration from existing plugins/gems
-
acts_as_tree checks unknown options
-
acts_as_tree checks that options are hash
-
Added a bang (!) to the integrity functions
-
Since these functions should only be used from ./script/console and not from your application, this change is not considered as breaking backwards compatibility and the major version wasn’t bumped.
-
-
Updated install script to point to documentation
-
Removed rails specific init
-
Removed uninstall script
-
-
Version 1.0.0 (2009-10-16)
-
Initial version
-
Tree building
-
Tree navigation
-
Integrity checking / restoration
-
Arrangement
-
Orphan strategies
-
Subtree movement
-
Named scopes
-
Validations
-
I will try to keep Ancestry up to date with changing versions of Rails and Ruby and also with any bug reports I might receive. I will implement new features on request as I see fit. One thing I definitely want to do soon is some proper performance testing.
Bug report? Faulty/incomplete documentation? Feature request? Please post an issue on ‘github.com/stefankroes/ancestry/issues’. Please also contact me at s.a.kroesgmail.com if it’s urgent.
Question? Contact me at s.a.kroesgmail.com, make sure you read the documentation.
Copyright © 2009 Stefan Kroes, released under the MIT license