Working with trees

Iterating over a tree sequence returns instances of tskit::Tree. This type is immutable. No access is provided to the underlying pointers.

The API provides a set of iterators over the data in a tree.

For example, to collect the siblings of each node into a vector:

    // This is an enum defining supported
    // traversal orders through a Tree.
    use tskit::NodeTraversalOrder;
    while let Some(tree) = tree_iterator.next() {
        for node in tree.traverse_nodes(NodeTraversalOrder::Preorder) {
            if let Some(parent) = tree.parent(node) {
                // Collect the siblings of node into a Vec
                // The children function returns another iterator
                let _siblings = tree
                    .children(parent)
                    .filter(|child| child != &node)
                    .collect::<Vec<_>>();
            }
        }
    }

We may do the same calculation using API elements giving &[NodeId] (slices of node ids). This method more closely matches the tskit-c API.

    while let Some(tree) = tree_iterator.next() {
        let parents = tree.parent_array();
        let rsibs = tree.right_sib_array();
        let lchildren = tree.left_child_array();
        for node in tree.traverse_nodes(NodeTraversalOrder::Preorder) {
            let mut siblings = vec![];
            assert!(!node.is_null());
            if let Some(parent) = parents.get(usize::try_from(node).unwrap()) {
                if !parent.is_null() {
                    if let Some(child) = lchildren.get(usize::try_from(*parent).unwrap()) {
                        let mut u = *child;
                        while !u.is_null() {
                            if u != node {
                                siblings.push(u);
                            }
                            if let Some(sib) = rsibs.get(usize::try_from(u).unwrap()) {
                                u = *sib;
                            }
                        }
                    }
                }
            }
        }
    }

This approach is more complex:

  • Slice element access is via usize (size_t in C/C++).
  • Row ids are i32 (tsk_id_t in tskit-c) behind the newtypes.
  • tskit implements TryFrom to help you out, forcing you to reckon with the fact that conversion from i32 to usize is fallible.

These conversion semantics require us to manually handle all possible error paths at each step.

We can have an intermediate level of complexity using getters from the tree arrays:

    while let Some(tree) = tree_iterator.next() {
        for node in tree.traverse_nodes(NodeTraversalOrder::Preorder) {
            let mut siblings = vec![];
            if let Some(parent) = tree.parent(node) {
                if let Some(child) = tree.left_child(parent) {
                    let mut u = child;
                    while !u.is_null() {
                        if u != node {
                            siblings.push(u);
                        }
                        if let Some(sib) = tree.right_sib(u) {
                            u = sib;
                        }
                    }
                }
            }
        }
    }