Julien Moumné Software Engineer, Paris, France

Building Trees using a JavaScript DSL

An instructional sequence to a hack-free internal DSL for building trees in JavaScript

In vanilla JavaScript, this is how you build a tree12:

JS Bin on jsbin.com

The article describes a way to build trees using an internal DSL:

tree('A', () => {      
  tree('B', () => {    
    tree('C')    
    tree('D')    
  })  
  tree('E')
  // add nodes from a dynamically created array
  moreNodes.forEach(e => tree(e))
})

This syntax mirrors the recursive structure of trees, reduces the noise and provides the ability to mixin arbitrary code.

This makes it convenient to:

  • specify hierarchical configuration data as in Mocha or Hotshell,
  • provide test data for algorithms on tree-like structures

Concepts found in this article are neither new nor restrained to JavaScript3.

This article aims to provide an instructional sequence to a hack-free solution in JavaScript.

Readers can jump to the final solution or step through this sequence: 

Naive Solution

In this solution nodes are all children of the same parent node.

JS Bin on jsbin.com

This helps us understand the key challenge, keeping track of the node under construction so children are inserted at the right location.

Forwarding a bare variable

The most straightforward solution is to explicitly pass along the parent node.

JS Bin on jsbin.com

Coupling data and operations OOP style

An alternative solution in par with OOP promotes the global tree() function to an instance method.

JS Bin on jsbin.com

This approach is used by jbuilder. See jbuilder.js.

Because we have only one operation available on the tree we can remove the need to prefix calls to add().

This is done by programmatically binding this using bind().

JS Bin on jsbin.com

This is closer to the DSL we are looking for but we still have to pass around a parameter in each closure signature.

Implicit context using this

One approach to achieve parameterless closures is to use the implicit formal parameter this.

Forwarding a bare variable using this

Solution ‘Forwarding a bare variable’ can be adapted by programmatically binding this using apply() and forwarding it the same way ctx was being forwarded.

JS Bin on jsbin.com

In this case and the one that follows, setting this dynamically does not work with arrow functions. We must revert to standard function definitions function () { body }.

Coupling data and operations OOP style using this

Solution ‘Coupling data and operations OOP style’ can be rewritten the same way.

JS Bin on jsbin.com

Coming from other OOP languages one could hope to directly call tree() without prefixing it with this thus achieving our target DSL.

In JavaScript, without hacks, it is mandatory to explicitly reference instance members using this.

Hacking a way through the forest

Solution ‘Coupling data and operations OOP style using this can be hacked to remove the need to prefix calls to tree() with this.

The scope chain must be altered using with().

Usage of this method is usually not recommended. Whether it is to be considered bad practice or not, we will see that this solution presents serious limitations.

Ideally, we would apply with() this way:

function tree(value, closure = () => {}) {
  let newTree = {
    value: value,
    forest: [],
    tree: tree
  }      
  this.forest.push(newTree)   
           
  // in this block of code every property is first looked-up in 'newTree'
  with(newTree){
    // an example of a directly accessed property
    console.log(value)
    // call the closure wishing calls to function 'tree()' will resolve to 'newTree.tree()'
    closure()
  };
}

Calling the closure will not work as expected, properties will not be resolved on newTree. This is because JavaScript’s closures are lexically scoped and newTree was not lexically present when the closure was defined.

We must reset the lexical scope by re-interpreting the closure using eval().

JS Bin on jsbin.com

The DSL finally looks like what we had in mind.

However, there are two issues:

  • functions called in the closure do not benefit from this hack, directly calling tree() within these functions will not work
tree('A', () => {  
  twoNodesFail()
  twoNodes(newTree)
})

// does not work, nodes are added at the top of the tree as 'tree()' refers to the global function
function twoNodesFail() {
  tree('1')    
  tree('2')
}

// works, but obscure, 'newTree' comes from 'with(newTree){...}'
function twoNodes(newTree) {
  newTree.tree('1')    
  newTree.tree('2')
}
tree('A', () => {  
  var commonChild = 'D'
  tree('B', () => {
    tree(commonChild) // ReferenceError: commonChild is not defined
  })
  tree('C', () => {
    tree(commonChild) // ReferenceError: commonChild is not defined
  })
})

This is how Hotshell was initially coded.

Forgo traditional calling convention

The trick to a hack-free solution lies in understanding how parameters are handled during a function call.

In stack-oriented programming languages, parameters are stored in the call stack and their life cycle is regulated by a calling convention.

We use solution ‘Forwarding a bare variable’ to illustrate the standard three step protocol.

function tree(ctx, value, closure = () => {}) {
  let newTree = {value: value, forest: []}  
  ctx.forest.push(newTree) 
  
  // 1. Push step
  //    Before calling 'closure()', a new call frame is pushed onto the stack and
  //    a reference of the local variable 'newTree' is copied in the parameter section of the call frame
  closure(newTree) // 2. Peek step
                   //    Every reference to 'newTree' in the body of 'closure()' is resolved
                   //    by accessing the parameter section of the top of the call stack
  // 3. Pop step
  //    When 'closure()' returns, the parameter section of the call stack is discarded
  //    by removing the call frame created at step 1
}

We can achieve the same result by using an explicit parameter stack and our own calling convention.

let stack = [{forest: []}]

function tree(ctx, value, closure = () => {}) {
  let newTree = {value: value, forest: []}
  
  // peak of stack holds the parent node
  stack.peek().forest.push(newTree)
  
  // simulates pushing the value to the argument section of the stack
  stack.push(newTree)
  
  // 'tree()' calls performed in 'closure()' will access the parent node
  // at the peak of the stack
  closure()
  
  // simulates popping the value from the argument section of the stack
  // to restore the argument to its previous value
  stack.pop()
}

JS Bin on jsbin.com

This approach is used in Mocha BDD interface to allow nested describe() calls. See bdd.js.

Reuse the existing stack

In a final simplification step we avoid allocating an explicit stack by using the local variable section of the existing stack.

JS Bin on jsbin.com

This approach is used in Groovy NodeBuilder and in Hotshell. See BuilderSupport.java and dslrunner.js.

Package the solution in a JavaScript module

The same solution can be coded as a module to provide namespace control and isolation of state.

JS Bin on jsbin.com

Thoughts on internal DSLs

I find internal DSLs convenient to interact with tools I need on a daily basis.

They provide legibility as well as flexibility by being able to mixin arbitrary code.

Here are some examples to illustrate the versatility of this approach: 

With the availability of embeddable interpreters I believe this approach can be generalized further.

This is the approach used in Hotshell by using Otto, an embeddable JavaScript interpreter for Go.


  1. Property forest could have been named trees, subtrees or children. Forest is used in this article to echoe the mutually recursive definition found in Mutual recursion - Wikipedia.

  2. Tree visualization code found at https://bl.ocks.org/mbostock/4339184

  3. NodeBuilder is one implementation in Groovy. Mocha is one in JavaScript.