J@ArangoDB

{ "subject" : "ArangoDB", "tags": [ "multi-model", "nosql", "database" ] }

Using Dynamic Attribute Names in AQL

On our mailing list, there is quite often the question whether attribute names in objects returned from AQL queries can be made dynamic.

Here’s a (non-working) example query for this:

example query that does not work
1
2
FOR doc IN collection
  RETURN { doc.type : doc.value }

The intention in the above query obviously is to use the dynamic value from doc.type as an attribute name in the result object and not to have an attribute named "doc.type". This feature is probably in the top 20 of the most-often wished features.

However, the above query won’t even parse. The AQL grammar only allows string values left of the colon in an object definition. Non-quoted strings are allowed there too, and are implicitly turned into quoted strings. It works similar to how object literals are defined in JavaScript:

using unquoted and quoted string attribute names
1
2
3
4
RETURN { 
  foo : "bar",
  "baz" : "qux"
}

Why not allow arbitrary expression left of the colon? The reason is simple: this would cause ambiguity and probably have side-effects. For an example, have a look at the following query:

which attribute name to use here?
1
2
3
FOR doc IN collection
  LET type = doc.type;
  RETURN { type : doc.value }

If the type attribute name inside the object definition is interpreted as a string literal as it currently is an AQL (and always was), then the resulting attribute name is just "type".

If the type attribute name would now be intepreted as an expression, it would get the value that was assigned to the variable type by the LET statement. Removing the LET from the query would change the attribute name in the result back to the string literal "type".

The ambiguity could be solved by telling the parser what to do in such cases. While technically this could be working, I think it may have too many unintended side-effects. I already mentioned that introducing a LET statement into the query would change the attribute name in the result. The same could also happen if a collection named type was added to the query. And it would break compatibility with existing queries.

JavaScript has the same problem, and it wasn’t solved portably yet. However, there is a proposal for ES6 that suggests enclosing attribute name expressions in [ and ].

To me, this looks like a good solution for the problem. It’s two bytes more when keying in queries, but the syntax is easy and explicit. There are no ambiguities.

I prototyped this solution for AQL, so I could write:

query using dynamic attribute names
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
FOR i IN 1..5 
  RETURN { 
    [ CONCAT('test', i) ] : i, 
    [ SUBSTITUTE(CONCAT('i is ', (i <= 3 ? 'small' : 'not small')), { ' ' :  '_' } ) ] : i 
  }

[
  { 
    "test1" : 1, 
    "i_is_small" : 1 
  }, 
  { 
    "test2" : 2, 
    "i_is_small" : 2 
  }, 
  { 
    "test3" : 3, 
    "i_is_small" : 3 
  }, 
  { 
    "test4" : 4, 
    "i_is_not_small" : 4 
  }, 
  { 
    "test5" : 5, 
    "i_is_not_small" : 5 
  } 
]

I ran a few queries with this, and they seemed to work. However, I haven’t committed the feature yet. There might still be cases in which it doesn’t work. Tests for the feature are also still missing. I hope I can finalize the implementation soon so it becomes available in some release.

UPDATE: tests have been added, and the feature has been committed in devel. It is included in ArangoDB since version 2.5.

Everyone is welcome to try it out already!