J@ArangoDB

{ "subject" : "ArangoDB", "tags": [ "multi-model", "nosql", "database" ] }

AQL Improvements for 2.7

With ArangoDB 2.6 being in beta already, it’s time to look at some features scheduled for 2.7. Today I’ll showcase a few AQL parser improvements that are present in the devel branch already, which will be the foundation for the 2.7 release.

Star operator

The already existing star operator ([*]) is much more flexible in 2.7 than in previous ArangoDB versions. It now allows filtering the values it iterates over, and optional projections.

These features will be demonstrated using the following example members data:

example data for queries
1
2
3
4
5
6
[
  { "name" : "sir alfred", "age" : 60, "likes" : [ "lettuce", "tortoises" ] },
  { "name" : "mozquito", "age" : 15, "likes" : [ "skateboards", "music" ] },
  { "name" : "murphy", "age" : 28, "likes" : [ "juice", "tarts", "cakes" ] },
  { "name" : "helga", "age" : 52, "likes" : [ "home", "garden", "tortoises", "cakes" ] }
]

To return all members with an age of 40 or higher and that also like tortoises, we can now combine the star operator with an inline FILTER expression:

star operator with inline FILTER
1
RETURN members[* FILTER CURRENT.age >= 40 && "tortoises" IN CURRENT.likes]

The inline FILTER expression has access to the current array element via the pseudo-variable CURRENT. The FILTER expression can thus access the element’s attributes and sub-attributes, and also use them in function expressions.

The above query will return only two array members as expected:

1
2
3
4
[
  { "name" : "sir alfred", "age" : 60, "likes" : [ "lettuce", "tortoises" ] },
  { "name" : "helga", "age" : 52, "likes" : [ "home", "garden", "tortoises", "cakes" ] }
]

It’s also possible to extract just a specific sub-attribute of each result value:

extracting just a sub-attribute
1
RETURN members[* FILTER CURRENT.age >= 40 && "tortoises" IN CURRENT.likes].name

This will return:

query result
1
2
3
4
[
  "sir alfred",
  "helga"
]

If we don’t want to return the whole match but a single attribute is not enough, the star operator can be used in combination with arbitrary projections, too.

The following query extracts the matching members a FILTER as above, but now only returns each member’s age attribute and the number of values in the member’s likes attribute:

extracting with a FILTER and a projection
1
2
3
4
RETURN members[* FILTER CURRENT.age >= 40 && "tortoises" IN CURRENT.likes RETURN { 
  name: CURRENT.name, 
  likes: LENGTH(CURRENT.likes) 
}]

This will produce the following result:

query result
1
2
3
4
[
  { "name" : "sir alfred", "likes" : 2 },
  { "name" : "helga", "likes" : 4 }
]

If only a certain number of values is required, the star operator can be combined with a LIMIT clause, too. This is useful for testing whether at least one of the array members satisfies a FILTER condition:

limiting the number of results
1
RETURN members[* FILTER "garden" IN CURRENT.likes LIMIT 1]

Overall, the star operator is now much more powerful than before, so in many queries it can replace full-blown FOR loops and subqueries when the goal simply is to iterate over sub-parts of a result.

Multi-star operator

In 2.7 there is now also a multi-star operator ([**]). This operator can be used to iterate over an array, too. In addition it will also flatten its input, so it can be used for collapsing nested array structures.

This is useful in cases where a flat result is required but the single star operator would return a nested array.

Consider this query with the single star operator:

extracting nested arrays
1
RETURN members[* FILTER CURRENT.age >= 40].likes[*]

This will produce:

result of single star query
1
2
3
4
[
  [ "lettuce", "tortoises" ],
  [ "home", "garden", "tortoises", "cakes" ]
]

To get a collapsed array, we can employ the double star operator:

extracting flattend arrays
1
RETURN members[* FILTER CURRENT.age >= 40].likes[**]

Then we’ll get:

result of double star query
1
2
3
4
5
6
7
8
[
  "lettuce",
  "tortoises",
  "home",
  "garden",
  "tortoises",
  "cakes"
]

Note: the result of this query can also be made unique using the standard AQL function UNIQUE().

The star operator in 2.7 can have any number of levels. Using it with a single star will simply iterate over the input array, using it with two stars will collapse one level of the input, using it with three stars will collapse two levels of the input etc.

Subquery result usage

While working on the AQL parser anyway, the grammar has been modified so it allows a few more types of expressions.

For example, the result of a subquery can now be used as an array and the subquery results can be accessed by position directly:

accessing subquery results by position
1
2
3
4
RETURN (
  FOR i IN [ 1, 2, 3 ] 
    RETURN i
)[0]

The trailing [0] led to a parse error in previous versions. To make this work in previous versions, the subquery result needs to be captured in an extra variable using LET before accessing the 0th member of that variable:

workaround for accessing subquery results by position
1
2
3
4
5
LET temp = (
  FOR i IN [ 1, 2, 3 ] 
    RETURN i
)
RETURN temp[0]

The parser generalizations now also allow to use the star operator directly on a subquery result and access its individual members:

expanding subquery results using star operator
1
2
3
4
5
6
7
RETURN (
  FOR member IN [ 
    { name: "sir alfred" }, 
    { name: "mozquito" }
  ]
   RETURN member
)[*].name