J@ArangoDB

{ "subject" : "ArangoDB", "tags": [ "multi-model", "nosql", "database" ] }

AQL Function Speedups

While working on the upcoming ArangoDB 2.8, we have reimplemented some AQL functions in C++ for improved performance. AQL queries using these functions may benefit from using the new implementation of the function.

The following list shows the AQL functions for which a C++ implementation has been added in 2.8. The other C++-based AQL function implementations added since ArangoDB 2.5 are also still available. Here’s the list of functions added in 2.8:

  • document-related functions: DOCUMENT, EDGES, PARSE_IDENTIFIER
  • numerical functions: ABS, FLOOR, RAND, ROUND, SQRT
  • statistical functions: MEDIAN, PERCENTILE, STDDEV_POPULATION, STDDEV_SAMPLE, VARIANCE_POPULATION, VARIANCE_SAMPLE
  • geo functions: NEAR, WITHIN
  • array functions: APPEND, FIRST, FLATTEN, LAST, MINUS, NTH, POP, POSITION, PUSH, REMOVE_NTH, REMOVE_VALUE, REMOVE_VALUES, SHIFT, UNSHIFT
  • informational functions: COLLECTIONS, CURRENT_DATABASE, FIRST_DOCUMENT, FIRST_LIST, NOT_NULL
  • object-related functions: MERGE_RECURSIVE, ZIP

Following are a few example queries that benefit from using the C++ variants of some of the above functions:

Fetching documents programmatically using the DOCUMENT function:

  • query: FOR i IN 1..10000 RETURN DOCUMENT(test, CONCAT('test', i))
  • 2.7: 0.3005 s
  • 2.8: 0.1050 s

Fetching edges programmatically using the EDGES function:

  • query: FOR i IN 1..100000 RETURN EDGES(edges, CONCAT('test/test', i), 'outbound'):
  • 2.7: 4.3590 s
  • 2.8: 1.4469 s

Fetching many documents from a geo index, post-filtering most of them:

  • query: FOR doc IN WITHIN(locations, 0, 0, 100000) FILTER doc.value2 == 'test1001' LIMIT 1 RETURN doc
  • 2.7: 2.9876 s
  • 2.8: 0.4087 s

Generating random numbers:

  • query: FOR value IN 1..100000 RETURN RAND() * 50
  • 2.7: 0.1743 s
  • 2.8: 0.1364 s

Please note that not in every case there will be a tremendous speedup. As usual, it depends on how often a function is called inside a query and what other constructs are used. Your mileage may vary.