J@ArangoDB

{ "subject" : "ArangoDB", "tags": [ "multi-model", "nosql", "database" ] }

Speeding Up Array/object Literal Access

Last week some further optimization slipped into 2.6. The optimization can provide significant speedups in AQL queries using huge array/object bind parameters and passing them into V8-based functions.

It started with an ArangoDB user reporting a specific query to run unexpectedly slow. The part of the query that caused the problem was simple and looked like this:

problematic query
1
2
3
FOR doc IN collection
  FILTER doc.attribute == @value
  RETURN TRANSLATE(doc.from, translations, 0)

In the original query, translations was a big, constant object literal. Think of something like the following, but with a lot more values:

example translations value
1
2
3
4
5
6
7
{
  "p1" : 1,
  "p2" : 2,
  "p3" : 40,
  "p4" : 9,
  "p5" : 12
}

The translations were used for replacing an attribute value in existing documents with a lookup table computed outside the AQL query.

The number of values in the translations object was varying from query to query, with no upper bound on the number of values. It was possible that the query was running with 50,000 lookup values in the translations object.

When trying to reproduce the problem, we expected that the query would get at worst linearly slower with an increasing number of lookup values. But in reality, the following non-linear execution times were observed when increasing the number of lookup values:

execution times for varying input sizes, without optimization
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
# of values |  execution time
------------+----------------
          1 |        0.6111 s
          2 |        0.6078 s  
          4 |        0.6021 s
          8 |        0.6160 s
         16 |        0.6925 s
         32 |        0.7107 s
         64 |        0.7677 s
        128 |        0.8576 s
        256 |        1.0544 s
        512 |        1.4579 s
       1024 |        8.8303 s
       2048 |       17.3674 s
       4096 |       35.3109 s
       8192 |       74.9161 s
      16384 |      145.0837 s
      32768 |      361.9870 s
      65536 |      880.4995 s

(note: all values stated above are wall-clock times for running the query with a FILTER condition matching 50,000 documents – i.e. the TRANSLATE() expression was executed 50,000 times per query)

With small objects passed in translate, the execution times only increased slowly even when object sizes were doubled. The TRANSLATE() expression’s share of the overall query execution time was still low for small objects, even when doubling their sizes. However, it got pretty bad for objects with 1,024 members already, and from that point on, execution times more than doubled if object sizes got doubled.

The TRANSLATE() function itself has O(1) complexity, so we could rule it out as the problem cause. However, TRANSLATE() is V8-based, and it turned out that there was a problem when the number of values in the translations object increased from 1022 to 1023. At that particular threshold, execution time quadrupled.

At 1023 object members, V8 seems to change the internal object format, which probably requires rearranging the object data internally. V8 has several internal types for representing JavaScript objects, and converting between them is not free.

The obvious optimization opportunity for this case was to create the translations object value just once as a V8 object, and reuse the same object when calling the TRANSLATE() function repeatedly. This avoids repeated creation and destruction of the V8 objects used in function calls, and as a side effect may also lead to less garbage values being accumulated when functions are called repeatedly.

The optimization is possible here because the translations object is an object literal and thus constant. It will also work for array literals and bind parameters (which are also treated as literals once their values are known).

Here are the execution time for running the TRANSLATE() on 50,000 documents with the modification:

execution times, with optimization
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
# of values |  execution time
------------+----------------
          1 |        0.6251 s
          2 |        0.6302 s  
          4 |        0.6138 s
          8 |        0.6141 s
         16 |        0.6685 s
         32 |        0.6232 s
         64 |        0.6204 s
        128 |        0.6326 s
        256 |        0.6460 s
        512 |        0.6275 s
       1024 |        0.6639 s
       2048 |        0.6345 s
       4096 |        0.6554 s
       8192 |        0.6789 s
      16384 |        0.7569 s
      32768 |        0.7636 s
      65536 |        1.0173 s

Looks like this is going to scale way better.

The optimization is disabled for big array/objects which are non-constant (e.g. a variable or the result of an expression), or for parameters passed into user-defined AQL functions. Enabling it for user-defined AQL functions is not safe because in theory these might modify their arguments (and function arguments are passed by reference – passing them by value would also defeat the purpose of the optimization).