ArangoDB provides many options for finding and modifying data. Though there are several more specialized operation, data-modification AQL queries are the most general solution in ArangoDB. They allow to find documents using arbitrary filter criteria, and to modify or remove the documents once found.
Such find-and-modify operations can be executed with multiple queries (one for the find step, one for the modification step), or with a single query. Putting both steps into a single query will often save roundtrips between the application and the database and thus may be preferred over executing the steps separately. Putting both the find and the modify step into the same query also prevents other operations from interfering in between and tampering with the underlying data.
Now what if the application not only requires the data to be updated, but also needs to keep track of which documents were found and modified by a find-and-modify query? This is often required when an application needs to keep database data in sync with data in some other datastore (e.g. the filesystem or a remote service).
The pattern I would dub find-modify-return would be useful for this.
Unfortunately it hasn’t been supported in AQL until very recently. We have been asked for this so many times that I stopped counting.
I am glad that this got better with ArangoDB 2.4. Finding, modifying and returning documents is now possible from the same AQL query. The solution is not yet perfect, but at least it provides basic find-modify-return functionality for a lot of use cases, including multi-document, multi-collection queries.
Let’s start with an
INSERT operation as it is the most simple to explain:
The above query does not explicitly specify the
_key attribute for the inserted
documents, meaning the database will create the documents keys automatically.
Additionally, the query assigns a random value to an attribute
Getting to know the created keys or random values required an extra lookup query
before 2.4. Extra queries are something one wants to avoid for efficiency
With 2.4, adding two lines at the end of the query will solve the problem so that everything can be done in one query:
1 2 3 4
LET with the pseudo-value of
NEW after the
INSERT together with
RETURN statement will make the inserted documents appear in the query
result. Note that the full documents will be returned and not just the specified
attributes. This allows the application to track of all document attributes,
even the auto-generated ones.
The mechanism works for
REMOVE statements, too.
Time to use a different example for this. Let’s assume session metadata are stored in the database, and some bigger session files are stored somewhere in the filesystem. If the application needs to remove expired sessions, it will need to clean up in both places. It will first query the database to find the expired sessions, only to remove them from the filesystem (if the filesystem sessions were organized by session id from the database) and to finally remove the sessions from database itself.
The following query could be used to find removal candidates:
1 2 3 4
The results of this query can be used for cleaning up sessions in the filesystem,
and for finally removing the sessions from the database. However, this would require
REMOVE query. And didn’t we say we would like to avoid extra queries?
The good news is that in 2.4 we can by putting at least the database part into a single query:
1 2 3 4 5 6
The last two lines of the above query make sure the removed sessions are returned to the application, so the application can perform any filesystem cleanup using the session ids from the database.
Note that in case of
REMOVE one has to use the pseudo-value
REMOVE can only return documents before removal. In the case of
we can only refer to the pseudo-value
UPDATE and REPLACE
So far we saw
REMOVE, but there are also
I will handle
REPLACE in one go. All the following refers to
UPDATE, but does apply to
REPLACE as well.
The mechanism to return documents from a query is the same as
already demonstrated: simply append the
LET ... RETURN sequence to the end
of the original query.
UPDATE allows to return either the old document revisions
(before modification) or the new document revisions (after modification).
This can be expressed by using either
NEW in the final
Following is a more complex example for an
UPDATE query that aggregates data
from one collection (
phraseOccurrences) in order to find and modify matching
documents in another collection (
phrases). It will return the documents from
phrases before they got modified:
1 2 3 4 5 6 7 8 9
If we are interested in what the documents in
phrases look like with the
UPDATE applied, we can use
1 2 3 4 5 6 7 8 9
Note that the full documents will be returned here, and not just the attributes
specified in or modified by the