This week saw the completion of the AQL UPSERT
command.
This command will be very helpful in a lot of use cases, including the following:
- ensure that a document exists
- update a document if it exists, otherwise create it
- replace a document if it exists, otherwise create it
The UPSERT
command is executed on the server side and so delivers client
applications from issuing a fetch command followed by a separate, conditional UPDATE
or INSERT
command.
The general format of an UPSERT
statement is:
1 2 3 4 |
|
Following are a few example invocations of UPSERT
.
Ensure a document exists
A simple use case of UPSERT
is to ensure that a specific document exists.
For example, the following query will ensure that that there will be a document
with attribute ip
equal to 192.168.173.13
in collection hosts
:
1 2 3 4 |
|
If the document does not yet exist, the INSERT
part will be carried out. If the
document is already there, the empty UPDATE
part will be run, which will not
modify the document.
After the query has finished, there will be a document with the specified ip
value.
There is no need for client applications to check for document existence first, and then to conditionally insert or update, or to try insert first and catch errors.
Note: this is the same as ActiveRecord’s find_or_create
.
Update a document if it exists, otherwise create it
Another common use case is to check whether a specific document exists, and then update it. If it does not yet exist, the document shall be created.
Counters are a good example for this, so let’s demo this pattern with a counter, too:
1 2 3 4 |
|
The above query will again look for a document with the specified ip
attribute. If the
document exists, its counter
attribute will be increased by one. This is achieved by
referring to the pseudo-value OLD
, which in the UPDATE
case contains the previous revision
of the document.
If the search document does not yet exist, the INSERT
part will be carried out, inserting
the document and setting the initial value of counter
to 1.
Assuming the collection was empty before, running the above query once will make the collection contain this data:
1 2 3 4 5 6 7 |
|
When running the UPSERT
statement again, the collection will contain the updated document:
1 2 3 4 5 6 7 |
|
Now let’s run the query with adjusted ip
and name
values:
1 2 3 4 |
|
After that, the collection will contain two documents:
1 2 3 4 5 6 7 8 9 10 11 12 |
|
Replace a document if it exists, otherwise create it
We’ve seen UPSERT
with an INSERT
and an UPDATE
clause so far.
UPDATE
will partially update the previous revision of the document if present.
Only those attributes specified in the update-expression
will be updated, and
all non-specified attributes of the original document revision will remain
unchanged.
If instead a full replacement of the original document is required, the REPLACE
clause should be used instead of UPDATE
. REPLACE
will overwrite the previous
revision completely with what’s in update-expression
.
1 2 3 4 |
|
note: an older version of this blog post contained a wrong example here. Thanks Andy for pointing this out!
Returning documents
UPSERT
can be combined with a RETURN
statement to return either the previous
document revision (in case of UPDATE
or REPLACE
) or the new version of the
document.
Client applications can use this to check whether the UPSERT
statement has
inserted or updated the document. In case no previous revision was present, the
pseudo-value OLD
will be null
.
UPSERT
also provides a pseudo-value named NEW
containing the insert, updated or
replaced version of the document:
1 2 3 4 5 |
|
In the INSERT
case, we’ll get:
1 2 3 4 5 6 7 8 9 10 |
|
When running the query again, we’ll get into the UPDATE
case, and the same query
will now return:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
|
Complex updates
Updating and returning OLD
and NEW
will work with arbitrary calculations.
For example, the following query adds a value development
to the tag
attribute
only if not yet present in the search document:
1 2 3 4 5 |
|
Running the query multiple times will ensure that tags
will contain the value
development
only once.
Note: PUSH
is a regular AQL array function.
Restrictions
The search-value
needs to be an object literal, with attribute names being inspectable
at query compile time. This means that neither variables nor bind parameters can be used
for search-value
.
However, bind parameters and variables can be used inside search-value
.
Dynamic attribute names cannot be used for specifying attribute names in search-value`.
UPSERT
does not require an index to be present on the attributes of search-value
,
but in reality queries will benefit from indexes to find matching documents.
When more than one document in the collection matches search-value
, one arbitrary match
will be used for executing the UPDATE
clause. It is therefore recommended to use
UPSERT
commands together with a unique index or to make sure from the client application
that at most one document will match the search-value
. Ironically, one way to achieve this is
to use the UPSERT
command for inserts…
Availability
UPSERT
is currently available in the devel
branch of ArangoDB. This branch
will eventually become release 2.6. Until then, everyone is welcome to try it out and
provide feedback.