CRUD Document Operations Using the Python SDK with Couchbase Server
You can access documents in Couchbase using methods of the couchbase.bucket.Bucket
object.
The method for retrieving documents is get()
, get_in()
and retrieve_in()
and the methods for mutating documents are upsert()
, insert()
, replace()
and mutate_in()
.
Examples are shown using the synchronous API. Semantics of the synchronous API are easily translatable to the Twisted and Gevent APIs.
Document input and output types
The Python requires that document IDs be convertible to unicode
objects.
For document values, by default it requires that the document value be of a type serializable to json.dumps
.
See Formats and Non-JSON Documents below for more details
Return value for CRUD operations
All Python SDK operations return a Result
object (or a subclass thereof).
The result object contains general operation information and item metadata retrieved from the server.
Typically a subclass of Result
will be returned which also contains operation-specific result information, such as the value
field for a get()
operation.
The most common fields in a Result
object are:
Name | Description |
---|---|
|
For retrieval-type operations, this field contains the value of the requested key. |
|
Contains the raw error code received from libcouchbase. If this number is zero then the operation was successful; otherwise it will be an error. The |
|
This is a convenience property which is equivalent to |
|
An opaque object representing the resulting CAS value of the key that was operated on. This value is not meant to be user facing, but should be passed directly to other operations for locking purposes. |
Note that for most of the asynchronous APIs, these objects are not returned per se, but rather passed back to callbacks. For example, with Twisted, a Deferred
object is returned, and the appropriate Result
or Failure
object is passed into the callback or errback, respectively.
Additional options
Additional options may be specified using keyword arguments
Update operations also accept a TTL (expiry) value (ttl
) which will instruct the server to delete the document after a given amount of time.
This option is useful for transient data (such as sessions).
By default documents do not expire.
See Expiration Overview for more information on expiration.
Creating and updating full documents
Documents may be created and updated using the Bucket.upsert()
, Bucket.insert()
, and Bucket.replace()
family of methods.
Read more about the difference between these methods at Primitive Key-Value Operations in the Couchbase developer guide.
These methods accept two mandatory arguments:
-
key
: The ID of the document to modify. This should be a Python string or unicode object. -
value
: The desired new value of the document. This may be anything that can be serialized as JSON (other input types can also be specified, see Document input and output types).
Additional options can be specified to the operation:
-
cas
: The CAS value for the document. If the CAS on the server does not match the CAS supplied to the method, the operation will fail with acouchbase.exceptions.KeyExistsError
error. See Concurrent Document Mutations for more information on the usage of CAS values. -
ttl
: Specify the expiry time for the document. If specified, the document will expire and no longer exist after the given number of seconds. See Expiration Overview for more information. -
format
: Specify the format of the new value. This indicates how thevalue
should be serialized before being sent to the cluster. By default only JSON-serializable objects may be supplied as values. See Document input and output types. -
persist_to
,replicate_to
: Specify durability requirements for the operations. A value of-1
indicates that the specific requirement will be set to the maximum possible.
Upon success, the returned Result
object will contain the new CAS value of the document.
If the document was not mutated successfully, an exception is raised.
See Handling Exceptions and Other Errors with the Python SDK in Couchbase for more information on exception types and how to handle them.
rv = bucket.insert('document_name', {'some': 'value'})
print rv
Output:
OperationResult<RC=0x0, Key=u'document_name', CAS=0x707339a4125aaa13>
If the document being inserted already exists, the client will raise a couchbase.exceptions.KeyExistsError
.
If your application simply wants to set the value ignoring whether it exists or not, use the upsert()
method.
Retrieving full documents
Documents may be retrieved using the Bucket.get()
method.
The get()
method has a single mandatory argument:
-
key
: The document ID to retrieve
Other options include:
-
ttl
: Set the expiration for the document. This operation is known as a get-and-touch operation. See Expiration Overview for more information. -
replica
: This may be passed as a boolean to issue a replica read. This may be used if access to the master/primary node is temporarily unavailable. -
quiet
: Suppress throwing exceptions if the document does not exist. Rather than throwing an exception, status can be obtained from theResult.success
property.rv = bkt.get('maybe', quiet=True) if rv.success: handle_value(rv) else: print "Item not found"
Upon success, a ValueResult
object is returned.
The actual document may be access by using the ValueResult.value
property.
Additional properties may also be accessed from the returned object.
See Return value for CRUD operations.
The ValueResult.value
will contain a native Python object, deserialized from JSON (or another format, per Document input and output types).
If the document does not exist (and quiet=True
was not specified), a couchbase.exceptions.NotFoundError
will be raised.
rv = bucket.get('document_name')
print "Result object is:", rv
print "Actual value is:", rv.value
Sample output:
Result object is ValueResult<RC=0x0, Key=u'document_name', Value={u'some': u'value'}, CAS=0x20504a5e6a5aaa13, Flags=0x2000000> Actual value is {u'some': u'value'}
If the item does not exist, the client will raise a couchbase.exceptions.NotFoundError
, which you can catch:
from couchbase.exceptions import NotFoundError
try:
rv = bkt.get('NOTEXISTENT')
except NotFoundError as e:
print "Item not found", e
Removing full documents
Documents may be removed using the Bucket.remove()
method.
This method takes a single mandatory argument:
-
key
: The ID of the document to remove
Some additional options:
-
quiet
: Do not raise an exception when attempting to remove a document which does not exist. -
cas
: Only remove the document if the CAS has not changed.
Modifying expiraton
Document expiration can be performed using the Bucket.touch()
method.
cb.touch('document_id', ttl=5)
You can also set the ttl
parameter for methods which support it:
cb.upsert('expires', "i'm getting old...", ttl=5)
print cb.get('expires').value
time.sleep(6)
print cb.get('expires').value
i'm getting old... Traceback (most recent call last): File "exp.py", line 10, in <module> print cb.get('expires').value File "/usr/local/lib/python2.7/site-packages/couchbase/bucket.py", line 489, in get replica=replica, no_format=no_format) couchbase.exceptions._NotFoundError_0xD (generated, catch NotFoundError): <Key=u'expires', RC=0xD[The key does not exist on the server], Operational Error, Results=1, C Source=(src/multiresult.c,309)>
Atomic document modifications
Additional atomic document modifications can be performing using the Python SDK.
You can modify a counter document using the Bucket.counter()
method.
You can also use the Bucket.append
and Bucket.prepend
methods to perform raw byte concatenation
Batching Operations
Many operations can be batched in the Python SDK using their *_multi
equivalent.
For example, to batch multiple Bucket.get()
calls, you would use Bucket.get_multi()
.
The various *_multi
operations all return a MultiResult
object which acts like a dictionary: it maps each individual key to its operation which was performed.
cb.upsert_multi({
'foo': 'fooval',
'bar': 'barval',
'baz': 'bazval'})
for key, result in cb.get_multi(('foo', 'bar', 'baz')).items():
print '{0}: {1.value}'.format(key, result)
baz: bazval foo: fooval bar: barval
You can use the Item API to pass additional per-operation options to multi methods.
Operating with sub-documents
Sub-Document API is available starting Couchbase Server version 4.5. See Sub-Document Operations for an overview. |
Sub-document operations save network bandwidth by allowing you to specify paths of a document to be retrieved or updated.
The document is parsed on the server and only the relevant sections (indicated by paths) are transferred between client and server.
You can execute sub-document operations in the Python SDK using the lookup_in
, mutate_in
, and retrieve_in
methods.
Each of these methods accepts a key
as its mandatory first argument, followed by one or more command specifications specifying a specifying an operation and a document field operand.
You may find all the operations in the couchbase.subdocument
module.
import couchbase.subdocument as SD res = cb.lookup_in('docid', SD.get('path.to.get'), SD.exists('check.path.exists')) res = cb.mutate_in('docid', SD.upsert('path.to.upsert', value, create_parents=True), SD.remove('path.to.del'))
For simply retrieving a list of paths, you may use the retrieve_in
convenience method:
res = cb.retrieve_in('docid', 'path1', 'path2', 'path3')
All sub-document operations return a special SubdocResult
object which is a subclass of Result
.
In contrast with a normal Result
object, a SubdocResult
object contains multiple results with multiple statuses, one result/status pair for every input operation.
You can access an individual result/status pair by addressing the SubdocResult
object as a mapping, and then using either the index position or the path of the operation as the key:
res = cb.lookup_in('docid', SD.get('foo'), SD.exists('bar'), SD.exists('baz'))
# First result
res['foo']
# or
res[0]
Using the []
(getitem
) functionality will raise an exception if the individual operation did not complete successfully.
You can also use SubdocResult.get()
to return a tuple of (errcode, value)
Formats and Non-JSON Documents
See Non-JSON Documents for a general overview of using non-JSON documents with Couchbase |
All Python objects which can be represented as JSON may be passed unmodified to a storage function, and be received via the get
method without any additional modifications.
You can modify the default JSON encoders used by the Python SDK using the couchbase.set_json_converters
function.
This function accepts a pair of encode and decode functions which are expected to behave similarly to json.dumps
and json.loads
respectively.
Storage operations accept a format
keyword argument which may be one of couchbase.FMT_JSON
(to indicate the object should be serialized as JSON), couchbase.FMT_UTF8
(to serialize the object as a UTF-8 encoded string), couchbase.FMT_BYTES
(to serialize an object as a raw set of bytes; note the Python object in question must be of type bytes
), couchbase.FMT_PICKLE
(to serialize an object using Python’s native pickle
module).
You may also define new formats and utilize them via a custom transcoder.
You can implement a custom transcoder if none of the pre-configured options are suitable for your application. A custom transcoder converts intputs to their serialized forms, and deserializes encoded data based on the item flags. The transcoder interface is described in the API documentation (http://pythonhosted.org/couchbase/api/transcoder.html), and an example (http://pythonhosted.org/couchbase/api/transcoder.html) is also provided in the source repository. When implementing a transcoder