MapReduce Views Using the Java SDK with Couchbase Server
You can use MapReduce views to create queryable indexes in Couchbase Server.
The normal CRUD methods allow you to look up a document by its ID. A MapReduce (view query) allows you to look up one or more documents based on various criteria. MapReduce views are comprised of a map function that is executed once per document (this is done incrementally, so this is not run each time you query the view) and an optional reduce function that performs aggregation on the results of the map function. The map and reduce functions are stored on the server and written in JavaScript.
MapReduce queries can be further customized during query time to allow only a subset (or range) of the data to be returned.
See the Incremental MapReduce Views and Querying Data with Views sections of the general documentation to learn more about views and their architecture. |
The following example is the definition of a by_name
view in a "beer" design document.
This view checks whether a document is a beer and has a name.
If it does, it emits the beer’s name into the index.
This view allows beers to be queried for by name.
For example, it’s now possible to ask the question "What beers start with A?"
function (doc, meta) {
if (doc.type && doc.type == "beer" && doc.name) {
emit(doc.name, null);
}
}
A Spatial View can instead be queried with a range
or bounding box.
For example, let’s imagine we have stored landmarks with coordinates for their home city (eg.
Paris, Vienna, Berlin and New York) under geo
, and each city’s coordinates is represented as two attributes, lon
and lat
.
The following spatial view map function could be used to find landmarks within Europe, as a "by_location" view in a "spatial" design document:
function (doc, meta) {
if (doc.type && doc.type == "landmark" && doc.geo) {
emit([doc.geo.lon, doc.geo.lat], null);
}
}
Querying Views through the Java SDK
Query a view through the Java client through the query(ViewQuery q)
method on the Bucket
class.
This method returns a ViewResult
whose iterator yields the results of the query (in the form of ViewRow
objects).
The ViewResult
also exposes the list of rows (allRows()
), the success()
status and potential error()
.
The ViewRow
object contains the key
and value
properties (which are the first and second arguments to the view’s emit()
function, respectively) as well as the id
property, which may be passed to the get()
method to return the actual document.
Alternatively, directly call the document()
method on the view row.
Bucket bkt = CouchbaseCluster.create("192.168.33.101").openBucket("beer-sample");
ViewResult result = bkt.query(ViewQuery.from("beer", "by_name");
for (ViewRow row : result) {
System.out.println(row); //prints the row
System.out.println(row.document().content()); //retrieves the doc and prints content
}
You can also set various properties on the query:
Bucket bkt = CouchbaseCluster.create("192.168.33.101").openBucket("beer-sample");
ViewQuery q = ViewQuery.from("beer", "by_name")
.limit(5) // Limit to 5 results
.startKey("A")
.endKey("A\u0fff");
ViewResult result = bkt.query(q);
for (ViewRow row : result) {
System.out.println(row);
}
Here’s some sample output for the previous query:
DefaultViewRow{id=harvey_son_lewes-a_lecoq_imperial_extra_double_stout_1999, key=A. LeCoq Imperial Extra Double Stout 1999, value=null} DefaultViewRow{id=harvey_son_lewes-a_lecoq_imperial_extra_double_stout_2000, key=A. LeCoq Imperial Extra Double Stout 2000, value=null} DefaultViewRow{id=mickey_finn_s_brewery-abana_amber_ale, key=Abana Amber Ale, value=null} DefaultViewRow{id=brasserie_lefebvre-abbaye_de_floreffe_double, key=Abbaye de Floreffe Double, value=null} DefaultViewRow{id=brasserie_de_brunehaut-abbaye_de_saint_martin_blonde, key=Abbaye de Saint-Martin Blonde, value=null}
It is only beneficial in the synchronous API (in the async API you could just call However, since the simple signature of Note that the |
Querying Geospatial Views
To query a geospatial view, you will need to construct a SpatialViewQuery
object (com.couchbase.client.java.view.SpatialViewQuery
).
Spatial queries accept a startRange
and an endRange
parameter which allow you to limit the enclosing bounding boxes of the result.
The arguments to these parameters are JsonArray
with each element corresponding to a component emitted by the key (the first two components implicitly being the longitude and latitude of the result itself).
On output, spatial queries yield instances of SpatialViewRow
classes.
A SpatialViewRow
is similar to a ViewRow
, with an added geometry
property.
SpatialViewQuery q = SpatialViewQuery.from("spatial", "by_location")
.startRange(JsonArray.from(0, -90, null))
.endRange(JsonArray.from(180, 90, null));
SpatialViewResult result = bkt.query(q);
for (SpatialViewRow row : result) {
System.out.println("Key:" + row.key());
System.out.println("Value:" + row.value());
System.out.println("Geometry:" + row.geometry());
}
SpatialView
also has the includeDocs()
parameter to preload the document for the SpatialViewRow
's document()
method.
View results details
For all types of views, a ViewResult
is always returned, which contains zero to many ViewRows
.
In addition to iterative row access, more methods are available on the result:
Method | Description |
---|---|
|
Accumulates all returned rows in a List and returns it. |
|
Provides iterative access to rows as they arrive. |
|
The total number of rows in the index can be greater than the number of |
|
True if the query was successful, false otherwise.
Check |
|
Contains the error if the query was not successful or null otherwise. |
|
Contains debug information if |
The only difference between regular and spatial view results is the fact that spatial ones do not expose the number of totalRows
.
ViewQuery API details
All options shown here are available on the ViewQuery
in a fluent API manner.
All of them are optional, so only when they are explicitly provided, they will alter the behavior of the query.
As a general note, all arguments that accept JSON are provided with a higher number of method overloads to accommodate all combinations in a type-safe manner.
Method | Accepted Types | Description |
---|---|---|
|
|
When true queries the development view, false by default. |
|
|
Explicitly enables/disables the reduce function on the query. If not provided and the view has a reduce function, it will be used. |
|
|
Limits the number of the returned documents to the specified number. |
|
|
Skips the given number of records before starting to return the results. |
|
|
Groups the results using the reduce function to a group or single row. |
|
|
Specifies the group level to be used. |
|
|
Whether the specified end key should be included in the result. |
|
|
Defines how stale the view results are allowed to be in the query. |
|
|
Enabled debugging on view queries. |
|
|
Sets the response in the event of an error. |
|
|
Returns the documents in descending order by key if |
|
|
The exact key to return from the query. |
|
|
Only the given matching keys will be returned. |
|
|
Where to start searching for the key range. Can be used for efficient pagination. |
|
|
Where to stop searching for the key range. |
|
|
The key where the row return range should start. |
|
|
The key where the row return range should end. |
|
|
Wether or not to automatically fetch the document corresponding to each row.
The second parameter is the target class for the document, This method is needed only when using the blocking API since on the async API there is no benefit over just calling See note on |
Important when using Grouping:group(boolean) and groupLevel(int) should not be used together in the same view query.
It is sufficient only to set the grouping level only and use this setter in cases where you always want the highest group level implicitly.
|
SpatialViewQuery API details
All options shown here are available on the SpatialViewQuery
in a fluent API manner.
All of them are optional, so only when they are explicitly provided, they will alter the behaviour of the query.
Method | Accepted Types | Description |
---|---|---|
|
|
When true queries the development view, false by default. |
|
|
Limits the number of the returned documents to the specified number. |
|
|
Skips the given number of records before starting to return the results. |
|
|
Defines how stale the view results are allowed to be on query. |
|
|
Enabled debugging on view queries. |
|
|
Sets the response in the event of an error. |
|
|
Where the spatial range should start. Can be multidimensional. |
|
|
Where the spatial range should end. Can be multidimensional. |
|
|
Convenience method to combine start and endrange in one argument. |
|
|
Weather or not to automatically fetch the document corresponding to each row.
The second parameter is the target class for the document, This method is needed only when using the blocking API since on the async API there is no benefit over just calling See note on |
Here is how to use the range
parameter to find documents with a location within a bounding box.
We have stored cities Paris, Vienna, Berlin and New York.
Each city’s coordinates is represented as two attributes, lon
and lat
.
The spatial view’s map function is:
function (doc) { if (doc.type == "city") { emit([doc.lon, doc.lat], null); } }
To query the view and find cities within Europe, we use Europe’s bouding box. The startRange is the most south-western point of the bounding box, the endRange is its most north-eastern point:
JsonArray EUROPE_SOUTH_WEST = JsonArray.from(-10.8, 36.59);
JsonArray EUROPE_NORTH_EAST = JsonArray.from(31.6, 70.67);
SpatialViewResult result = bucket.query(SpatialViewQuery.from("cities", "by_location")
.stale(Stale.FALSE)
.range(EUROPE_SOUTH_WEST, EUROPE_NORTH_EAST));
List<SpatialViewRow> allRows = result.allRows();
for (SpatialViewRow row : allRows) {
System.out.println(row.id());
}
//prints:
//city::Vienna
//city::Berlin
//city::Paris
Retry Conditions
SDK retries view requests automatically on certain known conditions, which represented in the following table:
HTTP status code | Behavior |
---|---|
200 |
Do not retry request. |
300, 301, 302, 303, 307, 401, 408, 409, 412, 416, 417, 501, 502, 503, 504 |
Retry request. |
404 |
In case the library detects yet unprovisioned node, it will retry.
Otherwise, it will report |
500 |
If the error payload reports missing view document or badly formed query, it will not retry. Otherwise, it will retry request. |
All codes not listed in the table will not be retried by default. But the client code still can use retrying framework or write a custom handler. In the example below, it will retry 10 times if the view does not exist:
bucket.query(SpatialViewQuery.from("spatial", "test"))
.retryWhen(
RetryBuilder.anyOf(ViewDoesNotExistException.class)
.delay(Delay.exponential(TimeUnit.SECONDS, 1))
.max(10)
.build())
.subscribe(new Action1<AsyncSpatialViewResult>() {
@Override
public void call(AsyncSpatialViewResult result) {
// handle result
}
});