Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion README.adoc
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
= Couchbase Analytics Java SDK Documentation

This repository hosts the documentation source for the
https://docs.couchbase.com/analytics-java-sdk/1.0/hello-world/overview.html[Analytics Java SDK Documentation].
https://docs.couchbase.com/analytics-java-sdk/1.1/hello-world/overview.html[Analytics Java SDK Documentation].


This branch documents the 1.1 release of the SDK.
Expand Down
Empty file.
23 changes: 22 additions & 1 deletion modules/concept-docs/pages/querying-your-data.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,8 @@
// This page pulls in content from -sdk-common-
// and code samples from -example-dir-
//
// It can be seen built at wwww.
// It can be seen built at whttps://docs.couchbase.com/java-analytics-sdk/1.1/concept-docs/querying-your-data.html


[abstract]
{description}
Expand All @@ -26,6 +27,26 @@ Enterprise Analytics eliminates these ETL complexities by unifying operational a
This enables Zero ETL, reducing costs, complexity, and improving time to insight.



== Long Running Queries

All versions of Enterprise Analytics support client-side async requests.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All versions of Enterprise Analytics support client-side async requests.

I would not be comfortable claiming the EA JVM SDK's startQuery method is guaranteed to work with EA servers older than 2.2. We haven't tested it, as far as I know.

This process keeps a connection live between client and server for long running queries.
For such long running queries, a better mechanism is for the client to trigger an async query without holding a connection to the server open for the duration of the query execution.
This xref:{version-ea}@enterprise-analytics:intro:intro.adoc[Server Asynchronous API] is introduced in Enterprise Analytics Server 2.2.


After the query request is submitted, clients can monitor the status of the request and on successful execution of query, opt to stream the results of the query.
This way the connection between SDK client and Analytics Server does not stay open for the long duration of query processing, and is only needed for result set streaming.

`cluster.StartQueryAsync()` → `QueryHandle` → `QueryStatus` → `QueryResultsHandle`

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the Java SDK, the name of the method that kicks it off is startQuery (not StartQueryAsync).

Also notable, the user can call startQuery on a cluster or a scope.

I would be in favor of removing this line and letting the flow in the example speak for itself.

Ditto in the other place this text occurs, in the connstr.adoc partial.


This information flow can be seen in the sample code in the
xref:howtos:sqlpp-queries-with-sdk.adoc#server-asynchronous-api[{sqlpp} queries section].


== Next Steps

Dive right in with our xref:howtos:sqlpp-queries-with-sdk.adoc[practical examples].


Expand Down
55 changes: 55 additions & 0 deletions modules/devguide/examples/java/ServerAsync.java
Original file line number Diff line number Diff line change
@@ -0,0 +1,55 @@
// tag::server-async[]
static void queryHandleExample(Queryable clusterOrScope) throws InterruptedException, TimeoutException {
String slowStatement = """
SELECT COUNT (1) AS c
FROM
ARRAY_RANGE(0,10000) AS d1,
ARRAY_RANGE(0,10000) AS d2
""";

Duration timeout = Duration.ofMinutes(15);

QueryHandle queryHandle = clusterOrScope.startQuery(
slowStatement,
opt -> opt.timeout(timeout)
);

QueryResultHandle resultHandle = waitForResult(queryHandle, timeout);
try {
// Process rows one by one as they arrive from the server.
QueryMetadata metadata = resultHandle.streamRows(row -> System.out.println("Got row: " + row));
System.out.println("Got metadata: " + metadata);

// Alternatively, if the result is known to fit in memory:
QueryResult buffered = resultHandle.bufferRows();
System.out.println("Got result: " + buffered);

} finally {
// Tell the server it can forget the result.
resultHandle.discard();
}
}

private static QueryResultHandle waitForResult(
QueryHandle queryHandle,
Duration timeout
) throws InterruptedException, TimeoutException {
final long timeoutNanos = timeout.toNanos();
final long startNanos = System.nanoTime();

while (true) {
QueryStatus status = queryHandle.fetchStatus();
if (status.resultReady()) return status.resultHandle();

System.out.println("Waiting for query to finish; current status: " + status);

long elapsedNanos = System.nanoTime() - startNanos;
if (elapsedNanos > timeoutNanos) {
throw new TimeoutException("Query result not ready after " + timeout);
}

SECONDS.sleep(1); // or use exponential backoff
}
}
// end::server-async[]

4 changes: 2 additions & 2 deletions modules/hello-world/pages/overview.adoc
Original file line number Diff line number Diff line change
@@ -1,10 +1,10 @@
= Java Analytics SDK
= Java Analytics SDK {sdk_dot_minor}
:page-layout: landing-page-top-level-sdk
:page-role: tiles
:!sectids:


= Java Analytics SDK
= Java Analytics SDK {sdk_dot_minor}

The Java Analytics SDK allows you to connect to an Enterprise Analytics cluster from Java.
For connecting to a Couchbase Server Cluster -- self-managed, or Capella Operational --
Expand Down
24 changes: 24 additions & 0 deletions modules/hello-world/pages/start-using-sdk.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -44,6 +44,15 @@ To see log messages from the SDK, xref:howtos:logging.adoc[include an SLF4J bind
[quickstart]
== Connecting and Executing a Query


The 1.1 {name-sdk} adds support for JWT and client certificate authentication, as well as a new "async" poll-based API that uses request handles to fetch results.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"async" -> Should we refer to this API as the "startQuery" API, because that's the name of the method?

Ditto everywhere else we refer to the "async" server API.

Introduced in the 2.2 release of self-managed Enterprise Analytics Server, this API eliminates the need for long-running server connections.

The examples in this first section of the page are for the standard API -- async on the client side -- working with all 2.x releases of Enterprise Analytics (with Server Async examples following in the <<#server-asynchronous-api,Server Async section>>).
Note, you will still be able to use this API with 2.2+ releases of Enterprise Analytics, in addition to the new API.

=== Server Synchronous API

[source,java]
----
import com.couchbase.analytics.client.java.Cluster;
Expand Down Expand Up @@ -106,6 +115,21 @@ include::{version-common}@analytics-sdk:shared:partial$connstr.adoc[tag=connstr]



=== Server Asynchronous API

In the 2.2 release, the Enterprise Analytics Server introduces an asynchronous request API.
The SDK sends a request, polls for results, and then fetches once the result is available.
The SDK supports each stage of this information flow:

`clusterOrScope.StartQuery()` → `QueryHandle` → `QueryStatus` → `QueryResultHandle`

.Server Asynchronous API Example
[source,java]
----
include::devguide:example$java/ServerAsync.java[indent=0,tag=server-async]
----


== Migration from Row-Based Analytics

include::{version-common}@analytics-sdk:shared:partial$migration.adoc[tag=migration]
188 changes: 51 additions & 137 deletions modules/howtos/pages/managing-connections.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -71,6 +71,56 @@ include::{version-common}@analytics-sdk:shared:partial$connstr.adoc[tag=more-con









== Authentication by Credential

Similarly to the `Authenticator` abstraction in Couchbase Operational SDKs, Analytics SDKs use a `Credential` abstraction covering regular password authentication (Basic Access Authentication), JSON Web Tokens (JWT), and Client Certificates through mTLS.

Basic Access Authentication is shown in the example <<#connecting-to-a-cluster,above>>.


=== JSON Web Tokens (JWT)

From the 1.1 SDK (with Enterprise Analytics Server 2.2+) JWT is supported.

[source,java]
----
var credential = Jwt.Create(Credential token);
public static Cluster Create(string connectionString, Credential credential);
Comment on lines +93 to +94

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

var credential = Credential.ofJwt("...");
var cluster = Cluster.newInstance(connectionString, credential);

----


=== Certificate Authentication

From the 1.1 Analytics SDK (with Enterprise Analytics Server 2.2+) certificate authentication is supported.
A conceptual and architectural overview of Enterprise Analytics's support of X.509 certificates is provided in the xref:{version-server}@server:learn:security/certificates.adoc[Server certificates docs].
Practical information on handling certificates can be found in the xref:{version-ea}@enterprise-analytics:manage:manage-security/manage-certificates.adoc[Enterprise Analytics certificates docs].

The Analytics SDK authenticates the client during the TLS handshake.
The SDK reads the certificate and private key from a `PKCS#12` file:

[source,java]
----
var credential = ClientCertificate.FromPem(certPath, keyPath);
public static Cluster Create(string connectionString, Credential credential);

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

var credential = Credential.fromKeyStore(
    Paths.get("path/to/client-cert-and-key.p12"),
    password
);
var cluster = Cluster.newInstance(connectionString, credential);

----


== Certificate Authority

To make a TLS connection to an Enterprise Analytics cluster with a root certificate issued by a trusted CA (Certificate Authority), you do not need to add this to your configuration --

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would replace this section with something like:

By default, the Analytics SDK trusts the same well-known Certificate Authorities (CAs) as the JVM, plus the Couchbase Capella CA. This works for most deployments.

If you need to trust a different CA, or want to trust only a specific CA, you can override the default trust settings by telling the Analytics SDK which CA certificates to trust.

var cluster = Cluster.newInstance(
  connectionString,
  credential,
  options -> options
    .security(sec -> sec.trustOnlyPemFile(Paths.get("path/to/ca.pem")))
);

the platform's defaults are automatically trusted.

The cluster's root certificate just needs to be issued by a CA whose certificate is in your system trust store.
This includes well known CAs (including GoDaddy and Verisign), plus any other CA certificates that you wish to add.



// For Columnar, as for all Capella products, connection must be made with Transport Layer Security (TLS) -- for full encryption of client-side traffic --
// for which the `couchbases://` schema is used as the root of the connection string (note the trailing *s*).

Expand Down Expand Up @@ -116,40 +166,13 @@ The client fetches the full address list from the first node it is able to conta
////


.Connection string with two parameters
----
http://localhost:8095?timeout.connect_timeout=30s&timeout.query_timeout=2m
----

The full list of recognized parameters is documented in the xref:ref:client-settings.adoc[client settings reference].


////
Any client setting with a system property name may also be specified as a connection string parameter (without the `com.couchbase.env.` prefix).

WARNING: When creating a `Cluster` using a custom `ClusterEnvironment`, *_connection string parameters are ignored_*, since client settings are frozen when the cluster environment is built.
////


////
=== Cluster Environment

A `ClusterEnvironment` manages shared resources like thread pools, timers, and schedulers.
It also holds the client settings.
One way to customize the client's behavior is to build your own `ClusterEnvironment` with custom settings:

[source,scala]
----
include::devguide:example$scala/ManagingConnections.scala[tag=env,indent=0]
----

This is a verbose example for simplicity, and the user may prefer to use `flatMap` or a for-comprehension to combine the multiple `Try`.

Note there are `com.couchbase.client.scala.env` and `com.couchbase.client.core.env` versions of all environment parameters: be sure to import the `.scala` versions.

TIP: If you create a `Cluster` without specifying a custom environment, the client creates a default environment used exclusively by that `Cluster`.
This default `ClusterEnvironment` is managed completely by the Scala SDK, and is automatically shut down when the associated `Cluster` is disconnected.
////


////
Expand All @@ -167,119 +190,10 @@ If you created any `ClusterEnvironment` instances, call their `shutdown()` metho



////
[#multiple-clusters]
=== Connecting to Multiple Clusters

If a single application needs to connect to multiple Couchbase Server clusters, we recommend creating a single `ClusterEnvironment` and sharing it between the `Clusters`.
We will use a for-comprehension here to avoid excessive `Try` juggling.

[source,scala]
----
include::devguide:example$scala/ManagingConnections.scala[tag=shared,indent=0]
----

Remember, whenever you manually create a `ClusterEnvironment` like this, the SDK will not shut it down when you call `Cluster.disconnect()`.
Instead you are responsible for shutting it down after disconnecting all clusters that share the environment.
////




////
=== Waiting for Bootstrap Completion

Opening resources is asynchronous.
That is, the call to `cluster.bucket` or `Cluster.connect` will complete instantly, and opening that resource will continue in the background.

You can force waiting for the resource to be opened with a call to `waitUntilReady`, which is available on both the `Cluster` and `Bucket`.
Here is an example of using it on the bucket:

[source,scala]
----
include::devguide:example$scala/ManagingConnections.scala[tag=wait-until-ready,indent=0]
----

If not present, then the first Key Value (KV) operation on the bucket will wait for it to be ready.
Any issues opening that bucket (for instance, if it does not exist), will result in an error being raised from that data operation.

Other timeout issues may occur when using the SDK located geographically separately from the Couchbase Server cluster --
this is xref:project-docs:compatibility.adoc#network-requirements[not recommended in production deployments], but often occurs during development.
See the <<working-in-the-cloud,Cloud section>> below for some suggestions of settings adjustments.
////


////
== Alternate Addresses and Custom Ports

If your Couchbase Server cluster is running in a containerized, port mapped, or otherwise NAT'd environment like Docker or Kubernetes, a client running outside that environment may need additional information in order to connect the cluster.
Both the client and server require special configuration in this case.

On the server side, each server node must be configured to advertise its external address as well as any custom port mapping.
This is done with the https://docs.couchbase.com/server/6.5/cli/cbcli/couchbase-cli-setting-alternate-address.html[`setting-alternate-address` CLI command] introduced in Couchbase Server 6.5.
A node configured in this way will advertise two addresses: one for connecting from the same network, and another for connecting from an external network.

On the client side, the externally visible ports must be used when connecting.
If the external ports are not the default, you can specify custom ports using the overloaded `Cluster.connect()` method that takes a set of `SeedNode` objects instead of a connection string.

[source,scala]
----
include::devguide:example$scala/ManagingConnections.scala[tag=seed-nodes,indent=0]
----

TIP: In a deployment that uses multi-dimensional scaling, a custom KV port is only applicable for nodes running the KV service.
A custom manager port may be specified regardless of which services are running on the node.

In many cases the client is able to automatically select the correct set of addresses to use when connecting to a cluster that advertises multiple addresses.
If the detection heuristic fails in your environment, you can override it by setting the `io.networkResolution` client setting to `default` if the client and server are on the same network, or `external` if they're on different networks.

NOTE: Any TLS certificates must be set up at the point where the connections are being made.
////





////
// DNS-SRV

include::{version-common}@sdk:shared:partial$dnssrv-pars.adoc[tag=dnssrv]

DNS SRV bootstrapping is enabled by default in the {name-sdk}.
In order to make the SDK use the SRV records, you need to pass in the hostname from your records (here `example.com`):

[source,scala]
----
include::devguide:example$scala/ManagingConnections.scala[tag=dnssrv,indent=0]
----

[source,scala]
----
ClusterEnvironment env = ClusterEnvironment.builder().ioConfig(IoConfig.enableDnsSrv(true)).build();
----

If the DNS SRV records could not be loaded properly you'll get the exception logged and the given host name will be used as a A record lookup.

----
WARNING: DNS SRV lookup failed, proceeding with normal bootstrap.
javax.naming.NameNotFoundException: DNS name not found [response code 3];
remaining name '_couchbase._tcp.example.com'
at com.sun.jndi.dns.DnsClient.checkResponseCode(DnsClient.java:651)
at com.sun.jndi.dns.DnsClient.isMatchResponse(DnsClient.java:569)
----

Also, if you pass in more than one node, DNS SRV bootstrap will not be initiated:

----
INFO: DNS SRV enabled, but less or more than one seed node given.
Proceeding with normal bootstrap.
----
////


== Local Development


We strongly recommend that the client and server xref:project-docs:compatibility.adoc#network-requirements[are in the same LAN-like environment] (e.g. AWS Region).
As this may not always be possible during development, read the guidance on working with xref:ref:client-settings.adoc#commonly-used-options[constrained network environments].
// As this may not always be possible during development, read the guidance on working with xref:ref:client-settings.adoc#commonly-used-options[constrained network environments].
// More details on connecting your client code to Couchbase Capella can be found xref:cloud:clouds:connect-an-sdk.adoc#connecting-your-sdk-to-capella[in the Capella Operational docs].
Loading