Connecting to JDBC datasources
Siren Federate provides the capability to query data from a remote JDBC databases.
Settings
In order to send queries to virtual indices the Elasticsearch cluster must contain at least one node enabled to issue queries over JDBC; it is advised to use a coordinating only node for this role, although this is not a requirement for testing purposes.
JDBC node settings
In order to enable JDBC on a node where the Siren Federate plugin is installed,
add the following setting to elasticsearch.yml
:
node.attr.connector.jdbc: true
Then, create a directory named jdbc-drivers
inside the configuration directory
of the node (e.g. elasticsearch/config
or /etc/elasticsearch
).
Finally, copy the JDBC driver for your remote datasource and its dependencies
to the jdbc-drivers
directory created above and restart the node; see the
JDBC driver installation and compatibility section for a list of compatible drivers and
dependencies.
Common configuration settings
Encryption
JDBC passwords are encrypted by default using a predefined 128 bit AES key;
before creating datasources, it is advised to generate a custom key by running
the keygen.sh
script included in the siren-federate
plugin directory as
follows:
bash plugins/siren-federate/tools/keygen.sh -s 128
The command will output a random base64 key; it is also possible to generate keys longer than 128 bit if your JVM supports it.
To use the custom key, the following parameters must be set in
elasticsearch.yml
on master nodes and on all the JDBC nodes:
-
siren.connector.encryption.enabled
:true
by default, can be set tofalse
to disable JDBC password encryption. -
siren.connector.encryption.secret_key
: a base64 encoded AES key used to encrypt JDBC passwords.
Example elasticsearch.yml
settings for a master node with a custom encryption
key:
siren.connector.encryption.secret_key: "1zxtIE6/EkAKap+5OsPWRw=="
Example elasticsearch.yml
settings for a JDBC node with a custom encryption
key:
siren.connector.encryption.secret_key: "1zxtIE6/EkAKap+5OsPWRw=="
node.attr.connector.jdbc: true
Restart the nodes after changing the configuration to apply the settings.
JDBC driver installation and compatibility
The JDBC driver for your remote datasource and its dependencies must be copied
to the jdbc-drivers
subdirectory inside the configuration directory of JDBC
nodes (e.g. elasticsearch/config/jdbc-drivers
).
It is not required nor recommended to copy these drivers to nodes which are not enabled to execute queries.
You may create a sub-directory within jdbc-drivers
to store a driver and to provide a custom security policy file for this driver. It is recommended to use this approach for drivers that come in multiple jars. A custom security policy file enables the definition of driver-specific permissions. The custom security policy file must be named security.policy
and must be located inside the driver sub-directory. The following variables can be used within the policy file:
-
codebase.federate.common
: Path to the directory storing thesecurity.policy
(defaults tojdbc-drivers
if the defaultdrivers-security.policy
file is used) -
codebase.federate.${jar_name)
: Path to a driver jar. Here,${jar_name}
refers to the filename of the jar stored in the directory where thesecurity.policy
file is located (defaults to jars found injdbc-drivers
if the defaultdrivers-security.policy
file is used). For Example:
grant codeBase "${codebase.federate.postgresql-42.2.5.jar}" {
// Permissions for postgresql-42.2.5.jar
}
If a security.policy
is placed in the main jdbc-drivers
directory, then it overrides the default drivers-security.policy
Restart the JDBC node after copying the drivers.
Name | JDBC class | Notes |
---|---|---|
PostgreSQL |
org.postgresql.Driver |
Download the latest JDBC 4.2 driver from
https://jdbc.postgresql.org/download.html and copy the
|
MySQL |
com.mysql.jdbc.Driver |
Download the latest GA release from
https://dev.mysql.com/downloads/connector/j/, extract it, then copy
When writing the JDBC connection string, set the |
Microsoft SQL Server 2014 or greater |
com.microsoft.sqlserver.jdbc.SQLServerDriver |
Download |
Sybase ASE 15.7+ |
com.sybase.jdbc4.jdbc.SybDriver OR net.sourceforge.jtds.jdbc.Driver |
To use the FreeTDS driver, download the latest version from
https://sourceforge.net/projects/jtds/files/, extract it, then copy
To use the jConnect driver, copy |
Oracle 12c+ |
oracle.jdbc.OracleDriver |
Download the latest |
Presto |
com.facebook.presto.jdbc.PrestoDriver |
Download the latest JDBC driver from https://prestodb.io/docs/current/installation/jdbc.html
and copy it to the |
Spark SQL 2.2+ |
com.simba.spark.jdbc41.Driver |
The Magnitude JDBC driver for Spark can be purchased at
https://www.simba.com/product/spark-drivers-with-sql-connector/; once
downloaded, extract the bundle, then extract the JDBC 4.1 archive and copy the
following jars to the In addition, copy your license file to the |
Dremio |
com.dremio.jdbc.Driver |
Download the jar at https://download.siren.io/dremio-jdbc-driver-1.4.4-201801230630490666-6d69d32.jar
and copy it to the |
Impala |
com.cloudera.impala.jdbc41.Driver |
Download the latest JDBC bundle from
https://www.cloudera.com/downloads/connectors/impala/jdbc/2-5-42.html, extract
the bundle, then extract the JDBC 4.1 archive and copy the following jars to the
|
Neo4j |
org.neo4j.jdbc.http.HttpDriver |
Download the driver from https://mvnrepository.com/artifact/org.neo4j/neo4j-jdbc-driver/3.4.0 and copy the jar to the |
Operations on virtual indices
The plugin supports the following operations on virtual indices:
-
get mapping
-
get field capabilities
-
search
-
msearch
-
get
-
mget
Search requests involving a mixture of virtual and normal Elasticsearch indices (e.g. when using a wildcard) are not supported and will be rejected; it is however possible to issue msearch requests containing requests on normal Elasticsearch indices and virtual indices.
When creating a virtual index, the plugin will create an empty Elasticsearch index for interoperability with {sg} and {xpack}; if an Elasticsearch index with the same name as the virtual index already exists and it is not empty, the virtual index creation will fail.
When deleting a virtual index, the corresponding Elasticsearch index will not be removed.
Type conversion
The plugin converts JDBC types to their closest Elasticsearch equivalent:
-
String types are handled as
keyword
fields. -
Boolean types are handled as
boolean
fields. -
Date and timestamp are handled as
date
fields. -
Integer types are handled as
long
fields. -
Floating point types are handled as
double
fields.
Complex JDBC types which are not recognized by the plugin are skipped during query processing and resultset fetching.
Supported search queries
The plugin supports the following queries:
-
match_all
-
term
-
terms
-
range
-
exists
-
prefix
-
wildcard
-
ids
-
bool
At this time the plugin provides no support for datasource specific full text
search functions, so all these queries will work as if they were issued against
keyword
fields.
Supported aggregations
Currently the plugin provides support for the following aggregations:
Metric:
-
Average
-
Cardinality
-
Max
-
Min
-
Sum
Bucket:
-
Date histogram
-
Histogram
-
Date range
-
Range
-
Terms
-
Filters
Only terms aggregations can be nested inside a parent bucket aggregation.
Troubleshooting
Cannot reconnect to datasource by hostname after DNS update
When the Java security manager is enabled, the JVM will cache name resolutions indefinitely; if the system you’re connecting to uses round-robin DNS or the IP address of the system changes frequently, you will need to modify the following Java Security Policy properties:
-
networkaddress.cache.ttl
: the number of seconds to cache a successful DNS lookup. Defaults to-1
(forever). -
networkaddress.cache.negative.ttl
: the number of seconds to cache an unsuccessful DNS lookup. Defaults to10
, set to0
to avoid caching.