FAQ¶
Java Version Issues¶
Make sure you are using JDK 1.8 or newer. JDK 7 has reached end of public updates and if using the connector you will get a java.lang.UnsupportedClassVersionError
exception
Classpath Issues¶
The Connectors throw dependency issues such as Guava class not found. Update the plugins.path
option in the
connect-avro-distributed.properties
to point to the directory containing the Connector jars files. Kafka Connect 0.11
has introduced isolated classpath loaders.
Connection Refused when posting Configs¶
If you see this
confluent-3.3.0|⇒ bin/connect-cli ps
java.net.ConnectException: Connection refused (Connection refused)
Kafka Connect is not up. Connect can be slow to start especially is loading a lot of plugins like the Stream Reactor. Confluent’s CLI will report Connect is up when running confluent start but Connect’s API is not ready to receive connectors yet. Wait until the Connect logs had reported that all plugins have loaded an try again.
Also, the connect-cli assumes you are on the same host a Connect cluster worker that you are targeting, if not set the following environment variable:
export KAFKA_CONNECT_REST="http://myserver:myport"
Kudu - No records in Impala table¶
If the KCQL statement is set to autocreate, tables that are created are not visible in Impala. You need to map them first. Please refer to the Kudu Documentation.
If you have created your table in Impala as a managed table you need to fully qualify the table name in the KCQL statement with the impala namespace and database .i.e.
INSERT INTO impala::default.my_table SELECT * FROM my_topic
I have JSON. Can I still use DataMountaineer Sink Connectors?¶
Kafka Connect has two converters for both the key and payload from Kafka. These are JSON and AVRO, the JSON converter is part of the Kafka distribution and the AVRO converter from Confluent’s schema registry. These converters convert the records in Kafka to SinkRecords, most of our Sinks rely of the data in Kafka being AVRO and written with the Confluent AVRO Serializers. This is best practice. This allows the Connectors to receive SinkRecords with schemas for the payloads to mapping, filtering can take place based on the The KCQL routing querying provided.
However, it is possible to sink JSON messages from Kafka with some of our Sinks by using the JsonConverter from Kafka. If your JSON messages
have a schema field the converter will deliver the records to the Sink with a schema. If no schema tag is present the
records will be delivered with a schema of type SCHEMA.String
.
Note
You must be using at least Cassandra 3.0.9 to have JSON support!
Can I run on multiple nodes?¶
Yes, Kafka Connect has two modes, standalone and distributed. Both allow for scaling by setting the tasks.max
property.
In distributed mode, each work joins a Connect cluster defined in the etc/schema-registry/connect-avro-distributed.properties
file that is part of the Confluent distribution. Within this file, a property called group.id
controls this.
# The group ID is a unique identifier for the set of workers that form a single Kafka Connect
# cluster
group.id=connect-cluster
Redis authentication¶
If your Redis server is requiring the connection to be authenticated you will need to provide an extra setting:
connect.redis.password=$REDIS_PASSWORD
Don’t set the value to empty if no password is required.
InfluxDb Port already in use¶
InfluxDB starts an Admin web server listening on port 8083 by default. For this quickstart, this will collide with Kafka Connects default port of 8083. Since we are running on a single node we will need to edit the InfluxDB config.
#create config dir
sudo mkdir /etc/influxdb
#dump the config
influxd config > /etc/influxdb/influxdb.generated.conf
Now change the following section to a port 8087 or any other free port.
[admin]
enabled = true
bind-address = ":8087"
https-enabled = false
https-certificate = "/etc/ssl/influxdb.pem"
How to get multiple workers on different hosts to form a Connect Cluster¶
For workers to join a Connect cluster, set the group.id
in the etc/schema-registry/connect-avro-distributed.properties
file.
# The group ID is a unique identifier for the set of workers that form a single Kafka Connect
# cluster
group.id=connect-cluster
HBase Sink isn’t connecting to Zookeeper Quroum¶
Ensure you have your HBase clusters hbase-site.xml
in your classpath.
export CLASSPATH=hbase-site.xml