Lenses SQL Snapshot mode allows to control the resources involved when running a query. To instruct the engine what to do, a user would have to set a few parameters using the SET statement.
SET
Querying a table can take a long time if it contains a lot of records.The underlying Kafka topic has to be read, and the filter conditions applied, and the projections made.
Additionally, the SELECT statement could end up bringing a large amount of data to the client. To be able to constrain the resources involved, Lenses allows for context customization, which ends up driving the execution, thus giving control to the user. Here is the list of context parameters to overwrite:
SELECT
SET max.size = '1g';
SET max.query.time = '60000ms';
SET max.idle.time = '5s';
SELECT * FROM payments LIMIT 100;
SET show.bad.records=false;
SET format.timestamp=true;
SET format.decimal= 2;
SET live.aggs=true;
SET max.group.records=10000000
SELECT * FROM trips WHERE _key='customer_id_value';
SET optimize.kafka.partition=false;
SET query.parallel=2;
max.poll.records
SET query.buffer=50000;
SET kafka.offset.timeout=20000;
All the above values can be given a default value via the configuration file. Using lenses.sql.settings as prefix the format.timestamp can be set like this:
lenses.sql.settings
format.timestamp
lenses.sql.settings.format.timestamp=true
Lenses SQL uses Kafka Consumer to read the data. This means that an advanced user with knowledge of Kafka could tweak the consumer properties to achieve better throughput. This would occur on very rare occasions. The query context can receive Kafka consumer settings. For example, the max.poll.records consumer can be set as:
SET max.poll.records= 100000; SELECT * FROM payments LIMIT 1000000
The fact is that streaming SQL is operating on unbounded streams of events: a query would normally be a never-ending query. In order to bring query termination semantics into Apache Kafka we introduced 4 controls:
Thus, when retrieving data, you can set a limit of 1GB to the maximum number of bytes retrieved and maximum query time of one hour like this:
SET max.bytes = 1000000000; SET max.time = 60000000; SELECT * FROM topicA WHERE customer.id = "XXX";
On this page