The Lenses SQL Snapshot engine accesses the data at the point in time the query is executed. This means, for Apache Kafka, data added just after the query was initiated will not be processed.
Typical use cases are, but limited to:
A single entry in a Kafka topic is called a message.
The engine considers a message to have four distinct components key, value, headers and metadata.
key
value
headers
metadata
Currently, the Snapshot Engine supports four different facets _key, _value, _headers and _metadata; These strings can be used to reference properties of each of the aforementioned message components and build a query that way.
_key
_value
_headers
_metadata
By default, unqualified properties are assumed to belong to the _value facet:
SELECT property FROM source_topic;
In order to reference a different facet, a facet qualifier can be added:
SELECT _value.valueField, _key.keyField, _meta.metaField, _headers.headerField FROM source_topic;
When more than one sources/topics are specified in a query (like it happens when two topics are joined) a table reference can be added to the selection to fix the ambiguity:
SELECT users._value.field FROM users JOIN purchases
the same can be done for any of the other facets (_key,_meta,_headers).
_meta
Messages can contain nested elements and embedded arrays. The . operator is used to refer to children, and the [] operator is used for referring to an element in an array.
.
[]
You can use a combination of these two operators to access data of any depth.
SELECT dependencies[0].first_name AS childName FROM policy_holder WHERE policyId='100001'
You explicitly reference the key, value and metadata.
For the key use _key, for the value use _value, and for metadata use _meta. When there is no prefix, the engine will resolve the field(s) as being part of the message value. For example, the following two queries are identical:
SELECT amount FROM payments; SELECT _value.amount FROM payments;
When the key or a value content is a primitive data type use the prefix only to address them.
For example, if messages contain a device identifier as the key and the temperature as the value, SQL code would be:
SELECT _key AS deviceId , _value AS temperature FROM iot_data
Use the _meta keyword to address the metadata. For example:
SELECT _meta.timestamp AS timestamp , _meta.offset AS index FROM iot_data
When projecting a field into a target record, Lenses allows complex structures to be built. This can be done by using a nested alias like below:
SELECT amount as user.amount userId as user.id FROM payments;
The result would be a struct with the following shape:
{ "user": { "amount" : 10.19, "id": 10 } }
When two alias names clash, the snapshot engine does not “override” that field. Lenses will instead generate a new name by appending a unique integer. This means that a query like the following:
SELECT amount as result.amount, amount + 5 as result.amount FROM payments;
will generate a structure like the following:
{ "result": { "amount" : 10, "amount0": 15 } }
On this page