Supported Functions¶
Out of the box, the engine packs a large collection of functions the user can utilize. The engine has been written to be extended. A user can provide his own function(s). Please follow the User Defined Functions chapter for understanding how to enrich the existing functions set.
Name | Description | Usage |
---|---|---|
ANONYMIZE | Obfuscates the data, in order to protect it. If the anonymized
value is not provided it will use
* for String and 0 for Numbers |
anonymize(expression [,Anonymized value]) |
CAST | Enables conversion of values from one data type to another | cast(dt as int) |
COALESCE | Returns the first non-null expression in the expression list.
You must specify two expressions
|
coalesce(value, prevValue) |
CONCAT | Returns the string representation of concatenating each
expression in the list. Null fields are left out
|
concatenate(weather.main, 'fixed', dt,temp.eve) |
CONCATENATE | An alias for concat.
|
concatenate(weather.main, 'fixed', dt,temp.eve) |
LEN | Returns the length of a string. LEN calculates length
using characters as defined by UTF-16
|
len(name) |
POSITION | Return the position of the first occurrence of substring | position('math' in fieldA) |
REPLACE | Returns string with every occurrence of
search_string replaced with
replacement_string |
replace(fieldA, 'math', 'mathematics') |
SUBSTRING | Return a portion of the string, beginning at the given position | substring(field, 2) |
CONTAINS | Return true if an expression contains the given substring | contains(field1, field2) |
STARTSWITH | Return true if an expression starts with the given substring | startswith(field1, field2) |
ENDSWITH | Return true if an expression ends with the given substring | endswith(field1, field2) |
TRIM | Removes leading and trailing spaces from the input expression | trim(expression) |
CHOP | Returns the last character from an expression of type string | chop(expression) |
UNIX_TIMESTAMP | Converts the given date into the timestamp as a long | unix_timestamp(field) |
DELETEWHITESPACE | Removes all whitespace from an expression of type string | deletewhitespace(expression) |
LOWER | Returns the expression in lowercase | lower(expression) |
UPPER | Returns the expression in uppercase | upper(expression) |
LOWERCASE | An alias for lower | lowercase(expression) |
UPPERCASE | An alias for upper | uppercase(expression) |
TAKELEFT | Returns the left most ‘length’ characters from a string expression | takeleft(expression, length) |
TAKERIGHT | Returns the right most ‘length’ characters from a string expression | takeright(expression, length) |
DROPLEFT | Removes the left most ‘length’ characters from a string expression | dropleft(expression, length) |
DROPRIGHT | Removes the left most ‘length’ characters from a string expression | dropright(expression, length) |
DUMP | Show the internal representation of a value | dump(expression) |
SIZEOF | Returns the number of elements in an array of object | sizeof(expression) |
DIGITS | Retains only the digits from a string expression | digits(expression) |
TYPEOF | Returns the object type of a complex expression. This can only be used when
the format on the wire includes the details of the
objects full name.
|
typeof(expression) |
CAPITALIZE | Capitalizes the first letter of the expression | capitalize(expression) |
CAMELCASE | Uncapitalizes the first letter of the expression | camelcase(expression) |
TRUNCATE | Truncates the expression by removing whitespace from the beginning and end | truncate(expression) |
ABBREVIATE | Abbreviates the expression to the given length and appends ellipses | abbreviate(expression, length) |
DISTANCE | Calculate the distance between two points using the haversine method | distance(x1, y1, x2, y2) |
INDEXOF | Returns the index of a substring in an expression | abbreviate(expression, substring) |
RADIANS | Converts the input expression to radians | radians(expression) |
DEGREES | Converts the input expression to degrees | degrees(expression) |
SIN | Returns the trigonometric sine of an expression | sin(expression) |
COS | Returns the trigonometric cosine of an expression | cos(expression) |
TAN | Returns the trigonometric tangent of an expression | tan(expression) |
ASIN | Returns the trigonometric arc sine of an expression | asin(expression) |
ACOS | Returns the trigonometric arc cosine of an expression | acos(expression) |
ATAN | Returns the trigonometric arc tangent of an expression | atan(expression) |
SINH | Returns the hyperbolic sine of an expression | sinh(expression) |
COSH | Returns the hyperbolic cosine of an expression | cosh(expression) |
TANH | Returns the hyperbolic tangent of an expression | tanh(expression) |
STRIPACCENTS | Removes diacritics (approximately the same as accents)
from an expression. The case will not be altered.
|
stripaccents(field1) |
UUID | Returns an universally unique identifier | uuid() |
RANDINT | Returns a random integer | randint() |
SWAPCASE | Swaps the case of a string expression | swapcase() |
ABS | Returns the absolute value of an
expression it has to evaluate to a number type
|
abs(field1 + field2) |
NEG | Returns the negative value of an
expression it has to evaluate to a number type
|
neg(field1) |
SIGN | Returns +1 if a value is positive or -1 if a value is negative
it has to evaluate to a number type
|
sign(field1) |
ROUND | Returns the closest integer of an
expression , with tiesrounding towards positive infinity. it has to evaluate
to a number type
|
round(field1) |
CEIL | Returns the absolute value of an
expression it has to evaluate to a number type
|
ceil(field1) |
FLOOR | Return the largest value not greater than the argument | floor(field1) |
‘%’ | The remainder operator (%) computes the remainder after
dividing its first operand by its second. Each expression has
to evaluate to a number
|
a % b |
‘/’ | Divides one number by another (an arithmetic operator)
Each expression has to evaluate to a number
|
a / b |
‘-‘ | Subtracts one number from another (an arithmetic operator)
Each expression has to evaluate to a number
|
a - b |
‘*’ | Multiplies one number from another (an arithmetic operator)
Each expression has to evaluate to a number
|
a * b |
‘+’ | Adds one number to another (an arithmetic operator)
Each expression has to evaluate to a number
|
a + b |
‘-‘ (negative) | Returns the negative of the value of a numeric expression
(a unary operator). The expression has to evaluate to a number
|
a * b |
POW | Returns expression1 raised to the expression2 power.
Each expression has to evaluate to a number
|
pow(a, b) |
SQRT | Returns the square root of expression.
The expression has to evaluate to a number
|
sqrt(a) |
CBRT | Returns the cube root of expression.
The expression has to evaluate to a number
|
cbrt(a) |
COUNT | Returns the number of records read
ONLY FOR STREAMING when grouping is involved |
SELECT count(*) |
MAX | Returns maximum value of expression.
ONLY FOR STREAMING when grouping is involved |
max(field1) |
MIN | Returns minimum value of expression.
ONLY FOR STREAMING when grouping is involved |
min(field1) |
SUM | Returns the sum of expression for each record read.
ONLY FOR STREAMING when grouping is involved |
sum(field1) |
IF | Evaluates the boolean result for condition. If true it will
return the value of expr1; otherwise, it evaluates and
returns the value of expr2
|
if(condition, expr1,expr2) |
LPAD | Prepends the value of padExpr to the value of strExpr
until the total length is lengthExpr
|
LPAD(strExpr, lengthExpr, padExpr) |
RPAD | Appends the value of padExpr to the value of strExpr
until the total length is lengthExpr
|
LPAD(strExpr, lengthExpr, padExpr) |
EPOCH_TO_DATETIME | Converts an epoch into a datetime
yyyy-MM-dd'T'HH:mm:ss.SSSZ strExpr needs to be a LONG value containing the
milliseconds since 1 January 1970 00:00:00)
|
EPOCH_TO_DATETIME(strExpr) |
EPOCH_TO_DATE | Converts an epoch into a date
yyyy-MM-dd The strExpr needs to resolve to INT/LONG value. If the
value is an INT it is expected to be the day represented
as days since epoch
|
EPOCH_TO_DATE(strExpr) |
DATETIME_TO_EPOCH
FROM_DATETIME
FROMDATETIME
|
Converts a string representation of a datetime into epoch value.
The first parameter strExpr needs to be a STRING value.
The second parameter pattern must be a DateTime Format
|
FROM_DATETIME(strExpr, pattern) |
DATE_TO_EPOCH | Converts a string representation of a date into epoch value.
For AVRO records, this will emit a field with a logical type of
Date. This means the output value is an INT and it represents
the days since Epoch. For JSON records the resulted value is the
epoch expressed as LONG. The first parameter strExpr needs to
resolve to a STRING value. The second parameter pattern,
must be a Date Format
|
DATE_TO_EPOCH(strExpr, pattern) |
CONVERT_TIME | Converts the string format of a date [and time]. The first
parameter strExpr needs be a STRING value. The second
parameter fromPattern is the incoming date[and time] format.
The last parameter represents the target date[and time] format.
|
CONVERT_TIME(strExpr, fromPattern, toPattern) |
CURDATE
CURRENT_DATE
|
Provides the current ISO date value
|
CURDATE()/CURRENT_DATE() |
CURRENT_DATETIME
CURRENT_DATETIME
CURTIME
|
Provides the current ISO date and time.
|
CURTIME()/CURRENT_DATETIME() |
YESTERDAY | Returns the current date time minus 1 day | yesterday() |
TOMORROW | Returns the current date time plus 1 day | tomorrow() |
NEXTWEEK | Return the current date time plus 1 week | nextweek() |
LASTWEEK | Return the current date time minus 1 week | lastweek() |
NEXTMONTH | Return the current date time plus 1 month | nextmonth() |
LASTMONTH | Return the current date time minus 1 month | lastmonth() |
DAY | Extracts the day component of an expression that is of type timestamp | day(expression) |
HOUR | Extracts the hour component of an expression that is of type timestamp | hour(expression) |
MINUTE | Extracts the minute component of an expression that is of type timestamp | minute(expression) |
SECOND | Extracts the second component of an expression that is of type timestamp | second(expression) |
MONTH | Extracts the month component of an expression that is of type timestamp | month(expression) |
YEAR | Extracts the year component of an expression that is of type timestamp | year(expression) |
ARRAY_SIZE | Returns the count of elements contained by the parameter.
It is expected the parameter, evaluates to
an array structure.
|
ARRAY_SIZE(EXPR) |
REGEXP | Returns the matched groups otherwise null.
Consider this regular expression
(Mr|Mrs|Ms) ([A-Z][a-z]+) ([A-Z][a-z]+) and this input
Ms Jane Smith ; the return values are:["Ms Jane Smith", "Ms", "Jane", "Smith"] . |
REGEXP(SQL_EXPR, REGEX_EXPR) |
REGEXP_LIKE | Returns true if the input value matches the expression
otherwise false.
|
REGEXP_LIKE(SQL_EXPR, REGEX_EXPR) |
REGEXP_REPLACE | Replace the matched text with the provided value.
If there’s no match the input will
remain the same.
|
REGEXP_REPLACE(SQL_EXPR, SQL_EXPR, REGEX_EXPR) |
EXISTS | Returns true if the given field is present;
false otherwise.
|
EXISTS(FIELD_EXPR) |
HEADERASSTRING | Returns the value of the record header key as
a STRING value.
|
HEADERASSTRING('HEADER_KEY') |
HEADERASINT | Returns the value of the record header key as
an INT value.
|
HEADERASINT('HEADER_KEY') |
HEADERASLONG | Returns the value of the record header key as
a LONG value.
|
HEADERASLONG('HEADER_KEY') |
HEADERASDOUBLE | Returns the value of the record header key as
a DOUBLE value.
|
HEADERASDOUBLE('HEADER_KEY') |
HEADERASFLOAT | Returns the value of the record header key as a FLOAT value.
|
HEADERASFLOAT('HEADER_KEY') |
HEADERKEYS | Returns all the header keys for the current record.
|
HEADERKEYS() |