Data Processing using Transforms and Converters
Reminder
Stream vs. Message Processing
Stream Processing
has access to multiple data streams
factory requires either one or more input to perform its job :
joining 2 data streams to form another one.
perform a per-message processing.
aggregate all the values from a data stream into a single one.
establish a windowed periods in which all operations are performed.
Message Processing
has access to only one data stream
factory can accept multiple messages or inputs and for one input, we have one output.
SMT (Single Message Transform) created with this mindset in mind : limited set of operations to the message, operation performed at message level.
SMT cannot be used for Stream Processing.
SMT (Single Message Transform) Implementing Transformation (Generic Object of Kafka Library)
SMT has to override functionalities of the “Transformation” Class.
Converters
All messages are stored in a byte[] format (binary) and there is no serialization or deserialization happening on the Kafka broker. But, producers and consumers have to serialize and deserialize the data, by using CONVERTERS.