EXPERT KNOWLEDGE AT A GLANCE

Category: Data Protocols

Messaging Patterns – It is not enough to decide to use a message

Messaging Patterns- What are they? What are their strengths and why should they only be used with caution? We clarify these questions in this article.

What are Design Patterns?


Technology-independent designs can provide proven pattern solutions in software development, ensuring standardized and robust architecture.
If you’ve never heard of software design patterns, check out this article from us on the subject first.


Design patterns allow a developer to draw on the experience of others. They offer proven solutions for recurring tasks. A one-to-one implementation is not advisable. The patterns should rather be used as a guide.

What is a message?

A basic design pattern is the message. Actually a term that is used by everyone as a matter of course, but what is behind it?


Data is packaged in messages and then transmitted from the sender to the receiver via a message channel. The following figure shows such a messaging system.

Messaging Patterns - This scheme shows the basic concept of a message
Messaging Patterns – Basic Concept of a Message

The communication is asynchronous, which means that both applications are decoupled from each other and therefore do not have to run simultaneously. The sender must build and send the message, while the receiver must read and unpack it.

What are Messaging Patterns?

However, this form of message transmission is only one way of transferring information. The following figure shows the basic concepts of messaging design patterns.

This diagram shows all the basic components of the messaging design patterns
Basic Components of the Messaging Patterns

What is Message Construction?

It is not enough to decide to use a message. A message can be constructed according to different architectural patterns, depending on the functions to be performed.


The following figure shows some of these patterns.

Messaging Design Patterns - This diagram shows the different Patterns of message construction.
Messaging Design Patterns – Message Construction Patterns

Message Construction – When do I use it?

Massaging can be used not only to send data between a sender and receiver, but also to call a procedure or request a response in another application.


With the right message architecture a certain flexibility can be guaranteed. This makes the message much more robust against possible future changes.

What is Message Routing?

A message router connects the message channels in a messaging system. We will come back to this topic later. This router corresponds to a filter, which regulates the message forwarding, but does not change the message. A message is only forwarded to another channel if all predefined conditions are met.


The following figure lists some specific message router types.

Messaging Design Patterns - This diagram shows the different patterns of message routing.
Messaging Patterns – Message Routing Patterns

When do I use message routing and how?

For example, messages can be forwarded to dynamically defined recipients, or message parts can be processed or combined in a differentiated manner.

What are Messaging Channels?

In a messaging system, the exchange of information does not just happen unregulated. The sender transfers the message to a so-called messaging channel and the receiver requests a specific message channel.
In this way, the sender and receiver are decoupled. However, the sender can determine which application receives the data without knowing about it by selecting the specific messaging channel.


However, the right choice of message channel depends on your architecture. Which channel should be addressed and when?

The following figure lists some such channel types.

Messaging Design Patterns - This diagram shows the different patterns of message channels.
Messaging Patterns – Message Channel Patterns

What are the basic differences between the channel types ?

Basically, the channel types can be divided into two main types.

A distinction can be made between a point-to-point channel, i.e. one sender and exactly one receiver, and a publish-subscribe channel, one sender and several receivers.

What is a Messaging Endpoint?

In order for a sender or receiver application to connect to the messaging channel, an intermediary must be used. This client is called a messaging endpoint.


The following figure shows the principle of communication via messaging endpoints.

Messaging Design Patterns - Dieses Diagramm zeigt the Basic principle of a message endpoint
Basic principle of a Message Endpoint

On the receiver side, the end point accepts the data to be sent, builds a message from it and sends it via a specific message channel. On the receiver side, this message is also received via an end point and extracted again. An application can access several end points here. However, an endpoint can only implement one alternative.

The following figure lists some endpoint types.

Messaging Design Patterns - This diagram shows the different patterns of message endpoints.
Messaging Design Patterns – Message Endpoint Patterns

When do I choose which endpoint?

Receiving messages in particular can become difficult and lead to server overload. Therefore, control and possible throttling of the processing of client requests is crucial. A proven means is, for example, the formation of processing queues or a dynamic adjustment of consumers, depending on the volume of requests.

What is Message Transformation?

If the data format has to be changed when data is exchanged between two applications, a so-called message transformation ensures that the message channel is formally decoupled.


This translation process can be understood as two systems running in parallel. The actual message data is separated from the metadata.


The following figure shows some message transformation types.

Messaging Design Patterns - This diagram shows the different patterns of message transformation.
Messaging Design Patterns – Message Transformation Patterns

How do I monitor my messaging system and keep it running?

A flexible messaging architecture unfortunately leads to a certain degree of complexity on the other side. Especially when it comes to integrating many message producers and consumers decoupled from each other in a messaging system, with partly asynchronous messaging, monitoring during operation can become difficult.


For this purpose, system management patterns have been developed to provide the right monitoring tools. The main goal is to prevent bottlenecks and hardware overloads in order to guarantee the smooth flow of messages.


The following figure shows some test and monitoring patterns.

Messaging Design Patterns - This diagram shows the different patterns of message monitoring.
Messaging Design Patterns – Message Monitoring Patterns

What are the basic systems?

With a typical system management solution, for example, the data flow can be controlled by checking the number of data sent and received, or the processing time.


This is contrasted with the actual checking of the message information contained.

Array vs Object – The creation of a JSON structure follows some rules you should know

Array vs Object – JSON is one of the most popular data formats. However, the creation of such an object is done according to some rules. These rules depend on the original data type. In this article we will introduce you to the conversion of some JSON data types (Array vs Object).

What is JSON anyway?

With the JavaScript Object Notation, JSON for short, you can structure data compactly and independently of programming languages. The data format is therefore particularly well suited for exchange between your applications, for general data storage (file extension “.json”) and for configuration files. The data is also readable for you and coded in the standardized text format. The application notes of the data format are defined by the standards – RFC 8259 and the JSON syntax by the standards ECMA-404. Due to its easy integration with JavaScript, you can use it well for transferring data in web applications.

You can best compare the JSON data structure to XML and YAML, only it’s simpler and more compact.

What are the basic rules?

This code snippet shows a simple json object structure
Simple JSON Object

The JSON text structure is based on the JavaScript Object Syntax. Hierarchical data structures are thus possible. It contains only properties and no methods. The basis is formed by name-value pairs and ordered list of values. Basically, they are formatted with curly braces and as strings. This is especially advantageous if you want to transfer the data over the network. If you want to access the data you have to convert the text structure into a native JavaScript object.

Data Formats – JSON Array vs Object

Basically, you can have different data types included in JSON.

Value:

Your JSON value can take one of the following allowed types.

Schematic representation of the data types that a JSON value can assume
JSON value data types

Object:

A JSON object represents the basic form of a JSON text. With this you can accept any data type that is suitable for inclusion in JSON.

JSON Array vs Object - Schematic representation of the creation of a JSON object
Creation of a JSON object

Array:

JSON Array vs Object – It is possible to include an array. Arrays can contain objects, strings, numbers, arrays and boolean. You can include arrays as shown schematically below, enclosed with two square brackets.

JSON Array vs Object - Schematic representation of the creation of a JSON array
Creation of a JSON array

In this way, you can further and further nest the individual data types with each other and thus easily create any number of hierarchy levels. For example, object attributes can consist of arrays, or arrays can contain multiple objects.

Apache Avro – Effective Big Data Serialization Solution for Kafka

In this article we will explain everything you need to know about Apache Avro, an open source big data serialization solution and why you should not do without it.


You can serialize data objects, i.e. put them into a sequential representation, in order to store or send them independent of the programming language. The text structure reflects your data hierarchy. Known serialization formats are for example XML and JSON. If you want to know more about both formats, read our articles on the topics. To read, you have to deserialize the text, i.e. convert it back into an object.

In times of Big Data, every computing process must be optimized. Even small computing delays can lead to long delays with a correspondingly large data throughput, and large data formats can block too many resources. The decisive factors are therefore speed and the smallest possible data formats that are stored. Avro is developed by the Apache community and is optimized for Big Data use. It offers you a fast and space-saving open source solution. If you don’t know what Apache means, look here. Here we have summarized everything you need to know about it and introduce you to some other Apache open source projects you should know about.

Apache Avro – Open Source Big Data Serialization Solution

With Apache Avro, you get not only a remote procedure call framework, but also a data serialization framework. So on the one hand you can call functions in other address spaces and on the other hand you can convert data into a more compact binary or text format. This duality gives you some advantages when you have cross-network data pipelines and is justified by its development history.

Avro was released back in 2011 as a part of Apache Hadoop. Here, Avro was supposed to provide a serialization format for data persistence as well as a data transfer format for communication between Hadoop nodes. To provide functionality in a Hadoop cluster, Avro needed to be able to access other address spaces. Due to its ability to serialize large amounts of data, cost-efficiently, Avro can now be used Hadoop-independently. 

You can access Avro via special API’s with many common programming languages (Java, C#, C, C++, Python and Ruby). So you can implement it very flexible.

In the following figure we have summarized some reasons what makes the framework so ingenious. But what really makes Avro so fast?

The schema clearly shows all the features that Apache Avro offers the user and why he should use it
Features Apache Avro

What makes Avro so fast?

The trick is that a schema is used for serialization and deserialization. About that the data hierarchy, i.e. the metadata, is stored separately in a file. The data types and protocols are defined via a JSON format. These are to be assigned unambiguously by ID to the actual values and can be called for the further data processing constantly. This schema is sent along with the data exchange via RPC calls.

Creating a schema registry is especially useful when processing data streams with Apache Kafka.

Apache Avro and Apache Kafka

Here you can save a lot of performance if you store the metadata separately and call it only when you really need it. In the following figure we have shown you this process schematically.

avro kafka

When you let Avro manage your schema registration, it provides you with comprehensive, flexible and automatic schema development. This means that you can add additional fields and delete fields. Even renaming is allowed within certain limits. At the same time, Avro schema is backward and forward compatible. This means that the schema versions of the Reader and Writer can differ. Schema registration management solutions exist, with Google Protocol Buffers and Apache Thrift, among others. However, the JSON data structure makes Avro the most popular choice.

What is XML and What role will it play in industry 4.0?

What is XML – It is one of the most popular and widely used data formats. Its widespread use is also its most important advantage. XML is interpretable by both humans and machines and is therefore widely used to import and export application data. XML stands for Extensible Markup Language and is a markup language for representing hierarchically structured data in text file format. It was already published in 1998 and is primarily a meta language.

That means that on its basis application-specific languages are defined by structural and content restrictions. For example RSS, MathML, GraphML, but also the Scalable Vector Graphics (SVG). All web browsers are able to visualize XML documents using the built-in XML parser.

What is XML Document Structure

An XML document can always be described as the interaction of its main components. In addition to the data itself, these are the layout, i.e. the description of the relationships between individual containers, and the structure.

What is XML-  This figure shows how an XML format document structure is determined.
What is XML – Document Components

An XML structure can be interpreted as a tree. Thus, each XML document has a root element and texts or attributes as sub-elements.

An XML document can have an optional header in addition to the actual data. XML declarations, i.e. references to an external document type definition (DTD), or internal DTD, or document type declarations can be placed here. Examples for these declarations are the XML version or the encoding.

Classification of the XML format

The XML format can be further classified. Which class comes into question when is determined by the use case. Mainly we decide between document-centered and data-centered. The document-centric XML format is based on a text document and is difficult to process by machine due to its weak structure. In data-centric, the schema describes entities of a data model and their relationships. This format is optimized for efficient processing by machines. The Semistructured format represents a hybrid of both.

Processing

The XML format allows both sequential and optional accesses. This can be done either by a “push”, where the program flow is controlled by the parser, or by a “pull”, where the flow is implemented in the code that calls the parser.
Management of the tree structure can be hierarchical as well as nested.

XML-Schema vs. Database-Schema

Besides XML, JSON is also a very popular markup language. In this article we have recorded the most important information about this format.

Another large field of computer languages, i.e. formal languages developed for interaction between humans and computers, is occupied by database languages. They describe the structure of a database. Here, too, the data is organized as a plan.

If you want to know more about database language, read our article on SQL and NoSQL. Here we explain the most important differences.


But how does this schema differ from an XML schema?

This figure shows the differences between an XML schema and a database schema.
What is XML- Differences between an XML schema and a database schema.

XML contains nested elements with an unlimited nesting depth. To transfer this nesting to a database schema, the nested elements must be decomposed and linked by foreign key relationships.
In XML format, the elements within an element can be repeated as often as desired. Elements of a given type do not always have to contain the same child elements. However, the order of elements is an integral part of the document structure.
In a database schema, each column is always present only once and contains simple values. Therefore, if multiple elements are to be stored, another table must then be created. The order in which the values are stored is not important, unlike in XML.

Apache Flink

Overview

– Open source stream processor framework developed by the Apache Software Foundation (2016)
– Data streams with high data volume can be processed and analyzed with low delay and high speed

flink analytics
Flink provides various tools for efficient real-time processing of continuous data streams and batch data

Core functions

– diverse, specialized APIs:
→ DataStream API (Stream Processing)
→ ProcessFunctions (control of states and time; event states can be saved and timers can be added for future calculations)
→ Table API
→ SQL API
→ provides a rich set of connectors to various storage systems such as Kafka, Kinesis, Kubernetes, YARN, HDFS, Elasticsearch, and JDBC database systems
→ REST API

Stream Processing

pexels pixabay 2438
How to handle this flood of data?

== Data is processed continuously with a short delay
→ without intermediate storage of the data in separate databases
– several data streams can be processed in parallel
– Each stream can be used to derive own follow-up actions and analyses

Architecture

Data can be processed as unbounded or bounded streams:

  • Unbounded stream

    • have a start but no defined end

    • must be continuously processed

  • Bounded stream

    • have a defined start and end

    • can be processed by ingesting all data before performing any computations(== batch processing)

– Flink automatically identifies the required resources based on the application’s configured parallelism and requests them from the resource manager.

In case of a failure, Flink replaces the failed container by requesting new resources.

– Stateful Flink applications are optimized for local state access