XML format is one of the most popular and widely used data formats. Its widespread use is also its most important advantage. XML is interpretable by both humans and machines and is therefore widely used to import and export application data. XML stands for Extensible Markup Language and is a markup language for representing hierarchically structured data in text file format. It was already published in 1998 and is primarily a meta language.

That means that on its basis application-specific languages are defined by structural and content restrictions. For example RSS, MathML, GraphML, but also the Scalable Vector Graphics (SVG). All web browsers are able to visualize XML documents using the built-in XML parser.

XML Document Structure

An XML document can always be described as the interaction of its main components. In addition to the data itself, these are the layout, i.e. the description of the relationships between individual containers, and the structure.

This figure shows how an XML format document structure is determined.
XMl Document Components

An XML structure can be interpreted as a tree. Thus, each XML document has a root element and texts or attributes as sub-elements.

An XML document can have an optional header in addition to the actual data. XML declarations, i.e. references to an external document type definition (DTD), or internal DTD, or document type declarations can be placed here. Examples for these declarations are the XML version or the encoding.

Classification of the XML format

The XML format can be further classified. Which class comes into question when is determined by the use case. Mainly we decide between document-centered and data-centered. The document-centric XML format is based on a text document and is difficult to process by machine due to its weak structure. In data-centric, the schema describes entities of a data model and their relationships. This format is optimized for efficient processing by machines. The Semistructured format represents a hybrid of both.

Processing

The XML format allows both sequential and optional accesses. This can be done either by a “push”, where the program flow is controlled by the parser, or by a “pull”, where the flow is implemented in the code that calls the parser.
Management of the tree structure can be hierarchical as well as nested.

XML-Schema vs. Database-Schema

Besides XML, JSON is also a very popular markup language. In this article we have recorded the most important information about this format.

Another large field of computer languages, i.e. formal languages developed for interaction between humans and computers, is occupied by database languages. They describe the structure of a database. Here, too, the data is organized as a plan.

If you want to know more about database language, read our article on SQL and NoSQL. Here we explain the most important differences.


But how does this schema differ from an XML schema?

This figure shows the differences between an XML schema and a database schema.
Differences between an XML schema and a database schema.

XML contains nested elements with an unlimited nesting depth. To transfer this nesting to a database schema, the nested elements must be decomposed and linked by foreign key relationships.
In XML format, the elements within an element can be repeated as often as desired. Elements of a given type do not always have to contain the same child elements. However, the order of elements is an integral part of the document structure.
In a database schema, each column is always present only once and contains simple values. Therefore, if multiple elements are to be stored, another table must then be created. The order in which the values are stored is not important, unlike in XML.