XML Extract


This extract is used to read data from XML files and Web Services in a tabular form. An XmlFile, REST, or SOAP connection is needed.

The extract definition uses XPath language, which is a standard query language for selecting nodes from an XML document. You can find more information on XPath at  www.w3schools.com/xml/xpath_intro.asp.

Main settings

XML, REST, or SOAP connection. For XML extracts using a connection type REST, the REST connection must use HTTP method GET. To use other methods, an XML load must be used. 

Only connections with HTTP modes GET and POST are possible for XML extracts using REST connections. Note: during a data preview of the extract, the POST request is executed, which may provoke undesired changes on the service endpoint.

XPath loop expression

Drop-down list of reasonable XPath expressions that specify the root or anchor elements of the XML document that will be looped through. The number of elements applying to this expression gives the number of output rows.

Expressions can also be entered manually.

XPath expression

Drop-down list of expressions that define a column of the extract output or the xpath from the XMLloop expression. For each column, a name and a default value can be defined. All space, multiple space, or null values will be mapped to this default value.


XML documents may contain namespaces. The declaration of these namespaces in the XML extract is optional; it can be omitted if the XML structure has no naming collisions without namespaces. To declare a namespace, a prefix has to be defined for the URI of the namespace. This prefix is then used in the XPath expressions to identify the correct XML nodes.

Prefix Identifies the correct XML node in the XPath expressions.
URI The URI of the namespace
Advanced settings

If caching is activated, the complete output of the extract is temporarily stored during the first call of the extract, using an internal H2 database. Subsequent calls of the extract read directly from the cache without connecting to the underlying source system of the extract. If the extract or the underlying connection contain variables, a separate cache is build for different values of these variables.

See Caching in Extracts for more information.