insights by infactory in-factory GmbH Generic Outbound in the Data Warehouse Environment

This article deals with the topic of Outbound and in particular with the creation of generic XML exports with the database function XMLAGG in interaction with Informatica PowerCenter.

The Outbound challenges

In many companies, data or information from the central data warehouse must be made available in flat files. This affects everything, including tools and departments that do not have direct access to the database of the data warehouse.

This can be, for example, a CSV export for a technical department that needs certain information as for daily evaluations. Or also XML exports for applications that, for example, require the company data for the creation of reports and/or for messages to the regulatory authorities.

In a data warehouse, there may be a large number of tables that need to be derived in order to provide the data to subsequent processes.

In order to save time and effort and to minimize error-proneness, it makes sense to design a generic generation of the outbound. This eliminates the need for an additional Informatica mapping implementation for each table.

In the coming sections we will limit ourselves to the Outbound of XML files.

XMLAGG

XMLAGG is a database function that aggregates or assembles partial fragments into a complete XML document. In this case, partial fragments are the individual data records that have already been put into an XML structure.

In the following example, XML_ENTITY contains three data records. With the help of XMLAGG the content of XML_ENTITY is combined into an XML document.

This makes it relatively easy to generate complete XML files for Outbound use.

SQL command:

SELECT XMLELEMENT(NAME "document", XMLATTRIBUTES( 'Outbounds in-factory Technik-Blog' as "name"),
XMLAGG(XML_ENTITY))
FROM
(SELECT XMLELEMENT(NAME "KUNDEN", XMLATTRIBUTES(KUNDEN_NR, NAME),
CASE WHEN STRASSE IS NOT NULL THEN XMLELEMENT(NAME "FELD", XMLATTRIBUTES('STRASSE' AS FELDNAME, STRASSE AS value)) END,
CASE WHEN PLZ IS NOT NULL THEN XMLELEMENT(NAME "FELD", XMLATTRIBUTES('PLZ' AS FELDNAME, PLZ AS value)) END,
CASE WHEN ORT IS NOT NULL THEN XMLELEMENT(NAME "FELD", XMLATTRIBUTES('Ort' AS FELDNAME , ORT AS value)) END)
AS XML_ENTITY FROM KUNDEN);

XML document:

<document name="Outbound in-factory Technik-Blog">
<KUNDEN KUNDEN_NR="1" NAME="FIRMA1">
<FELD FELDNAME="STRASSE" VALUE="Klosterwall 87"/>
<FELD FELDNAME="PLZ" VALUE="20095"/>
<FELD FELDNAME="ORT" VALUE="Hamburg"/>
</KUNDEN>
<KUNDEN KUNDEN_NR="2" NAME="FIRMA2">
<FELD FELDNAME="STRASSE" VALUE="Pariser Platz 20"/>
<FELD FELDNAME="PLZ" VALUE="10117"/>
<FELD FELDNAME="ORT" VALUE="Berlin"/>
</KUNDEN>
<KUNDEN KUNDEN_NR="3" NAME="FIRMA3">
<FELD FELDNAME="STRASSE" VALUE="Maximilianstraße 43"/>
<FELD FELDNAME="PLZ" VALUE="80539"/>
<FELD FELDNAME="ORT" VALUE="München"/>
</KUNDEN>
</document>

Integration in Informatica PowerCenter

To generate the SQL command around XMLAGG in an automated and generic way, the integration in Informatica PowerCenter is a good choice. For example, one could use the table metadata of the database to determine the columns of a table to be exported.

By passing the parameters “$$DBSCHEMA” and “$$QUELL_TABELLE” the table to be exported including its columns is determined. If, for example, a table has technical fields which should not be exported because they are not interesting for the following process, this could be solved with another parameter (“$$Columns_NOT_EXPORT”).

IBM DB2:

SELECT col.TABSCHEMA, col.TABNAME, col.COLNAME, col.TYPENAME
FROM SYSCAT.COLUMNS col
WHERE col.TABSCHEMA = '$$DBSCHEMA'
AND col.TABNAME = '$$QUELL_TABELLE' AND col.COLNAME NOT IN ($$SPALTEN_NICHT_EXPORTIEREN)

Oracle:

SELECT col.OWNER, col.TABLE_NAME, col.COLUMN_NAME, col.DATA_TYPE
FROM DBA_TAB_COLUMNS col
WHERE col.OWNER = '$$DBSCHEMA'
AND col.TABLE_NAME = '$$QUELL_TABELLE' AND col.COLUMN_NAME NOT IN ($$SPALTEN_NICHT_EXPORTIEREN)

Thus, a single Informatica workflow can be used to create the appropriate SQL statement for XML generation and perform the export completely generically for each individual table. There is no need to build a single individual process for each table. This time saved can in turn be used for the implementation for further business logic.

Author: Anonymous

ANY QUESTIONS? WE HAVE THE ANSWERS!

Write us. We are looking forward to your message!

MAIL TO