Generic outbound in the data warehouse environment

Back to overview

Today we are delving into a classic area of the data warehouse. This article deals with the topic of outbound and in particular with the creation of generic XML exports using the XMLAGG database function in conjunction with Informatica PowerCenter.

The outbound challenges

In many companies, data or information from the central data warehouse must be made available in flat files. This affects everything, including tools and departments that do not have direct access to the database of the data warehouse.

For example, this could be a CSV export for a specialist department that requires certain information for daily evaluations. Or XML exports for applications that require the company data for the creation of reports and/or for notifications to the supervisory authorities, for example.

In a data warehouse, there may be a large number of tables that need to be extracted in order to make the data available for subsequent processes.

In order to save time and effort and minimize the susceptibility to errors, it makes sense to design a generic generation of the outbound. This eliminates the need for an additional Informatica mapping implementation for each table.

In the following sections, we will restrict ourselves to the outbound of XML files.

XMLAGG

XMLAGG is a database function that aggregates or compiles partial fragments into a complete XML document. In this case, partial fragments are the individual data records that have already been brought into an XML structure.

In the following example, XML_ENTITY contains three data records. XMLAGG is used to summarize the content of XML_ENTITY into an XML document.

This makes it relatively easy to generate complete XML files for outbound.

SQL command:

SELECT XMLELEMENT(NAME "document", XMLATTRIBUTES( 'Outbounds in-factory Technik-Blog' as "name"),
XMLAGG(XML_ENTITY))
FROM
(SELECT XMLELEMENT(NAME "KUNDEN", XMLATTRIBUTES(KUNDEN_NR, NAME),
CASE WHEN STRASSE IS NOT NULL THEN XMLELEMENT(NAME "FELD", XMLATTRIBUTES('STRASSE' AS FELDNAME, STRASSE AS value)) END,
CASE WHEN PLZ IS NOT NULL THEN XMLELEMENT(NAME "FELD", XMLATTRIBUTES('PLZ' AS FELDNAME, PLZ AS value)) END,
CASE WHEN ORT IS NOT NULL THEN XMLELEMENT(NAME "FELD", XMLATTRIBUTES('Ort' AS FELDNAME , ORT AS value)) END)
AS XML_ENTITY FROM KUNDEN);

XML document:

<document name="Outbound in-factory Technik-Blog">
<KUNDEN KUNDEN_NR="1" NAME="FIRMA1">
<FELD FELDNAME="STRASSE" VALUE="Klosterwall 87"/>
<FELD FELDNAME="PLZ" VALUE="20095"/>
<FELD FELDNAME="ORT" VALUE="Hamburg"/>
</KUNDEN>
<KUNDEN KUNDEN_NR="2" NAME="FIRMA2">
<FELD FELDNAME="STRASSE" VALUE="Pariser Platz 20"/>
<FELD FELDNAME="PLZ" VALUE="10117"/>
<FELD FELDNAME="ORT" VALUE="Berlin"/>
</KUNDEN>
<KUNDEN KUNDEN_NR="3" NAME="FIRMA3">
<FELD FELDNAME="STRASSE" VALUE="Maximilianstraße 43"/>
<FELD FELDNAME="PLZ" VALUE="80539"/>
<FELD FELDNAME="ORT" VALUE="München"/>
</KUNDEN>
</document>

 

Integration in Informatica PowerCenter

To generate the SQL command for XMLAGG automatically and generically, integration in Informatica PowerCenter is a good option. For example, you could use the table metadata of the database to determine the columns of a table to be exported.

By passing the parameters "$$DBSCHEMA" and "$$QUELL_TABELLE", the table to be exported including its columns is determined. If, for example, a table has technical fields that should not be exported as they are not of interest for the subsequent process, this could be solved with an additional parameter ("$$COLUMNS_NOT_EXPORT").

IBM DB2:

SELECT col.TABSCHEMA, col.TABNAME, col.COLNAME, col.TYPENAME
FROM SYSCAT.COLUMNS col
WHERE col.TABSCHEMA = '$$DBSCHEMA'
AND col.TABNAME = '$$QUELL_TABELLE' AND col.COLNAME NOT IN ($$SPALTEN_NICHT_EXPORTIEREN)

Oracle:

SELECT col.OWNER, col.TABLE_NAME, col.COLUMN_NAME, col.DATA_TYPE
FROM DBA_TAB_COLUMNS col
WHERE col.OWNER = '$$DBSCHEMA'
AND col.TABLE_NAME = '$$QUELL_TABELLE' AND col.COLUMN_NAME NOT IN ($$SPALTEN_NICHT_EXPORTIEREN)

This means that a single Informatica workflow can be used to create the corresponding SQL statement for XML generation and export completely generically for each individual table. There is no need to build an individual process for each table. The time saved can in turn be used for the implementation of further business logic.

Do you have any questions? We have the answers!

Please write to us. We look forward to hearing from you!