Load Data

  1. Purpose of the Task:
    • The Load Data task is responsible for reading data from the source and loading it into the loading table in the staging area. It ensures that the required data is transferred from the source to the temporary table for further processing.
  2. Execution of the Task:
    • This task is always executed as part of the loading phase of the data integration process. It ensures that the source data is captured and placed into the loading table in the target database.
  3. Auto Commit:
    • The loading phase always uses auto commit because ODI temporary tables are used to hold intermediate data. Since the data in these temporary tables is not permanent or critical, it is automatically committed without the need for manual intervention.
  4. Command on Source:
    • The select statement in the source database retrieves the data. The various ODI methods are used to dynamically generate parts of the SQL query:
      • getFilter: Retrieves any filters applied to the data.
      • getJoin: Retrieves join expressions that define relationships between tables.
      • getFrom: Specifies the source table or query.
      • getColList: Retrieves the columns and their transformations (expressions) to be selected from the source.
      • getJrnFilter: Retrieves journal-related filters.
      • getGrpBy and getHaving: Specify grouping and filtering for aggregates.

Example SQL query on the source:

select       <%=snpRef.getPop("DISTINCT_ROWS")%>      

             <%=snpRef.getColList("", "[EXPRESSION]\t[ALIAS_SEP] [CX_COL_NAME]", ",\n\t", "", "")%>

from        <%=snpRef.getFrom()%>

where       (1=1)

<%=snpRef.getFilter()%>

<%=snpRef.getJrnFilter()%>

<%=snpRef.getJoin()%>

<%=snpRef.getGrpBy()%>

<%=snpRef.getHaving()%>

    • This query dynamically generates the necessary SQL to extract the data based on the transformations, filters, and joins specified.
  1. Command on Target:
    • The insert statement on the target (staging area) places the retrieved data into the loading table. The columns from the source are dynamically bound to the corresponding columns in the loading table.

Example SQL insert query on the target:

insert into <%=snpRef.getTable("L", "COLL_NAME", "A")%>(

    <%=snpRef.getColList("", "[CX_COL_NAME]", ",\n\t", "","")%>)

values(

    <%=snpRef.getColList("", ":[CX_COL_NAME]", ",\n\t", "","")%>)

    • Here, :[CX_COL_NAME] is a bind variable used to dynamically match the source column attributes with their corresponding alias in the loading table.
    • The values for the insert statement are sourced from the selected columns and transformed using the ODI expressions.
  1. Dynamic Expressions and Aliases:
    • The task makes use of dynamic expressions and aliases (e.g., CX_COL_NAME), which represent the source data fields. These are dynamically created using the ODI methods such as getColList and are used in the SQL queries to correctly map and transfer the data.
    • Expression property of getColList ensures that the source attributes are correctly transformed and placed into the target loading table.

 

Summary of Steps in the "Load Data" Task:

  1. Select Data from Source:
    • The system generates a SELECT statement on the source using various ODI methods (getFilter, getJoin, getFrom, etc.) to extract the required data.
  2. Insert Data into Loading Table:
    • The data from the source is inserted into the loading table in the target system using the INSERT statement. The column names are dynamically bound, ensuring that the correct data is inserted.
  3. Use of Auto Commit:
    • Since the loading table contains temporary data, auto commit is enabled to automatically commit the transaction after the data is loaded into the table.
  4. Dynamic SQL Execution:
    • The SQL queries are dynamically generated using ODI functions and context expressions. This ensures that the correct transformations, filters, and mappings are applied during the data loading process.

 

No comments:

Post a Comment