Loading Using Unload/Load

  1. Unload Data from Remote Database Server:
    • When the source result set is on a remote database server, an alternative method to using the agent for transferring data is to unload the data to a file and then load that file into the staging area.
  2. When to Use Unload/Load:
    • This method is particularly effective when dealing with large volumes of data across heterogeneous technologies.
    • For example, unloading data from Microsoft SQL Server using bcp and then loading it into an Oracle staging area using SQL*Loader.
  3. Steps in the LKM Using Unload/Load Strategy:
    • Step 1: Drop and Create the Loading Table:
      • The loading table is dropped (if it already exists) and recreated in the staging area to prepare it for the new data.
    • Step 2: Unload Data to a Temporary Flat File:
      • The data is unloaded from the source to a temporary flat file. This can be done using:
        • A source database unload utility (e.g., bcp for Microsoft SQL Server, DB2 unload for DB2).
        • Alternatively, the built-in OdiSqlUnload tool can be used to unload data.
    • Step 3: Generate the Loading Utility Script:
      • A script is generated for the appropriate loading utility (e.g., SQL*Loader for Oracle). This script is used to load the temporary file into the loading table in the staging area.
    • Step 4: Execute the Load Command:
      • The corresponding operating system command is executed to start the data load process.
      • The return code of the operation is checked to ensure the data is loaded successfully.
    • Step 5: Error Handling:
      • If needed, log files generated by the loading utility are analyzed for error handling and troubleshooting.
    • Step 6: Clean-Up:
      • After the integration KM has completed, the loading table is dropped to clean up the staging area.
      • Additionally, the temporary file used for unloading is also removed to free up disk space.
  4. Challenges of Unload/Load Strategy:
    • Disk Space Usage:
      • Data is staged twice—once in the temporary file and again in the loading table, which can result in additional disk space usage.
    • Efficiency Issues:
      • The two-stage process may lead to potential efficiency issues, especially when handling very large datasets.
  5. Alternative: Use of Pipelines:
    • A more efficient alternative to unloading and loading via files would be to use pipelines between the unload and load utilities. However, not all operating systems support file-based pipelines (FIFOs), which can limit the applicability of this method.

 

No comments:

Post a Comment