Design Approach to Handle Late Arriving Dimensions and Late Arriving Facts
A high-level systematic ETL design will help to build efficient and flexible ETL processes. So special care should be given in the design phase of your project. In following we will be covering the key points one should keep in mind while designing an ETL process. The following recommendations can be integrated into your ETL design and development processes to simplify the effort and improve the overall quality of the finished product.
        1. Consistency
        2. Modularity
        3. Reusability
        4. Scalability
        5. Simplicity

1. Consistency

Informatica Performance Tuning Guide, Performance Enhancements - Part 4
To ensure consistency and facilitate easy maintenance post production it is important to define and agree on development standards before development work has begun.

The standards will define the ground rules for the development team. Standards can range in items from naming conventions to documentation standards to error handling standards. Development work should adhere to these standards throughout the life cycle and new team members will be able to reference these standards to understand the requirements placed upon the design and build activities

Applying consistent standards such as naming conventions, design patterns, error handling, change data capture  reduces long term complications and makes maintenance easy. 

2. Modularity

Informatica Performance Tuning Guide, Performance Enhancements - Part 4
A modular design is important for an efficient ETL design. Divide different components of your ETL process such as incremental data pull logic error handlingchange data capture, operational meta data logging into different modules. This makes the  ETL processes efficient, scalable, and maintainable. 

3. Reusability

Informatica Performance Tuning Guide, Performance Enhancements - Part 4
Reusability is a great feature in Informatica PowerCenter which can be used by developers. Its general purpose is to reduce unnecessary coding which ultimately reduces development time and increases supportability. In addition to that, it also help to react quickly to potential changes required for a program.

A great focus should be given during the design phase on reuse to make quick and universal modifications. Informatica PowerCenter has provided a variety of methods to achieve reusability such as Mapplets, Worklets, Reusable Transformations, Reusable functions, Parameters, Shared Folders.

4. Scalability

Informatica Performance Tuning Guide, Performance Enhancements - Part 4
Keep volumes in mind in order to create efficient ETL process. Estimating the data volume requirements of a data integration project is a critical. Based on the volume estimates special consideration need to be given for caching different transformations, running complex queries, applying different performance turning techniques, such as push down optimization, Session Partitioning, Dynamic Session Partition, Concurrent Workflows, Grid Deployments, Workflow Load Balancing and Other available Performance Tips.

5. Simplicity

Informatica Performance Tuning Guide, Performance Enhancements - Part 4
It is recommended to create multiple simple ETL Process, Informatica Mappings and Informatica Workflows instead of few complex ones. Use Staging Area and try to keep the processing logic as clear and simple as possible. Such design makes develop, debug, maintain easy compared to complex ETL logic.

Share with your friends

Readers Comments