Use Informatica Persistent Cache and Reduce Fact Table Load Time
In a matured data warehouse environment, it is very common to see fact tables with dozens of dimension tables linked to it. If we are using informatica to build this ETL process, we would expect to see dozens of lookup transformations as well; unless any other design techniques are used. Since lookup is the predominant transformation, turning this will help us gain some performance. Lets see how we can use persistent lookup cache for this performance improvement.

In a typical matured data warehouse, we will see dozens of dimension tables which is used across multiple fact tables. All these dimension tables will be used in a lookup transformation with in the ETL process to load the fact tables. 

By nature lookup transformation caches the data from the database and keeps in system memory for faster data retrieval. This caching process takes some time depending on the size of the lookup table. Since the same lookup table can be repeatedly used in multiple fact table load ETLs, we don't have to waist time in building the cache again and again. 

So the idea behind this performance improvement is to build the lookup cache once and use multiple times. For that we shall use the persistent lookup caching technique.

What is Persistent LookUp Cache

If different lookup transformations have same properties like data source, return/output/loopup ports,  cache directory, then the cache created by the lookup transformation can be saved or persisted (Unnamed LookUp Cache) and reused by different lookups in the same mapping. If the cache files are saved with a name prefix (Named LookUp Cache), this can be reused with in the same mapping or in different mappings with compatible lookup properties.

Unnamed cache. When Lookup transformations in a mapping have compatible caching structures, the Integration Service shares the cache by default. You can only share static unnamed caches.

Named cache. Use a persistent named cache when you want to share a cache file across mappings or share a dynamic and a static cache. The caching structures must match or be compatible with a named cache. You can share static and dynamic named caches.

Creating Persistent LookUp Cache

Lets see how to create the Persistent LookUp Cache. For the demonstration lets consider one of the dimension PROD_DIM, which is required for multiple fact table loads. Once the lookup cache is persisted, it can be reused in different fact table loads.

For enabling Named Persistent LookUp Cache, lets give below properties.
  • Lookup cache persistent : Enable persistent lookup cache.
  • Cache File Name Prefix : Provide the cache file name prefix. (This Named Persistent cache file name that will be used in all the other mappings using the same lookup table). Enter the prefix name only.
  • Re-cache from lookup source : Check this option to rebuilt the current data of the lookup source.
Use Informatica Persistent Cache and Reduce Fact Table Load Time

Reusing Persistent LookUp Cache

Now the persisted lookup cache can be reused in different fact tables load mappings (different mappings) with bellow settings.
  • Lookup cache persistent : Enable persistent lookup cache.
  • Cache File Name Prefix : Provide the cache file name prefix. Use the same prefix that was used in creating the persistent cache. 
Use Informatica Persistent Cache and Reduce Fact Table Load Time
Note : 'Re-cache from lookup source' property need be unchecked, so that the cache is not recreated again.

Conclusion

Reusing the lookup cache can give considerable performance improvement. The improvement is directly proportionally to the amount of time taken to to cache the lookup data and the number of times the lookup cache is reused.

Hope you enjoyed this tutorial, Please let us know your question and feedback. We are happy to hear from you.

Share with your friends

Readers Comments