Informatica Performance Tuning Guide, Resolve Performance Bottlenecks - Part 3
In our previous article in the performance tuning series, we covered different approaches to identify performance bottlenecks. In this article we will cover the methods to resolve different performance bottlenecks. We will talk about session memory, cache memory, source, target and mapping performance turning techniques in detail.
I. Buffer Memory Optimization
When the Integration Service initializes a session, it allocates blocks of memory to hold source and target data. Sessions that use a large number of sources and targets might require additional memory blocks.
You can change the buffer block size in the session configuration as shown in below image.
Performance Tuning Tutorial Series
Part I : Performance Tuning Introduction.
Part II : Identify Performance Bottlenecks.
Part III : Remove Performance Bottlenecks.
Part IV : Performance Enhancements.
Part I : Performance Tuning Introduction.
Part II : Identify Performance Bottlenecks.
Part III : Remove Performance Bottlenecks.
Part IV : Performance Enhancements.
Not having enough buffer memory for DTM process, can slowdown reading, transforming or writing and cause large fluctuations in performance. Adding extra memory blocks can keep the threads busy and improve session performance. You can do this by adjusting the buffer block size and DTM Buffer size.
Note : You can identify DTM buffer bottleneck from Session Log File, Check here for details.
Note : You can identify DTM buffer bottleneck from Session Log File, Check here for details.
1. Optimizing the Buffer Block Size
Depending on the source, target data, you might need to increase or decrease the buffer block size.
To identify the optimal buffer block size, sum up the precision of individual source and targets columns. The largest precision among all the source and target should be the buffer block size for one row. Ideally, a buffer block should accommodates at least 100 rows at a time.
- Buffer Block Size = Largest Row Precision * 100
2. Increasing DTM Buffer Size
When you increase the DTM buffer memory, the Integration Service creates more buffer blocks, which improves performance. You can identify the required DTM Buffer Size based on below calculation.
- Session Buffer Blocks = (total number of sources + total number of targets) * 2
- DTM Buffer Size = Session Buffer Blocks * Buffer Block Size / 0.9
You can change the DTM Buffer Size in the session configuration as shown in below image.
II. Caches Memory Optimization
Transformations such as Aggregator, Rank, Lookup uses cache memory to store transformed data, which includes index and data cache. If the allocated cache memory is not large enough to store the data, the Integration Service stores the data in a temporary cache file. Session performance slows each time the Integration Service reads from the temporary cache file.
Note : You can examine the performance counters to determine what all transformations require cache memory turning, Check here for details.
1. Increasing the Cache Sizes
You can increase the allocated cache sizes to process the transformation in cache memory itself such that the integration service do not have to read from the cache file.
You can calculate the memory requirements for a transformation using the Cache Calculator. Below shown is the Cache Calculator for Lookup transformation.
You can update the cache size in the session property of the transformation as shown below.
2. Limiting the Number of Connected Ports
For transformations that use data cache, limit the number of connected input/output and output only ports. Limiting the number of connected input/output or output ports reduces the amount of data the transformations store in the data cache.
III. Optimizing the Target
The most common performance bottleneck occurs when the Integration Service writes to a target database. Small database checkpoint intervals, small database network packet sizes, or problems during heavy loading operations can cause target bottlenecks.
Note : Target bottleneck can be determined with the help of Session Log File, check here for details.
1. Using Bulk Loads
You can use bulk loading to improve the performance of a session that inserts a large amount of data into a DB2, Sybase ASE, Oracle, or Microsoft SQL Server database. When bulk loading, the Integration Service bypasses the database log, which speeds performance. Without writing to the database log, however, the target database cannot perform rollback. As a result, you may not be able to perform recovery.
2. Using External Loaders
To increase session performance, configure PowerCenter to use an external loader for the following types of target databases. External loader can be used for Oracle, DB2, Sybase and Teradata.
3. Dropping Indexes and Key Constraints
When you define key constraints or indexes in target tables, you slow the loading of data to those tables. To improve performance, drop indexes and key constraints before you run the session. You can rebuild those indexes and key constraints after the session completes.
4. Minimizing Deadlocks
Encountering deadlocks can slow session performance. You can increase the number of target connection groups in a session to avoid deadlocks. To use a different target connection group for each target in a session, use a different database connection name for each target instance.
5. Increasing Database Checkpoint Intervals
The Integration Service performance slows each time it waits for the database to perform a checkpoint. To decrease the number of checkpoints and increase performance, increase the checkpoint interval in the database.
6. Increasing Database Network Packet Size
If you write to Oracle, Sybase ASE, or Microsoft SQL Server targets, you can improve the performance by increasing the network packet size. Increase the network packet size to allow larger packets of data to cross the network at one time.
IV. Optimizing the Source
Performance bottlenecks can occur when the Integration Service reads from a source database. Inefficient query or small database network packet sizes can cause source bottlenecks.
Note : Session Log File details can be used to identify Source bottleneck, check here for details.
1. Optimizing the Query
If a session joins multiple source tables in one Source Qualifier, you might be able to improve performance by optimizing the query with optimizing hints. Usually, the database optimizer determines the most efficient way to process the source data. However, you might know properties about the source tables that the database optimizer does not. The database administrator can create optimizer hints to tell the database how to execute the query for a particular set of source tables.
2. Increasing Database Network Packet Size
If you read from Oracle, Sybase ASE, or Microsoft SQL Server sources, you can improve the performance by increasing the network packet size. Increase the network packet size to allow larger packets of data to cross the network at one time.
V. Optimizing the Mappings
Mapping-level optimization may take time to implement, but it can significantly boost session performance. Focus on mapping-level optimization after you optimize the targets and sources.
Generally, you reduce the number of transformations in the mapping and delete unnecessary links between transformations to optimize the mapping. Configure the mapping with the least number of transformations and expressions to do the most amount of work possible. Delete unnecessary links between transformations to minimize the amount of data moved.
Note : You can identify Mapping bottleneck from Session Log File, check here for details.
Note : You can identify Mapping bottleneck from Session Log File, check here for details.
1. Optimizing Datatype Conversions
You can increase performance by eliminating unnecessary datatype conversions. For example, if a mapping moves data from an Integer column to a Decimal column, then back to an Integer column, the unnecessary datatype conversion slows performance. Where possible, eliminate unnecessary datatype conversions from mappings.
2. Optimizing Expressions
You can also optimize the expressions used in the transformations. When possible, isolate slow expressions and simplify them.
- Factoring Out Common Logic : If the mapping performs the same task in multiple places, reduce the number of times the mapping performs the task by moving the task earlier in the mapping.
- Minimizing Aggregate Function Calls : When writing expressions, factor out as many aggregate function calls as possible. Each time you use an aggregate function call, the Integration Service must search and group the data. For example SUM(COL_A + COL_B) performs better than SUM(COL_A) + SUM(COL_B)
- Replacing Common Expressions with Local Variables : If you use the same expression multiple times in one transformation, you can make that expression a local variable.
- Choosing Numeric Versus String Operations : The Integration Service processes numeric operations faster than string operations. For example, if you look up large amounts of data on two columns, EMPLOYEE_NAME and EMPLOYEE_ID, configuring the lookup around EMPLOYEE_ID improves performance.
- Using Operators Instead of Functions : The Integration Service reads expressions written with operators faster than expressions with functions. Where possible, use operators to write expressions.
3. Optimizing Transformations
Each transformation is different and the tuning required for different transformation is different. But generally, you reduce the number of transformations in the mapping and delete unnecessary links between transformations to optimize the transformation.
Note : Tuning technique for different transformation will be covered as a separate article.
Note : Tuning technique for different transformation will be covered as a separate article.
What is Next in the Series
The next article in this series will cover the additional features available in Informatica PowerCenter to improve session performance. Hope you enjoyed this article, please leave us a comment or feedback if you have any, we are happy to hear from you.