Processing UNICODE Characters in Informatica PowerCenter Workflow
Couple of days back one of my friends mailed me and said, he is not able to process Arabic characters using Informatica PowerCenter workflow. You might have faced same issue in processing scripts such as Arabic, Hebrew, Chinese etc. Let discuss about how we can process such non English scripts in Informatica PowerCenter workflows.
Before we jump into Informatica PowerCenter configuration, lets understand couple key concept behind processing different character set.
Character Set : Is a code that pairs a set of natural language characters such as an alphabet or symbol with a set of numbers. For example The ASCII character set, uses the numbers 0 through 127 to represent all English characters as well as special control characters. UNICODE is the widely used character set, which can represent over 110,000 characters covering 100 scripts such as Arabic and Hebrew Chinese etc..
Character Encoding : Is an algorithm that translates a list of numbers (these number are defined in the character set) to binary so that a computer reads and displays a character in a way that humans can understand. UTF-8 is the popular encoding used for UNICODE character set.
So from above description it is very evident that Character Set and Character Encoding are the key behind processing any foreign characters correctly. We need to have the Informatica PowerCenter Integration Service and Repository Service configured to process all the characters hat might come in your data sources.
So from above description it is very evident that Character Set and Character Encoding are the key behind processing any foreign characters correctly. We need to have the Informatica PowerCenter Integration Service and Repository Service configured to process all the characters hat might come in your data sources.
Integration Service Configuration
You can choose the character set supported by the integration service during the initial configuration or you can change it later from the Administrator console.
While Informatica PowerCenter Installation
During PowerCenter installation we can set the supported character set or Data movement Mode as shown in below image. Please Check out Informatica PowerCenter Installation Guide for step by step installation instruction.
After Informatica PowerCenter Installation
We can change the Character Set later after PowerCenter Installation, you can do this from the Admin Console.
Log on to Admin Console using the admin user id and password and choose the Integration service from the Domain Navigator as shown in below image.
Click Edit to change the Data Movement Mode(Character Set)
Repository Service Configuration
Character set of Repository Service can only be set during the service configuration. This canot be changed later. See below highlighted image and Check out the complete Informatica PowerCenter Installation Guide.
With this configuration, Informatica PowerCenter will have the capability to handle any character with in the UNICODE character set.
Workflow Configuration
During the configuration of each workflow, you need to choose the codepage or character encoding for the data source and the target data.
You can choose codepage or character encoding from the source and target property as shown in the below image.
You can choose the code page or the character encoding of the target data as shown in below image.
Hope you enjoyed this post and is informative. Please leave your question and comments.