Processing Multibyte Characters Like UNICODE Characters in Fixed Width Files

Johnson Cyriac Jun 30, 2013
|

Working with Multibyte Data in Fixed-Width Filess
Processing fixed width file in Informatica PowerCenter is not a big deal. But this can be bit tricky if the fixed file contins both multi byte and single byte characters such as unicode characters. This is because of the fact that fixed width flat files are byte oriented, which means the field width is measured in bytes. So if a single character takes two bytes for its representation, it will be treated as width two. In this article, lets see how such files are handled in Informatica PowerCenter.

The key behind processing fixed file contins both multi byte and single byte characters are Shift Key,  which identifies the character boundaries.

What is Shift Key

When we have a combination of multi and single byte character, we need a mechanism to know the boundaries of each character such that the fixed file can be read correctly. Shift key comes handy for that.

Shift-in and shift-out keys separate multibyte characters so the Integration Service can read each character correctly. A shift-out key signals the beginning of a sequence of multibyte characters. A shift-in key signals the end of this sequence. If the file source does not contain shift keys, you need to define shift states for each column in the file so the Integration Service can read each character correctly.

Importing Flat Files with Shift Keys

We can use the Flat File Wizard to import a file source that contains shift keys, just like any other flat files. The Flat File Wizard and the Integration Service uses the shift keys in the file to determine the shift state of each column in the source. 

Importing Flat Files Without Shift Keys

We can use the Flat File Wizard to import a file source that do not contains shift keys, just like any other flat files. Once the import is complete, we can set the shift keys by following below steps.

As the first step set "user Defined Shift State" as shown in below image.
Processing Multibyte Characters  in Fixed Width Files
Now you will see Shift Key column in the Columns tab as shown in below image. Choose Shift-In if the column contains single-byte characters. Choose Shift-Out if the column contains multibyte characters.
Processing Multibyte Characters  in Fixed Width Files

Configuring Session and Workflow

No special consideration is required in the session and workflow configuration to process, expect choosing the correct code page setting.

Working with Multibyte Data in Fixed Width Targets

To load multibyte data into a fixed-width flat file target, configure the precision to accommodate the multibyte data. When you configure the precision for a fixed width target, you need to consider the number of bytes you load into the target, rather than the number of characters. The Integration Service writes the row to the reject file if the precision is not large enough to accommodate the multibyte data.

Hope this tutorial is informatica and helpful. Please leave your comments and questions below.





About US Contact US Advertise Guest Post Terms and Conditions Privacy Policy Disclaimer

© 2012-2013 Data Intelligence Solution, All Rights Reserved
The contents in this site is copyrighted to Data intelligence Solution and may not be reproduced on other websites.