Notes of my Lotus Notes Headline Animator

Search My Blog

Tuesday, September 8, 2015

Leading and Trailing Zeroes for Decimal fields when writing from Table to File in DataStage

There was an issue in one of my job where we are provided with a base file which comes with Decimal values of various lengths, precision and scale.

We also have a job where we are reading data from table and writing to a file.

Requirement is to match this file we are creating from table with that of base file.

Problem 1: A decimal column is  coming as 000001234.12, but in the file created it's shown as 0000001234.12.

Problem 2: decimal column is  coming as 00.123, but in the file created it's shown as 000000.000.

NOTE: decimal fields are typed cast to char data type while writing to the file, since the data coming from the file is coming as characters only. 

e.g. 

SELECT
    SNPSHT_DT, POLICY_NUM, 
trim(cast(FUND_VAL AS char(13))) aFUND_VAL,
    FUND_SRC_CD,
    trim(cast(INT_RT_CUR as char(7))) as INT_RT_CUR,
    trim(cast(EIA_FND_GMIC_PCT as char(10))) as EIA_FND_GMIC_PCT,
    trim(cast(STRT_NDX_VAL as char(13))) asSTRT_NDX_VAL
FROM
    EIM.TO_ANTY_FUND order byPOLICY_NUM,CASE_NUM,FUND_CD

Explanation:
When  such issues occur one must check the Metadata of the table first, instead of just experimenting with the length of the character data type.

Understand how the data type is stored. That's what I did and noticed that the length and scale defined for the columns are resulting in such problem.

1st field: Data coming here from base file is 9 numbers + (.) a dot + 2 numbers = 12 characters.

Now the metadata of the table had this column as (12,2) i.e. 12 is Length from which 10 is Precision and 2 is Scale.

So, when the data is actually stored it's internally stored as 10 numbers + (.) + 2 numbers.

Now, (.) may be a part of length in Character data type, however it's not a part of length for Decimal data type, because it IS a part of the data type itself.

Now expecting a 12 characters long data when you type cast the decimal to char(12) you get the data as 0000001234.1, i.e. 12 characters from the beginning.

To get the correct scale when you change char(12) to char(13) you get 0000001234.12 - correct scale but extra leading zero.
So either way your data is not matching.

2nd field: In this case it was observed that metadata of this decimal field was defined as (12,6) i.e. from a Length of 12 Precision is 6 and Scale is 6.

Thus, the data is actually stored internally as 000000.123000.

Now expecting a 6 characters length data when you type cast the decimal to char(6) it gives you the 6 characters from the beginning i.e. 6 zeroes.

If you change your length to char(10) you would match the correct scale but again extra leading zeroes.

Solution: Now since you want to match the output file to the base file, the above solution of plain typecasting isn't right.

A simple solution is to redefine the length and scale of the decimal columns.

Now, my work restricts me to change the DDL of the table. Whatever is defined as per my project standards.

So, I did the redefining in different way. 
I type cast my decimal column to the desired precision and scale first, and then typed cast the redefined decimal to char.

e.g.

SELECT
    SNPSHT_DT,
    POLICY_NUM,   trim(cast(CAST(FUND_VAL as DECIMAL(11,2)) AS char(12))) as FUND_VAL,
    FUND_SRC_CD,
    trim(cast(INT_RT_CUR as char(7))) as INT_RT_CUR,
    trim(cast(cast(EIA_FND_GMIC_PCT as DECIMAL(5,3))as char(6))) as EIA_FND_GMIC_PCT,
    trim(cast(STRT_NDX_VAL as char(13))) asSTRT_NDX_VAL
FROM
    EIM.TO_ANTY_FUND order byPOLICY_NUM,CASE_NUM,FUND_CD

If you notice in the above example I am first type casting the first field to (11,2) and then type casting it to char(12). In the second field I'm type casting it to (5,3) and then doing type cast to char(6).

Thus, I have the exact expected data. i.e. 000001234.12 and 00.123.

Hope it helps you in some way.
:)

Unicode character instead of Space in DataStage

Recently I was debugging a job where we are loading data from a .csv file to table. The problem that was happening is that the Character fields had spaces shown as blocks and not spaces.
Upon checking 2 observations were made:

1) The Character field had the Extended property set to Unicode.
2) Trim function was performed on the column in Transformation stage and assigned back to a Character field.

These 2 were causing the issue.
Once the Unicode option was changed to Blank and the trims were removed from Transformation stage, the Unicode character i.e. Blocks disappeared.

Reverting either of the observation caused in blocks.
One more observation: Blocks did not appear when a Trim was performed on a Character field but assigned to a Varchar field.

:)