Talend: use tSchemaComplianceCheck to validate data

Inserting data from a system to another could be quite troublesome at times especially due to the difference in field length for both systems. However, this can be easily overcome in Talend by utilizing the tSchemaComplianceCheck component. This component can help to validate the data according to the schema that you have defined and it also allows you to catch the problematic data.



Here is a simple demo on how to use the component. I have the following data in a CSV file:



In my database, the length of the UserID field is 8 characters and this is the schema that I defined in Talend. Please note that the Nullable, Date Pattern and Length attributes are important as that will be used by the tSchemaComplianceCheck component to validate the data. In this example, I want to validate the length of all the fields that I have. So, I have defined the allowed field length in the Database component schema accordingly.



After that, I map the output row from the tMap component to the Database component through tSchemaComplianceCheck. In the tSchemaComplianceCheck setting, I set the mode to “check all columns from schema”. Once everything is set, I should see the second row of data from my CSV file is rejected by the tSchemaComplianceCheck due to the UserId exceeding the maximum length allowed as shown in the screenshot below.