Data Profiling in Centerprise
What is Centerprise
Centerprise Integration Studio and Centerprise Integration Server are parts of Astera’s suite of data transformation and integration products. Centerprise provides a number of features for data transformation, migration, synchronization, and profiling.
This document provides an overview of Centerprise data profiling functionality.
Data Profiling in Centerprise
Centerprise data profiling supports the data integration functionality by providing extensive, accurate information about data. This information can be generated for incoming data or the data already in the database.
You can use this feature to generate summary and detailed data profile to develop a better understanding of the nature of incoming data and pinpoint any data quality issues before committing it to the database. Depending on the level of detail specified, profiler generates field level aggregate information as well as individual record level error information. Field level aggregate information includes:
- Minimum value
- Maximum value
- Number of null values
- Number of records with errors in that field
- Number of records with warnings in that field
Record level information shows source and destination records side-by-side along with validation error or warning messages for field and record level validations.
Data Validation Rules
Using Centerprise business rules language, you can validation rules on the incoming or mapped data and view field and record level statistics in the profiler. Here are some sample validation rules. While the examples are intentionally kept simple, the rules language is powerful to handle complex data validations. Moreover, the ability to quickly add custom functions can be used to develop validation function libraries as well as integrate with other libraries, components, and services.
Validation rules can be defined as either errors or warnings. When a rule is defined as warnings, an error message is added to data profile but the record proceeds to next step of transformation. For validation rules defined as errors, the record is not processed by next step once a validation rule fails.
Generating Data Profile
Centerprise automatically generates data profile when you transfer data from any source to destination. You can also generate data profile by using ‘Validate’ feature of the product. This feature simulates data transfer steps but does not write any data to destination.