Overview
Centerprise Integration Studio is designed to provide a hyper-parallel multithreaded platform for handling large-scale data integration tasks. The engine takes advantage of multiple processor architecture of today's computers by running several steps of data integration tasks in parallel. Additionally, the product provides a number of features to enable major performance gains in data integration tasks.
Data Synchronization is a key feature of Centerprise and is designed to optimize large-scale data transformation processes. This feature is especially valuable in situations where you periodically replicate data between different systems or update a large date sets with data refreshes from other systems. By eliminating database writes that are not necessary and optimizing the ones that
are, this feature can bring about major performance gains.
This document provides an overview of Centerprise Data Synchronization feature. It discusses circumstances where Data Synchronization can be used to optimize transfer tasks. It also discusses techniques and options to control data synchronization process.
How Synchronization Works
Data Synchronization is designed for situations where an existing database is updated periodically with data from another source. The source can be a file, database, query, or any other source type supported by Centerprise. Typically, the source contains a large number of records. However, there is usually a subset of these records that has actually changed since prior updates.
Centerprise synchronization compares data in the destination with the source data and, based on set of user-controlled options, updates the destination database. Synchronization is performed in batches. Centerprise reads a batch of records from the source and maps them to the destination. For all records in the batch, it retrieves the destination records and performs reconciliation between the source and destination records. If source and destination records are identical, no updates are performed. If there no corresponding destination for a source record, the record is inserted. Multiple
batches are run in parallel to increase throughput.
Synchronizing Data
Centerprise provides a number of options to influence data synchronization behavior. This section provides a brief description of synchronization options.
If Record exists in source and destination
You can choose
Update,
Skip, or
Delete and Insert actions on records that exist in both source and destination. You can use business rules to specify update conditions. For instance, if you want to skip updating orders that are already shipped, you can specify a condition that filters out these orders.
If Record exists only in source
You can choose
Skip or
Insert actions on records that are found in source but do not exist in destination.
If Record exists only in destination
For records that exists in destination but do not exist in source, you can choose to leave them untouched or delete them.
Using Data Synchronization for Data Replication
Data synchronization can be use to perform data replication across databases. Using Centerprise Scheduler, you can trigger data synchronization at specific intervals. As Centerprise supports popular
databases such as Microsoft SQL Server, Oracle, Sybase, and DB2, you can perform these updates directly between two databases and skip using flat files.
Speeding up Synchronization with Database Writer Options
Centerprise provides a number of features that enable you to control database writing process. Several options are provided to speed up database writes. These options can be combined to provide huge performance boost to your data synchronization tasks. Please refer to ‘Optimizing Large Jobs in Centerprise' for detailed discussion of this topic.
Controlling Synchronization Process
There are other options that enable you to control synchronization process. These include specifying synchronization batch size, and number of synchronization batches that can be executed concurrently.
Creating Custom Synchronizers
While Centerprise Synchronization provides a high degree of configurability, there may be situations where you may want to create customized synchronizers. Centerprise exposes all the synchronization functionality through extensive APIs that can be used to create customized variants. Please refer to ‘Creating Custom Integrators using Centerprise APIs' series of articles.
See Also
Optimizing Large Jobs in Centerprise ET