BryteFlow TruData is the data reconciliation application for the data replicated to the Cloud Data Lake or Cloud Data Warehouse. It does near real time data reconciliation for the configured objects, replicated on any Cloud destinations i.e. AWS Redshift, S3 or Snowflake via BryteFlow’s Ingest application.
Though the BryteFlow Ingest application is reliable, operational issues can sometimes affect data completeness on the target or destination. Data reconciliation using TruData is typically used to verify target data against the original source data, to ensure that the replication application BryteFlow Ingest has transferred the data and no change has been missed. TruData can also be used in situations where the database transaction logs are corrupted or not available, to identify the missing data. BryteFlow XL-Ingest can be then used to bring back the missing data.
BryteFlow TruData uses process information and mathematical methods to do a complete count and checksum validation of source and target data, at a particular point in time.
Being a companion software to BryteFlow Ingest the setup is automatic. It performs a thorough and accurate verification on the correctness of the migrated data. It provides near real-time stats for the reconciled data and the ongoing reconciliation process. It has flexibility to break down really huge tables to a very granular level and provide the comparison.
Table size doesn’t matter anymore. Every bit can be reconciled.
The User Interface of TruData is very informative and user friendly. Its graphical representation of the completeness results is very informative for the user, summarizes everything in one place.
New tables setup is just a few clicks away. Monitor progress and results instantly.
The BryteFlow TruData dashboard provides complete summary of the reconciliations that are currently setup within the BryteFlow TruData application.
It shows summary for :
In this page you will see the brief summary of all the tables configured for reconciliation. It lists the tables selected for reconciliation along with
Clicking further on each table name results in a detail summary being displayed as shown below :
Table Slicing Status, shows detailed statistics for the most recent reconciliation for each slice of the table that has been processed
Slice Name, Value range(min to max) for each slice of the selected table.
Src. Records, number of records in source database for the slice
Dest. Records, number of records in destination database for the slice
Status, Reconciliation status for each slice for the table selected
Src. Checksum, checksum value for the slice at Source
Dest. Checksum, checksum value for the slice at Destination
Last Checked, date and time of the last reconciliation run that happened for the Slice
This section can be used to do simple configuration of the TruData application which covers :
Location of CDC installation : The path of BryteFlow Ingest software
Web port : Port no. for the TruData web application to be hosted
No. of Source threads: Configured parallelism at source. No. of threads performing the reconciliation checks at source database.
No. of Destination threads: Configured parallelism at destination. No. of threads performing the reconciliation checks at destination database
Estimated Latency: Estimated latency of Ingest application. The schedule at which Ingest instance is scheduled to perform the load.
Default Schedule : This is the default schedule of the reconciliation process for all the tables selected. When no schedule is configured, default schedule is set to 24hrs.
Product ID : Product ID of the TruData installation
Licence Key : Licence key for the product id of the installation
Trudata being the companion software to Ingest pulls up the table list from the Ingest setup and lists all tables in this section for them to be configured for reconciliation.
In this section, the Users get to configure the table settings individually.
They get to,
Steps for Table Setup:
BryteFlow TruData has been custom built to handle tables of all sizes. Large tables (over 20GB) are heavy and can take some time to replicate. In such scenario, we advise to slice the tables to form comparatively smaller chunks which makes it faster, more streamlined and highly manageable. This can be in turn beneficial in case of network failures or any interruption caused in between checks by redoing the specific slices.
Users are given an option to either enter the slice values for the table manually or by using the software’s ‘Auto Slice‘ functionality. Below are some recommendation for manual slicing:
Example 1: Serial Nos.
*Separate values with one space.
Example 2: Character values
Slice: A G M S Z
*Separate values with one space.
‘Auto Slice’ is an important feature added in ‘Table Setup‘. Users can choose to use this functionality to slice the tables automatically by the software by following some easy steps as below.
**Please note : Partitioned Tables replicated on AWS S3 as a destination cannot be Sliced.
This section shows the list of actions/jobs performed in last 24 hours. It list all the actions such as ‘Slice’, ‘Count’ and ‘Compare’ with respective statuses, run datetime, no. of records, slice information etc.
The list of actions/jobs can be filtered based on the different status options selected against ‘Filter by’. All statuses are selected by default to show all actions.
Release details (by date descending, latest version first)
Release Notes BryteFlow Trudata – v2.2
Released May 2020
Release Notes BryteFlow Trudata – v2.1.1
Released November 2019