Reload data from a specific time windows in the past
DMS isn't designed for time-windowed selective reloads while maintaining ongoing CDC. The challenge is that CDC works sequentially from transaction logs, and introducing historical data without creating duplicates or inconsistencies requires careful coordination.
The ReloadTables API in AWS DMS doesn't natively support reloading data for a specific time window in the past while CDC is ongoing. It's designed to perform a complete reload of the specified tables, not a targeted time-based refresh.
If this is a critical requirement, we need to implement a custom solution using a combination of DMS tasks and application-specific data reconciliation logic. To achieve this we have a few alternative approaches:
Custom Solution with Staging Tables:
Create staging tables in your target database
Use a separate DMS task to load historical data (from the time window) into these staging tables
Write custom merge logic to incorporate this historical data maintaining the current CDC stream
This keeps your ongoing CDC undisturbed
Adjust CDC Start Position:
If we want to "replay" changes from a specific point in time, we could:
Create a new task that starts CDC from your desired historical point
Direct this to a separate target schema/database
Merge the results after completion
Backup/Snapshot Restoration:
Restore a backup/snapshot from your desired time point to a separate environment
Migrate just the data you need from this point-in-time copy
Merge with your current state
AWS Database Migration Assessment:
Run data validation to identify specific records that differ between source and target
Selectively reload just those records rather than entire tables
Custom Solution with Staging Tables
Key Requirements:
No interruption to ongoing CDC replication
Transactional consistency during merges
Selective time-based data refresh
Full audit trail of changes
For migrating from Oracle to PostgreSQL with a time-window data reload:
Oracle to PostgreSQL Migration with Time-Window Data Reload
Oracle to PostgreSQL Migration with Time-Window Data ReloadEdited:
Added transformation rules in DMS task to handle Oracle-specific data types
Included a verification view to identify data type mismatches
2. Oracle-Specific DMS Settings
Added Oracle-specific settings in the task configuration:
LogMiner reader for CDC (more efficient with Oracle)
Character length semantics settings
Timestamp with timezone conversion options
Supplemental logging options
3. PostgreSQL Syntax Adjustments
Used PostgreSQL's CTE (WITH clause) syntax for efficient updates and inserts
Adjusted the merge procedure to use PostgreSQL's transaction handling
Created PostgreSQL-compatible monitoring views
4. Performance Optimizations
Added additional polling time in the orchestration script (Oracle migrations often take longer)
Added progress percentage reporting to monitor large table migrations
Improved transaction handling for PostgreSQL's MVCC architecture
5. Oracle-PostgreSQL Schema Compatibility
Added data type verification view to identify potential data format issues
Use of PostgreSQL-specific syntax for stored procedures
Handling of Oracle's case-sensitivity vs PostgreSQL's case-insensitivity
This approach allows you to selectively migrate and reload data from a specific time window in your Oracle database to PostgreSQL while maintaining your ongoing CDC replication. The solution is specifically optimized for the Oracle to PostgreSQL migration path.