Sign in or 

|
mansurshamsi |
Find missing and duplicate records in over 500 million records.
Aug 7 2009, 3:35 PM EDT
I have two databases on same server. In DB2 I have already 400 million records. I tried to load more 100 million records from DB1 to DB2 on top of existing 400 million records but after an hour I got the unique constraint error from DB2. The load has been failed after inserting 70% records.Now I have three tasks. First I have to find the duplicate record between DB1 and DB2. Second I have to find out the remaining 30% records. Third I have to re-load the remaining 30% records. I tried to find out the duplicates between DB1 and DB2 but my query is not ending. The unique constraint is based on composite key (charge_id, start_date). Can you give me the better and fast solution? Do you find this valuable?
Keyword tags:
None
|
|
tolko |
1. RE: Find missing and duplicate records in over 500 million records.
Aug 10 2009, 4:33 AM EDT
to find duplicates:SELECT COUNT(*), A, B FROM DBX GROUP BY A, B HAVING COUNT(*) > 2 re-loading MERGE INTO DBX t USING DBY a ON (t.A= a.A) WHEN NOT MATCHED THEN INSERT (t.A, t.B) VALUES (a.A, a.B) Do you find this valuable? |
|
gokhan.kucuk |
2. RE: Find missing and duplicate records in over 500 million records.
Aug 13 2009, 3:02 AM EDT
Trying "Insert and if it fails Update" strategy could be a solution, but sure it will not be fast choice.. Maybe you should use an ETL tool for parallel processing...Do you find this valuable? |