Location: Data Import and Export

Discussion: Data Pump.....and Compression?Reported This is a featured thread

Showing 9 posts
psmith@epx.com
psmith@epx.com
Data Pump.....and Compression?
Dec 17 2007, 3:58 PM EST | Post edited: Dec 17 2007, 3:58 PM EST
A common technique used with exp/imp was compressing through a Pipe. Apparently this cannot be done with data pump "the ... dumpfile ... is no longer processed in a sequential manner" (Note:463336.1).

Certainly understandable up to the point of output to the named DUMPFILE...but beyond that point isn't this a fundamental violation of the accepted handling of Files in an UNIX environment? The specified DUMPFILE doesn't comply with what I've always understood to be the basic rules of Files in UNIX. Perhaps it might make some sense to allow the DUMPFILE one additional level of abstraction by using a BLOB or External Table for non-sequential assembly, the attaching this to a Standard File output which would adhere to the traditional conventions...once again allowing use of Pipes! Just an idea.....
2  out of 15 found this valuable. Do you?    
Keyword tags: data exp expdp export imp impdp import

Moles
1. RE: Data Pump.....and Compression?
Jan 21 2008, 3:21 PM EST | Post edited: Jan 21 2008, 3:21 PM EST
On the fly compression using gzip or similar can easily reduce dump space requirements by a factor of 5 or more. For large databases that is very significant indeed. Furthermore, if the processing capacity is available, the resultant reduction in I/O required to process a dump will substantially speed up the whole process.
Since the expdp utility effectively prevents use of pipes because, I quote from the manual "If there are preexisting files that match the resulting filenames, an error is generated. The existing dump files will not be overwritten.' one would appear to be screwed even before considering how to handle multiple o/p files.
Trying to circumvent this by using the DBMS_DATAPUMP API doesn't work either.
I understand that the compression issue has been addressed in11g. Anyone want to confirm or deny that based on experience?
1  out of 12 found this valuable. Do you?    
rucknrun
rucknrun
2. RE: Data Pump.....and Compression?
Jan 25 2008, 10:16 AM EST | Post edited: Jan 25 2008, 10:16 AM EST
We gzip up our dump files after they are created. I have set up our nightly exports to gzip up the files. If you script your backups using a shell script you can easily make the filenames unique. We attach the time to the end of the filename. 2  out of 14 found this valuable. Do you?    

Moles
3. RE: Data Pump.....and Compression?
Jan 25 2008, 11:21 AM EST | Post edited: Jan 25 2008, 11:21 AM EST
Having to compress after the fact is very wasteful. Even if you don't care about the disk space - and with very large databases that is unlikely to be the case - it wastes as lot of resources. Somewhere between two and three times as much non-cached sequential I/O will be required to process the dump files. Since I/O is the slowest part of the system that means both imports and exports take far longer than would be the case if Oracle Corp. had built data pump better.

In a Standard Edition environment (which does not provide for parallel dump file activity), it's quite likely that the overall performance of the standard/old export utility when used with compression and named pipes will be comparable to data pump. Thus the only compelling reason for using data pump is for those databases using data types that export does not support. A sad state of affairs.
3  out of 13 found this valuable. Do you?    
rucknrun
rucknrun
4. RE: Data Pump.....and Compression?
Jan 25 2008, 1:38 PM EST | Post edited: Jan 25 2008, 1:38 PM EST
Well, gzip is probably more efficient at compression then the Oracle process would be. I don't think compressing the dumps after the fact is that big a deal. 2  out of 17 found this valuable. Do you?    

davjohnson
5. RE: Data Pump.....and Compression?
Jan 9 2009, 9:37 AM EST | Post edited: Jan 9 2009, 9:37 AM EST
"Well, gzip is probably more efficient at compression then the Oracle process would be. I don't think compressing the dumps after the fact is that big a deal."
The big deal is that if you can compress using a pipe, you never even create the larger uncompressed file, its only a stream of data in memory that is being compressed, so there is less overall file i/o happening, and less volume space and less time required to complete the process. This can add up quickly if you are on a big production box with 30+ oracle instances running. The normal compress process done inline was the unix compress command, but I believe you could just as easily run gzip in the pipe, using the old imp /exp utilities. Does anyone know if this issue is fixed using 11g?
Do you find this valuable?    

Moles
6. RE: Data Pump.....and Compression?
Jan 9 2009, 3:45 PM EST | Post edited: Jan 9 2009, 3:45 PM EST
All Oracle Corp. have to do is allow for over-write of an existing file and turning off of output to multiple files. Standard UNIX techniques can then take over.
Do you find this valuable?    

mathewjoy@yahoo.com
7. RE: Data Pump.....and Compression?
Apr 30 2009, 9:05 PM EDT | Post edited: Apr 30 2009, 9:05 PM EDT
"Well, gzip is probably more efficient at compression then the Oracle process would be. I don't think compressing the dumps after the fact is that big a deal."
Currently I have a piped, gzip-ed file that is 110GB on a 130GB file system using exp. If I have to expdp this, I beed close to 700GB filesystem. Infact compressing dumps after th fact is not a big deal. the BIG DEAL is to compress it as the export happens... at least in my case. Any comments??
Do you find this valuable?    

atulrd
8. RE: Data Pump.....and Compression?
May 17 2009, 6:43 PM EDT | Post edited: May 17 2009, 6:43 PM EDT
Two things
(1) REUSE_DUMPFILES will resolve the issue of NOT writing existing DUMPFILEs
(2) One can use double compression - first offered by 11g and then use gzip on top of it. This approach dramatically reduces the file sizes.
Do you find this valuable?    

Related Content

  (what's this?Related ContentThanks to keyword tags, links to related pages and threads are added to the bottom of your pages. Up to 15 links are shown, determined by matching tags and by how recently the content was updated; keeping the most current at the top. Share your feedback on Wetpaint Central.)