ON THE FLY COMPRESSION/UNCOMPRESSION IS EASY ON UNIX, BUT ALSO ON WINDOWS  

Posted by ReelTym

Link to full article

ON THE FLY COMPRESSION IS EASY ON UNIX, BUT IT IS ALSO EASY ON WINDOWS

There are a number of situations where the output results of a program become the input data for other program (for example you may want to compress your backup file with gzip tool and the compressed file be encrypted with Rijndael algorithm using GNU aes tool)

When both programs support stdin and stdout as a mechanism for input output, you can easily pipe the output of the first program to the input of the second program at the command line. For example

gzip –c mybackupfile.bkp | aes –e -p mypass -o mybackupfile.bkp.gz.enc

Unfortunately, this is not always the case and some programs don’t accept stdin and stdout for data input output (this is the case of Oracle import/export tools or Microsoft bcp tool).

On Unix environments these cases have been typically solved using mknod tool to create an OS file pipe. Once the pipe is created, the first program is able to write its output results to the pipe as if it was a normal file and the second program is able to read data from that pipe as if it was a normal file.

As an example, here you can find a largely used script by Oracle DBA to perform on the fly compression of an export operation

# Make a pipe
mknod expdat.dmp p
# Start compress the pipe in background
gzip -c < expdat.dmp > expdat.dmp.gz &
# Wait start the export
sleep 5
# Start the export
exp scott/tiger file=expdat.dmp

As far as I know, there is no similar native way to perform this operation on Microsoft Windows operating system.

I started thinking on it and finally I got a simple solution using Microsoft Windows pipes, zlib library (http://www.zlib.net/zlib123-dll.zip ) and a couple of small tools (less than 100 lines of code each of them)  I wrote: ZipPipe.exe and UnZipPipe.exe.

In point 1, I will show several uses of these tools, basically how to perform on the fly compression using Oracle imp and exp tools.
To get the necessary bin files (ZipPipe.exe, UnZipPipe.exe and zlib1.dll I would suggest to read point 2 and 3 of this document, but if you have any problem to obtain these files, just drop me an e-mail at jcarlossaez1@hotmail.com
Note: You can obtain a compiled version of these tools from http://cid-b3378f057444b65c.skydrive.live.com/self.aspx/P%c3%bablico/ZipPipe/zippipe.zip You don´t need anything more than these to run the tools.

1 HOW TO USE ZipPipe AND UnZipPipe TOOLS


1.1 On the Fly compression with Oracle Exp and Imp tools


On Unix environments, it has been largely used scripts allowing on the fly compression of the dump file generated by exp utility.
In the same way, on the fly decompression can be achieved to perform import operations reading directly from a compressed file.

A typical script to perform on the fly compression for the data generated by exp utility is

# Make a pipe
mknod expdat.dmp p
# Start compress the pipe in background
gzip -c < expdat.dmp > expdat.dmp.gz &
# Wait start the export
sleep 5
# Start the export
exp scott/tiger file=expdat.dmp

A typical script to execute an import operation reading directly from a compressed file is 

# Make a pipe
mknod expdat.dmp p
# Start decompress to the pipe in background
gzip -c < expdat.dmp.gz > expdat.dmp > &
# Wait start the import
sleep 5
# Start the import
imp scott/tiger file=expdat.dmp

There is no way to accomplish this work in the same way on Windows platforms. When exporting you first export to a normal file and then, you can compress it (except using NTFS built-in compression capabilities, but this is not what we are looking for)
When importing, you need first decompress the file and then you can import the file.

However, whit the ZipPipe and UnZipPipe tools, you can achieve the same behaviour as you have on Unix.

How to perform on the fly compression while exporting on Windows platforms?

Until now, your bat scripts looks something similar to this

exp scott/tiger file=expdat.dmp
gzip  expdat.dmp expdat.dmp.gz

Only when exp tool finishes its job, you can start compressing the file. This way needs more disk space and in most of the cases more time.

Look how you can export and compress without any intermediate file

start /MIN ZipPipe EXPPIPE expdat.dmp.gz 9
exp scott/tiger file= \\.\pipe\EXPPIPE

The first line starts our “compressor engine” that listens on named pipe \\.\pipe\EXPPIPE and writes the compressed information to the file expdat.dmp.gz  with a compression level of 9 (compression level can be in the range 1 to 9)
When export tool completes the export operation, ZipPipe process detects it and ends

How to perform on the fly decompression while importing on Windows platforms?

Until now, your bat scripts looks something similar to this

gzip –d expdat.dmp.gz expdat.dmp
imp scott/tiger file=expdat.dmp

Only when decompressor tool finishes its job, you can start importing the file. This way needs more disk space and in most of the cases more time.

Look how you can import and decompress without any intermediate file

start /MIN UnZipPipe IMPPIPE expdat.dmp.gz
imp scott/tiger file= \\.\pipe\IMPIPE

The first line starts our “decompressor engine” that listens that reads from the compressed file expdat.dmp.gz and writes the decompressed information to named pipe \\.\pipe\IMPPIPE
When import tool completes the import operation, UnZipPipe process detects it and ends.

You can think on ZipPipe and UnZipPipe as the equivalent tool to mknod plus gzip in the Unix environment.
Of course, you can make many remarks to this solution, but it allows you to achieve the same functionality you have on Unix, saving lot of space in Disks and most of the times reducing import/export duration.

One more thing: it is a pity these tools don’t work with new expdp and impdp tools available in Oracle10g. But don’t blame to Microsoft Windows or to these tools themselves. You won’t be able to perform on the fly compression with these new tools on Unix environments too. It is due to a change in the design of these tools. (And don’t get wrong with the COMPRESS parameter of these new tools. This parameter only compresses metadata).

1.2 On the Fly compression with Microsoft bcp tool

What a terrible pity! I have been able to use these tools only to on the fly compress the output of the bcp in native format.
I can not use them for on the fly decompression when using bcp to import or even when downloading data in no native format.
Perhaps someone can make them work.

How do you use bcp to export pubs..authors table to an uncompressed file and then compress?

Typically, at the command prompt in the source SQL Server you only need to type:

            bcp  pubs..authors out  authors.txt -T –n
            gzip authors.txt authors.txt.gz

The first command exports the data and the second one compress the generated file using gzip tool

Note that during the process you need enough space to store authors.txt and authors.txt.gz simultaneously provided that at the end you can delete the uncompressed file.

How can you use bcp and ZipPipe to export pubs..authors table directly to a compressed file?

At the command prompt in the source SQL Server you only need to type:

            start /MIN ZipPipe authors_pipe authors.txt.gz 9
            bcp  pubs..authors out  \\.\pipe\authors_pipe -T -n

The first command starts our compressor tool (you can think this step is similar to create a pipe and start the background compressor in the Unix environment all in one step).
Second, you only need to start bcp tool, but giving the pipe \\.pipe\ authors_pipe  as the file name where bcp has to write.

Another  process is launched. This background process is our compressor tool that creates and listens on named pipe \\.pipe\ authors_pipe and saves the data once compressed in the file authors.txt.gz. This process automatically ends when bcp completes export operation.

And you can see how file authors.txt.gz is the only file generated in one step.

Rest of the article where you can find how to build these tools
 http://spaces.msn.com/members/jcarlossaez/Blog/cns!1phQKLZOcIUsN9Tj5QObzgdw!112.entry

This entry was posted on Friday, October 21, 2011 and is filed under , , , , . You can leave a response and follow any responses to this entry through the Subscribe to: Post Comments (Atom) .

0 comments