by H. Paul Robertson

H. Paul Robertson


10 October 2011

Prerequisite knowledge

General experience of building applications
with Flex Builder is suggested. For more
details on getting started with this Quick
Start, refer to Building the Quick Start
sample applications with Flex
Required products

Sample files

User level

The File Compression Tool example application is intentionally simple (see Figure 1). It demonstrates the following Adobe AIR features:
  • Compressing a file using the DEFLATE compression algorithm
  • Decompressing a file that was compressed with the DEFLATE compression algorithm
This sample application enables you to compress and decompress files in the GZIP file format.
Figure 1. This sample application enables you to compress and decompress files in the GZIP file format.
Note: This is a sample application provided, as is, for instructional purposes.
This sample application includes the following files, located in the folder FileCompressionTool:
  • FileCompressionTool.mxml: The main application file in MXML for Flex; includes the code discussed in this article
  • CompressScreen.mxml: An MXML file that defines the layout and functionality of the application's "compress a file" screen
  • DecompressScreen.mxml: An MXML file that defines the layout and functionality of the "decompress a file" screen
  • FileCompressionTool.-app.xml: The AIR application descriptor file
  • Sample AIR icon files
Note: These files are from the open-source ActionScript GZIP compression library written by H. Paul Robertson.
  • A GZIPEncoder instance that can be used to compress a file into a GZIP-compressed format file or decompress a GZIP compressed format file into a ByteArray or File object.
  • A GZIPFile instance represents a single GZIP-compressed format file, with properties for the various metadata as well as the actual source data in the compressed file.
  • A CRC32Generator instance is used to generate a CRC-32 value, which is used to verify that the uncompressed data extracted from a GZIP-compressed file is not modified or damaged from its original, pre-compression content. The CRC-32 check is used in the GZIP compression format.
Important: The application descriptor file, FileCompressionTool-app.xml, uses the AIR 3.0 namespace. To modify and repackage this application, you need to download and use AIR 3 SDK along with Flash Builder 4.5.

Testing the application

Download and install the application installer (FileCompressionTool.air). Select a file from your system's disk and compress it, using the GZIP-compressed file format. Then select a GZIP-compressed file from the disk, decompress it, and save the contents locally.
Note: This application uses the gzip compressed file format, described in IETF RFC 1952.

Understanding the code

Note: This article does not describe all of the Flex components used in the MXML code for the file. For more information, see the ActionScript 3 Reference for the Flash Platform.
Understanding the gzip compression format
When you're working with compressed-data files such as ZIP or similar files (including GZIP files used in this application), the technique of compressing file data and storing it in a compressed file involves two parts.
First, the data itself must be compressed using a compression algorithm—a set of rules that define a way to decrease the number of bytes needed to represent a file or item of data. (This is most commonly done by searching for and removing redundant patterns in the data.) The DEFLATE algorithm is one such algorithm for compressing computer data. In AIR, you can compress data in a ByteArray instance using the DEFLATE algorithm by calling the ByteArray instance's compress() method, passing the constant CompressionAlgorithm.DEFLATE for the method's algorithm parameter, as shown here:
import flash.utils.ByteArray; import flash.utils.CompressionAlgorithm; var dataToCompress:ByteArray = new ByteArray(); // ... add data to the ByteArray using writeBytes() or other methods ... // Compress the data dataToCompress.compress(CompressionAlgorithm.DEFLATE); // The ByteArray now contains a compressed version of its previous data
Although the data is now compressed, only half the process of creating a compressed-format file is complete. In addition to containing raw compressed data, each compression format (such as ZIP, GZIP, and so on) stores extra information about the compressed data within the compressed data file. What makes each compressed file format unique is the definition of the specific way that a file is structured, including what extra information is stored along with the compressed data and how the bundle of data and extra information is organized. For example, the GZIP-compressed format specifies that GZIP-formatted data (for example, a GZIP file) contains certain identifiers indicating the compression format and compression algorithm used, information about the date and time that the compression took place, the operating system on which the compression took place, and so forth, in addition to containing the actual compressed data.
If you are writing code to create or parse files that are structured using a particular compressed file format, you need to understand the distinction between the actual compressed data (which the runtime can create or extract for you using ByteArray.compress() and ByteArray.uncompress() ) and the structure of the compressed data, which your code will need to create or read. For examples of how this is done in the File Compression Tool sample application, see the GZIPEncoder.compressToFile() method for creating a file in gzip compression format, and the GZIPEncoder.parseGZIPFile() method for parsing a gzip compression format file into its parts.
Compressing a file using the DEFLATE algorithm
You use the ByteArray.compress() method to compress the data contained in a ByteArray instance. By default, the compress() method compresses data using the zlib compression format. Like zip and gzip, zlib is a compression format that includes compressed data as well as extra information about the data. If you want to use a different compression format, such as ZIP or GZIP, you can specify that the ByteArray should only be compressed using the DEFLATE algorithm without the addition of any extra information by passing the CompressionAlgorithm.DEFLATE constant as an argument to the compress() method. This is what the GZIPEncoder class does to compress a file in gzip compression format, in its compressToFile() method:
public function compressToFile(src:Object, output:File):void { // ... perform error checking ... // This ByteArray will contain the data to compress var srcBytes:ByteArray; var target:File = new File(output.nativePath); // ... populate srcBytes with data, depending on whether src is // a File instance or a ByteArray instance ... // Open the FileStream instance for creating the resulting file var outStream:FileStream = new FileStream();, FileMode.WRITE); // ... calculate the extra header and footer information for // the gzip format and append it to the output FileStream ... // Add the actual compressed data srcBytes.compress(CompressionAlgorithm.DEFLATE); outStream.writeBytes(srcBytes, 0, srcBytes.length); // ... append the footer information to the output FileStream ... outStream.close(); }
Although the compressToFile() method includes some additional complexity for dealing with creating the output file and writing the extra information that's part of the GZIP compression format, the core function of compressing the data and adding it to the GZIP file is performed by these two lines:
srcBytes.compress(CompressionAlgorithm.DEFLATE); outStream.writeBytes(srcBytes, 0, srcBytes.length);
The ByteArray instance named srcBytes contains the data to be compressed (loaded from the user-selected source file). The compress() method is called with the CompressionAlgorithm.DEFLATE argument, so all the bytes in srcBytes are compressed using the DEFLATE compression algorithm. At that point srcBytes contains the compressed version of the data (the original content is replaced). That data is then written to the destination file (the gzip file that's being created) in the appropriate part of the file as specified in the gzip compression format. The entire srcBytes ByteArray is written to the output file stream using the outStream variable's writeBytes() method—the three arguments indicate that the data to write comes from srcBytes, starting at position 0 in srcBytes , and going to the final position of srcBytes (indicated by srcBytes.length ).
Decompressing a file using the DEFLATE algorithm
When a user selects a gzip format file to decompress, that file is passed to the GZIPEncoder.uncompressToFile() method, which actually decompresses the file, extracts the original source data, and writes it to a file in the specified location. Like the compressToFile() method, the uncompressToFile() method performs several steps, most of which involve error checking, separating the extra information in the gzip file from the actual compressed data, and determining the file name of the file to which the newly uncompressed data is saved:
public function uncompressToFile(src:File, output:File):void { // ... error checking ... // call the parseGZIPFile method, which extracts the gzip file into a // GZIPFile instance with properties representing the parts of the gzip file var gzipData:GZIPFile = parseGZIPFile(src); var outFile:File = new File(output.nativePath); // ... determine the destination file path, depending on whether // output is a directory, and whether the source gzip file contained // file name information ... // get the actual compressed bytes from the gzip file var data:ByteArray = gzipData.getCompressedData(); // Perform the uncompress operation try { data.uncompress(CompressionAlgorithm.DEFLATE); } catch (error:Error) { throw new IllegalOperationError("The specified file is not a GZIP file."); } // ... write the uncompressed data to the destination file ... }
The step of converting the compressed data to uncompressed data takes place in this line:
When the uncompress() method is called, the compressed content of the data ByteArray instance is uncompressed and the data variable then contains the uncompressed data, ready to be written to the destination file.