1. Files that use proprietary formats, such as word-processing programs, spreadsheets, database programs, etc. These files contain formatting (like italics, underlines, etc.) and perhaps graphics, and other goodies beyond the simple text.
2. "Application Programs". These are programs written in (the binary) machine language that your computer understands. They are "compiled" from text files of "source code" written in a programming language. Vendors almost never make their source code available--except for free software, which you may have to compile yourself.
3. Text files that have been compressed to about half their size with one of the popular compression programs. Compression makes texts files binary. Compression doesn"t do much for files that are already binary unless the data they contain is very repet.i.tive.
4. Files containing graphics like GIF, TIFF, PICT, or JPEG files.
More on this below.
Transfering binary files is as easy as transfering text files once you understand the potential problems:
1. Most FTP programs start you out in TEXT mode. This means that text files are *translated* when they go from computer to computer on their way to you. This is fatal to binary files because their bit pattern has nothing to do with the groups of eight bits that make up text.
2. Even text files have slight compatibility problems because the three "worlds"--IBM, Macintosh, and UNIX--use a different control character to represent "return", "enter", or "newline." Translation between the different dialects is handled automatically in TEXT mode transfers. It is also the main reason why text files cannot be transfered in BINARY mode.
The two control characters involved are called "linefeed" (LF) and "carriage return" (CR):
IBM PC and compatibles : Macintosh and VAX : UNIX : 3. As mentioned above, text files are often compressed to save s.p.a.ce.
This means that you need a program to uncompress them before you read them--and that you have to transfer them in BINARY mode.
The most common compression programs and common file extensions are:
IBM PC and compatibles : PKZIP and PKUNZIP (.ZIP)
Macintosh : Stuffit and UnStuffit archives (.sit)
UNIX : compress and uncompress (.Z) and tape archive (.tar) with both together being most common (.tar.Z or .taz). Note capital "Z".
UNIX also has the gzip/gunzip command pair. gzip files usually have the extension ".z" (*small* z) or ".tgz" if they are also tape archive files.
Fortunately you can usually find free software for you computer that will uncompress formats from other computer models. For current information on compression software, see the FAQ for the newsgroup comp.compression (ftp://rtfm.mit.edu/x.x.x).
4. Conversely, sometimes binary files are converted to a sort of ASCII that looks like gibberish so that they can be mailed or transferred in TEXT mode--but again you need a program that translates them back to binary. Sometimes we encounter the ultimate absurdity, a text file that is compressed then re-encoded as ASCII for mailing.
Actually this makes sense if a large number of related text files are stored in a compressed "archive".
The most common programs for this are:
uuencode/uudecode for UNIX (used for Usenet news postings of binary files and for mailing programs) The file extension (rarely encountered because there is little reason to store files in this format) is ".uue".
BinHex for the Macintosh (.hqx) Often combined with Stuffit (.sit.hqx). This is a common method for distributing all the files that come with a program as a single file.
uuencoded files can be recognized by the fact that every line begins with a capital "M" and is exactly the same length. The file starts with the word "begin" and ends with "end" The translating program needs these words, but nothing above or below them. Often a uuencoded file is split into several parts for transmission and must be rea.s.sembled (and stripped of mail headers, etc.) in a word processing program before it is decoded. If you do this be sure to save the resulting file as a text file and not in the proprietary format of the word processing program!
What To Do With Graphics The second topic of this chapter is graphic images. Graphics are very important for Desktop Publishers--writers of newsletters, businesses that prepare their own brochures, and small printshops. Pictures can be stored in separate files or, in some cases, embedded in other formats such as the proprietary format of Microsoft Word files. Picture files take up a large amount of s.p.a.ce--especially big pictures at high resolution. 1 Megabyte is a typical size for a smallish picture at moderate resolution. Thus, one picture is worth about 500 pages of text!
The lifecycle of a typical graphic goes something like this:
STEP 1. Capturing (scanning) of photograph with optical scanner or with a special "video" camera
The better sort of optical scanner looks like a small xerox machine.
There are also cheaper hand-held models. Flatbed scanners cost in the $1000+ range so you are not likely to have one unless you are in the business. Most likely, the casual user will get a graphic from someone else, from a collection of "clip art", or create the graphic from scratch in a drawing program.
STEP 2. Storage in a file using an interchange format
However the image is obtained, it has to be stored on disk before it can be used. There are perhaps twenty or so common formats, but those found most often on the Internet and in the Usenet newsgroups are:
GIF (Graphics Interchange Format) a rather old-fashioned but very commonly found type of graphics file. Almost any software can read this format. This is the most common format on Anonymous FTP archives.
TIFF (Tagged Image File Format) Technically more versatile than GIF and just about as common. A very good choice for exchanging files between different programs.
JPEG () A special compressed image format that is becoming common in newer software.
EPS (Encapsulated PostScript) Not really a graphics file per say, but a set of instructions for drawing an image. The success of the Postscript page description language for Laser printers has led to a new stategy for including graphics in word processing files. Many high end word processing programs like Microsoft Word allow you to include a reference to an external Postscript file containing the figure.
Desktop publishing and high-end word processing programs can often save and import graphics in any of these formats, especially TIFF and EPS.
In addition, you may find files in proprietary formats like Macintosh PICT files. These formats serve as standards for their line of computers but not across different brands. Fortunately you can find free software that will convert TIFF to PICT or _vice versa_.
STEP 3. Transmission to point of use
Suppose you have a graphics file or a word processing file containing your brochure. How do you send that file to someone?
If you work in an academic environment, it is quite possible that one or the other inst.i.tutions is an Anonymous FTP site. You may be able to use the Anon. FTP site as a "mailbox" to transfer the file in binary mode-- or you could exchange pa.s.swords and transfer the file directly, if both have a direct connection to the Internet.
More commonly, you will have to send the file by E-mail. Say you"ve just finished a brochure and you want to send it cross-country. Let"s suppose that your business has two branches--one in New York and one in Los Angeles, and that both offices have Macintoshes with Microsoft Word and that you both have one of the free "Usenet software kits" for the Macintosh (not necessarily the same one). Then, you proceed as follows:
A. Using UUENCODE (or BINHEX, if you like) you convert the Microsoft Word file to a coded text file.
B. If your mail has a size limit, you may have to break up the file and send it in parts.
C. At the receiving end, rea.s.semble the file and strip any headers and trailers added by the mail system. The file should look like
begin M M M M end and be saved as a TEXT file.
D. Run UUDECODE (or BINHEX) and recover the binary file.