File Format Encyclopedia: 2008

Icon Library .ICL File Format

An .ICL file -- ICon Library, as used by icon editors like Microangelo -- is a renamed 16-bit Windows .DLL (an NE format executable) which typically contains nothing but a resource section.
The ICL extension seems to be used by convention.

.WMF Metafile Format

.WMF Metafile Format

A metafile for the Microsoft Windows operating system consists of a collection of graphics device interface (GDI) functions that describe an image. Because metafiles take up less space and are more device-independent than bitmaps, they provide convenient storage for images that appear repeatedly in an application or need to be moved from one application to another. To generate a metafile, a Windows application creates a special device context that sends GDI commands to a file or memory for storage. The application can later play back the metafile and display the image. During playback, Windows breaks the metafile down into records and identifies each object with an index to a handle table. When a META_DELETEOBJECT record is encountered during playback, the associated object is deleted from the handle table. The entry is then reused by the next object that the metafile creates. To ensure compatibility, an application that explicitly manipulates records or builds its own metafile should manage the handle table in the same way. For more information on the format of the handle table, see the HANDLETABLE structure.
In some cases, there are two variants of a metafile record, one representing the record created by Windows versions before 3.0 and the second representing the record created by Windows versions 3.0 and later. Windows versions 3.0 and later play all metafile versions but store only 3.0 and later versions. Windows versions earlier than 3.0 do not play metafiles recorded by Windows versions 3.0 and later. A metafile consists of two parts: a header and a list of records. The header and records are described in the remainder of this topic. For a list of function-specific records, see Metafile Records.

Metafile Header
The metafile header contains a description of the size of the metafile and the number of drawing objects it uses. The drawing objects can be pens, brushes, bitmaps, or fonts.
The metafile header has the following form:


typedef struct tagMETAHEADER {
    WORD  mtType;
    WORD  mtHeaderSize;
    WORD  mtVersion;
    DWORD mtSize;
    WORD  mtNoObjects;
    DWORD mtMaxRecord;
    WORD  mtNoParameters;
} METAHEADER;

Following are the members in the metafile header:

mtType	Specifies whether the metafile is stored in memory or recorded in a file. This member has one of the following values: 0 : Metafile is in memory. 1 : Metafile is in a file.
mtHeaderSize	Specifies the size, in words, of the metafile header.
mtVersion	Specifies the Windows version number. The version number for Windows version 3.0 and later is 0x300.
mtSize	Specifies the size, in words, of the file.
mtNoObjects	Specifies the maximum number of objects that can exist in the metafile at the same time.
mtMaxRecord	Specifies the size, in words, of the largest record in the metafile.
mtNoParameters	Not used.

Typical Metafile Record
The graphics device interface stores most of the GDI functions that an application can use to create metafiles in typical records.
A typical metafile record has the following form:


struct {
    DWORD rdSize;
    WORD  rdFunction;
    WORD  rdParm[];
}

Following are the members in a typical metafile record:

rdSize	Specifies the size, in words, of the record.
rdFunction	Specifies the function number. This value may be the number of any function in the table at the end of this section.
rdParm	Identifies an array of words containing the function parameters (listed in the reverse order in which they are passed to the function).

Following are the GDI functions found in typical records, along with their hexadecimal values:

GDI function	Value
Arc	0x0817
Chord	0x0830
Ellipse	0x0418
ExcludeClipRect	0x0415
FloodFill	0x0419
IntersectClipRect	0x0416
LineTo	0x0213
MoveTo	0x0214
OffsetClipRgn	0x0220
OffsetViewportOrg	0x0211
OffsetWindowOrg	0x020F
PatBlt	0x061D
Pie	0x081A
RealizePalette (3.0 and later)	0x0035
Rectangle	0x041B
ResizePalette (3.0 and later)	0x0139
RestoreDC	0x0127
RoundRect	0x061C
SaveDC	0x001E
ScaleViewportExt	0x0412
ScaleWindowExt	0x0400
SetBkColor	0x0201
SetBkMode	0x0102
SetMapMode	0x0103
SetMapperFlags	0x0231
SetPixel	0x041F
SetPolyFillMode	0x0106
SetROP2	0x0104
SetStretchBltMode	0x0107
SetTextAlign	0x012E
SetTextCharacterExtra	0x0108
SetTextColor	0x0209
SetTextJustification	0x020A
SetViewportExt	0x020E
SetViewportOrg	0x020D
SetWindowExt	0x020C
SetWindowOrg	0x020B

Placeable Windows Metafiles
A placeable Windows metafile is a standard Windows metafile that has an additional 22-byte header. The header contains information about the aspect ratio and original size of the metafile, permitting applications to display the metafile in its intended form.
The header for a placeable Windows metafile has the following form:


typedef struct {
    DWORD   key;
    HANDLE  hmf;
    RECT    bbox;
    WORD    inch;
    DWORD   reserved;
    WORD    checksum;
} METAFILEHEADER;

Following are the members of a placeable metafile header:

key	Specifies the binary key that uniquely identifies this file type. This member must be set to 0x9AC6CDD7L.
hmf	Unused; must be zero.
bbox	Specifies the coordinates of the smallest rectangle that encloses the picture. The coordinates are in metafile units as defined by the inch member.
inch	Specifies the number of metafile units to the inch. To avoid numeric overflow, this value should be less than 1440. Most applications use 576 or 1000.
reserved	Unused; must be zero.
checksum	Specifies the checksum. It is the sum (using the XOR operator) of the first 10 words of the header.

The actual content of the Windows metafile immediately follows the header. The format for this content is identical to that for
standard Windows metafiles. For some applications, a placeable indows metafile must not exceed 64K.

Note: Placeable Windows metafiles are not compatible with the GetMetaFile function. Applications that intend to use the metafile functions to read and play placeable Windows metafiles must read the file by using an input function (such as _lread), strip the 22-byte header, and create a standard Windows metafile by using the remaining bytes and the SetMetaFileBits function.

Guidelines for Windows Metafiles
To ensure that metafiles can be transported between different computers and applications, any application that creates a metafile should make sure the metafile is device-independent and sizable.
The following guidelines ensure that every metafile can be accepted and manipulated by other applications:

Set a mapping mode as one of the first records. Many applications, including OLE applications, only accept metafiles that are in MM_ANISOTROPIC mode.
Call the SetWindowOrg and SetWindowExt functions. Do not call the SetViewportExt or SetViewportOrg functions if the user will be able to resize or change the dimensions of the object.
Use the MFCOMMENT printer escape to add comments to the metafile.
Rely primarily on the functions listed in Typical Metafile Record. Observe the following limitations on the functions you use:
- Do not use functions that retrieve data (for example, GetActiveWindow or EnumFontFamilies).
- Do not use any of the region functions (because they are device dependent).
- Use StretchBlt or StretchDIB instead of BitBlt.

Sample of Metafile Program Output
This section describes a sample program and the metafile that it creates. The sample program creates a small metafile that draws a purple rectangle with a green border and writes the words "Hello People" in the rectangle.


MakeAMetaFile(hDC)
HDC hDC;
{
    HPEN     hMetaGreenPen;
    HBRUSH   hMetaVioletBrush;
    HDC      hDCMeta;
    HANDLE   hMeta;

    /* Create the metafile with output going to the disk. */

    hDCMeta = CreateMetaFile( (LPSTR) "sample.met");

    hMetaGreenPen = CreatePen(0, 0, (DWORD) 0x0000FF00);
    SelectObject(hDCMeta, hMetaGreenPen);

    hMetaVioletBrush = CreateSolidBrush((DWORD) 0x00FF00FF);
    SelectObject(hDCMeta, hMetaVioletBrush);

    Rectangle(hDCMeta, 0, 0, 150, 70);

    TextOut(hDCMeta, 10, 10, (LPSTR) "Hello People", 12);


    /* We are done with the metafile. */

    hMeta = CloseMetaFile(hDCMeta);

    /* Play the metafile that we just created. */

    PlayMetaFile(hDC, hMeta);
}

The resulting metafile, SAMPLE.MET, consists of a metafile header and six records. It has the following binary form:


0001         mtType... disk metafile
0009         mtSize...
0300         mtVersion
0000 0036    mtSize
0002         mtNoObjects
0000 000C    mtMaxRecord
0000         mtNoParameters

0000 0008    rdSize
02FA         rdFunction (CreatePenIndirect function)
0000 0000 0000 0000 FF00  rdParm (LOGPEN structure defining pen)

0000 0004    rdSize
012D         rdFunction (SelectObject)
0000         rdParm (index to object #0... the above pen)

0000 0007    rdSize
02FC         rdFunction (CreateBrushIndirect)

0000 00FF 00FF 0000 rdParm (LOGBRUSH structure defining the brush)

0000 0004    rdSize
012D         rdFunction (SelectObject)
0001         rdParm (index to object #1... the brush)

0000 0007    rdSize
041B         rdFunction (Rectangle)
0046 0096 0000 0000 rdParm (parameters sent to Rectangle...
                    in reverse order)

0000 000C    rdSize
0521         rdFunction (TextOut)
rdParm
000C         count
string
48 65 6C 6C 6F 20 50 65 6F 70 6C 65   "Hello People"
000A             y-value

000A             x-value

ZSoft PCX File Format

Image files used by PC Paintbrush product family and FRIEZE (those with a
.PCX extension) begin with a 128 byte header. Usually you can ignore this
header, since your images will probably all have the same resolution. If
you want to process different resolutions or colors, you will need to
interpret the header correctly. The remainder of the image file consists
of encoded graphic data. The encoding method is a simple byte oriented
run-length technique. We reserve the right to change this method to
improve space efficiency. When more than one color plane is stored in
the file, each line of the image is stored by color plane (generally ordered
red, green, blue, intensity), As shown below.


Scan line 0:         RRR...        (Plane 0)
                     GGG...        (Plane 1)
                     BBB...        (Plane 2)
                     III...        (Plane 3)
Scan line 1:         RRR...
                     GGG...
                     BBB...
                     III...        (etc.)

The encoding method is:


    FOR  each  byte,  X,  read from the file
        IF the top two bits of X are  1's then
            count = 6 lowest bits of X
            data = next byte following X
        ELSE
            count = 1
            data = X

Since the overhead this technique requires is, on average, 25% of
the non-repeating data and is at least offset whenever bytes are repeated,
the file storage savings are usually considerable.

ZSoft .PCX FILE HEADER FORMAT

Byte	Item	Size	Description/Comments
0	Manufacturer	1	Constant Flag, 10 = ZSoft .pcx
1	Version	1	Version information 0 = Version 2.5 of PC Paintbrush 2 = Version 2.8 w/palette information 3 = Version 2.8 w/o palette information 4 = PC Paintbrush for Windows(Plus for Windows uses Ver 5) 5 = Version 3.0 and > of PC Paintbrush and PC Paintbrush +, includes Publisher's Paintbrush . Includes 24-bit .PCX files
2	Encoding	1	1 = .PCX run length encoding
3	BitsPerPixel	1	Number of bits to represent a pixel (per Plane) : 1, 2, 4, or 8
4	Window	8	Image Dimensions: Xmin,Ymin,Xmax,Ymax
12	HDpi	2	Horizontal Resolution of image in DPI*
14	VDpi	2	Vertical Resolution of image in DPI*
16	Colormap	48	Color palette setting, see text
64	Reserved	1	Should be set to 0.
65	NPlanes	1	Number of color planes
66	BytesPerLine	2	Number of bytes to allocate for a scanline plane. MUST be an EVEN number. Do NOT calculate from Xmax-Xmin.
68	PaletteInfo	2	How to interpret palette- 1 = Color/BW, 2 = Grayscale (ignored in PB IV/ IV +)
70	HscreenSize	2	Horizontal screen size in pixels. New field found only in PB IV/IV Plus
72	VscreenSize	2	Vertical screen size in pixels. New field found only in PB IV/IV Plus
74	Filler	54	Blank to fill out 128 byte header. Set all bytes to 0

NOTES:

All sizes are measured in BYTES.
All variables of SIZE 2 are integers.
*) HDpi and VDpi represent the Horizontal and Vertical resolutions which the image was created (either printer or scanner); i.e. an image which was scanned might have 300 and 300 in each of these fields.

Decoding .PCX Files

First, find the pixel dimensions of the image by calculating
[XSIZE = Xmax - Xmin + 1] and [YSIZE = Ymax - Ymin + 1]. Then calculate how many bytes are required to hold one complete uncompressed scan line: TotalBytes = NPlanes * BytesPerLine

Note that since there are always an even number of bytes per scan line, there will probably be unused data at the end of each scan line. TotalBytes shows how much storage must be available to decode each scan line, including any blank area on the right side of the image. You can now begin decoding the first scan line - read the first byte of data from the file. If the top two bits are set, the remaining six bits in the byte show how many times to duplicate the next byte in the file. If the top two bits are not set, the first byte is the data itself, with a count of one.

Continue decoding the rest of the line. Keep a running subtotal of how many bytes are moved and duplicated into the output buffer. When the subtotal equals TotalBytes, the scan line is complete. There should always be a decoding break at the end of each scan line. But there will not be a decoding break at the end of each plane within each scan line. When the scan line is completed, there may be extra blank data at the end of each plane within the scan line. Use the XSIZE and YSIZE values to find where the valid image data is. If the data is multi-plane, BytesPerLine shows where each plane ends within the scan line.

Continue decoding the remainder of the scan lines (do not just read to end-of-file). There may be additional data after the end of the image (palette, etc.)

Palette Information Description

EGA/VGA 16 Color Palette Information
In standard RGB format (IBM EGA, IBM VGA) the data is stored as 16 triples.
Each triple is a 3 byte quantity of Red, Green, Blue values. The values can
range from 0-255, so some interpretation may be necessary. On an IBM EGA,
for example, there are 4 possible levels of RGB for each color. Since
256/4 = 64, the following is a list of the settings and levels:


Setting                Level
   0-63                0
 64-127                1
128-192                2
193-254                3

VGA 256 Color Palette Information
ZSoft has recently added the capability to store palettes containing more than 16 colors in the .PCX image file. The 256 color palette is formatted and treated the same as the 16 color palette, except that it is substantially longer. The palette (number of colors x 3 bytes in length) is appended to the end of the .PCX file, and is preceded by a 12 decimal. Since the VGA device expects a palette value to be 0-63 instead of 0-255, you need to divide the values read in the palette by 4.
To access a 256 color palette:

First, check the version number in the header; if it contains a 5 there is
a palette.
Second, read to the end of the file and count back 769 bytes. The value you find should be a 12 decimal, showing the presence of a 256 color palette.

24-Bit .PCX Files
24 bit images are stored as version 5 or above as 8 bit, 3 plane images.
24 bit images do not contain a palette. Bit planes are ordered as lines of red, green, blue in that order.

CGA Color Palette Information

NOTE: This is no longer supported for PC Paintbrush IV/IV Plus.

For a standard IBM CGA board, the palette settings are a bit more complex.
Only the first byte of the triple is used. The first triple has a valid
first byte which represents the background color. To find the background,
take the (unsigned) byte value and divide by 16. This will give a result
between 0-15, hence the background color. The second triple has a valid
first byte, which represents the foreground palette. PC Paintbrush supports
8 possible CGA palettes, so when the foreground setting is encoded between
0 and 255, there are 8 ranges of numbers and the divisor is 32.

CGA Color Map
Header Byte #16: Background color is determined in the upper four bits.
Header Byte #19: Only upper 3 bits are used, lower 5 bits are ignored. The first three bits that are used are ordered C, P, I. These bits are interpreted as follows:
c: color burst enable - 0 = color; 1 = monochrome
p: palette - 0 = yellow; 1 = white
i: intensity - 0 = dim; 1 = bright

QuickBasic BSAVE Format

We'll assume the picture dimensions are width * height x bpp (where bpp means bits per pixel).

File Header
The total length of the header is 7 bytes

Byte 00	Must be &HFD (253) to be a valid BSAVE file
Bytes 01 to 02	Segment where the datas were stored in memory before using BSAVE
Bytes 03 to 04	Offset where the datas were stored in memory before using BSAVE
Bytes 05 to 06	widthheight(bpp/8)+5 : Size of the array stored in the file + 5

Memory Dump of the buffer got using GET command

Bytes 07 to 08	width*bpp
Bytes 09 to 10	height
Bytes 11 to 11+widthheight(bpp/8)	Color indexes map got using GET command

Checksum
Last byte : This is a kind of checksum. I have no more information about this byte but it appears to depend on the picture dimensions.

Note: All number in this document are written in decimal

.FNT Font-File Format

Formats for Microsoft Windows font files are defined for both raster and vector fonts. These formats can be used by smart text generators in some GDI support modules. The vector formats, in particular, are more frequently used by GDI itself than by support modules.

Both raster and vector font files begin with information that is common to both, and then continue with information that differs for each type of file.

For Windows 3.00, the font-file header includes six new fields: dFlags, dfAspace, dfBspace, dfCspace, dfColorPointer, and dfReserved1. These fields are not used in Windows 3.00. To ensure compatibility with future versions of Windows, these fields should be set to zero.

All device drivers support the Windows 2.x fonts. However, not all device drivers support the Windows 3.00 version.

Windows 3.00 font files include the glyph table in dfCharTable, which consists of structures that describe the bits for characters in the font file. This version enables fonts to exceed 64K in size, the size limit of Windows 2.x fonts. This is made possible by the use of 32-bit offsets to the character glyphs in dfCharTable.

Because of the 32-bit offsets and their potentially large size, these fonts are designed for use on systems that are running Windows version 3.00 in protected (standard or 386 enhanced) mode with an 80386 (or higher) processor where the processor's 32-bit registers can access the character glyphs. Typically, device drivers use the Windows 3.00 version of a font only when both of these conditions are true.

Font files are stored with an .FNT extension of the form NAME.FNT. The information at the beginning of both raster and vector versions of Windows 3.00 font files is shown in the following list:

Field	Description
dfVersion	2 bytes specifying the version (0200H or 0300H) of the file.
dfSize	4 bytes specifying the total size of the file in bytes.
dfCopyright	60 bytes specifying copyright information.
dfType	2 bytes specifying the type of font file. The low-order byte is exclusively for GDI use. If the low-order bit of the WORD is zero, it is a bitmap (raster) font file. If the low-order bit is 1, it is a vector font file. The second bit is reserved and must be zero. If no bits follow in the file and the bits are located in memory at a fixed address specified in dfBitsOffset, the third bit is set to 1; otherwise, the bit is set to 0 (zero). The high-order bit of the low byte is set if the font was realized by a device. The remaining bits in the low byte are reserved and set to zero. The high byte is reserved for device use and will always be set to zero for GDI-realized standard fonts. Physical fonts with the high-order bit of the low byte set may use this byte to describe themselves. GDI will never inspect the high byte.
dfPoints	2 bytes specifying the nominal point size at which this character set looks best.
dfVertRes	2 bytes specifying the nominal vertical resolution (dots-per-inch) at which this character set was digitized.
dfHorizRes	2 bytes specifying the nominal horizontal resolution (dots-per-inch) at which this character set was digitized.
dfAscent	2 bytes specifying the distance from the top of a character definition cell to the baseline of the typographical font. It is useful for aligning the baselines of fonts of different heights.
dfInternalLeading	Specifies the amount of leading inside the bounds set by dfPixHeight. Accent marks may occur in this area. This may be zero at the designer's option.
dfExternalLeading	Specifies the amount of extra leading that the designer requests the application add between rows. Since this area is outside of the font proper, it contains no marks and will not be altered by text output calls in either the OPAQUE or TRANSPARENT mode. This may be zero at the designer's option.
dfItalic	1 (one) byte specifying whether or not the character definition data represent an italic font. The low-order bit is 1 if the flag is set. All the other bits are zero.
dfUnderline	1 byte specifying whether or not the character definition data represent an underlined font. The low-order bit is 1 if the flag is set. All the other bits are 0 (zero).
dfStrikeOut	1 byte specifying whether or not the character definition data represent a struckout font. The low-order bit is 1 if the flag is set. All the other bits are zero.
dfWeight	2 bytes specifying the weight of the characters in the character definition data, on a scale of 1 to 1000. A dfWeight of 400 specifies a regular weight.
dfCharSet	1 byte specifying the character set defined by this font.
dfPixWidth	2 bytes. For vector fonts, specifies the width of the grid on which the font was digitized. For raster fonts, if dfPixWidth is nonzero, it represents the width for all the characters in the bitmap; if it is zero, the font has variable width characters whose widths are specified in the dfCharTable array.
dfPixHeight	2 bytes specifying the height of the character bitmap (raster fonts), or the height of the grid on which a vector font was digitized.
dfPitchAndFamily	Specifies the pitch and font family. The low bit is set if the font is variable pitch. The high four bits give the family name of the font. Font families describe in a general way the look of a font. They are intended for specifying fonts when the exact face name desired is not available. The families are as follows: Family Description ------ ----------- FF_DONTCARE (0<<4) Don't care or don't know. FF_ROMAN (1<<4) Proportionally spaced fonts with serifs. FF_SWISS (2<<4) Proportionally spaced fonts without serifs. FF_MODERN (3<<4) Fixed-pitch fonts. FF_SCRIPT (4<<4) FF_DECORATIVE (5<<4)
dfAvgWidth	2 bytes specifying the width of characters in the font. For fixed-pitch fonts, this is the same as dfPixWidth. For variable-pitch fonts, this is the width of the character "X."
dfMaxWidth	2 bytes specifying the maximum pixel width of any character in the font. For fixed-pitch fonts, this is simply dfPixWidth.
dfFirstChar	1 byte specifying the first character code defined by this font. Character definitions are stored only for the characters actually present in a font. Therefore, use this field when calculating indexes into either dfBits or dfCharOffset.
dfLastChar	1 byte specifying the last character code defined by this font. Note that all the characters with codes between dfFirstChar and dfLastChar must be present in the font character definitions.
dfDefaultChar	1 byte specifying the character to substitute whenever a string contains a character out of the range. The character is given relative to dfFirstChar so that dfDefaultChar is the actual value of the character, less dfFirstChar. The dfDefaultChar should indicate a special character that is not a space.
dfBreakChar	1 byte specifying the character that will define word breaks. This character defines word breaks for word wrapping and word spacing justification. The character is given relative to dfFirstChar so that dfBreakChar is the actual value of the character, less that of dfFirstChar. The dfBreakChar is normally (32 - dfFirstChar), which is an ASCII space.
dfWidthBytes	2 bytes specifying the number of bytes in each row of the bitmap. This is always even, so that the rows start on WORD boundaries. For vector fonts, this field has no meaning.
dfDevice	4 bytes specifying the offset in the file to the string giving the device name. For a generic font, this value is zero.
dfFace	4 bytes specifying the offset in the file to the null-terminated string that names the face.
dfBitsPointer	4 bytes specifying the absolute machine address of the bitmap. This is set by GDI at load time. The dfBitsPointer is guaranteed to be even.
dfBitsOffset	4 bytes specifying the offset in the file to the beginning of the bitmap information. If the 04H bit in the dfType is set, then dfBitsOffset is an absolute address of the bitmap (probably in ROM). For raster fonts, dfBitsOffset points to a sequence of bytes that make up the bitmap of the font, whose height is the height of the font, and whose width is the sum of the widths of the characters in the font rounded up to the next WORD boundary. For vector fonts, it points to a string of bytes or words (depending on the size of the grid on which the font was digitized) that specify the strokes for each character of the font. The dfBitsOffset field must be even.
dfReserved	1 byte, not used.
dfFlags	4 bytes specifying the bits flags, which are additional flags that define the format of the Glyph bitmap, as follows: DFF_FIXED equ 0001h ; font is fixed pitch DFF_PROPORTIONAL equ 0002h ; font is proportional ; pitch DFF_ABCFIXED equ 0004h ; font is an ABC fixed ; font DFF_ABCPROPORTIONAL equ 0008h ; font is an ABC pro- ; portional font DFF_1COLOR equ 0010h ; font is one color DFF_16COLOR equ 0020h ; font is 16 color DFF_256COLOR equ 0040h ; font is 256 color DFF_RGBCOLOR equ 0080h ; font is RGB color
dfAspace	2 bytes specifying the global A space, if any. The dfAspace is the distance from the current position to the left edge of the bitmap.
dfBspace	2 bytes specifying the global B space, if any. The dfBspace is the width of the character.
dfCspace	2 bytes specifying the global C space, if any. The dfCspace is the distance from the right edge of the bitmap to the new current position. The increment of a character is the sum of the three spaces. These apply to all glyphs and is the case for DFF_ABCFIXED.
dfColorPointer	4 bytes specifying the offset to the color table for color fonts, if any. The format of the bits is similar to a DIB, but without the header. That is, the characters are not split up into disjoint bytes. Instead, they are left intact. If no color table is needed, this entry is NULL. [NOTE: This information is different from that in the hard-copy Developer's Notes and reflects a correction.]
dfReserved1	16 bytes, not used. [NOTE: This information is different from that in the hard-copy Developer's Notes and reflects a correction.]
dfCharTable	For raster fonts, the CharTable is an array of entries each consisting of two 2-byte WORDs for Windows 2.x and three 2-byte WORDs for Windows 3.00. The first WORD of each entry is the character width. The second WORD of each entry is the byte offset from the beginning of the FONTINFO structure to the character bitmap. For Windows 3.00, the second and third WORDs are used for the offset. There is one extra entry at the end of this table that describes an absolute-space character. This entry corresponds to a character that is guaranteed to be blank; this character is not part of the normal character set. The number of entries in the table is calculated as `((dfLastChar - dfFirstChar) + 2)`. This includes a spare, the sentinel offset mentioned in the following paragraph. For fixed-pitch vector fonts, each 2-byte entry in this array specifies the offset from the start of the bitmap to the beginning of the string of stroke specification units for the character. The number of bytes or WORDs to be used for a particular character is calculated by subtracting its entry from the next one, so that there is a sentinel at the end of the array of values. For proportionally spaced vector fonts, each 4-byte entry is divided into two 2-byte fields. The first field gives the starting offset from the start of the bitmap of the character strokes. The second field gives the pixel width of the character.
<facename>	An ASCII character string specifying the name of the font face. The size of this field is the length of the string plus a NULL terminator.
<devicename>	An ASCII character string specifying the name of the device if this font file is for a specific device. The size of this field is the length of the string plus a NULL terminator.
<bitmaps>	This field contains the character bitmap definitions. Each character is stored as a contiguous set of bytes. (In the old font format, this was not the case.) The first byte contains the first 8 bits of the first scanline (that is, the top line of the character). The second byte contains the first 8 bits of the second scanline. This continues until a first "column" is completely defined. The following byte contains the next 8 bits of the first scanline, padded with zeros on the right if necessary (and so on, down through the second "column"). If the glyph is quite narrow, each scanline is covered by 1 byte, with bits set to zero as necessary for padding. If the glyph is very wide, a third or even fourth set of bytes can be present. Note: The character bitmaps must be stored contiguously and arranged in ascending order. The following is a single-character example, in which are given the bytes for a 12 x 14 pixel character, as shown here schematically. ............ .....*..... .......... .......... .......... .......... .......... ..******.. .......... .......... ........*.. ............ ............ ............ The bytes are given here in two sets, because the character is less than 17 pixels wide. 00 06 09 10 20 20 20 3F 20 20 20 00 00 00 00 00 00 80 40 40 40 C0 40 40 40 00 00 00 Note that in the second set of bytes, the second digit of each is always zero. It would correspond to the 13th through 16th pixels on the right side of the character, if they were present.

The Windows 2.x version of dfCharTable has a GlyphEntry structure with the following format:


GlyphEntry    struc
  geWidth     dw ? ;width of character bitmap in pixels
  geOffset    dw ? ;pointer to the bits
GlyphEntry    ends

The Windows 3.00 version of the dfCharTable is dependent on the format of the Glyph bitmap.
Note: The only formats supported in Windows 3.00 will be DFF_FIXED and DFF_PROPORTIONAL.


DFF_FIXED
DFF_PROPORTIONAL

GlyphEntry    struc
  geWidth     dw ? ;width of character bitmap in pixels
  geOffset    dd ? ;pointer to the bits
GlyphEntry    ends

DFF_ABCFIXED
DFF_ABCPROPORTIONAL

GlyphEntry    struc
  geWidth     dw ? ;width of character bitmap in pixels
  geOffset    dd ? ;pointer to the bits
  geAspace    dd ? ;A space in fractional pixels (16.16)
  geBspace    dd ? ;B space in fractional pixels (16.16)
  geCspace    dw ? ;C space in fractional pixels (16.16)
GlyphEntry    ends

The fractional pixels are expressed as a 32-bit signed number with an implicit binary point between bits 15 and 16. This is referred to as a 16.16 ("sixteen dot sixteen") fixed-point number.
The ABC spacing here is the same as that defined above. However, here there are specific sets for each character.


DFF_1COLOR
DFF_16COLOR
DFF_256COLOR
DFF_RGBCOLOR

GlyphEntry    struc
  geWidth     dw ? ;width of character bitmap in pixels
  geOffset    dd ? ;pointer to the bits
  geHeight    dw ? ;height of character bitmap in pixels
  geAspace    dd ? ;A space in fractional pixels (16.16)
  geBspace    dd ? ;B space in fractional pixels (16.16)
  geCspace    dd ? ;C space in fractional pixels (16.16)
GlyphEntry    ends

DFF_1COLOR means 8 pixels per byte
DFF_16COLOR means 2 pixels per byte
DFF_256COLOR means 1 pixel per byte
DFF_RGBCOLOR means RGBquads

.FON File Format

.FON files are used by the Windows operating system and are an extension to the Windows 3.1 .FNT files. Each FON file is a NE format DLL file with only a resource section, containing .FNT files as resources.

EXE File Format

All multi-byte values are stored LSB first. One block is 512 bytes, one paragraph is 16 bytes.

Offset (hex)	Meaning
`00-01`	`0x4d`, `0x5a.` This is the "magic number" of an EXE file. The first byte of the file is `0x4d` and the second is `0x5a`.
`02-03`	The number of bytes in the last block of the program that are actually used. If this value is zero, that means the entire last block is used (i.e. the effective value is 512).
`04-05`	Number of blocks in the file that are part of the EXE file. If `[02-03]` is non-zero, only that much of the last block is used.
`06-07`	Number of relocation entries stored after the header. May be zero.
`08-09`	Number of paragraphs in the header. The program's data begins just after the header, and this field can be used to calculate the appropriate file offset. The header includes the relocation entries. Note that some OSs and/or programs may fail if the header is not a multiple of 512 bytes.
`0A-0B`	Number of paragraphs of additional memory that the program will need. This is the equivalent of the BSS size in a Unix program. The program can't be loaded if there isn't at least this much memory available to it.
`0C-0D`	Maximum number of paragraphs of additional memory. Normally, the OS reserves all the remaining conventional memory for your program, but you can limit it with this field.
`0E-0F`	Relative value of the stack segment. This value is added to the segment the program was loaded at, and the result is used to initialize the `SS` register.
`10-11`	Initial value of the `SP` register.
`12-13`	Word checksum. If set properly, the 16-bit sum of all words in the file should be zero. Usually, this isn't filled in.
`14-15`	Initial value of the `IP` register.
`16-17`	Initial value of the `CS` register, relative to the segment the program was loaded at.
`18-19`	Offset of the first relocation item in the file.
`1A-1B`	Overlay number. Normally zero, meaning that it's the main program.

Here is a structure that can be used to represend the EXE header and relocation entries, assuming a 16-bit LSB machine:


struct EXE {
  unsigned short signature; /* == 0x5a4D */
  unsigned short bytes_in_last_block;
  unsigned short blocks_in_file;
  unsigned short num_relocs;
  unsigned short header_paragraphs;
  unsigned short min_extra_paragraphs;
  unsigned short max_extra_paragraphs;
  unsigned short ss;
  unsigned short sp;
  unsigned short checksum;
  unsigned short ip;
  unsigned short cs;
  unsigned short reloc_table_offset;
  unsigned short overlay_number;
};

struct EXE_RELOC {
  unsigned short offset;
  unsigned short segment;
};

The offset of the beginning of the EXE data is computed like this:


exe_data_start = exe.header_paragraphs * 16L;

The offset of the byte just after the EXE data (in DJGPP, the size of the stub and the start of the COFF image) is computed like this:


extra_data_start = exe.blocks_in_file * 512L;
if (exe.bytes_in_last_block)
  extra_data_start -= (512 - exe.bytes_in_last_block);

Intel's COM (Command Executable) File Format

The COM files are raw binary executables and are a leftover from the old CP/M machines with 64K RAM. A COM program can only have a size of less than one segment (64K), including code and static data since no fixups for segment relocation or anything else is included. One method to check for a COM file is to check if the first byte in the file could be a valid jump or call opcode, but this is a very weak test since a COM file is not required to start with a jump or a call. In principle, a COM file is just loaded at offset 100h in the segment and then executed.

OFFSET	Count TYPE	Description
0000h	1 byte ID=0E9h	ID=0EBh Those are not safe ways to determine wether a file is a COM file or not, but most COM files start with a jump.

WAVE File Format

WAVE File Format is a file format for storing digital audio (waveform) data. It supports a variety of bit resolutions, sample rates, and channels of audio. This format is very popular upon IBM PC (clone) platforms, and is widely used in professional programs that process digital audio waveforms. It takes into account some pecularities of the Intel CPU such as little endian byte order.

This format uses Microsoft's version of the Electronic Arts Interchange File Format method for storing data in "chunks".

Data Types

A C-like language will be used to describe the data structures in the file. A few extra data types that are not part of standard C, but which will be used in this document, are:

pstring : Pascal-style string, a one-byte count followed by that many text bytes. The total number of bytes in this data type should be even. A pad byte can be added to the end of the text to accomplish this. This pad byte is not reflected in the count.
ID : A chunk ID (ie, 4 ASCII bytes).

Also note that when you see an array with no size specification (e.g., char ckData[];), this indicates a variable-sized array in our C-like language. This differs from standard C arrays.

Constants

Decimal values are referred to as a string of digits, for example 123, 0, 100 are all decimal numbers. Hexadecimal values are preceded by a 0x - e.g., 0x0A, 0x1, 0x64.

Data Organization

All data is stored in 8-bit bytes, arranged in Intel 80x86 (ie, little endian) format. The bytes of multiple-byte values are stored with the low-order (ie, least significant) bytes first. Data bits are as follows (ie, shown with bit numbers on top):

File Structure

A WAVE file is a collection of a number of different types of chunks. There is a required Format ("fmt ") chunk which contains important parameters describing the waveform, such as its sample rate. The Data chunk, which contains the actual waveform data, is also required. All other chunks are optional. Among the other optional chunks are ones which define cue points, list instrument parameters, store application-specific information, etc. All of these chunks are described in detail in the following sections of this document.

All applications that use WAVE must be able to read the 2 required chunks and can choose to selectively ignore the optional chunks. A program that copies a WAVE should copy all of the chunks in the WAVE, even those it chooses not to interpret.

There are no restrictions upon the order of the chunks within a WAVE file, with the exception that the Format chunk must precede the Data chunk. Some inflexibly written programs expect the Format chunk as the first chunk (after the RIFF header) although they shouldn't because the specification doesn't require this.

Chart on the right side is a graphical overview of an example, minimal WAVE file. It consists of a single WAVE containing the 2 required chunks, a Format and a Data Chunk.

A Bastardized Standard

The WAVE format is sort of a bastardized standard that was concocted by too many "cooks" who didn't properly coordinate the addition of "ingredients" to the "soup". Unlike with the AIFF standard which was mostly designed by a small, coordinated group, the WAVE format has had all manner of much-too-independent, uncoordinated aberrations inflicted upon it. The net result is that there are far too many chunks that may be found in a WAVE file -- many of them duplicating the same information found in other chunks (but in an unnecessarily different way) simply because there have been too many programmers who took too many liberties with unilaterally adding their own additions to the WAVE format without properly coming to a concensus of what everyone else needed (and therefore it encouraged an "every man for himself" attitude toward adding things to this "standard").

One example is the Instrument chunk versus the Sampler chunk. Another example is the Note versus Label chunks in an Associated Data List. I don't even want to get into the totally irresponsible proliferation of compressed formats. (ie, It seems like everyone and his pet Dachshound has come up with some compressed version of storing wave data -- like we need 100 different ways to do that). Furthermore, there are lots of inconsistencies, for example how 8-bit data is unsigned, but 16-bit data is signed.

I've attempted to document only those aspects that you're very likely to encounter in a WAVE file. I suggest that you concentrate upon these and refuse to support the work of programmers who feel the need to deviate from a standard with inconsistent, proprietary, self-serving, unnecessary extensions. Please do your part to rein in half-ass programming.

Sample Points and Sample Frames

A large part of interpreting WAVE files revolves around the two concepts of sample points and sample frames.

A sample point is a value representing a sample of a sound at a given moment in time. For waveforms with greater than 8-bit resolution, each sample point is stored as a linear, 2's-complement value which may be from 9 to 32 bits wide (as determined by the wBitsPerSample field in the Format Chunk, assuming PCM format -- an uncompressed format).

For example, each sample point of a 16-bit waveform would be a 16-bit word (ie, two 8-bit bytes) where 32767 (0x7FFF) is the highest value and -32768 (0x8000) is the lowest value. For 8-bit (or less) waveforms, each sample point is a linear, unsigned byte where 255 is the highest value and 0 is the lowest value. Obviously, this signed/unsigned sample point discrepancy between 8-bit and larger resolution waveforms was one of those "oops" scenarios where some Microsoft employee decided to change the sign sometime after 8-bit wave files were common but 16-bit wave files hadn't yet appeared.

Because most CPU's read and write operations deal with 8-bit bytes, it was decided that a sample point should be rounded up to a size which is a multiple of 8 when stored in a WAVE. This makes the WAVE easier to read into memory. If your ADC produces a sample point from 1 to 8 bits wide, a sample point should be stored in a WAVE as an 8-bit byte (ie, unsigned char). If your ADC produces a sample point from 9 to 16 bits wide, a sample point should be stored in a WAVE as a 16-bit word (ie, signed short). If your ADC produces a sample point from 17 to 24 bits wide, a sample point should be stored in a WAVE as three bytes. If your ADC produces a sample point from 25 to 32 bits wide, a sample point should be stored in a WAVE as a 32-bit doubleword (ie, signed long). etc.

Furthermore, the data bits should be left-justified, with any remaining (ie, pad) bits zeroed. For example, consider the case of a 12-bit sample point. It has 12 bits, so the sample point must be saved as a 16-bit word. Those 12 bits should be left-justified so that they become bits 4 to 15 inclusive, and bits 0 to 3 should be set to zero. Shown below is how a 12-bit sample point with a value of binary 101000010111 is formatted left-justified as a 16-bit word.

But note that, because the WAVE format uses Intel little endian byte order, the LSB is stored first in the wave file as so:

For multichannel sounds (for example, a stereo waveform), single sample points from each channel are interleaved. For example, assume a stereo (ie, 2 channel) waveform. Instead of storing all of the sample points for the left channel first, and then storing all of the sample points for the right channel next, you "mix" the two channels' sample points together. You would store the first sample point of the left channel. Next, you would store the first sample point of the right channel. Next, you would store the second sample point of the left channel. Next, you would store the second sample point of the right channel, and so on, alternating between storing the next sample point of each channel. This is what is meant by interleaved data; you store the next sample point of each of the channels in turn, so that the sample points that are meant to be "played" (ie, sent to a DAC) simultaneously are stored contiguously.

The sample points that are meant to be "played" (ie, sent to a DAC) simultaneously are collectively called a sample frame. In the example of our stereo waveform, every two sample points makes up another sample frame. This is illustrated below for that stereo example.

For a monophonic waveform, a sample frame is merely a single sample point (ie, there's nothing to interleave). For multichannel waveforms, you should follow the conventions shown below for which order to store channels within the sample frame. (ie, Below, a single sample frame is displayed for each example of a multichannel waveform).

The sample points within a sample frame are packed together; there are no unused bytes between them. Likewise, the sample frames are packed together with no pad bytes.

Note that the above discussion outlines the format of data within an uncompressed data chunk. There are some techniques of storing compressed data in a data chunk. Obviously, that data would need to be uncompressed, and then it will adhere to the above layout.

The Format Chunk

The Format (fmt) chunk describes fundamental parameters of the waveform data such as sample rate, bit resolution, and how many channels of digital audio are stored in the WAVE.

#define FormatID 'fmt '   /* chunkID for Format Chunk. NOTE: There is a space at the end of this ID. */

typedef struct {
 ID             chunkID;
 long           chunkSize;

 short          wFormatTag;
 unsigned short wChannels;
 unsigned long  dwSamplesPerSec;
 unsigned long  dwAvgBytesPerSec;
 unsigned short wBlockAlign;
 unsigned short wBitsPerSample;

/* Note: there may be additional fields here, depending upon wFormatTag. */

} FormatChunk;

The ID is always "fmt ". The chunkSize field is the number of bytes in the chunk. This does not include the 8 bytes used by ID and Size fields. For the Format Chunk, chunkSize may vary according to what "format" of WAVE file is specified (ie, depends upon the value of wFormatTag).

WAVE data may be stored without compression, in which case the sample points are stored as described in Sample Points and Sample Frames. Alternately, different forms of compression may be used when storing the sound data in the Data chunk. With compression, each sample point may take a differing number of bytes to store. The wFormatTag indicates whether compression is used when storing the data.

If compression is used (ie, WFormatTag is some value other than 1), then there will be additional fields appended to the Format chunk which give needed information for a program wishing to retrieve and decompress that stored data. The first such additional field will be an unsigned short that indicates how many more bytes have been appended (after this unsigned short). Furthermore, compressed formats must have a Fact chunk which contains an unsigned long indicating the size (in sample points) of the waveform after it has been decompressed. There are (too) many compressed formats. Details about them can be gotten from Microsoft's web site.

If no compression is used (ie, wFormatTag = 1), then there are no further fields.

The wChannels field contains the number of audio channels for the sound. A value of 1 means monophonic sound, 2 means stereo, 4 means four channel sound, etc. Any number of audio channels may be represented. For multichannel sounds, single sample points from each channel are interleaved. A set of interleaved sample points is called a sample frame.

The actual waveform data is stored in another chunk, the Data Chunk, which will be described later.

The dwSamplesPerSec field is the sample rate at which the sound is to be played back in sample frames per second (ie, Hertz). The 3 standard MPC rates are 11025, 22050, and 44100 KHz, although other rates may be used.

The dwAvgBytesPerSec field indicates how many bytes play every second. dwAvgBytesPerSec may be used by an application to estimate what size RAM buffer is needed to properly playback the WAVE without latency problems. Its value should be equal to the following formula rounded up to the next whole number:

dwSamplesPerSec * wBlockAlign

The wBlockAlign field should be equal to the following formula, rounded to the next whole number:

wChannels * (wBitsPerSample % 8)

Essentially, wBlockAlign is the size of a sample frame, in terms of bytes. (eg, A sample frame for a 16-bit mono wave is 2 bytes. A sample frame for a 16-bit stereo wave is 4 bytes. Etc).

The wBitsPerSample field indicates the bit resolution of a sample point (ie, a 16-bit waveform would have wBitsPerSample = 16).

One, and only one, Format Chunk is required in every WAVE.

Data Chunk

The Data (data) chunk contains the actual sample frames (ie, all channels of waveform data).

#define DataID 'data'  /* chunk ID for data Chunk */

typedef struct {
 ID             chunkID;
 long           chunkSize;

 unsigned char  waveformData[];
} DataChunk;

The ID is always data. chunkSize is the number of bytes in the chunk, not counting the 8 bytes used by ID and Size fields nor any possible pad byte needed to make the chunk an even size (ie, chunkSize is the number of remaining bytes in the chunk after the chunkSize field, not counting any trailing pad byte).

Remember that the bit resolution, and other information is gotten from the Format chunk.

The following discussion assumes uncompressed data.

The waveformData array contains the actual waveform data. The data is arranged into what are called sample frames. For more information on the arrangment of data, see "Sample Points and Sample Frames".

You can determine how many bytes of actual waveform data there is from the Data chunk's chunkSize field. The number of sample frames in waveformData is determined by dividing this chunkSize by the Format chunk's wBlockAlign.

The Data Chunk is required. One, and only one, Data Chunk may appear in a WAVE.

Another Way of Storing Waveform Data

So, you're thinking "This WAVE format isn't that bad. It seems to make sense and there aren't all that many inconsistencies, duplications, and inefficiencies". You fool! We're just getting started with our first excursion into unnecessary inconsistencies, duplications, and inefficiency.

Sure, countless brain-damaged programmers have inflicted literally dozens of compressed data formats upon the Data chunk, but apparently someone felt that even this wasn't enough to make your life difficult in trying to support WAVE files. No, some half-wit decided that it would be a good idea to screw around with storing waveform data in something other than one Data chunk. NOOOOOOOOOOOOOO!!!!!!

For some god-forsaken reason, someone came up with the idea of using an imbedded IFF List inside of the WAVE file. NOOOOOOOOOOOOOOOOO!!!!!!!! And this "Wave List" would contain multiple 'data' and 'slnt' chunks. NOOOOOOOOOOOOOOOO!!!! The Type ID for this List is 'wavl'.

I strongly suggest that you refuse to support any WAVE file that exhibits this Wave List nonsense. There's no need for it, and hopefully, the misguided programmer who conjured it up will be embarrassed into hanging his head in shame when nobody agrees to support his foolishness. Just say "NOOOOOOOOOOOOOO!!!!"

Cue Chunk

The Cue chunk contains one or more "cue points" or "markers". Each cue point references a specific offset within the waveformData array, and has its own CuePoint structure within this chunk.

In conjunction with the Playlist chunk, the Cue chunk can be used to store looping information.

CuePoint Structure

typedef struct {
 long    dwIdentifier;
 long    dwPosition;
 ID      fccChunk;
 long    dwChunkStart;
 long    dwBlockStart;
 long    dwSampleOffset;
} CuePoint;

The dwIdentifier field contains a unique number (ie, different than the ID number of any other CuePoint structure). This is used to associate a CuePoint structure with other structures used in other chunks which will be described later.

The dwPosition field specifies the position of the cue point within the "play order" (as determined by the Playlist chunk. See that chunk for a discussion of the play order).

The fccChunk field specifies the chunk ID of the Data or Wave List chunk which actually contains the waveform data to which this CuePoint refers. If there is only one Data chunk in the file, then this field is set to the ID 'data'. On the other hand, if the file contains a Wave List (which can contain both 'data' and 'slnt' chunks), then fccChunk will specify 'data' or 'slnt' depending upon in which type of chunk the referenced waveform data is found.

The dwChunkStart and dwBlockStart fields are set to 0 for an uncompressed WAVE file that contains one 'data' chunk. These fields are used only for WAVE files that contain a Wave List (with multiple 'data' and 'slnt' chunks), or for a compressed file containing a 'data' chunk. (Actually, in the latter case, dwChunkStart is also set to 0, and only dwBlockStart is used). Again, I want to emphasize that you can avoid all of this unnecessary crap if you avoid hassling with compressed files, or Wave Lists, and instead stick to the sensible basics.

The dwChunkStart field specifies the byte offset of the start of the 'data' or 'slnt' chunk which actually contains the waveform data to which this CuePoint refers. This offset is relative to the start of the first chunk within the Wave List. (ie, It's the byte offset, within the Wave List, of where the 'data' or 'slnt' chunk of interest appears. The first chunk within the List would be at an offset of 0).

The dwBlockStart field specifies the byte offset of the start of the block containing the position. This offset is relative to the start of the waveform data within the 'data' or 'slnt' chunk.

The dwSampleOffset field specifies the sample offset of the cue point relative to the start of the block. In an uncompressed file, this equates to simply being the offset within the waveformData array. Unfortunately, the WAVE documentation is much too ambiguous, and doesn't define what it means by the term "sample offset". This could mean a byte offset, or it could mean counting the sample points (for example, in a 16-bit wave, every 2 bytes would be 1 sample point), or it could even mean sample frames (as the loop offsets in AIFF are specified). Who knows? The guy who conjured up the Cue chunk certainly isn't saying. I'm assuming that it's a byte offset, like the above 2 fields.

Cue Chunk

#define CueID 'cue '  /* chunk ID for Cue Chunk */

typedef struct {
 ID        chunkID;
 long      chunkSize;

 long      dwCuePoints;
 CuePoint  points[];
} CueChunk;

The ID is always cue . chunkSize is the number of bytes in the chunk, not counting the 8 bytes used by ID and Size fields.

The dwCuePoints field is the number of CuePoint structures in the Cue Chunk. If dwCuePoints is not 0, it is followed by that many CuePoint structures, one after the other. Because all fields in a CuePoint structure are an even number of bytes, the length of any CuePoint will always be even. Thus, CuePoints are packed together with no unused bytes between them. The CuePoints need not be placed in any particular order.

The Cue chunk is optional. No more than one Cue chunk can appear in a WAVE.

Playlist chunk

The Playlist (plst) chunk specifies a play order for a series of cue points. The Cue chunk contains all of the cue points, but the Playlist chunk determines how those cue points are used when playing back the waveform (ie, which cue points represent looped sections, and in what order those loops are "played"). The Playlist chunk contains one or more Segment structures, each of which identifies a looped section of the waveform (in conjunction with the CuePoint structure with which it is associated).

Segment Structure

typedef struct {
 long    dwIdentifier;
 long    dwLength;
 long    dwRepeats;
} Segment;

The dwIdentifier field contains a unique number (ie, different than the ID number of any other Segment structure). This field should correspond with the dwIndentifier field of some CuePoint stored in the Cue chunk. In other words, this Segment structure contains the looping information associated with that CuePoint structure with the same ID number.

The dwLength field specifies the length of the section in samples (ie, the length of the looped section). Note that the start position of the loop would be the dwSampleOffset of the referenced CuePoint structure in the Cue chunk. (Or, you may need to hassle with the dwChunkStart and dwBlockStart fields as well if dealing with a Wave List or compressed data).

The dwRepeats field specifies the number of times to play the loop. I assume that a value of 1 means to repeat this loop once only, but the WAVE documentation is very incomplete and omits this important information. I have no idea how you would specify an infinitely repeating loop. Certainly, the person who conjured up the Playlist chunk appears to have no idea whatsoever. Due to the ambiguities, inconsistencies, inefficiencies, and omissions of the Cue and Playlist chunks, I very much recommend that you use the Sampler chunk (described later) to replace them.

Playlist chunk

#define PlaylistID 'plst'  /* chunk ID for Playlist Chunk */

typedef struct {
 ID        chunkID;
 long      chunkSize;

 long      dwSegments;
 Segment   Segments[];
} PlaylistChunk;

The ID is always plst. chunkSize is the number of bytes in the chunk, not counting the 8 bytes used by ID and Size fields.

The dwSegments field is the number of Segment structures in the Playlist Chunk. If dwSegments is not 0, it is followed by that many Segment structures, one after the other. Because all fields in a Segment structure are an even number of bytes, the length of any Segment will always be even. Thus, Segments are packed together with no unused bytes between them. The Segments need not be placed in any particular order.

Associated Data List

The Associated Data List contains text "labels" or "names" that are associated with the CuePoint structures in the Cue chunk. In other words, this list contains the text labels for those CuePoints.

Again, we're talking about another imbedded IFF List within the WAVE file. NOOOOOOOOOOOOOO!!!! What's a List? A List is simply a "master chunk" that contains several "sub-chunks". Just like with any other chunk, the "master chunk" has an ID and chunkSize, but inside of this chunk are sub-chunks, each with its own ID and chunkSize. Of course, the chunkSize for the master chunk (ie, List) includes the size of all of these sub-chunks (including their ID and chunkSize fields).

The "Type ID" for the Associated Data List is "adtl". Remember that an IFF list header has 3 fields:

typedef struct {
 ID      listID;      /* 'list' */
 long    chunkSize;   /* includes the Type ID below */
 ID      typeID;     /* 'adtl' */
} ListHeader;

There are several sub-chunks that may be found inside of the Associated Data List. The ones that are important to WAVE format have IDs of "labl", "note", or "ltxt". Ignore the rest. Here are those 3 sub-chunks and their fields:

The Associated Data List is optional. The WAVE documentation doesn't specify if more than one can be contained in a WAVE file.

Label Chunk

#define LabelID 'labl'  /* chunk ID for Label Chunk */

typedef struct {
 ID      chunkID;
 long    chunkSize;

 long    dwIdentifier;
 char    dwText[];
} LabelChunk;

The ID is always labl. chunkSize is the number of bytes in the chunk, not counting the 8 bytes used by ID and Size fields nor any possible pad byte needed to make the chunk an even size (ie, chunkSize is the number of remaining bytes in the chunk after the chunkSize field, not counting any trailing pad byte).

The dwIdentifier field contains a unique number (ie, different than the ID number of any other Label chunk). This field should correspond with the dwIndentifier field of some CuePoint stored in the Cue chunk. In other words, this Label chunk contains the text label associated with that CuePoint structure with the same ID number.

The dwText array contains the text label. It should be a null-terminated string. (The null byte is included in the chunkSize, therefore the length of the string, including the null byte, is chunkSize - 4).

Note Chunk

#define NoteID 'note'  /* chunk ID for Note Chunk */

typedef struct {
 ID      chunkID;
 long    chunkSize;

 long    dwIdentifier;
 char    dwText[];
} NoteChunk;

The Note chunk, whose ID is note, is otherwise exactly the same as the Label chunk (ie, same fields). See what I mean about pointless duplication? But, in theory, a Note chunk contains a "comment" about a CuePoint, whereas the Label chunk is supposed to contain the actual CuePoint label. So, it's possible that you'll find both a Note and Label for a specific CuePoint, each containing different text.

Labeled Text Chunk

#define LabelTextID 'ltxt'  /* chunk ID for Labeled Text Chunk */

typedef struct {
 ID      chunkID;
 long    chunkSize;

 long    dwIdentifier;
 long    dwSampleLength;
 long    dwPurpose;
 short   wCountry;
 short   wLanguage;
 short   wDialect;
 short   wCodePage;
 char    dwText[];
} LabelTextChunk;

The ID is always ltxt. chunkSize is the number of bytes in the chunk, not counting the 8 bytes used by ID and Size fields nor any possible pad byte needed to make the chunk an even size (ie, chunkSize is the number of remaining bytes in the chunk after the chunkSize field, not counting any trailing pad byte).

The dwIdentifier field is the same as the Label chunk.

The dwSampleLength field specifies the number of sample points in the segment of waveform data. In other words, a Labeled Text chunk contains a label for a section of the waveform data, not just a specific point, for example the looped section of a waveform.

The dwPurpose field specifies the type or purpose of the text. For example, dwPurpose can contain an ID like "scrp" for script text or "capt" for close-caption text. How is this related to waveform data? Well, it isn't really. It's just that Associated Data Lists are used in other file formats, so they contain generic fields that sometimes don't have much relevance to waveform data.

The wCountry, wLanguage, and wCodePage fields specify the country code, language/dialect, and code page for the text. An application typically queries these values from the operating system.

Sampler Chunk

The Sampler (smpl) Chunk defines basic parameters that an instrument, such as a MIDI sampler, could use to play the waveform data. Most importantly, it includes information about looping the waveform (ie, during playback, to "sustain" the waveform). Of course, as you've come to expect from the WAVE file format, it duplicates some of the information that can be found in the Cue and Playlist chunks, but fortunately, in a more sensible, consistent, better-documented way.

#define SamplerID 'smpl'  /* chunk ID for Sampler Chunk */

typedef struct {
 ID             chunkID;
 long           chunkSize;

 long           dwManufacturer;
 long           dwProduct;
 long           dwSamplePeriod;
 long           dwMIDIUnityNote;
 long           dwMIDIPitchFraction;
 long           dwSMPTEFormat;
 long           dwSMPTEOffset;
 long           cSampleLoops;
 long           cbSamplerData;
 struct SampleLoop Loops[];
} SamplerChunk;

The ID is always smpl. chunkSize is the number of bytes in the chunk, not counting the 8 bytes used by ID and Size fields nor any possible pad byte needed to make the chunk an even size (ie, chunkSize is the number of remaining bytes in the chunk after the chunkSize field, not counting any trailing pad byte).

The dwManufacturer field contains the MMA Manufacturer code for the intended sampler. Each manufacturer of MIDI products has his own ID assigned to him by the MIDI Manufacturer's Association. See the MIDI Specification (under System Exclusive) for a listing of current Manufacturer IDs. The high byte of dwManufacturer indicates the number of low order bytes (1 or 3) that are valid for the manufacturer code. For example, this value will be 0x01000013 for Digidesign (the MMA Manufacturer code is one byte, 0x13); whereas 0x03000041 identifies Microsoft (the MMA Manufacturer code is three bytes, 0x00 0x00 0x41). If the WAVE is not intended for a specific manufacturer, then this field should be set to 0.

The dwProduct field contains the Product code (ie, model ID) of the intended sampler for the dwManufacturer. Contact the manufacturer of the sampler to ascertain the sampler's model ID. If the WAVE is not intended for a specific manufacturer's product, then this field should be set to 0.

The dwSamplePeriod field specifies the period of one sample in nanoseconds (normally 1/nSamplesPerSec from the Format chunk. But note that this field allows finer tuning than nSamplesPerSec). For example, 44.1 KHz would be specified as 22675 (0x00005893).

The dwMIDIUnityNote field is the MIDI note number at which the instrument plays back the waveform data without pitch modification (ie, at the same sample rate that was used when the waveform was created). This value ranges 0 through 127, inclusive. Middle C is 60.

The dwMIDIPitchFraction field specifies the fraction of a semitone up from the specified dwMIDIUnityNote. A value of 0x80000000 is 1/2 semitone (50 cents); a value of 0x00000000 represents no fine tuning between semitones.

The dwSMPTEFormat field specifies the SMPTE time format used in the dwSMPTEOffset field. Possible values are:

0  = no SMPTE offset (dwSMPTEOffset should also be 0)
24 = 24 frames per second
25 = 25 frames per second
29 = 30 frames per second with frame dropping ('30 drop')
30 = 30 frames per second

The dwSMPTEOffset field specifies a time offset for the sample if it is to be syncronized or calibrated according to a start time other than 0. The format of this value is 0xhhmmssff. hh is a signed Hours value [-23..23]. mm is an unsigned Minutes value [0..59]. ss is unsigned Seconds value [0..59]. ff is an unsigned value [0..( - 1)].

The cSampleLoops field is the number (count) of SampleLoop structures that are appended to this chunk. These structures immediately follow the cbSamplerData field. This field will be 0 if there are no SampleLoop structures.

The cbSamplerData field specifies the size (in bytes) of any optional fields that an application wishes to append to this chunk. An application which needed to save additional information (ie, beyond the above fields) may append additional fields to the end of this chunk, after all of the SampleLoop structures. These additional fields are also reflected in the ChunkSize, and remember that the chunk should be padded out to an even number of bytes. The cbSamplerData field will be 0 if no additional information is appended to the chunk.

What follows the above fields are any SampleLoop structures. Each SampleLoop structure defines one loop (ie, the start and end points of the loop, and how many times it plays). What follows any SampleLoop structures are any additional, proprietary sampler information that an application chooses to store.

SampleLoop Structure

typedef struct {
  long  dwIdentifier;
  long  dwType;
  long  dwStart;
  long  dwEnd;
  long  dwFraction;
  long  dwPlayCount;
} SampleLoop;

The dwIdentifier field contains a unique number (ie, different than the ID number of any other SampleLoop structure). This field may correspond with the dwIdentifier field of some CuePoint stored in the Cue chunk. In other words, the CuePoint structure which has the same ID number would be considered to be describing the same loop as this SampleLoop structure. Furthermore, this field corresponds to the dwIndentifier field of any label stored in the Associated Data List. In other words, the text string (within some chunk in the Associated Data List) which has the same ID number would be considered to be this loop's "name" or "label".

The dwType field is the loop type (ie, how the loop plays back) as so:

0 - Loop forward (normal)
1 - Alternating loop (forward/backward)
2 - Loop backward
3-31 - reserved for future standard types
32-? - sampler specific types (manufacturer defined)

The dwStart field specifies the startpoint of the loop. In other words, it's the byte offset from the start of waveformData[], where an offset of 0 would be at the start of the waveformData[] array (ie, the loop start is at the very first sample point).

The dwEnd field specifies the endpoint of the loop (ie, a byte offset).

The dwFraction field allows fine-tuning for loop fractional areas between samples. Values range from 0x00000000 to 0xFFFFFFFF. A value of 0x80000000 represents 1/2 of a sample length.

The dwPlayCount field is the number of times to play the loop. A value of 0 specifies an infinite sustain loop (ie, the wave keeps looping until some external force interrupts playback, such as the musician releasing the key that triggered that wave's playback).

The Sampler Chunk is optional. I don't know as if there is any limit of one per WAVE file. I don't see why there should be such a limit, since after all, an application may need to deal with several MIDI samplers.

The Instrument Chunk Format

The Instrument Chunk contains some of the same type of information as the Sampler chunk. So what else is new?

#define InstrumentID 'inst'  /* chunkID for Instruments Chunk */

typedef struct {
  ID     chunkID;
  long   chunkSize;

  unsigned char UnshiftedNote;
  char          FineTune;
  char          Gain;
  unsigned char LowNote;
  unsigned char HighNote;
  unsigned char LowVelocity;
  unsigned char HighVelocity;
} InstrumentChunk;

The ID is always inst. chunkSize should always be 7 since there are no fields of variable length.

The UnshiftedNote field is the same as the Sampler chunk's dwMIDIUnityNote field.

The FineTune field determines how much the instrument should alter the pitch of the sound when it is played back. Units are in cents (1/100 of a semitone) and range from -50 to +50. Negative numbers mean that the pitch of the sound should be lowered, while positive numbers mean that it should be raised. While not the same measurement is used, this field serves the same purpose as the Sampler chunk's dwFraction field.

The Gain field is the amount by which to change the gain of the sound when it is played. Units are decibels. For example, 0db means no change, 6db means double the value of each sample point (ie, every additional 6db doubles the gain), while -6db means halve the value of each sample point.

The LowNote and HighNote fields specify the suggested MIDI note range on a keyboard for playback of the waveform data. The waveform data should be played if the instrument is requested to play a note between the low and high note numbers, inclusive. The UnshiftedNote does not have to be within this range.

The LowVelocity and HighVelocity fields specify the suggested range of MIDI velocities for playback of the waveform data. The waveform data should be played if the note-on velocity is between low and high velocity, inclusive. The range is 1 (lowest velocity) through 127 (highest velocity), inclusive.

The Instrument Chunk is optional. No more than 1 Instrument Chunk can appear in one WAVE.

Windows Resource (.RES) Files

A Windows resource file (.RES) contains a series of packed resource entries, with no other structure, that is no headers, footers, padding, etc. The format of a resource entry is different for Windows 3.x (16-bit) and Win32, that is, Windows 95, Windows NT/XP/Vista (32-bit).

32-bit

A 32-bit .RES file starts with an empty resource entry of 32 bytes:
00000000 20000000 FFFF0000 FFFF0000 00000000 00000000 00000000 00000000

After that comes the real resource entries, packed into the file with no padding or other structure -- just a series of resource entries.

Each resource entry has a header followed immediately by the resource data. Immediately after the data for one entry comes the header for the next entry. Each header has the following format:

Field	Size (bytes)	Description
Data Size	4	Size of the resource data that follow the header
Header Size	4	Size of the resource header (always at least 16)
Type	variable	Resource type
Name	variable	Resource name or identifier
Data Version	4	Version number for resource data format, usually 0
Flags	2	Most flags are for backward compability with Win16.Discardable (1000₁₆) is the only Win32 flag.
Language	2	Primary and secondary language identifiers. Zero for language-neutral, or look up your Windows documentation for a full list of identifiers. Form a language identifier from a primary and sublanguage as follows: (sublanguage << 10 \| primary).
Version	4	Version number for the resource entry
Characteristics	4	Anything you want

The type and name can be numeric or textual. If the first two bytes are FFFF₁₆, the subsequent two bytes are the numeric value. Otherwise, the first two bytes are the first Unicode character in a zero-terminated string.

16-bit

Each resource entry has a header followed immediately by the resource data. Immediately after the data for one entry comes the header for the next entry. Each header has the following format:

Field	Size	Description
Type	variable	Resource type
Name	variable	Resource name or identifier
Flags	2	Discardable=1000₁₆, Moveable=0010₁₆, Pure=0020₁₆, Preload=0040₁₆
Size	4	Size of the resource data that immediately follow the header

The type and name can be numeric or textual. If the first byte is FF₁₆, the subsequent two bytes are the numeric value. Otherwise, the first byte is the first character of the ANSI string.

Resource types

Windows reserves numeric resource types under 256 for its own use. In this range are several predefined resource types:

Type	Value	Description
RT_CURSOR	1	Cursor image (one entry in a cursor group)
RT_BITMAP	2	Bitmap (Windows or OS/2 BMP format)
RT_ICON	3	Icon image (one entry in an icon group)
RT_MENU	4	Menu
RT_DIALOG	5	Dialog box
RT_STRING	6	String table (must have numeric identifier, not textual)
RT_FONTDIR	7	Font directory
RT_FONT	8	Font entry
RT_ACCELERATOR	9	Keyboard accelerator table
RT_RCDATA	10	Application-defined data
RT_GROUP_CURSOR	12	Group header for a cursor
RT_GROUP_ICON	14	Group header for an icon

Win32 defines additional resource types:

Type	Value	Description
RT_MESSAGETABLE	11	Message table
RT_VERSION	16	Version information
RT_DLGINCLUDE	17	Dialog include
RT_PLUGPLAY	19	Plug and play
RT_VXD	20	VxD
RT_ANICURSOR	21	Animated cursor

Windows .SCR Screen Savers

SCR files are nothing more complex than .EXE files with the extension SCR. Windows calls the .SCR file with two command-line options:

    /s  to launch the screensaver
   /c  to configure the screensaver

For the windows control panel to recognise the screensaver, the program's module description string must begin with SCRNSAVE: (in uppercase). So, if writing a Visual Basic screensaver, simply set the application title to something like "SCRNSAVE:Test Screensaver"

To create a new screen saver simply write a program that checks the command-line option when starting and performs the appropriate action. The display should use a full-screen window (usually with a black background) and should end when any key is pressed or when the mouse is moved.

When the program is compiled, rename the .EXE to .SCR and put it into the Windows directory so it can be found by the screensaver selection dialog in Windows.

CDA Music Tracks File Format

CDA files are generally RIFF resources. The RIFF id of .CDA file is "CDDA" (43h, 44h, 44h, 41h). They contain only one data block called "fmt " (66h, 6dh, 74h, 20h). In current version of .CDA file, this block is 24 bytes long. Here's structure of it:

Offset	Length	Description
00h	02h	CDA file version. Currently equals 1. If it has other value, following data may be out of date.
02h	02h	Number of track.
04h	04h	CD disc serial number (the one stored in CDPLAYER.INI)
08h	04h	Beginning of the track in HSG format.
0Ch	04h	Length of the track in HSG format.
10h	04h	Beginning of the track in Red-Book format.
14h	04h	Length of the track in Red-Book format.

As you see, time is represented in two formats: HSG and Red-Book. HSG can be calculated as following:

time = minute * 4500 + second * 75 + frame

Red-book is much easier to use, because it contains minutes, seconds and frames in unmodified form, byte-packed:

Offset	Length	Description
00h	01h	Frame
01h	01h	Second
02h	01h	Minute
03h	01h	not used

Now, I'll show you an example file. First part is a hex dump of the file, the second is the explanation of the fields.


52 49 46 46 24 00 00 00  43 44 44 41 66 6D 74 20  RIFF$...CDDAfmt 
18 00 00 00 01 00 04 00  B8 24 F6 00 F7 11 01 00  .........$......
B4 5C 00 00 0A 25 0F 00  20 10 05 00              .\...%.. ...

01 00       - first version of CDA file :)
04 00       - fourth track
B8 24 F6 00 - serial number of CD in CDPLAYER.INI [F623B8]

F7 11 01 00 - begining of track in HSG format
B4 5C 00 00 - length of track in HSG format

0A 25 0F 00 - begining of track in Red-Book format (15:37)
20 10 05 00 - length of track in Red-book format (05:16)

Blog Archive