0
|
1 |
<pre>
|
|
2 |
DRAFT TIFF Technical Note #2 17-Mar-95
|
|
3 |
============================
|
|
4 |
|
|
5 |
This Technical Note describes serious problems that have been found in
|
|
6 |
TIFF 6.0's design for embedding JPEG-compressed data in TIFF (Section 22
|
|
7 |
of the TIFF 6.0 spec of 3 June 1992). A replacement TIFF/JPEG
|
|
8 |
specification is given. Some corrections to Section 21 are also given.
|
|
9 |
|
|
10 |
To permit TIFF implementations to continue to read existing files, the 6.0
|
|
11 |
JPEG fields and tag values will remain reserved indefinitely. However,
|
|
12 |
TIFF writers are strongly discouraged from using the 6.0 JPEG design. It
|
|
13 |
is expected that the next full release of the TIFF specification will not
|
|
14 |
describe the old design at all, except to note that certain tag numbers
|
|
15 |
are reserved. The existing Section 22 will be replaced by the
|
|
16 |
specification text given in the second part of this Tech Note.
|
|
17 |
|
|
18 |
|
|
19 |
Problems in TIFF 6.0 JPEG
|
|
20 |
=========================
|
|
21 |
|
|
22 |
Abandoning a published spec is not a step to be taken lightly. This
|
|
23 |
section summarizes the reasons that have forced this decision.
|
|
24 |
TIFF 6.0's JPEG design suffers from design errors and limitations,
|
|
25 |
ambiguities, and unnecessary complexity.
|
|
26 |
|
|
27 |
|
|
28 |
Design errors and limitations
|
|
29 |
-----------------------------
|
|
30 |
|
|
31 |
The fundamental design error in the existing Section 22 is that JPEG's
|
|
32 |
various tables and parameters are broken out as separate fields which the
|
|
33 |
TIFF control logic must manage. This is bad software engineering: that
|
|
34 |
information should be treated as private to the JPEG codec
|
|
35 |
(compressor/decompressor). Worse, the fields themselves are specified
|
|
36 |
without sufficient thought for future extension and without regard to
|
|
37 |
well-established TIFF conventions. Here are some of the significant
|
|
38 |
problems:
|
|
39 |
|
|
40 |
* The JPEGxxTable fields do not store the table data directly in the
|
|
41 |
IFD/field structure; rather, the fields hold pointers to information
|
|
42 |
elsewhere in the file. This requires special-purpose code to be added to
|
|
43 |
*every* TIFF-manipulating application, whether it needs to decode JPEG
|
|
44 |
image data or not. Even a trivial TIFF editor, for example a program to
|
|
45 |
add an ImageDescription field to a TIFF file, must be explicitly aware of
|
|
46 |
the internal structure of the JPEG-related tables, or else it will probably
|
|
47 |
break the file. Every other auxiliary field in the TIFF spec contains
|
|
48 |
data, not pointers, and can be copied or relocated by standard code that
|
|
49 |
doesn't know anything about the particular field. This is a crucial
|
|
50 |
property of the TIFF format that must not be given up.
|
|
51 |
|
|
52 |
* To manipulate these fields, the TIFF control logic is required to know a
|
|
53 |
great deal about JPEG details, for example such arcana as how to compute
|
|
54 |
the length of a Huffman code table --- the length is not supplied in the
|
|
55 |
field structure and can only be found by inspecting the table contents.
|
|
56 |
This is again a violation of good software practice. Moreover, it will
|
|
57 |
prevent easy adoption of future JPEG extensions that might change these
|
|
58 |
low-level details.
|
|
59 |
|
|
60 |
* The design neglects the fact that baseline JPEG codecs support only two
|
|
61 |
sets of Huffman tables: it specifies a separate table for each color
|
|
62 |
component. This implies that encoders must waste space (by storing
|
|
63 |
duplicate Huffman tables) or else violate the well-founded TIFF convention
|
|
64 |
that prohibits duplicate pointers. Furthermore, baseline decoders must
|
|
65 |
test to find out which tables are identical, a waste of time and code
|
|
66 |
space.
|
|
67 |
|
|
68 |
* The JPEGInterchangeFormat field also violates TIFF's proscription against
|
|
69 |
duplicate pointers: the normal strip/tile pointers are expected to point
|
|
70 |
into the larger data area pointed to by JPEGInterchangeFormat. All TIFF
|
|
71 |
editing applications must be specifically aware of this relationship, since
|
|
72 |
they must maintain it or else delete the JPEGInterchangeFormat field. The
|
|
73 |
JPEGxxTables fields are also likely to point into the JPEGInterchangeFormat
|
|
74 |
area, creating additional pointer relationships that must be maintained.
|
|
75 |
|
|
76 |
* The JPEGQTables field is fixed at a byte per table entry; there is no
|
|
77 |
way to support 16-bit quantization values. This is a serious impediment
|
|
78 |
to extending TIFF to use 12-bit JPEG.
|
|
79 |
|
|
80 |
* The 6.0 design cannot support using different quantization tables in
|
|
81 |
different strips/tiles of an image (so as to encode some areas at higher
|
|
82 |
quality than others). Furthermore, since quantization tables are tied
|
|
83 |
one-for-one to color components, the design cannot support table switching
|
|
84 |
options that are likely to be added in future JPEG revisions.
|
|
85 |
|
|
86 |
|
|
87 |
Ambiguities
|
|
88 |
-----------
|
|
89 |
|
|
90 |
Several incompatible interpretations are possible for 6.0's treatment of
|
|
91 |
JPEG restart markers:
|
|
92 |
|
|
93 |
* It is unclear whether restart markers must be omitted at TIFF segment
|
|
94 |
(strip/tile) boundaries, or whether they are optional.
|
|
95 |
|
|
96 |
* It is unclear whether the segment size is required to be chosen as
|
|
97 |
a multiple of the specified restart interval (if any); perhaps the
|
|
98 |
JPEG codec is supposed to be reset at each segment boundary as if
|
|
99 |
there were a restart marker there, even if the boundary does not fall
|
|
100 |
at a multiple of the nominal restart interval.
|
|
101 |
|
|
102 |
* The spec fails to address the question of restart marker numbering:
|
|
103 |
do the numbers begin again within each segment, or not?
|
|
104 |
|
|
105 |
That last point is particularly nasty. If we make numbering begin again
|
|
106 |
within each segment, we give up the ability to impose a TIFF strip/tile
|
|
107 |
structure on an existing JPEG datastream with restarts (which was clearly a
|
|
108 |
goal of Section 22's authors). But the other choice interferes with random
|
|
109 |
access to the image segments: a reader must compute the first restart
|
|
110 |
number to be expected within a segment, and must have a way to reset its
|
|
111 |
JPEG decoder to expect a nonzero restart number first. This may not even
|
|
112 |
be possible with some JPEG chips.
|
|
113 |
|
|
114 |
The tile height restriction found on page 104 contradicts Section 15's
|
|
115 |
general description of tiles. For an image that is not vertically
|
|
116 |
downsampled, page 104 specifies a tile height of one MCU or 8 pixels; but
|
|
117 |
Section 15 requires tiles to be a multiple of 16 pixels high.
|
|
118 |
|
|
119 |
This Tech Note does not attempt to resolve these ambiguities, so
|
|
120 |
implementations that follow the 6.0 design should be aware that
|
|
121 |
inter-application compatibility problems are likely to arise.
|
|
122 |
|
|
123 |
|
|
124 |
Unnecessary complexity
|
|
125 |
----------------------
|
|
126 |
|
|
127 |
The 6.0 design creates problems for implementations that need to keep the
|
|
128 |
JPEG codec separate from the TIFF control logic --- for example, consider
|
|
129 |
using a JPEG chip that was not designed specifically for TIFF. JPEG codecs
|
|
130 |
generally want to produce or consume a standard ISO JPEG datastream, not
|
|
131 |
just raw compressed data. (If they were to handle raw data, a separate
|
|
132 |
out-of-band mechanism would be needed to load tables into the codec.)
|
|
133 |
With such a codec, the TIFF control logic must parse JPEG markers emitted
|
|
134 |
by the codec to create the TIFF table fields (when writing) or synthesize
|
|
135 |
JPEG markers from the TIFF fields to feed the codec (when reading). This
|
|
136 |
means that the control logic must know a great deal more about JPEG details
|
|
137 |
than we would like. The parsing and reconstruction of the markers also
|
|
138 |
represents a fair amount of unnecessary work.
|
|
139 |
|
|
140 |
Quite a few implementors have proposed writing "TIFF/JPEG" files in which
|
|
141 |
a standard JPEG datastream is simply dumped into the file and pointed to
|
|
142 |
by JPEGInterchangeFormat. To avoid parsing the JPEG datastream, they
|
|
143 |
suggest not writing the JPEG auxiliary fields (JPEGxxTables etc) nor even
|
|
144 |
the basic TIFF strip/tile data pointers. This approach is incompatible
|
|
145 |
with implementations that handle the full TIFF 6.0 JPEG design, since they
|
|
146 |
will expect to find strip/tile pointers and auxiliary fields. Indeed this
|
|
147 |
is arguably not TIFF at all, since *all* TIFF-reading applications expect
|
|
148 |
to find strip or tile pointers. A subset implementation that is not
|
|
149 |
upward-compatible with the full spec is clearly unacceptable. However,
|
|
150 |
the frequency with which this idea has come up makes it clear that
|
|
151 |
implementors find the existing Section 22 too complex.
|
|
152 |
|
|
153 |
|
|
154 |
Overview of the solution
|
|
155 |
========================
|
|
156 |
|
|
157 |
To solve these problems, we adopt a new design for embedding
|
|
158 |
JPEG-compressed data in TIFF files. The new design uses only complete,
|
|
159 |
uninterpreted ISO JPEG datastreams, so it should be much more forgiving of
|
|
160 |
extensions to the ISO standard. It should also be far easier to implement
|
|
161 |
using unmodified JPEG codecs.
|
|
162 |
|
|
163 |
To reduce overhead in multi-segment TIFF files, we allow JPEG overhead
|
|
164 |
tables to be stored just once in a JPEGTables auxiliary field. This
|
|
165 |
feature does not violate the integrity of the JPEG datastreams, because it
|
|
166 |
uses the notions of "tables-only datastreams" and "abbreviated image
|
|
167 |
datastreams" as defined by the ISO standard.
|
|
168 |
|
|
169 |
To prevent confusion with the old design, the new design is given a new
|
|
170 |
Compression tag value, Compression=7. Readers that need to handle
|
|
171 |
existing 6.0 JPEG files may read both old and new files, using whatever
|
|
172 |
interpretation of the 6.0 spec they did before. Compression tag value 6
|
|
173 |
and the field tag numbers defined by 6.0 section 22 will remain reserved
|
|
174 |
indefinitely, even though detailed descriptions of them will be dropped
|
|
175 |
from future editions of the TIFF specification.
|
|
176 |
|
|
177 |
|
|
178 |
Replacement TIFF/JPEG specification
|
|
179 |
===================================
|
|
180 |
|
|
181 |
[This section of the Tech Note is expected to replace Section 22 in the
|
|
182 |
next release of the TIFF specification.]
|
|
183 |
|
|
184 |
This section describes TIFF compression scheme 7, a high-performance
|
|
185 |
compression method for continuous-tone images.
|
|
186 |
|
|
187 |
Introduction
|
|
188 |
------------
|
|
189 |
|
|
190 |
This TIFF compression method uses the international standard for image
|
|
191 |
compression ISO/IEC 10918-1, usually known as "JPEG" (after the original
|
|
192 |
name of the standards committee, Joint Photographic Experts Group). JPEG
|
|
193 |
is a joint ISO/CCITT standard for compression of continuous-tone images.
|
|
194 |
|
|
195 |
The JPEG committee decided that because of the broad scope of the standard,
|
|
196 |
no one algorithmic procedure was able to satisfy the requirements of all
|
|
197 |
applications. Instead, the JPEG standard became a "toolkit" of multiple
|
|
198 |
algorithms and optional capabilities. Individual applications may select
|
|
199 |
a subset of the JPEG standard that meets their requirements.
|
|
200 |
|
|
201 |
The most important distinction among the JPEG processes is between lossy
|
|
202 |
and lossless compression. Lossy compression methods provide high
|
|
203 |
compression but allow only approximate reconstruction of the original
|
|
204 |
image. JPEG's lossy processes allow the encoder to trade off compressed
|
|
205 |
file size against reconstruction fidelity over a wide range. Typically,
|
|
206 |
10:1 or more compression of full-color data can be obtained while keeping
|
|
207 |
the reconstructed image visually indistinguishable from the original. Much
|
|
208 |
higher compression ratios are possible if a low-quality reconstructed image
|
|
209 |
is acceptable. Lossless compression provides exact reconstruction of the
|
|
210 |
source data, but the achievable compression ratio is much lower than for
|
|
211 |
the lossy processes; JPEG's rather simple lossless process typically
|
|
212 |
achieves around 2:1 compression of full-color data.
|
|
213 |
|
|
214 |
The most widely implemented JPEG subset is the "baseline" JPEG process.
|
|
215 |
This provides lossy compression of 8-bit-per-channel data. Optional
|
|
216 |
extensions include 12-bit-per-channel data, arithmetic entropy coding for
|
|
217 |
better compression, and progressive/hierarchical representations. The
|
|
218 |
lossless process is an independent algorithm that has little in
|
|
219 |
common with the lossy processes.
|
|
220 |
|
|
221 |
It should be noted that the optional arithmetic-coding extension is subject
|
|
222 |
to several US and Japanese patents. To avoid patent problems, use of
|
|
223 |
arithmetic coding processes in TIFF files intended for inter-application
|
|
224 |
interchange is discouraged.
|
|
225 |
|
|
226 |
All of the JPEG processes are useful only for "continuous tone" data,
|
|
227 |
in which the difference between adjacent pixel values is usually small.
|
|
228 |
Low-bit-depth source data is not appropriate for JPEG compression, nor
|
|
229 |
are palette-color images good candidates. The JPEG processes work well
|
|
230 |
on grayscale and full-color data.
|
|
231 |
|
|
232 |
Describing the JPEG compression algorithms in sufficient detail to permit
|
|
233 |
implementation would require more space than we have here. Instead, we
|
|
234 |
refer the reader to the References section.
|
|
235 |
|
|
236 |
|
|
237 |
What data is being compressed?
|
|
238 |
------------------------------
|
|
239 |
|
|
240 |
In lossy JPEG compression, it is customary to convert color source data
|
|
241 |
to YCbCr and then downsample it before JPEG compression. This gives
|
|
242 |
2:1 data compression with hardly any visible image degradation, and it
|
|
243 |
permits additional space savings within the JPEG compression step proper.
|
|
244 |
However, these steps are not considered part of the ISO JPEG standard.
|
|
245 |
The ISO standard is "color blind": it accepts data in any color space.
|
|
246 |
|
|
247 |
For TIFF purposes, the JPEG compression tag is considered to represent the
|
|
248 |
ISO JPEG compression standard only. The ISO standard is applied to the
|
|
249 |
same data that would be stored in the TIFF file if no compression were
|
|
250 |
used. Therefore, if color conversion or downsampling are used, they must
|
|
251 |
be reflected in the regular TIFF fields; these steps are not considered to
|
|
252 |
be implicit in the JPEG compression tag value. PhotometricInterpretation
|
|
253 |
and related fields shall describe the color space actually stored in the
|
|
254 |
file. With the TIFF 6.0 field definitions, downsampling is permissible
|
|
255 |
only for YCbCr data, and it must correspond to the YCbCrSubSampling field.
|
|
256 |
(Note that the default value for this field is not 1,1; so the default for
|
|
257 |
YCbCr is to apply downsampling!) It is likely that future versions of TIFF
|
|
258 |
will provide additional PhotometricInterpretation values and a more general
|
|
259 |
way of defining subsampling, so as to allow more flexibility in
|
|
260 |
JPEG-compressed files. But that issue is not addressed in this Tech Note.
|
|
261 |
|
|
262 |
Implementors should note that many popular JPEG codecs
|
|
263 |
(compressor/decompressors) provide automatic color conversion and
|
|
264 |
downsampling, so that the application may supply full-size RGB data which
|
|
265 |
is nonetheless converted to downsampled YCbCr. This is an implementation
|
|
266 |
convenience which does not excuse the TIFF control layer from its
|
|
267 |
responsibility to know what is really going on. The
|
|
268 |
PhotometricInterpretation and subsampling fields written to the file must
|
|
269 |
describe what is actually in the file.
|
|
270 |
|
|
271 |
A JPEG-compressed TIFF file will typically have PhotometricInterpretation =
|
|
272 |
YCbCr and YCbCrSubSampling = [2,1] or [2,2], unless the source data was
|
|
273 |
grayscale or CMYK.
|
|
274 |
|
|
275 |
|
|
276 |
Basic representation of JPEG-compressed images
|
|
277 |
----------------------------------------------
|
|
278 |
|
|
279 |
JPEG compression works in either strip-based or tile-based TIFF files.
|
|
280 |
Rather than repeating "strip or tile" constantly, we will use the term
|
|
281 |
"segment" to mean either a strip or a tile.
|
|
282 |
|
|
283 |
When the Compression field has the value 7, each image segment contains
|
|
284 |
a complete JPEG datastream which is valid according to the ISO JPEG
|
|
285 |
standard (ISO/IEC 10918-1). Any sequential JPEG process can be used,
|
|
286 |
including lossless JPEG, but progressive and hierarchical processes are not
|
|
287 |
supported. Since JPEG is useful only for continuous-tone images, the
|
|
288 |
PhotometricInterpretation of the image shall not be 3 (palette color) nor
|
|
289 |
4 (transparency mask). The bit depth of the data is also restricted as
|
|
290 |
specified below.
|
|
291 |
|
|
292 |
Each image segment in a JPEG-compressed TIFF file shall contain a valid
|
|
293 |
JPEG datastream according to the ISO JPEG standard's rules for
|
|
294 |
interchange-format or abbreviated-image-format data. The datastream shall
|
|
295 |
contain a single JPEG frame storing that segment of the image. The
|
|
296 |
required JPEG markers within a segment are:
|
|
297 |
SOI (must appear at very beginning of segment)
|
|
298 |
SOFn
|
|
299 |
SOS (one for each scan, if there is more than one scan)
|
|
300 |
EOI (must appear at very end of segment)
|
|
301 |
The actual compressed data follows SOS; it may contain RSTn markers if DRI
|
|
302 |
is used.
|
|
303 |
|
|
304 |
Additional JPEG "tables and miscellaneous" markers may appear between SOI
|
|
305 |
and SOFn, between SOFn and SOS, and before each subsequent SOS if there is
|
|
306 |
more than one scan. These markers include:
|
|
307 |
DQT
|
|
308 |
DHT
|
|
309 |
DAC (not to appear unless arithmetic coding is used)
|
|
310 |
DRI
|
|
311 |
APPn (shall be ignored by TIFF readers)
|
|
312 |
COM (shall be ignored by TIFF readers)
|
|
313 |
DNL markers shall not be used in TIFF files. Readers should abort if any
|
|
314 |
other marker type is found, especially the JPEG reserved markers;
|
|
315 |
occurrence of such a marker is likely to indicate a JPEG extension.
|
|
316 |
|
|
317 |
The tables/miscellaneous markers may appear in any order. Readers are
|
|
318 |
cautioned that although the SOFn marker refers to DQT tables, JPEG does not
|
|
319 |
require those tables to precede the SOFn, only the SOS. Missing-table
|
|
320 |
checks should be made when SOS is reached.
|
|
321 |
|
|
322 |
If no JPEGTables field is used, then each image segment shall be a complete
|
|
323 |
JPEG interchange datastream. Each segment must define all the tables it
|
|
324 |
references. To allow readers to decode segments in any order, no segment
|
|
325 |
may rely on tables being carried over from a previous segment.
|
|
326 |
|
|
327 |
When a JPEGTables field is used, image segments may omit tables that have
|
|
328 |
been specified in the JPEGTables field. Further details appear below.
|
|
329 |
|
|
330 |
The SOFn marker shall be of type SOF0 for strict baseline JPEG data, of
|
|
331 |
type SOF1 for non-baseline lossy JPEG data, or of type SOF3 for lossless
|
|
332 |
JPEG data. (SOF9 or SOF11 would be used for arithmetic coding.) All
|
|
333 |
segments of a JPEG-compressed TIFF image shall use the same JPEG
|
|
334 |
compression process, in particular the same SOFn type.
|
|
335 |
|
|
336 |
The data precision field of the SOFn marker shall agree with the TIFF
|
|
337 |
BitsPerSample field. (Note that when PlanarConfiguration=1, this implies
|
|
338 |
that all components must have the same BitsPerSample value; when
|
|
339 |
PlanarConfiguration=2, different components could have different bit
|
|
340 |
depths.) For SOF0 only precision 8 is permitted; for SOF1, precision 8 or
|
|
341 |
12 is permitted; for SOF3, precisions 2 to 16 are permitted.
|
|
342 |
|
|
343 |
The image dimensions given in the SOFn marker shall agree with the logical
|
|
344 |
dimensions of that particular strip or tile. For strip images, the SOFn
|
|
345 |
image width shall equal ImageWidth and the height shall equal RowsPerStrip,
|
|
346 |
except in the last strip; its SOFn height shall equal the number of rows
|
|
347 |
remaining in the ImageLength. (In other words, no padding data is counted
|
|
348 |
in the SOFn dimensions.) For tile images, each SOFn shall have width
|
|
349 |
TileWidth and height TileHeight; adding and removing any padding needed in
|
|
350 |
the edge tiles is the concern of some higher level of the TIFF software.
|
|
351 |
(The dimensional rules are slightly different when PlanarConfiguration=2,
|
|
352 |
as described below.)
|
|
353 |
|
|
354 |
The ISO JPEG standard only permits images up to 65535 pixels in width or
|
|
355 |
height, due to 2-byte fields in the SOFn markers. In TIFF, this limits
|
|
356 |
the size of an individual JPEG-compressed strip or tile, but the total
|
|
357 |
image size can be greater.
|
|
358 |
|
|
359 |
The number of components in the JPEG datastream shall equal SamplesPerPixel
|
|
360 |
for PlanarConfiguration=1, and shall be 1 for PlanarConfiguration=2. The
|
|
361 |
components shall be stored in the same order as they are described at the
|
|
362 |
TIFF field level. (This applies both to their order in the SOFn marker,
|
|
363 |
and to the order in which they are scanned if multiple JPEG scans are
|
|
364 |
used.) The component ID bytes are arbitrary so long as each component
|
|
365 |
within an image segment is given a distinct ID. To avoid any possible
|
|
366 |
confusion, we require that all segments of a TIFF image use the same ID
|
|
367 |
code for a given component.
|
|
368 |
|
|
369 |
In PlanarConfiguration 1, the sampling factors given in SOFn markers shall
|
|
370 |
agree with the sampling factors defined by the related TIFF fields (or with
|
|
371 |
the default values that are specified in the absence of those fields).
|
|
372 |
|
|
373 |
When DCT-based JPEG is used in a strip TIFF file, RowsPerStrip is required
|
|
374 |
to be a multiple of 8 times the largest vertical sampling factor, i.e., a
|
|
375 |
multiple of the height of an interleaved MCU. (For simplicity of
|
|
376 |
specification, we require this even if the data is not actually
|
|
377 |
interleaved.) For example, if YCbCrSubSampling = [2,2] then RowsPerStrip
|
|
378 |
must be a multiple of 16. An exception to this rule is made for
|
|
379 |
single-strip images (RowsPerStrip >= ImageLength): the exact value of
|
|
380 |
RowsPerStrip is unimportant in that case. This rule ensures that no data
|
|
381 |
padding is needed at the bottom of a strip, except perhaps the last strip.
|
|
382 |
Any padding required at the right edge of the image, or at the bottom of
|
|
383 |
the last strip, is expected to occur internally to the JPEG codec.
|
|
384 |
|
|
385 |
When DCT-based JPEG is used in a tiled TIFF file, TileLength is required
|
|
386 |
to be a multiple of 8 times the largest vertical sampling factor, i.e.,
|
|
387 |
a multiple of the height of an interleaved MCU; and TileWidth is required
|
|
388 |
to be a multiple of 8 times the largest horizontal sampling factor, i.e.,
|
|
389 |
a multiple of the width of an interleaved MCU. (For simplicity of
|
|
390 |
specification, we require this even if the data is not actually
|
|
391 |
interleaved.) All edge padding required will therefore occur in the course
|
|
392 |
of normal TIFF tile padding; it is not special to JPEG.
|
|
393 |
|
|
394 |
Lossless JPEG does not impose these constraints on strip and tile sizes,
|
|
395 |
since it is not DCT-based.
|
|
396 |
|
|
397 |
Note that within JPEG datastreams, multibyte values appear in the MSB-first
|
|
398 |
order specified by the JPEG standard, regardless of the byte ordering of
|
|
399 |
the surrounding TIFF file.
|
|
400 |
|
|
401 |
|
|
402 |
JPEGTables field
|
|
403 |
----------------
|
|
404 |
|
|
405 |
The only auxiliary TIFF field added for Compression=7 is the optional
|
|
406 |
JPEGTables field. The purpose of JPEGTables is to predefine JPEG
|
|
407 |
quantization and/or Huffman tables for subsequent use by JPEG image
|
|
408 |
segments. When this is done, these rather bulky tables need not be
|
|
409 |
duplicated in each segment, thus saving space and processing time.
|
|
410 |
JPEGTables may be used even in a single-segment file, although there is no
|
|
411 |
space savings in that case.
|
|
412 |
|
|
413 |
JPEGTables:
|
|
414 |
Tag = 347 (15B.H)
|
|
415 |
Type = UNDEFINED
|
|
416 |
N = number of bytes in tables datastream, typically a few hundred
|
|
417 |
JPEGTables provides default JPEG quantization and/or Huffman tables which
|
|
418 |
are used whenever a segment datastream does not contain its own tables, as
|
|
419 |
specified below.
|
|
420 |
|
|
421 |
Notice that the JPEGTables field is required to have type code UNDEFINED,
|
|
422 |
not type code BYTE. This is to cue readers that expanding individual bytes
|
|
423 |
to short or long integers is not appropriate. A TIFF reader will generally
|
|
424 |
need to store the field value as an uninterpreted byte sequence until it is
|
|
425 |
fed to the JPEG decoder.
|
|
426 |
|
|
427 |
Multibyte quantities within the tables follow the ISO JPEG convention of
|
|
428 |
MSB-first storage, regardless of the byte ordering of the surrounding TIFF
|
|
429 |
file.
|
|
430 |
|
|
431 |
When the JPEGTables field is present, it shall contain a valid JPEG
|
|
432 |
"abbreviated table specification" datastream. This datastream shall begin
|
|
433 |
with SOI and end with EOI. It may contain zero or more JPEG "tables and
|
|
434 |
miscellaneous" markers, namely:
|
|
435 |
DQT
|
|
436 |
DHT
|
|
437 |
DAC (not to appear unless arithmetic coding is used)
|
|
438 |
DRI
|
|
439 |
APPn (shall be ignored by TIFF readers)
|
|
440 |
COM (shall be ignored by TIFF readers)
|
|
441 |
Since JPEG defines the SOI marker to reset the DAC and DRI state, these two
|
|
442 |
markers' values cannot be carried over into any image datastream, and thus
|
|
443 |
they are effectively no-ops in the JPEGTables field. To avoid confusion,
|
|
444 |
it is recommended that writers not place DAC or DRI markers in JPEGTables.
|
|
445 |
However readers must properly skip over them if they appear.
|
|
446 |
|
|
447 |
When JPEGTables is present, readers shall load the table specifications
|
|
448 |
contained in JPEGTables before processing image segment datastreams.
|
|
449 |
Image segments may simply refer to these preloaded tables without defining
|
|
450 |
them. An image segment can still define and use its own tables, subject to
|
|
451 |
the restrictions below.
|
|
452 |
|
|
453 |
An image segment may not redefine any table defined in JPEGTables. (This
|
|
454 |
restriction is imposed to allow readers to process image segments in random
|
|
455 |
order without having to reload JPEGTables between segments.) Therefore, use
|
|
456 |
of JPEGTables divides the available table slots into two groups: "global"
|
|
457 |
slots are defined in JPEGTables and may be used but not redefined by
|
|
458 |
segments; "local" slots are available for local definition and use in each
|
|
459 |
segment. To permit random access, a segment may not reference any local
|
|
460 |
tables that it does not itself define.
|
|
461 |
|
|
462 |
|
|
463 |
Special considerations for PlanarConfiguration 2
|
|
464 |
------------------------------------------------
|
|
465 |
|
|
466 |
In PlanarConfiguration 2, each image segment contains data for only one
|
|
467 |
color component. To avoid confusing the JPEG codec, we wish the segments
|
|
468 |
to look like valid single-channel (i.e., grayscale) JPEG datastreams. This
|
|
469 |
means that different rules must be used for the SOFn parameters.
|
|
470 |
|
|
471 |
In PlanarConfiguration 2, the dimensions given in the SOFn of a subsampled
|
|
472 |
component shall be scaled down by the sampling factors compared to the SOFn
|
|
473 |
dimensions that would be used in PlanarConfiguration 1. This is necessary
|
|
474 |
to match the actual number of samples stored in that segment, so that the
|
|
475 |
JPEG codec doesn't complain about too much or too little data. In strip
|
|
476 |
TIFF files the computed dimensions may need to be rounded up to the next
|
|
477 |
integer; in tiled files, the restrictions on tile size make this case
|
|
478 |
impossible.
|
|
479 |
|
|
480 |
Furthermore, all SOFn sampling factors shall be given as 1. (This is
|
|
481 |
merely to avoid confusion, since the sampling factors in a single-channel
|
|
482 |
JPEG datastream have no real effect.)
|
|
483 |
|
|
484 |
Any downsampling will need to happen externally to the JPEG codec, since
|
|
485 |
JPEG sampling factors are defined with reference to the full-precision
|
|
486 |
component. In PlanarConfiguration 2, the JPEG codec will be working on
|
|
487 |
only one component at a time and thus will have no reference component to
|
|
488 |
downsample against.
|
|
489 |
|
|
490 |
|
|
491 |
Minimum requirements for TIFF/JPEG
|
|
492 |
----------------------------------
|
|
493 |
|
|
494 |
ISO JPEG is a large and complex standard; most implementations support only
|
|
495 |
a subset of it. Here we define a "core" subset of TIFF/JPEG which readers
|
|
496 |
must support to claim TIFF/JPEG compatibility. For maximum
|
|
497 |
cross-application compatibility, we recommend that writers confine
|
|
498 |
themselves to this subset unless there is very good reason to do otherwise.
|
|
499 |
|
|
500 |
Use the ISO baseline JPEG process: 8-bit data precision, Huffman coding,
|
|
501 |
with no more than 2 DC and 2 AC Huffman tables. Note that this implies
|
|
502 |
BitsPerSample = 8 for each component. We recommend deviating from baseline
|
|
503 |
JPEG only if 12-bit data precision or lossless coding is required.
|
|
504 |
|
|
505 |
Use no subsampling (all JPEG sampling factors = 1) for color spaces other
|
|
506 |
than YCbCr. (This is, in fact, required with the TIFF 6.0 field
|
|
507 |
definitions, but may not be so in future revisions.) For YCbCr, use one of
|
|
508 |
the following choices:
|
|
509 |
YCbCrSubSampling field JPEG sampling factors
|
|
510 |
1,1 1h1v, 1h1v, 1h1v
|
|
511 |
2,1 2h1v, 1h1v, 1h1v
|
|
512 |
2,2 (default value) 2h2v, 1h1v, 1h1v
|
|
513 |
We recommend that RGB source data be converted to YCbCr for best compression
|
|
514 |
results. Other source data colorspaces should probably be left alone.
|
|
515 |
Minimal readers need not support JPEG images with colorspaces other than
|
|
516 |
YCbCr and grayscale (PhotometricInterpretation = 6 or 1).
|
|
517 |
|
|
518 |
A minimal reader also need not support JPEG YCbCr images with nondefault
|
|
519 |
values of YCbCrCoefficients or YCbCrPositioning, nor with values of
|
|
520 |
ReferenceBlackWhite other than [0,255,128,255,128,255]. (These values
|
|
521 |
correspond to the RGB<=>YCbCr conversion specified by JFIF, which is widely
|
|
522 |
implemented in JPEG codecs.)
|
|
523 |
|
|
524 |
Writers are reminded that a ReferenceBlackWhite field *must* be included
|
|
525 |
when PhotometricInterpretation is YCbCr, because the default
|
|
526 |
ReferenceBlackWhite values are inappropriate for YCbCr.
|
|
527 |
|
|
528 |
If any subsampling is used, PlanarConfiguration=1 is preferred to avoid the
|
|
529 |
possibly-confusing requirements of PlanarConfiguration=2. In any case,
|
|
530 |
readers are not required to support PlanarConfiguration=2.
|
|
531 |
|
|
532 |
If possible, use a single interleaved scan in each image segment. This is
|
|
533 |
not legal JPEG if there are more than 4 SamplesPerPixel or if the sampling
|
|
534 |
factors are such that more than 10 blocks would be needed per MCU; in that
|
|
535 |
case, use a separate scan for each component. (The recommended color
|
|
536 |
spaces and sampling factors will not run into that restriction, so a
|
|
537 |
minimal reader need not support more than one scan per segment.)
|
|
538 |
|
|
539 |
To claim TIFF/JPEG compatibility, readers shall support multiple-strip TIFF
|
|
540 |
files and the optional JPEGTables field; it is not acceptable to read only
|
|
541 |
single-datastream files. Support for tiled TIFF files is strongly
|
|
542 |
recommended but not required.
|
|
543 |
|
|
544 |
|
|
545 |
Other recommendations for implementors
|
|
546 |
--------------------------------------
|
|
547 |
|
|
548 |
The TIFF tag Compression=7 guarantees only that the compressed data is
|
|
549 |
represented as ISO JPEG datastreams. Since JPEG is a large and evolving
|
|
550 |
standard, readers should apply careful error checking to the JPEG markers
|
|
551 |
to ensure that the compression process is within their capabilities. In
|
|
552 |
particular, to avoid being confused by future extensions to the JPEG
|
|
553 |
standard, it is important to abort if unknown marker codes are seen.
|
|
554 |
|
|
555 |
The point of requiring that all image segments use the same JPEG process is
|
|
556 |
to ensure that a reader need check only one segment to determine whether it
|
|
557 |
can handle the image. For example, consider a TIFF reader that has access
|
|
558 |
to fast but restricted JPEG hardware, as well as a slower, more general
|
|
559 |
software implementation. It is desirable to check only one image segment
|
|
560 |
to find out whether the fast hardware can be used. Thus, writers should
|
|
561 |
try to ensure that all segments of an image look as much "alike" as
|
|
562 |
possible: there should be no variation in scan layout, use of options such
|
|
563 |
as DRI, etc. Ideally, segments will be processed identically except
|
|
564 |
perhaps for using different local quantization or entropy-coding tables.
|
|
565 |
|
|
566 |
Writers should avoid including "noise" JPEG markers (COM and APPn markers).
|
|
567 |
Standard TIFF fields provide a better way to transport any non-image data.
|
|
568 |
Some JPEG codecs may change behavior if they see an APPn marker they
|
|
569 |
think they understand; since the TIFF spec requires these markers to be
|
|
570 |
ignored, this behavior is undesirable.
|
|
571 |
|
|
572 |
It is possible to convert an interchange-JPEG file (e.g., a JFIF file) to
|
|
573 |
TIFF simply by dropping the interchange datastream into a single strip.
|
|
574 |
(However, designers are reminded that the TIFF spec discourages huge
|
|
575 |
strips; splitting the image is somewhat more work but may give better
|
|
576 |
results.) Conversion from TIFF to interchange JPEG is more complex. A
|
|
577 |
strip-based TIFF/JPEG file can be converted fairly easily if all strips use
|
|
578 |
identical JPEG tables and no RSTn markers: just delete the overhead markers
|
|
579 |
and insert RSTn markers between strips. Converting tiled images is harder,
|
|
580 |
since the data will usually not be in the right order (unless the tiles are
|
|
581 |
only one MCU high). This can still be done losslessly, but it will require
|
|
582 |
undoing and redoing the entropy coding so that the DC coefficient
|
|
583 |
differences can be updated.
|
|
584 |
|
|
585 |
There is no default value for JPEGTables: standard TIFF files must define all
|
|
586 |
tables that they reference. For some closed systems in which many files will
|
|
587 |
have identical tables, it might make sense to define a default JPEGTables
|
|
588 |
value to avoid actually storing the tables. Or even better, invent a
|
|
589 |
private field selecting one of N default JPEGTables settings, so as to allow
|
|
590 |
for future expansion. Either of these must be regarded as a private
|
|
591 |
extension that will render the files unreadable by other applications.
|
|
592 |
|
|
593 |
|
|
594 |
References
|
|
595 |
----------
|
|
596 |
|
|
597 |
[1] Wallace, Gregory K. "The JPEG Still Picture Compression Standard",
|
|
598 |
Communications of the ACM, April 1991 (vol. 34 no. 4), pp. 30-44.
|
|
599 |
|
|
600 |
This is the best short technical introduction to the JPEG algorithms.
|
|
601 |
It is a good overview but does not provide sufficiently detailed
|
|
602 |
information to write an implementation.
|
|
603 |
|
|
604 |
[2] Pennebaker, William B. and Mitchell, Joan L. "JPEG Still Image Data
|
|
605 |
Compression Standard", Van Nostrand Reinhold, 1993, ISBN 0-442-01272-1.
|
|
606 |
638pp.
|
|
607 |
|
|
608 |
This textbook is by far the most complete exposition of JPEG in existence.
|
|
609 |
It includes the full text of the ISO JPEG standards (DIS 10918-1 and draft
|
|
610 |
DIS 10918-2). No would-be JPEG implementor should be without it.
|
|
611 |
|
|
612 |
[3] ISO/IEC IS 10918-1, "Digital Compression and Coding of Continuous-tone
|
|
613 |
Still Images, Part 1: Requirements and guidelines", February 1994.
|
|
614 |
ISO/IEC DIS 10918-2, "Digital Compression and Coding of Continuous-tone
|
|
615 |
Still Images, Part 2: Compliance testing", final approval expected 1994.
|
|
616 |
|
|
617 |
These are the official standards documents. Note that the Pennebaker and
|
|
618 |
Mitchell textbook is likely to be cheaper and more useful than the official
|
|
619 |
standards.
|
|
620 |
|
|
621 |
|
|
622 |
Changes to Section 21: YCbCr Images
|
|
623 |
===================================
|
|
624 |
|
|
625 |
[This section of the Tech Note clarifies section 21 to make clear the
|
|
626 |
interpretation of image dimensions in a subsampled image. Furthermore,
|
|
627 |
the section is changed to allow the original image dimensions not to be
|
|
628 |
multiples of the sampling factors. This change is necessary to support use
|
|
629 |
of JPEG compression on odd-size images.]
|
|
630 |
|
|
631 |
Add the following paragraphs to the Section 21 introduction (p. 89),
|
|
632 |
just after the paragraph beginning "When a Class Y image is subsampled":
|
|
633 |
|
|
634 |
In a subsampled image, it is understood that all TIFF image
|
|
635 |
dimensions are measured in terms of the highest-resolution
|
|
636 |
(luminance) component. In particular, ImageWidth, ImageLength,
|
|
637 |
RowsPerStrip, TileWidth, TileLength, XResolution, and YResolution
|
|
638 |
are measured in luminance samples.
|
|
639 |
|
|
640 |
RowsPerStrip, TileWidth, and TileLength are constrained so that
|
|
641 |
there are an integral number of samples of each component in a
|
|
642 |
complete strip or tile. However, ImageWidth/ImageLength are not
|
|
643 |
constrained. If an odd-size image is to be converted to subsampled
|
|
644 |
format, the writer should pad the source data to a multiple of the
|
|
645 |
sampling factors by replication of the last column and/or row, then
|
|
646 |
downsample. The number of luminance samples actually stored in the
|
|
647 |
file will be a multiple of the sampling factors. Conversely,
|
|
648 |
readers must ignore any extra data (outside the specified image
|
|
649 |
dimensions) after upsampling.
|
|
650 |
|
|
651 |
When PlanarConfiguration=2, each strip or tile covers the same
|
|
652 |
image area despite subsampling; that is, the total number of strips
|
|
653 |
or tiles in the image is the same for each component. Therefore
|
|
654 |
strips or tiles of the subsampled components contain fewer samples
|
|
655 |
than strips or tiles of the luminance component.
|
|
656 |
|
|
657 |
If there are extra samples per pixel (see field ExtraSamples),
|
|
658 |
these data channels have the same number of samples as the
|
|
659 |
luminance component.
|
|
660 |
|
|
661 |
Rewrite the YCbCrSubSampling field description (pp 91-92) as follows
|
|
662 |
(largely to eliminate possibly-misleading references to
|
|
663 |
ImageWidth/ImageLength of the subsampled components):
|
|
664 |
|
|
665 |
(first paragraph unchanged)
|
|
666 |
|
|
667 |
The two elements of this field are defined as follows:
|
|
668 |
|
|
669 |
Short 0: ChromaSubsampleHoriz:
|
|
670 |
|
|
671 |
1 = there are equal numbers of luma and chroma samples horizontally.
|
|
672 |
|
|
673 |
2 = there are twice as many luma samples as chroma samples
|
|
674 |
horizontally.
|
|
675 |
|
|
676 |
4 = there are four times as many luma samples as chroma samples
|
|
677 |
horizontally.
|
|
678 |
|
|
679 |
Short 1: ChromaSubsampleVert:
|
|
680 |
|
|
681 |
1 = there are equal numbers of luma and chroma samples vertically.
|
|
682 |
|
|
683 |
2 = there are twice as many luma samples as chroma samples
|
|
684 |
vertically.
|
|
685 |
|
|
686 |
4 = there are four times as many luma samples as chroma samples
|
|
687 |
vertically.
|
|
688 |
|
|
689 |
ChromaSubsampleVert shall always be less than or equal to
|
|
690 |
ChromaSubsampleHoriz. Note that Cb and Cr have the same sampling
|
|
691 |
ratios.
|
|
692 |
|
|
693 |
In a strip TIFF file, RowsPerStrip is required to be an integer
|
|
694 |
multiple of ChromaSubSampleVert (unless RowsPerStrip >=
|
|
695 |
ImageLength, in which case its exact value is unimportant).
|
|
696 |
If ImageWidth and ImageLength are not multiples of
|
|
697 |
ChromaSubsampleHoriz and ChromaSubsampleVert respectively, then the
|
|
698 |
source data shall be padded to the next integer multiple of these
|
|
699 |
values before downsampling.
|
|
700 |
|
|
701 |
In a tiled TIFF file, TileWidth must be an integer multiple of
|
|
702 |
ChromaSubsampleHoriz and TileLength must be an integer multiple of
|
|
703 |
ChromaSubsampleVert. Padding will occur to tile boundaries.
|
|
704 |
|
|
705 |
The default values of this field are [ 2,2 ]. Thus, YCbCr data is
|
|
706 |
downsampled by default!
|
|
707 |
</pre>
|