|
1 <pre> |
|
2 DRAFT TIFF Technical Note #2 17-Mar-95 |
|
3 ============================ |
|
4 |
|
5 This Technical Note describes serious problems that have been found in |
|
6 TIFF 6.0's design for embedding JPEG-compressed data in TIFF (Section 22 |
|
7 of the TIFF 6.0 spec of 3 June 1992). A replacement TIFF/JPEG |
|
8 specification is given. Some corrections to Section 21 are also given. |
|
9 |
|
10 To permit TIFF implementations to continue to read existing files, the 6.0 |
|
11 JPEG fields and tag values will remain reserved indefinitely. However, |
|
12 TIFF writers are strongly discouraged from using the 6.0 JPEG design. It |
|
13 is expected that the next full release of the TIFF specification will not |
|
14 describe the old design at all, except to note that certain tag numbers |
|
15 are reserved. The existing Section 22 will be replaced by the |
|
16 specification text given in the second part of this Tech Note. |
|
17 |
|
18 |
|
19 Problems in TIFF 6.0 JPEG |
|
20 ========================= |
|
21 |
|
22 Abandoning a published spec is not a step to be taken lightly. This |
|
23 section summarizes the reasons that have forced this decision. |
|
24 TIFF 6.0's JPEG design suffers from design errors and limitations, |
|
25 ambiguities, and unnecessary complexity. |
|
26 |
|
27 |
|
28 Design errors and limitations |
|
29 ----------------------------- |
|
30 |
|
31 The fundamental design error in the existing Section 22 is that JPEG's |
|
32 various tables and parameters are broken out as separate fields which the |
|
33 TIFF control logic must manage. This is bad software engineering: that |
|
34 information should be treated as private to the JPEG codec |
|
35 (compressor/decompressor). Worse, the fields themselves are specified |
|
36 without sufficient thought for future extension and without regard to |
|
37 well-established TIFF conventions. Here are some of the significant |
|
38 problems: |
|
39 |
|
40 * The JPEGxxTable fields do not store the table data directly in the |
|
41 IFD/field structure; rather, the fields hold pointers to information |
|
42 elsewhere in the file. This requires special-purpose code to be added to |
|
43 *every* TIFF-manipulating application, whether it needs to decode JPEG |
|
44 image data or not. Even a trivial TIFF editor, for example a program to |
|
45 add an ImageDescription field to a TIFF file, must be explicitly aware of |
|
46 the internal structure of the JPEG-related tables, or else it will probably |
|
47 break the file. Every other auxiliary field in the TIFF spec contains |
|
48 data, not pointers, and can be copied or relocated by standard code that |
|
49 doesn't know anything about the particular field. This is a crucial |
|
50 property of the TIFF format that must not be given up. |
|
51 |
|
52 * To manipulate these fields, the TIFF control logic is required to know a |
|
53 great deal about JPEG details, for example such arcana as how to compute |
|
54 the length of a Huffman code table --- the length is not supplied in the |
|
55 field structure and can only be found by inspecting the table contents. |
|
56 This is again a violation of good software practice. Moreover, it will |
|
57 prevent easy adoption of future JPEG extensions that might change these |
|
58 low-level details. |
|
59 |
|
60 * The design neglects the fact that baseline JPEG codecs support only two |
|
61 sets of Huffman tables: it specifies a separate table for each color |
|
62 component. This implies that encoders must waste space (by storing |
|
63 duplicate Huffman tables) or else violate the well-founded TIFF convention |
|
64 that prohibits duplicate pointers. Furthermore, baseline decoders must |
|
65 test to find out which tables are identical, a waste of time and code |
|
66 space. |
|
67 |
|
68 * The JPEGInterchangeFormat field also violates TIFF's proscription against |
|
69 duplicate pointers: the normal strip/tile pointers are expected to point |
|
70 into the larger data area pointed to by JPEGInterchangeFormat. All TIFF |
|
71 editing applications must be specifically aware of this relationship, since |
|
72 they must maintain it or else delete the JPEGInterchangeFormat field. The |
|
73 JPEGxxTables fields are also likely to point into the JPEGInterchangeFormat |
|
74 area, creating additional pointer relationships that must be maintained. |
|
75 |
|
76 * The JPEGQTables field is fixed at a byte per table entry; there is no |
|
77 way to support 16-bit quantization values. This is a serious impediment |
|
78 to extending TIFF to use 12-bit JPEG. |
|
79 |
|
80 * The 6.0 design cannot support using different quantization tables in |
|
81 different strips/tiles of an image (so as to encode some areas at higher |
|
82 quality than others). Furthermore, since quantization tables are tied |
|
83 one-for-one to color components, the design cannot support table switching |
|
84 options that are likely to be added in future JPEG revisions. |
|
85 |
|
86 |
|
87 Ambiguities |
|
88 ----------- |
|
89 |
|
90 Several incompatible interpretations are possible for 6.0's treatment of |
|
91 JPEG restart markers: |
|
92 |
|
93 * It is unclear whether restart markers must be omitted at TIFF segment |
|
94 (strip/tile) boundaries, or whether they are optional. |
|
95 |
|
96 * It is unclear whether the segment size is required to be chosen as |
|
97 a multiple of the specified restart interval (if any); perhaps the |
|
98 JPEG codec is supposed to be reset at each segment boundary as if |
|
99 there were a restart marker there, even if the boundary does not fall |
|
100 at a multiple of the nominal restart interval. |
|
101 |
|
102 * The spec fails to address the question of restart marker numbering: |
|
103 do the numbers begin again within each segment, or not? |
|
104 |
|
105 That last point is particularly nasty. If we make numbering begin again |
|
106 within each segment, we give up the ability to impose a TIFF strip/tile |
|
107 structure on an existing JPEG datastream with restarts (which was clearly a |
|
108 goal of Section 22's authors). But the other choice interferes with random |
|
109 access to the image segments: a reader must compute the first restart |
|
110 number to be expected within a segment, and must have a way to reset its |
|
111 JPEG decoder to expect a nonzero restart number first. This may not even |
|
112 be possible with some JPEG chips. |
|
113 |
|
114 The tile height restriction found on page 104 contradicts Section 15's |
|
115 general description of tiles. For an image that is not vertically |
|
116 downsampled, page 104 specifies a tile height of one MCU or 8 pixels; but |
|
117 Section 15 requires tiles to be a multiple of 16 pixels high. |
|
118 |
|
119 This Tech Note does not attempt to resolve these ambiguities, so |
|
120 implementations that follow the 6.0 design should be aware that |
|
121 inter-application compatibility problems are likely to arise. |
|
122 |
|
123 |
|
124 Unnecessary complexity |
|
125 ---------------------- |
|
126 |
|
127 The 6.0 design creates problems for implementations that need to keep the |
|
128 JPEG codec separate from the TIFF control logic --- for example, consider |
|
129 using a JPEG chip that was not designed specifically for TIFF. JPEG codecs |
|
130 generally want to produce or consume a standard ISO JPEG datastream, not |
|
131 just raw compressed data. (If they were to handle raw data, a separate |
|
132 out-of-band mechanism would be needed to load tables into the codec.) |
|
133 With such a codec, the TIFF control logic must parse JPEG markers emitted |
|
134 by the codec to create the TIFF table fields (when writing) or synthesize |
|
135 JPEG markers from the TIFF fields to feed the codec (when reading). This |
|
136 means that the control logic must know a great deal more about JPEG details |
|
137 than we would like. The parsing and reconstruction of the markers also |
|
138 represents a fair amount of unnecessary work. |
|
139 |
|
140 Quite a few implementors have proposed writing "TIFF/JPEG" files in which |
|
141 a standard JPEG datastream is simply dumped into the file and pointed to |
|
142 by JPEGInterchangeFormat. To avoid parsing the JPEG datastream, they |
|
143 suggest not writing the JPEG auxiliary fields (JPEGxxTables etc) nor even |
|
144 the basic TIFF strip/tile data pointers. This approach is incompatible |
|
145 with implementations that handle the full TIFF 6.0 JPEG design, since they |
|
146 will expect to find strip/tile pointers and auxiliary fields. Indeed this |
|
147 is arguably not TIFF at all, since *all* TIFF-reading applications expect |
|
148 to find strip or tile pointers. A subset implementation that is not |
|
149 upward-compatible with the full spec is clearly unacceptable. However, |
|
150 the frequency with which this idea has come up makes it clear that |
|
151 implementors find the existing Section 22 too complex. |
|
152 |
|
153 |
|
154 Overview of the solution |
|
155 ======================== |
|
156 |
|
157 To solve these problems, we adopt a new design for embedding |
|
158 JPEG-compressed data in TIFF files. The new design uses only complete, |
|
159 uninterpreted ISO JPEG datastreams, so it should be much more forgiving of |
|
160 extensions to the ISO standard. It should also be far easier to implement |
|
161 using unmodified JPEG codecs. |
|
162 |
|
163 To reduce overhead in multi-segment TIFF files, we allow JPEG overhead |
|
164 tables to be stored just once in a JPEGTables auxiliary field. This |
|
165 feature does not violate the integrity of the JPEG datastreams, because it |
|
166 uses the notions of "tables-only datastreams" and "abbreviated image |
|
167 datastreams" as defined by the ISO standard. |
|
168 |
|
169 To prevent confusion with the old design, the new design is given a new |
|
170 Compression tag value, Compression=7. Readers that need to handle |
|
171 existing 6.0 JPEG files may read both old and new files, using whatever |
|
172 interpretation of the 6.0 spec they did before. Compression tag value 6 |
|
173 and the field tag numbers defined by 6.0 section 22 will remain reserved |
|
174 indefinitely, even though detailed descriptions of them will be dropped |
|
175 from future editions of the TIFF specification. |
|
176 |
|
177 |
|
178 Replacement TIFF/JPEG specification |
|
179 =================================== |
|
180 |
|
181 [This section of the Tech Note is expected to replace Section 22 in the |
|
182 next release of the TIFF specification.] |
|
183 |
|
184 This section describes TIFF compression scheme 7, a high-performance |
|
185 compression method for continuous-tone images. |
|
186 |
|
187 Introduction |
|
188 ------------ |
|
189 |
|
190 This TIFF compression method uses the international standard for image |
|
191 compression ISO/IEC 10918-1, usually known as "JPEG" (after the original |
|
192 name of the standards committee, Joint Photographic Experts Group). JPEG |
|
193 is a joint ISO/CCITT standard for compression of continuous-tone images. |
|
194 |
|
195 The JPEG committee decided that because of the broad scope of the standard, |
|
196 no one algorithmic procedure was able to satisfy the requirements of all |
|
197 applications. Instead, the JPEG standard became a "toolkit" of multiple |
|
198 algorithms and optional capabilities. Individual applications may select |
|
199 a subset of the JPEG standard that meets their requirements. |
|
200 |
|
201 The most important distinction among the JPEG processes is between lossy |
|
202 and lossless compression. Lossy compression methods provide high |
|
203 compression but allow only approximate reconstruction of the original |
|
204 image. JPEG's lossy processes allow the encoder to trade off compressed |
|
205 file size against reconstruction fidelity over a wide range. Typically, |
|
206 10:1 or more compression of full-color data can be obtained while keeping |
|
207 the reconstructed image visually indistinguishable from the original. Much |
|
208 higher compression ratios are possible if a low-quality reconstructed image |
|
209 is acceptable. Lossless compression provides exact reconstruction of the |
|
210 source data, but the achievable compression ratio is much lower than for |
|
211 the lossy processes; JPEG's rather simple lossless process typically |
|
212 achieves around 2:1 compression of full-color data. |
|
213 |
|
214 The most widely implemented JPEG subset is the "baseline" JPEG process. |
|
215 This provides lossy compression of 8-bit-per-channel data. Optional |
|
216 extensions include 12-bit-per-channel data, arithmetic entropy coding for |
|
217 better compression, and progressive/hierarchical representations. The |
|
218 lossless process is an independent algorithm that has little in |
|
219 common with the lossy processes. |
|
220 |
|
221 It should be noted that the optional arithmetic-coding extension is subject |
|
222 to several US and Japanese patents. To avoid patent problems, use of |
|
223 arithmetic coding processes in TIFF files intended for inter-application |
|
224 interchange is discouraged. |
|
225 |
|
226 All of the JPEG processes are useful only for "continuous tone" data, |
|
227 in which the difference between adjacent pixel values is usually small. |
|
228 Low-bit-depth source data is not appropriate for JPEG compression, nor |
|
229 are palette-color images good candidates. The JPEG processes work well |
|
230 on grayscale and full-color data. |
|
231 |
|
232 Describing the JPEG compression algorithms in sufficient detail to permit |
|
233 implementation would require more space than we have here. Instead, we |
|
234 refer the reader to the References section. |
|
235 |
|
236 |
|
237 What data is being compressed? |
|
238 ------------------------------ |
|
239 |
|
240 In lossy JPEG compression, it is customary to convert color source data |
|
241 to YCbCr and then downsample it before JPEG compression. This gives |
|
242 2:1 data compression with hardly any visible image degradation, and it |
|
243 permits additional space savings within the JPEG compression step proper. |
|
244 However, these steps are not considered part of the ISO JPEG standard. |
|
245 The ISO standard is "color blind": it accepts data in any color space. |
|
246 |
|
247 For TIFF purposes, the JPEG compression tag is considered to represent the |
|
248 ISO JPEG compression standard only. The ISO standard is applied to the |
|
249 same data that would be stored in the TIFF file if no compression were |
|
250 used. Therefore, if color conversion or downsampling are used, they must |
|
251 be reflected in the regular TIFF fields; these steps are not considered to |
|
252 be implicit in the JPEG compression tag value. PhotometricInterpretation |
|
253 and related fields shall describe the color space actually stored in the |
|
254 file. With the TIFF 6.0 field definitions, downsampling is permissible |
|
255 only for YCbCr data, and it must correspond to the YCbCrSubSampling field. |
|
256 (Note that the default value for this field is not 1,1; so the default for |
|
257 YCbCr is to apply downsampling!) It is likely that future versions of TIFF |
|
258 will provide additional PhotometricInterpretation values and a more general |
|
259 way of defining subsampling, so as to allow more flexibility in |
|
260 JPEG-compressed files. But that issue is not addressed in this Tech Note. |
|
261 |
|
262 Implementors should note that many popular JPEG codecs |
|
263 (compressor/decompressors) provide automatic color conversion and |
|
264 downsampling, so that the application may supply full-size RGB data which |
|
265 is nonetheless converted to downsampled YCbCr. This is an implementation |
|
266 convenience which does not excuse the TIFF control layer from its |
|
267 responsibility to know what is really going on. The |
|
268 PhotometricInterpretation and subsampling fields written to the file must |
|
269 describe what is actually in the file. |
|
270 |
|
271 A JPEG-compressed TIFF file will typically have PhotometricInterpretation = |
|
272 YCbCr and YCbCrSubSampling = [2,1] or [2,2], unless the source data was |
|
273 grayscale or CMYK. |
|
274 |
|
275 |
|
276 Basic representation of JPEG-compressed images |
|
277 ---------------------------------------------- |
|
278 |
|
279 JPEG compression works in either strip-based or tile-based TIFF files. |
|
280 Rather than repeating "strip or tile" constantly, we will use the term |
|
281 "segment" to mean either a strip or a tile. |
|
282 |
|
283 When the Compression field has the value 7, each image segment contains |
|
284 a complete JPEG datastream which is valid according to the ISO JPEG |
|
285 standard (ISO/IEC 10918-1). Any sequential JPEG process can be used, |
|
286 including lossless JPEG, but progressive and hierarchical processes are not |
|
287 supported. Since JPEG is useful only for continuous-tone images, the |
|
288 PhotometricInterpretation of the image shall not be 3 (palette color) nor |
|
289 4 (transparency mask). The bit depth of the data is also restricted as |
|
290 specified below. |
|
291 |
|
292 Each image segment in a JPEG-compressed TIFF file shall contain a valid |
|
293 JPEG datastream according to the ISO JPEG standard's rules for |
|
294 interchange-format or abbreviated-image-format data. The datastream shall |
|
295 contain a single JPEG frame storing that segment of the image. The |
|
296 required JPEG markers within a segment are: |
|
297 SOI (must appear at very beginning of segment) |
|
298 SOFn |
|
299 SOS (one for each scan, if there is more than one scan) |
|
300 EOI (must appear at very end of segment) |
|
301 The actual compressed data follows SOS; it may contain RSTn markers if DRI |
|
302 is used. |
|
303 |
|
304 Additional JPEG "tables and miscellaneous" markers may appear between SOI |
|
305 and SOFn, between SOFn and SOS, and before each subsequent SOS if there is |
|
306 more than one scan. These markers include: |
|
307 DQT |
|
308 DHT |
|
309 DAC (not to appear unless arithmetic coding is used) |
|
310 DRI |
|
311 APPn (shall be ignored by TIFF readers) |
|
312 COM (shall be ignored by TIFF readers) |
|
313 DNL markers shall not be used in TIFF files. Readers should abort if any |
|
314 other marker type is found, especially the JPEG reserved markers; |
|
315 occurrence of such a marker is likely to indicate a JPEG extension. |
|
316 |
|
317 The tables/miscellaneous markers may appear in any order. Readers are |
|
318 cautioned that although the SOFn marker refers to DQT tables, JPEG does not |
|
319 require those tables to precede the SOFn, only the SOS. Missing-table |
|
320 checks should be made when SOS is reached. |
|
321 |
|
322 If no JPEGTables field is used, then each image segment shall be a complete |
|
323 JPEG interchange datastream. Each segment must define all the tables it |
|
324 references. To allow readers to decode segments in any order, no segment |
|
325 may rely on tables being carried over from a previous segment. |
|
326 |
|
327 When a JPEGTables field is used, image segments may omit tables that have |
|
328 been specified in the JPEGTables field. Further details appear below. |
|
329 |
|
330 The SOFn marker shall be of type SOF0 for strict baseline JPEG data, of |
|
331 type SOF1 for non-baseline lossy JPEG data, or of type SOF3 for lossless |
|
332 JPEG data. (SOF9 or SOF11 would be used for arithmetic coding.) All |
|
333 segments of a JPEG-compressed TIFF image shall use the same JPEG |
|
334 compression process, in particular the same SOFn type. |
|
335 |
|
336 The data precision field of the SOFn marker shall agree with the TIFF |
|
337 BitsPerSample field. (Note that when PlanarConfiguration=1, this implies |
|
338 that all components must have the same BitsPerSample value; when |
|
339 PlanarConfiguration=2, different components could have different bit |
|
340 depths.) For SOF0 only precision 8 is permitted; for SOF1, precision 8 or |
|
341 12 is permitted; for SOF3, precisions 2 to 16 are permitted. |
|
342 |
|
343 The image dimensions given in the SOFn marker shall agree with the logical |
|
344 dimensions of that particular strip or tile. For strip images, the SOFn |
|
345 image width shall equal ImageWidth and the height shall equal RowsPerStrip, |
|
346 except in the last strip; its SOFn height shall equal the number of rows |
|
347 remaining in the ImageLength. (In other words, no padding data is counted |
|
348 in the SOFn dimensions.) For tile images, each SOFn shall have width |
|
349 TileWidth and height TileHeight; adding and removing any padding needed in |
|
350 the edge tiles is the concern of some higher level of the TIFF software. |
|
351 (The dimensional rules are slightly different when PlanarConfiguration=2, |
|
352 as described below.) |
|
353 |
|
354 The ISO JPEG standard only permits images up to 65535 pixels in width or |
|
355 height, due to 2-byte fields in the SOFn markers. In TIFF, this limits |
|
356 the size of an individual JPEG-compressed strip or tile, but the total |
|
357 image size can be greater. |
|
358 |
|
359 The number of components in the JPEG datastream shall equal SamplesPerPixel |
|
360 for PlanarConfiguration=1, and shall be 1 for PlanarConfiguration=2. The |
|
361 components shall be stored in the same order as they are described at the |
|
362 TIFF field level. (This applies both to their order in the SOFn marker, |
|
363 and to the order in which they are scanned if multiple JPEG scans are |
|
364 used.) The component ID bytes are arbitrary so long as each component |
|
365 within an image segment is given a distinct ID. To avoid any possible |
|
366 confusion, we require that all segments of a TIFF image use the same ID |
|
367 code for a given component. |
|
368 |
|
369 In PlanarConfiguration 1, the sampling factors given in SOFn markers shall |
|
370 agree with the sampling factors defined by the related TIFF fields (or with |
|
371 the default values that are specified in the absence of those fields). |
|
372 |
|
373 When DCT-based JPEG is used in a strip TIFF file, RowsPerStrip is required |
|
374 to be a multiple of 8 times the largest vertical sampling factor, i.e., a |
|
375 multiple of the height of an interleaved MCU. (For simplicity of |
|
376 specification, we require this even if the data is not actually |
|
377 interleaved.) For example, if YCbCrSubSampling = [2,2] then RowsPerStrip |
|
378 must be a multiple of 16. An exception to this rule is made for |
|
379 single-strip images (RowsPerStrip >= ImageLength): the exact value of |
|
380 RowsPerStrip is unimportant in that case. This rule ensures that no data |
|
381 padding is needed at the bottom of a strip, except perhaps the last strip. |
|
382 Any padding required at the right edge of the image, or at the bottom of |
|
383 the last strip, is expected to occur internally to the JPEG codec. |
|
384 |
|
385 When DCT-based JPEG is used in a tiled TIFF file, TileLength is required |
|
386 to be a multiple of 8 times the largest vertical sampling factor, i.e., |
|
387 a multiple of the height of an interleaved MCU; and TileWidth is required |
|
388 to be a multiple of 8 times the largest horizontal sampling factor, i.e., |
|
389 a multiple of the width of an interleaved MCU. (For simplicity of |
|
390 specification, we require this even if the data is not actually |
|
391 interleaved.) All edge padding required will therefore occur in the course |
|
392 of normal TIFF tile padding; it is not special to JPEG. |
|
393 |
|
394 Lossless JPEG does not impose these constraints on strip and tile sizes, |
|
395 since it is not DCT-based. |
|
396 |
|
397 Note that within JPEG datastreams, multibyte values appear in the MSB-first |
|
398 order specified by the JPEG standard, regardless of the byte ordering of |
|
399 the surrounding TIFF file. |
|
400 |
|
401 |
|
402 JPEGTables field |
|
403 ---------------- |
|
404 |
|
405 The only auxiliary TIFF field added for Compression=7 is the optional |
|
406 JPEGTables field. The purpose of JPEGTables is to predefine JPEG |
|
407 quantization and/or Huffman tables for subsequent use by JPEG image |
|
408 segments. When this is done, these rather bulky tables need not be |
|
409 duplicated in each segment, thus saving space and processing time. |
|
410 JPEGTables may be used even in a single-segment file, although there is no |
|
411 space savings in that case. |
|
412 |
|
413 JPEGTables: |
|
414 Tag = 347 (15B.H) |
|
415 Type = UNDEFINED |
|
416 N = number of bytes in tables datastream, typically a few hundred |
|
417 JPEGTables provides default JPEG quantization and/or Huffman tables which |
|
418 are used whenever a segment datastream does not contain its own tables, as |
|
419 specified below. |
|
420 |
|
421 Notice that the JPEGTables field is required to have type code UNDEFINED, |
|
422 not type code BYTE. This is to cue readers that expanding individual bytes |
|
423 to short or long integers is not appropriate. A TIFF reader will generally |
|
424 need to store the field value as an uninterpreted byte sequence until it is |
|
425 fed to the JPEG decoder. |
|
426 |
|
427 Multibyte quantities within the tables follow the ISO JPEG convention of |
|
428 MSB-first storage, regardless of the byte ordering of the surrounding TIFF |
|
429 file. |
|
430 |
|
431 When the JPEGTables field is present, it shall contain a valid JPEG |
|
432 "abbreviated table specification" datastream. This datastream shall begin |
|
433 with SOI and end with EOI. It may contain zero or more JPEG "tables and |
|
434 miscellaneous" markers, namely: |
|
435 DQT |
|
436 DHT |
|
437 DAC (not to appear unless arithmetic coding is used) |
|
438 DRI |
|
439 APPn (shall be ignored by TIFF readers) |
|
440 COM (shall be ignored by TIFF readers) |
|
441 Since JPEG defines the SOI marker to reset the DAC and DRI state, these two |
|
442 markers' values cannot be carried over into any image datastream, and thus |
|
443 they are effectively no-ops in the JPEGTables field. To avoid confusion, |
|
444 it is recommended that writers not place DAC or DRI markers in JPEGTables. |
|
445 However readers must properly skip over them if they appear. |
|
446 |
|
447 When JPEGTables is present, readers shall load the table specifications |
|
448 contained in JPEGTables before processing image segment datastreams. |
|
449 Image segments may simply refer to these preloaded tables without defining |
|
450 them. An image segment can still define and use its own tables, subject to |
|
451 the restrictions below. |
|
452 |
|
453 An image segment may not redefine any table defined in JPEGTables. (This |
|
454 restriction is imposed to allow readers to process image segments in random |
|
455 order without having to reload JPEGTables between segments.) Therefore, use |
|
456 of JPEGTables divides the available table slots into two groups: "global" |
|
457 slots are defined in JPEGTables and may be used but not redefined by |
|
458 segments; "local" slots are available for local definition and use in each |
|
459 segment. To permit random access, a segment may not reference any local |
|
460 tables that it does not itself define. |
|
461 |
|
462 |
|
463 Special considerations for PlanarConfiguration 2 |
|
464 ------------------------------------------------ |
|
465 |
|
466 In PlanarConfiguration 2, each image segment contains data for only one |
|
467 color component. To avoid confusing the JPEG codec, we wish the segments |
|
468 to look like valid single-channel (i.e., grayscale) JPEG datastreams. This |
|
469 means that different rules must be used for the SOFn parameters. |
|
470 |
|
471 In PlanarConfiguration 2, the dimensions given in the SOFn of a subsampled |
|
472 component shall be scaled down by the sampling factors compared to the SOFn |
|
473 dimensions that would be used in PlanarConfiguration 1. This is necessary |
|
474 to match the actual number of samples stored in that segment, so that the |
|
475 JPEG codec doesn't complain about too much or too little data. In strip |
|
476 TIFF files the computed dimensions may need to be rounded up to the next |
|
477 integer; in tiled files, the restrictions on tile size make this case |
|
478 impossible. |
|
479 |
|
480 Furthermore, all SOFn sampling factors shall be given as 1. (This is |
|
481 merely to avoid confusion, since the sampling factors in a single-channel |
|
482 JPEG datastream have no real effect.) |
|
483 |
|
484 Any downsampling will need to happen externally to the JPEG codec, since |
|
485 JPEG sampling factors are defined with reference to the full-precision |
|
486 component. In PlanarConfiguration 2, the JPEG codec will be working on |
|
487 only one component at a time and thus will have no reference component to |
|
488 downsample against. |
|
489 |
|
490 |
|
491 Minimum requirements for TIFF/JPEG |
|
492 ---------------------------------- |
|
493 |
|
494 ISO JPEG is a large and complex standard; most implementations support only |
|
495 a subset of it. Here we define a "core" subset of TIFF/JPEG which readers |
|
496 must support to claim TIFF/JPEG compatibility. For maximum |
|
497 cross-application compatibility, we recommend that writers confine |
|
498 themselves to this subset unless there is very good reason to do otherwise. |
|
499 |
|
500 Use the ISO baseline JPEG process: 8-bit data precision, Huffman coding, |
|
501 with no more than 2 DC and 2 AC Huffman tables. Note that this implies |
|
502 BitsPerSample = 8 for each component. We recommend deviating from baseline |
|
503 JPEG only if 12-bit data precision or lossless coding is required. |
|
504 |
|
505 Use no subsampling (all JPEG sampling factors = 1) for color spaces other |
|
506 than YCbCr. (This is, in fact, required with the TIFF 6.0 field |
|
507 definitions, but may not be so in future revisions.) For YCbCr, use one of |
|
508 the following choices: |
|
509 YCbCrSubSampling field JPEG sampling factors |
|
510 1,1 1h1v, 1h1v, 1h1v |
|
511 2,1 2h1v, 1h1v, 1h1v |
|
512 2,2 (default value) 2h2v, 1h1v, 1h1v |
|
513 We recommend that RGB source data be converted to YCbCr for best compression |
|
514 results. Other source data colorspaces should probably be left alone. |
|
515 Minimal readers need not support JPEG images with colorspaces other than |
|
516 YCbCr and grayscale (PhotometricInterpretation = 6 or 1). |
|
517 |
|
518 A minimal reader also need not support JPEG YCbCr images with nondefault |
|
519 values of YCbCrCoefficients or YCbCrPositioning, nor with values of |
|
520 ReferenceBlackWhite other than [0,255,128,255,128,255]. (These values |
|
521 correspond to the RGB<=>YCbCr conversion specified by JFIF, which is widely |
|
522 implemented in JPEG codecs.) |
|
523 |
|
524 Writers are reminded that a ReferenceBlackWhite field *must* be included |
|
525 when PhotometricInterpretation is YCbCr, because the default |
|
526 ReferenceBlackWhite values are inappropriate for YCbCr. |
|
527 |
|
528 If any subsampling is used, PlanarConfiguration=1 is preferred to avoid the |
|
529 possibly-confusing requirements of PlanarConfiguration=2. In any case, |
|
530 readers are not required to support PlanarConfiguration=2. |
|
531 |
|
532 If possible, use a single interleaved scan in each image segment. This is |
|
533 not legal JPEG if there are more than 4 SamplesPerPixel or if the sampling |
|
534 factors are such that more than 10 blocks would be needed per MCU; in that |
|
535 case, use a separate scan for each component. (The recommended color |
|
536 spaces and sampling factors will not run into that restriction, so a |
|
537 minimal reader need not support more than one scan per segment.) |
|
538 |
|
539 To claim TIFF/JPEG compatibility, readers shall support multiple-strip TIFF |
|
540 files and the optional JPEGTables field; it is not acceptable to read only |
|
541 single-datastream files. Support for tiled TIFF files is strongly |
|
542 recommended but not required. |
|
543 |
|
544 |
|
545 Other recommendations for implementors |
|
546 -------------------------------------- |
|
547 |
|
548 The TIFF tag Compression=7 guarantees only that the compressed data is |
|
549 represented as ISO JPEG datastreams. Since JPEG is a large and evolving |
|
550 standard, readers should apply careful error checking to the JPEG markers |
|
551 to ensure that the compression process is within their capabilities. In |
|
552 particular, to avoid being confused by future extensions to the JPEG |
|
553 standard, it is important to abort if unknown marker codes are seen. |
|
554 |
|
555 The point of requiring that all image segments use the same JPEG process is |
|
556 to ensure that a reader need check only one segment to determine whether it |
|
557 can handle the image. For example, consider a TIFF reader that has access |
|
558 to fast but restricted JPEG hardware, as well as a slower, more general |
|
559 software implementation. It is desirable to check only one image segment |
|
560 to find out whether the fast hardware can be used. Thus, writers should |
|
561 try to ensure that all segments of an image look as much "alike" as |
|
562 possible: there should be no variation in scan layout, use of options such |
|
563 as DRI, etc. Ideally, segments will be processed identically except |
|
564 perhaps for using different local quantization or entropy-coding tables. |
|
565 |
|
566 Writers should avoid including "noise" JPEG markers (COM and APPn markers). |
|
567 Standard TIFF fields provide a better way to transport any non-image data. |
|
568 Some JPEG codecs may change behavior if they see an APPn marker they |
|
569 think they understand; since the TIFF spec requires these markers to be |
|
570 ignored, this behavior is undesirable. |
|
571 |
|
572 It is possible to convert an interchange-JPEG file (e.g., a JFIF file) to |
|
573 TIFF simply by dropping the interchange datastream into a single strip. |
|
574 (However, designers are reminded that the TIFF spec discourages huge |
|
575 strips; splitting the image is somewhat more work but may give better |
|
576 results.) Conversion from TIFF to interchange JPEG is more complex. A |
|
577 strip-based TIFF/JPEG file can be converted fairly easily if all strips use |
|
578 identical JPEG tables and no RSTn markers: just delete the overhead markers |
|
579 and insert RSTn markers between strips. Converting tiled images is harder, |
|
580 since the data will usually not be in the right order (unless the tiles are |
|
581 only one MCU high). This can still be done losslessly, but it will require |
|
582 undoing and redoing the entropy coding so that the DC coefficient |
|
583 differences can be updated. |
|
584 |
|
585 There is no default value for JPEGTables: standard TIFF files must define all |
|
586 tables that they reference. For some closed systems in which many files will |
|
587 have identical tables, it might make sense to define a default JPEGTables |
|
588 value to avoid actually storing the tables. Or even better, invent a |
|
589 private field selecting one of N default JPEGTables settings, so as to allow |
|
590 for future expansion. Either of these must be regarded as a private |
|
591 extension that will render the files unreadable by other applications. |
|
592 |
|
593 |
|
594 References |
|
595 ---------- |
|
596 |
|
597 [1] Wallace, Gregory K. "The JPEG Still Picture Compression Standard", |
|
598 Communications of the ACM, April 1991 (vol. 34 no. 4), pp. 30-44. |
|
599 |
|
600 This is the best short technical introduction to the JPEG algorithms. |
|
601 It is a good overview but does not provide sufficiently detailed |
|
602 information to write an implementation. |
|
603 |
|
604 [2] Pennebaker, William B. and Mitchell, Joan L. "JPEG Still Image Data |
|
605 Compression Standard", Van Nostrand Reinhold, 1993, ISBN 0-442-01272-1. |
|
606 638pp. |
|
607 |
|
608 This textbook is by far the most complete exposition of JPEG in existence. |
|
609 It includes the full text of the ISO JPEG standards (DIS 10918-1 and draft |
|
610 DIS 10918-2). No would-be JPEG implementor should be without it. |
|
611 |
|
612 [3] ISO/IEC IS 10918-1, "Digital Compression and Coding of Continuous-tone |
|
613 Still Images, Part 1: Requirements and guidelines", February 1994. |
|
614 ISO/IEC DIS 10918-2, "Digital Compression and Coding of Continuous-tone |
|
615 Still Images, Part 2: Compliance testing", final approval expected 1994. |
|
616 |
|
617 These are the official standards documents. Note that the Pennebaker and |
|
618 Mitchell textbook is likely to be cheaper and more useful than the official |
|
619 standards. |
|
620 |
|
621 |
|
622 Changes to Section 21: YCbCr Images |
|
623 =================================== |
|
624 |
|
625 [This section of the Tech Note clarifies section 21 to make clear the |
|
626 interpretation of image dimensions in a subsampled image. Furthermore, |
|
627 the section is changed to allow the original image dimensions not to be |
|
628 multiples of the sampling factors. This change is necessary to support use |
|
629 of JPEG compression on odd-size images.] |
|
630 |
|
631 Add the following paragraphs to the Section 21 introduction (p. 89), |
|
632 just after the paragraph beginning "When a Class Y image is subsampled": |
|
633 |
|
634 In a subsampled image, it is understood that all TIFF image |
|
635 dimensions are measured in terms of the highest-resolution |
|
636 (luminance) component. In particular, ImageWidth, ImageLength, |
|
637 RowsPerStrip, TileWidth, TileLength, XResolution, and YResolution |
|
638 are measured in luminance samples. |
|
639 |
|
640 RowsPerStrip, TileWidth, and TileLength are constrained so that |
|
641 there are an integral number of samples of each component in a |
|
642 complete strip or tile. However, ImageWidth/ImageLength are not |
|
643 constrained. If an odd-size image is to be converted to subsampled |
|
644 format, the writer should pad the source data to a multiple of the |
|
645 sampling factors by replication of the last column and/or row, then |
|
646 downsample. The number of luminance samples actually stored in the |
|
647 file will be a multiple of the sampling factors. Conversely, |
|
648 readers must ignore any extra data (outside the specified image |
|
649 dimensions) after upsampling. |
|
650 |
|
651 When PlanarConfiguration=2, each strip or tile covers the same |
|
652 image area despite subsampling; that is, the total number of strips |
|
653 or tiles in the image is the same for each component. Therefore |
|
654 strips or tiles of the subsampled components contain fewer samples |
|
655 than strips or tiles of the luminance component. |
|
656 |
|
657 If there are extra samples per pixel (see field ExtraSamples), |
|
658 these data channels have the same number of samples as the |
|
659 luminance component. |
|
660 |
|
661 Rewrite the YCbCrSubSampling field description (pp 91-92) as follows |
|
662 (largely to eliminate possibly-misleading references to |
|
663 ImageWidth/ImageLength of the subsampled components): |
|
664 |
|
665 (first paragraph unchanged) |
|
666 |
|
667 The two elements of this field are defined as follows: |
|
668 |
|
669 Short 0: ChromaSubsampleHoriz: |
|
670 |
|
671 1 = there are equal numbers of luma and chroma samples horizontally. |
|
672 |
|
673 2 = there are twice as many luma samples as chroma samples |
|
674 horizontally. |
|
675 |
|
676 4 = there are four times as many luma samples as chroma samples |
|
677 horizontally. |
|
678 |
|
679 Short 1: ChromaSubsampleVert: |
|
680 |
|
681 1 = there are equal numbers of luma and chroma samples vertically. |
|
682 |
|
683 2 = there are twice as many luma samples as chroma samples |
|
684 vertically. |
|
685 |
|
686 4 = there are four times as many luma samples as chroma samples |
|
687 vertically. |
|
688 |
|
689 ChromaSubsampleVert shall always be less than or equal to |
|
690 ChromaSubsampleHoriz. Note that Cb and Cr have the same sampling |
|
691 ratios. |
|
692 |
|
693 In a strip TIFF file, RowsPerStrip is required to be an integer |
|
694 multiple of ChromaSubSampleVert (unless RowsPerStrip >= |
|
695 ImageLength, in which case its exact value is unimportant). |
|
696 If ImageWidth and ImageLength are not multiples of |
|
697 ChromaSubsampleHoriz and ChromaSubsampleVert respectively, then the |
|
698 source data shall be padded to the next integer multiple of these |
|
699 values before downsampling. |
|
700 |
|
701 In a tiled TIFF file, TileWidth must be an integer multiple of |
|
702 ChromaSubsampleHoriz and TileLength must be an integer multiple of |
|
703 ChromaSubsampleVert. Padding will occur to tile boundaries. |
|
704 |
|
705 The default values of this field are [ 2,2 ]. Thus, YCbCr data is |
|
706 downsampled by default! |
|
707 </pre> |