|
1 EVALID |
|
2 19/10/2005 |
|
3 |
|
4 EVALID compares two trees of files, ignoring non-significant differences in |
|
5 some types of files. |
|
6 |
|
7 EVALID can perform the comparision using two different methodlogies. |
|
8 |
|
9 1) Direct comparision of two trees of files, where each tree must be accessible |
|
10 at run to time to EVALID. |
|
11 |
|
12 2) Indirect comparision which generates a set of MD5 signitures for the files |
|
13 in each tree in multiple runs of EVALID and then compares sets of MD5 signitures |
|
14 in an additional run of EVALID. This method does not require each file tree to |
|
15 be accessible at the same time. |
|
16 |
|
17 |
|
18 A typical EVALID report is something like: |
|
19 |
|
20 OK: tree1\file1 and tree2\file1 (identical) |
|
21 OK: tree1\file2 and tree2\file2 (some type) |
|
22 |
|
23 FAILED: tree1\file3 and tree2\file3 (some type) |
|
24 |
|
25 This indicates that file1 was completely byte-for-byte identical in the two |
|
26 trees, that file2 was the same ignoring non-significant differences, and that |
|
27 file2 was significantly different. Both file2 and file3 were different in the |
|
28 strict byte-for-byte sense, so EVALID examined the file in the first tree to |
|
29 identify the file type, and applied an associated comparison function to ignore |
|
30 the non-significant differences. |
|
31 |
|
32 The -v option to EVALID will report more detail of the comparison failures: for |
|
33 types where the comparison is done on filtered text (e.g. Intel object), the |
|
34 comparison failure will show the lines after filtering has taken place. |
|
35 |
|
36 These are the types of file recognised by EVALID, with the comparison |
|
37 functions and rules applied in each case. File type recognition is based on an |
|
38 examination of the first few bytes of the file, not the file name. |
|
39 |
|
40 |
|
41 -------- |
|
42 (identical) |
|
43 |
|
44 EVALID did not determine the type: the files are the same size and have |
|
45 the same sequence of bytes. |
|
46 |
|
47 -------- |
|
48 (E32 DLL), (E32 EXE), (compressed E32 DLL), (compressed E32 EXE) |
|
49 |
|
50 E32 Image format files as produced by PETRAN. |
|
51 |
|
52 EVALID compares the files using "pediff -e32", which ignores the timestamp and |
|
53 the version number in the file header, but requires the rest of the files to |
|
54 be identical. |
|
55 |
|
56 -------- |
|
57 (Intel DLL), (Intel EXE), (MSDOS executable) |
|
58 |
|
59 PE-COFF format executable for machine type IMAGE_FILE_MACHINE_I386. |
|
60 |
|
61 EVALID applies "pe_dump" to each file and compares the output. |
|
62 |
|
63 The "pe_dump" utility ignores timestamps in the COFF header, the debug data, |
|
64 the export directory and the .rsrc section. The timestamps are set to zero, but |
|
65 then the headers and the contents of each section are required to be identical. |
|
66 |
|
67 -------- |
|
68 (Intel object) |
|
69 |
|
70 PE-COFF object file for machine type IMAGE_FILE_MACHINE_I386. |
|
71 |
|
72 EVALID applies "dumpbin /symbols /exports" to each file and compares the output |
|
73 subject to the following filtering: |
|
74 |
|
75 1) Lines beginning "Dump of file" have the filename removed |
|
76 2) "line #xxxx" references to source line numbers are removed |
|
77 3) The filenames associated with ".file" information have the drive and path information removed. |
|
78 4) The absolute symbol for the LIB.EXE version number is ignored. |
|
79 5) COFF section offsets are ignored |
|
80 6) The size of each section is ignored. |
|
81 7) Summary information about the debug section is ignored |
|
82 |
|
83 The dumpbin output is otherwise expected to be identical. |
|
84 |
|
85 -------- |
|
86 (ARM object) |
|
87 |
|
88 PE-COFF object file for machine type 0x0A00. |
|
89 |
|
90 EVALID applies "nm --no-sort" to each file and compares the output subject to |
|
91 the following filtering: |
|
92 |
|
93 1) All filenames are ignored |
|
94 2) Pathnames of object files are ignored |
|
95 3) The unique symbol generated by dlltool is based on the -o argument |
|
96 specified on the command line, so it is "cleaned" to remove path information |
|
97 |
|
98 The output of nm is otherwise expected to be identical. |
|
99 |
|
100 -------- |
|
101 (Intel library), (ARM library) |
|
102 |
|
103 Archive file in which the first identifable object file is (Intel object) or |
|
104 (ARM object) respectively. |
|
105 |
|
106 EVALID compares libraries using the same approach as used for the corresponding |
|
107 object files: the tools used accept libraries of object files as well as |
|
108 individual files. |
|
109 |
|
110 -------- |
|
111 (Java class) |
|
112 |
|
113 Files which begin with the 0xCAFEBABE magic number. |
|
114 |
|
115 Java class files ought to be (identical) as there are no non-significant |
|
116 differences. EVALID therefore only reports this type for failures. |
|
117 |
|
118 -------- |
|
119 (ZIP file) |
|
120 |
|
121 Files which begin with the signature for PK ZIP format 3.4 |
|
122 |
|
123 EVALID applies "unzip -l -v" to each file and compares the output subject to |
|
124 the following filtering: |
|
125 |
|
126 1) The name of the archive file is ignored |
|
127 2) The time and datestamps on the files are ignored |
|
128 |
|
129 The output of zip is otherwise expected to be identical, i.e. the zip file |
|
130 contains the same filenames with the same sizes and checksums, in the same order. |
|
131 |
|
132 -------- |
|
133 (EPOC Permanent File Store) |
|
134 |
|
135 Files which appear to have UID 1 equal to 0x10000050, without actually |
|
136 computing the Symbian platform checksum to confirm that the UIDs are valid. |
|
137 |
|
138 EVALID applies "pfsdump -c -v" and compares the output, ignoring the name |
|
139 of the file being dumped. Pfsdump is a utility which prints out the |
|
140 contents of the file store in stream ID order, thereby ignoring unreclaimed |
|
141 free space in the filestore or the current location of each stream within |
|
142 the store. |
|
143 |
|
144 -------- |
|
145 (SIS file) |
|
146 |
|
147 Files which appear to have the UIDs for narrow or unicode SIS files. |
|
148 |
|
149 EVALID does not know how to compare SIS files, so this type is only |
|
150 reported for comparison failures. |
|
151 |
|
152 -------- |
|
153 (MSVC database) |
|
154 |
|
155 Microsoft database files, usually Debug databases (.PDB files). |
|
156 |
|
157 EVALID does not know how to compare these files, but assumes that there are no |
|
158 significant differences in these files which won't also be reflected in the |
|
159 associated executables: this type is always reported as an "OK" comparison. |
|
160 |
|
161 -------- |
|
162 (MAP file) |
|
163 |
|
164 MAP file generated by the GNU linker. |
|
165 |
|
166 EVALID filters this text format as follows: |
|
167 |
|
168 1) Names such as ds999.o are ignored (as per ARM object files) |
|
169 2) Pathnames to object files are ignored |
|
170 3) The .stab and .stabstr lines are ignored because they relate to debug information |
|
171 4) Lines which say "size before relaxing" are ignored as they also relate to debug information |
|
172 5) The unique "_head" and "_iname" symbols in import libraries are ignored (as per ARM object files) |
|
173 |
|
174 The files are otherwise expected to be identical. |
|
175 |
|
176 -------- |
|
177 (SGML file) |
|
178 |
|
179 Files which contain "<!DOCTYPE" and therefore follow the SGML standard - this |
|
180 includes XML and HTML files. |
|
181 |
|
182 EVALID filters these text files to remove the text inside single line script |
|
183 comments, e.g. <!-- comment -->, and expects the files to be otherwise identical. |
|
184 |
|
185 -------- |
|
186 (Preprocessed text) |
|
187 |
|
188 Files which begin with "# 1 "filename"" are assumed to be the output of CPP.EXE. |
|
189 |
|
190 EVALID filters these text files to remove the lines which record the #include structure leading to |
|
191 the final contents of the file, and expects the files to be otherwise identical. |
|
192 |
|
193 -------- |
|
194 (unknown format) |
|
195 |
|
196 Files which weren't recognised as any of the above types, and which weren't |
|
197 identical. |
|
198 |
|
199 EVALID only reports this type for comparison failures. |
|
200 |
|
201 -------- |
|
202 (unknown library) |
|
203 |
|
204 Archive files which didn't appear to contain Intel or ARM object files |
|
205 |
|
206 EVALID only reports this type for comparison failures. |
|
207 |
|
208 -------- |
|
209 (Unknown COFF object) |
|
210 |
|
211 COFF format object file with a machine_type which doesn't imply either |
|
212 (ARM object) or (Intel object). |
|
213 |
|
214 EVALID only reports this type for comparison failures. |
|
215 |
|
216 -------- |
|
217 (ELF file) |
|
218 |
|
219 ELF format file. |
|
220 |
|
221 EVALID applies "elfdump" and compares the output, ignoring the program |
|
222 header and section header offsets. Elfdump is a utility which prints out |
|
223 the significant parts of an ELF executable, ignoring non-significant |
|
224 differences such as offsets of symbols in string table. |
|
225 |
|
226 -------- |
|
227 (CHM file) |
|
228 |
|
229 Microsoft's Compiled HTML Help files |
|
230 |
|
231 EVALID applies "hh -decompile" to each file expanding it to a temporary directory and then compares |
|
232 the contents of the temporary directory using the following process: |
|
233 |
|
234 When directly comparing two chm files EVALID will first compare the file listing of the temporary |
|
235 directories and return failed if the file listing is not identical. |
|
236 If the file listing are identical it will then compare the contents of the two temporary directories |
|
237 using the normal EVALID process. |
|
238 |
|
239 When using the MD5 comparing functionality EVALID processes the contents of the temporary directory |
|
240 using the standard EVALID process but amalgamates the results in to one MD5 signiture. |