EVALID19/10/2005EVALID compares two trees of files, ignoring non-significant differences insome types of files.EVALID can perform the comparision using two different methodlogies.1) Direct comparision of two trees of files, where each tree must be accessibleat run to time to EVALID.2) Indirect comparision which generates a set of MD5 signitures for the filesin each tree in multiple runs of EVALID and then compares sets of MD5 signituresin an additional run of EVALID. This method does not require each file tree tobe accessible at the same time.A typical EVALID report is something like: OK: tree1\file1 and tree2\file1 (identical) OK: tree1\file2 and tree2\file2 (some type) FAILED: tree1\file3 and tree2\file3 (some type)This indicates that file1 was completely byte-for-byte identical in the twotrees, that file2 was the same ignoring non-significant differences, and thatfile2 was significantly different. Both file2 and file3 were different in thestrict byte-for-byte sense, so EVALID examined the file in the first tree toidentify the file type, and applied an associated comparison function to ignorethe non-significant differences.The -v option to EVALID will report more detail of the comparison failures: for types where the comparison is done on filtered text (e.g. Intel object), the comparison failure will show the lines after filtering has taken place.These are the types of file recognised by EVALID, with the comparisonfunctions and rules applied in each case. File type recognition is based on anexamination of the first few bytes of the file, not the file name.--------(identical)EVALID did not determine the type: the files are the same size and havethe same sequence of bytes.--------(E32 DLL), (E32 EXE), (compressed E32 DLL), (compressed E32 EXE)E32 Image format files as produced by PETRAN. EVALID compares the files using "pediff -e32", which ignores the timestamp and the version number in the file header, but requires the rest of the files to be identical.--------(Intel DLL), (Intel EXE), (MSDOS executable)PE-COFF format executable for machine type IMAGE_FILE_MACHINE_I386.EVALID applies "pe_dump" to each file and compares the output.The "pe_dump" utility ignores timestamps in the COFF header, the debug data, the export directory and the .rsrc section. The timestamps are set to zero, but then the headers and the contents of each section are required to be identical.--------(Intel object)PE-COFF object file for machine type IMAGE_FILE_MACHINE_I386.EVALID applies "dumpbin /symbols /exports" to each file and compares the outputsubject to the following filtering:1) Lines beginning "Dump of file" have the filename removed2) "line #xxxx" references to source line numbers are removed3) The filenames associated with ".file" information have the drive and path information removed.4) The absolute symbol for the LIB.EXE version number is ignored.5) COFF section offsets are ignored6) The size of each section is ignored.7) Summary information about the debug section is ignoredThe dumpbin output is otherwise expected to be identical.--------(ARM object)PE-COFF object file for machine type 0x0A00.EVALID applies "nm --no-sort" to each file and compares the output subject tothe following filtering:1) All filenames are ignored2) Pathnames of object files are ignored3) The unique symbol generated by dlltool is based on the -o argument specified on the command line, so it is "cleaned" to remove path informationThe output of nm is otherwise expected to be identical.--------(Intel library), (ARM library)Archive file in which the first identifable object file is (Intel object) or(ARM object) respectively. EVALID compares libraries using the same approach as used for the correspondingobject files: the tools used accept libraries of object files as well asindividual files.--------(Java class)Files which begin with the 0xCAFEBABE magic number.Java class files ought to be (identical) as there are no non-significant differences. EVALID therefore only reports this type for failures.--------(ZIP file)Files which begin with the signature for PK ZIP format 3.4EVALID applies "unzip -l -v" to each file and compares the output subject to the following filtering:1) The name of the archive file is ignored2) The time and datestamps on the files are ignoredThe output of zip is otherwise expected to be identical, i.e. the zip filecontains the same filenames with the same sizes and checksums, in the same order.--------(EPOC Permanent File Store)Files which appear to have UID 1 equal to 0x10000050, without actuallycomputing the Symbian platform checksum to confirm that the UIDs are valid.EVALID applies "pfsdump -c -v" and compares the output, ignoring the nameof the file being dumped. Pfsdump is a utility which prints out thecontents of the file store in stream ID order, thereby ignoring unreclaimedfree space in the filestore or the current location of each stream withinthe store.--------(SIS file)Files which appear to have the UIDs for narrow or unicode SIS files.EVALID does not know how to compare SIS files, so this type is onlyreported for comparison failures.--------(MSVC database)Microsoft database files, usually Debug databases (.PDB files).EVALID does not know how to compare these files, but assumes that there are nosignificant differences in these files which won't also be reflected in theassociated executables: this type is always reported as an "OK" comparison.--------(MAP file)MAP file generated by the GNU linker.EVALID filters this text format as follows:1) Names such as ds999.o are ignored (as per ARM object files)2) Pathnames to object files are ignored3) The .stab and .stabstr lines are ignored because they relate to debug information4) Lines which say "size before relaxing" are ignored as they also relate to debug information5) The unique "_head" and "_iname" symbols in import libraries are ignored (as per ARM object files)The files are otherwise expected to be identical.--------(SGML file)Files which contain "<!DOCTYPE" and therefore follow the SGML standard - thisincludes XML and HTML files.EVALID filters these text files to remove the text inside single line scriptcomments, e.g. <!-- comment -->, and expects the files to be otherwise identical.--------(Preprocessed text)Files which begin with "# 1 "filename"" are assumed to be the output of CPP.EXE.EVALID filters these text files to remove the lines which record the #include structure leading tothe final contents of the file, and expects the files to be otherwise identical.--------(unknown format)Files which weren't recognised as any of the above types, and which weren'tidentical.EVALID only reports this type for comparison failures.--------(unknown library)Archive files which didn't appear to contain Intel or ARM object filesEVALID only reports this type for comparison failures.--------(Unknown COFF object)COFF format object file with a machine_type which doesn't imply either(ARM object) or (Intel object).EVALID only reports this type for comparison failures.--------(ELF file)ELF format file.EVALID applies "elfdump" and compares the output, ignoring the programheader and section header offsets. Elfdump is a utility which prints out the significant parts of an ELF executable, ignoring non-significantdifferences such as offsets of symbols in string table.--------(CHM file)Microsoft's Compiled HTML Help filesEVALID applies "hh -decompile" to each file expanding it to a temporary directory and then comparesthe contents of the temporary directory using the following process:When directly comparing two chm files EVALID will first compare the file listing of the temporarydirectories and return failed if the file listing is not identical.If the file listing are identical it will then compare the contents of the two temporary directoriesusing the normal EVALID process.When using the MD5 comparing functionality EVALID processes the contents of the temporary directoryusing the standard EVALID process but amalgamates the results in to one MD5 signiture.