The eMag Link Monthly Articles
March Edition
View previous articles
How Can You Read Scanned Images Archived on an AS/400?
Hashing Added to Forensic Scanning Option of MM/PC
How Can You Read Scanned Images Archived on an AS/400?
Many companies today have a vast amount of data stored on optical storage systems. At times these scanned images, that can be archived on an AS/400 platform in MO:DCA type format, need to be read outside of this platform as a standard image such as TIFF or JPEG (eg. MO:DCA to TIFF conversion). The scanned document is purely a picture and has no means of being referenced. (ie. it has no text that can be searched for). The problem here is often the requirement to address by a level of content whose indexing information can be located in a separate file, or files, such as in DB2 format. So how do you do this?
Generally in the first stage of the conversion you need to be able to read the optical disks and create the required images. Many optical disks use proprietary recording methods so it is not possible to just attach an optical drive to a PC or Unix system. In the second stage you would need to convert the index files. An example might be several index files with many millions of records. Associated with the index records are the same number of scanned documents, often multi-page, and they can conform to different logical types of documents. For instance for a bank, the documents could be account details, statements or loan agreements.
In moving and converting files between different platforms there can be an issue of filenames. Scanned images often use sequential numbers or similar naming systems, but as these numbers are referenced within the index, it is essential that any renaming required for platform compatibility is tracked through a revised index. At the same time, it may be necessary to process any index files into a format suitable for importing into a new system.
The final stage in any conversion exercise is to ensure that the receiving system can handle the converted files. There may be limitations on file size or numbers of files within a subdirectory. It is also essential to choose a compatible media such as DVD or tape. Once each of the above stages has been resolved, and there can be many options at each stage, a small or large-scale conversion can take place.
eMag can help you with this type of optical conversion as well as many others. Contact us today to learn more.
Hashing Added to Forensic Scanning Option of MM/PC
A hash is an algorithm for computing a condensed representation of a message or a data file. The condensed representation is of fixed length and is known as a 'message digest' or 'fingerprint'. In theory, like with a human fingerprint, it is computationally infeasible to produce two messages having the same message digest. This uniqueness enables the message digest to act as a 'fingerprint' of the message... opening up the possibility of using this technology for validating data integrity and comparison checking.
For instance when you download or receive a file, you can use MD5 or SHA-1 to guarantee that you have the correct, unaltered file by comparing its hash value with the original. You are essentially verifying the file's integrity. A typical hash value looks something like F18CB6E5CE925864BD872CD953C2A2755EAD3368... exciting stuff! Now that you know what to look for, you will see hash values mentioned on numerous web sites on their download pages.
On a day-to-day basis, we use hash algorithms all the time. Whenever we log into the computer, a hash value of our password is compared against the hash value on record, and if it matches, we are let into the system. A hash is a uni-directional computation. One can't back compute the value to determine initial file/string contents.
Today there are two hash algorithms in mainstream use; SHA-1 and the older MD5.
SHA-1: The Secure Hash Algorithm (SHA) was developed by NIST and is specified in the Secure Hash Standard (SHS, FIPS 180). SHA-1 is a revision to this version and was published in 1994. It is also described in the ANSI X9.30 (part 2) standard. SHA-1 produces a 160-bit (20 byte) message digest. Although slower than MD5, this larger digest size makes it stronger against brute force attacks.
MD5: MD5 was developed by Professor Ronald L. Rivest in 1994. Its 128 bit (16 byte) message digest makes it a faster implementation than SHA-1.
So how have we incorporated hashing within MediaMerge/PC software? MM/PC is used to restore files from tape. Today's tapes, with their ever increasing capacities, can easily store 500,000 files and are typically backups. In the forensic field, you examine the contents of the tape. If we compute the hash value of a file at restoration time and then check to see if that value has already been computed previously, we can determine if we have previously restored a file with the same contents (not name/date). If this is the case, then we can optionally not restore this file, and in some cases greatly reduce the number of files to be restored and forensically examined making the process much quicker. This value can also be used against a list of known hash values, say for system files or known illegal images, and used to determine whether or not to restore a file.
We have added MD5 and SHA-1 hashes to the Forensic Scan option within MM/PC at the request of some of our forensics clients, and this is available for download now or contact us today to learn more. Note: We will soon be adding a de-duplicating option to this facility and will let you know more about this shortly.
This article may be re-published as long as the following resource box is included at the end of the article and as long as you link to the email address and the URL mentioned in the resource box:
Article by eMag Solutions. For more articles on eDiscovery and Data Restoration, subscribe to our e-mail Newsletter by sending a blank email to newsletter@emaglink.com or by going to http://www.emaglink.com.