Monday, March 28, 2016

MD5Deep Installation

The project I chose to optimize is called md5deep. This is a program to compute hashes for a number of files. Md5deep doesn't only hash files with md5 but can also use SHA variants, Tiger and Whirlpool. It is similar to the md5sum program but md5deep has some additional features, which I quoted from: http://md5deep.sourceforge.net/
  • Recursive operation - md5deep is able to recursive examine an entire directory tree. That is, compute the MD5 for every file in a directory and for every file in every subdirectory.
  • Comparison mode - md5deep can accept a list of known hashes and compare them to a set of input files. The program can display either those input files that match the list of known hashes or those that do not match. Hashes sets can be drawn from Encase, the National Software Reference LibraryiLook InvestigatorHashkeepermd5sum, BSD md5, and other generic hash generating programs. Users are welcome to add functionality to read other formats too!
  • Time estimation - md5deep can produce a time estimate when it's processing very large files.
  • Piecewise hashing - Hash input files in arbitrary sized blocks
  • File type mode - md5deep can process only files of a certain type, such as regular files, block devices, etc.
I'll be taking a look at the source code and attempt to find optimization possibilities. The possibilities of optimizations are:
  • Improving a hashing algorithm
  • Add code so it works specifically on ARCH64, but not influencing the other systems
  • Edit the make file so it makes the program faster
I downloaded the source code from https://github.com/jessek/hashdeep/releases and ran the following commands to install it correctly:
sh bootstrap.sh
./configure
make
sudo make install
To test that it worked I hashed a few files and saved the hashes in a file. By saving the hashing you can find the matching files by using their comparison feature. The next step will be benchmarking.

No comments:

Post a Comment