Day 20 : 21 July 2022 : Test Program Update : Message Digest Algorithms

My 100 Daze of Code

https://github.com/davidjwalling/100-days-of-code

#20 : Test Program Update: Message Digest Algorithms

A couple of posts ago we added a test program "testava" to the project. At that time, the test program simply logged startup and completion messages. Here, we add some reusable functions to the test harness and demonstrate them by adding the first code under test, message digest algorithms.

A message digest is, usually, a fixed-length binary output of an algorithm applied to a variable-length data input, where the computed output, the message "digest" is highly likely to differ if any bit of the input data changes. Over the years, numerous message digest algorithms have come and gone. Some are preferable for speed and others for applicability to certain problem domains such as cryptology.

Here, we'll introduce message digest classes to our Ava library that implement the MD5, SHA-1 and SHA-2 message digest algorithms. MD5 is no longer recommended in cryptology but still produces fast and effective message digests used for determining if a source file, for example, has been altered from its original. More commonly used now are SHA-1 and SHA-2. SHA-1 produces slightly larger message digests than MD5, 160-bits versus 128-bits, and is also no longer commonly used in cryptology. The SHA-2 algorithms include SHA-224, SHA-256, SHA-384 and SHA-512. In a future post, we'll look further at more modern message digest algorithms including SHA-3, part of the Keccak family of cryptographic primitives, based on RadioGatun).

The source code for my implementation of the message digest algorithms is not reproduced here, but can be found in the source code repository on Github at the link above.

Updates to the main Routine

The testava main routine is updated to handle program arguments. There are three keyword variables that can be entered as program arguments. The "test" keyword, if entered first, will instruct the program that algorithm validation tests are to be run. The "all" keyword instructs the program not to prompt for each test category. The "dump" keyword instructs the program to display test variable values before and after tests.

If an invalid number of program arguments is entered or if "help" is provided as the first program argument, a "usage" string is displayed. If no program arguments are provided at start, the program will accept arguments until a carriage return is entered. If "test" was provided as the first argument, the main routine will call runTests when program arguments are processed.

The runTests Routine

The runTests routine starts by displaying the size of several data types, which can vary depending on the operating system in use. Then the queryRun routine is called to prompt the user whether to run the testDigestAlgs routine. In later posts, we'll add calls to queryRun for other test categories. Finally a message is output indicating that all tests have been run.

The queryRun Routine

The queryRun routine is a reusable routine that prompts the user whether to run a test routine unless the "all" program argument was provided.

The testDigestAlgs Routine

The testDigestAlgs routine prompts the user whether to run tests for each supported message digest algorithm. If the "all" program argument was entered, then all tests are run without prompting the user. If the "dump" program argument was entered, then test vectors and results are displayed. 

For each message digest test set, a structure of type DIGESTTEST is defined that includes the test label, the input vector, a count of iterations and the desired output. As an example, for SHA256, there are six total tests. Each test has its input vector and expected result, and the input length. And one of the tests is iterated 10,000 times.

The testSHA256 Routine

Each of the algorithm-specific test routines are similar. each defines an output field, a digest object which is an instance of one of the message digest classes, and a call to the testDigest routine, passing the output field, digest object and a pointer to the specific test definitions in a DIGESTTEST structure.

The testDigest Routine

The testDigest routine is reusable and services all message digest tests based on the contents of the DIGESTTEST structure passed and the message digest class instance. Since all message digest classes are derived from the Digest class, the behavior is polymorphic. The digest is initialized, run through each iteration over the input vector or current state, and then the final hash is computed. If the "dump" program argument was provided, then the dumpMem routine is called to output data.

The dumpMem Routine

The dumpMem routine displays a sequence of bytes in hexadecimal notation with relative address labels. Lookup tables are used to quickly translate binary values to their ASCII or hexadecimal equivalents.

Test Output

The test program output, finally, as also shown above, illustrates the program title messages, accepting user input - as alternative to providing program arguments - querying the user whether to run various tests, and test information including input vectors, output results and whether the test output matched expected results.

We'll continue to use the same test framework as the project proceeds, simply adding additional test vectors, test configuration structures and hierarchical calls to test routines.





Comments

Popular posts from this blog

Day 12 : 19 June 2022 : Adding Windows Service Logic

Day 5 : 12 June 2022 : CMake on Windows

Day 11: 18 June 2022 : Handling Program Arguments