How Hashing Algorithm Used in Cryptography?

A Hash Function (H) takes a variable-length block of data and returns a hash value of a fixed size. A good hash function has a property that when it is applied to a large number of inputs, the outputs will be evenly distributed and appear random. Generally, the primary purpose of a hash function is to maintain data integrity. Any change to any bits or bits in the results will result in a change in the hash code, with a high probability.

The type of hash function that is needed for security purposes is called a cryptographic hash function.

A cryptographic hash function (or cryptographic hash algorithm) is an algorithm that is not computationally efficient (no attack is more efficient than brute force) when it is used to find either:

A data object which maps to a predefined hash result
Two data objects that map to the hash result in collision-free property.

Because of these properties, a hash function is often used to check whether data has changed.

Block Diagram of Cryptographic: Hash Function; h = H(M)

Working on Hashing Algorithms in Cryptography

Now that we have a basic idea of what a hash function is in cryptography, let's break down the internal mechanics.

The first act of the hashing algorithm is to divide the large input data into blocks of equal size. Further, the algorithm applies the hashing process to the data blocks one by one.

Though one block is hashed separately, all the blocks are related to each other. The output hash value for the first data block is taken as an input value and is summed up with the second data block. Similarly, the hashed output of the second block is summed up with the third block, and the summed-up input value is again hashed. And this process goes on and on until you get the final hash output, which is the summed-up value of all the blocks that were involved.

Therefore, tampering with the data of any block will change its hash value. As its hash value goes into the feeding of blocks following it, all the hash values are changed. This is how even the smallest change in the input data is detectable, as it changes the entire hash value.

Alice is a vendor whose business supplies stationery to Bob's office on credit. She sends Bob an invoice with an inventory list, billing amount, and her bank account details a month later. She applies her digital signature to the document and hashes it before sending it to Bob. However, Todd, who's a hacker, intercepts the document while it's in transit and replaces Alice's bank account details with his.

When Bob receives the letter, his computer calculates the hash value of the document and finds that it's different from the original hash value. Bob's computer immediately raises a flag, warning him that something is fishy with the document and he shouldn't trust it.

Without the hashed document, Bob would easily have trusted the content of the document because he was acquainted with Alice and the transaction details in the document were genuine. However, since the hash values did not match, Bob was aware of the change. Now, he contacts Alice by phone and shares with her the information in the document he received. Alice confirms that her bank account is different than what is written in the document.

That's how a hashing function saves Alice and Bob from financial fraud. Now imagine this scenario with your own business and how it could.

Primary Terminologies

Preimage: Let’s say we have a hash value (hash value h = h(x)). We say that x is the first image of h. Let’s call x a data block, whose hash function (using the function H) is h. Because H is a multiple-fold mapping, there will always be some number of preimages for any hash value h.

Collision: If x\neq y and H(x) = H(y), we’ll have a collision. Since we’re dealing with hash functions, it’s obvious that collisions are not desirable.

Popular Hash Functions

Hash functions play an important role in computing, providing versatile capabilities like: Quick retrieval of data, Secure protection of information (cryptography), Ensuring data remains unaltered (integrity verification). Some commonly used hash functions are

Message Digest 5 (MD5)

MD5 is a specific message digest algorithm, a type of cryptographic hash function. It takes an input of any length (a message) and produces a fixed-length (128-bit) hash value, which acts like a unique fingerprint for the message.

MD5 was widely used from the early 1990s onwards for various purposes, including:

File Check: Making sure a file got from the web was not changed while transferring. MD5 was used to make a code for the first file and compare it to the code of the received file.
Password Storage: MD5 was sometimes used to store passwords on servers. However, it was never recommended to store passwords directly in plain text. Instead, the password was hashed using MD5, and the hash value was stored. This meant that even if a security breach occurred, the actual passwords wouldn't be compromised.

Secure Hash Function (SHA)

SHA stands for Safe Hash Algorithm. It's a group of codes for keeping data safe made by NIST. These codes convert any size input into a fixed code, called a hash value or message digest.

There are different SHA types, each with varied lengths and security features:

SHA-1: The first SHA code, making a 160-bit hash. It's now unsafe because of flaws and is no longer used.
SHA-2: A family of improved SHA algorithms with various output lengths:
- SHA-224 (224 bits)
- SHA-256 (256 bits - most common)
- SHA-384 (384 bits)
- SHA-512 (512 bits)
SHA-3: A completely redesigned hash function introduced after weaknesses were found in SHA-2. It offers improved security but isn't as widely used yet.

SHAs have a number of applications in digital security:

Data Integrity: Checking if data is changed. Even small change means different hash value.
E-Signatures: Verify documents. It uses private key, hash to sign data. Receiver checks signature using sender's public key, re-computed hash.
Password Protection: Passwords are encrypted before saved. If there's a breach, only hash is compromised, not passwords.
Software Check: Verify downloaded file is unchanged. Often, hash is given by distributor to check file's authenticity.

Applications

Message Authentication

Message Authentication is the process or service used for making sure that a message is authentic. It also means assuring that the received data is the same as the one sent—that is, not tampered with to delete, insert, or replay. In most cases, authentication will also ensure that the alleged sender is who he claims he is.

More precisely, the hash function is referred to as a message digest when hash functions are applied for verifying a message.

Digital Signatures

In the case of a digital signature, the hash value of a message is encrypted using the private key owned by the person sending it out so that no one else can change what they have said without being detected easily by those looking for it also within seconds by those scanning across different networks particularly corporate or government intranets. With this information at hand, hackers always attempt hacking passwords and other security codes so that torrent downloaders from this work freely without facing any restrictions they may not be able to avoid under lawful circumstances.

A hash code is used to provide a digital signature as:

a. The hash code is encrypted with public-key encryption using the sender's private key. This provides authentication, but it also provides a digital signature because only the sender can have produced the encrypted hash code. In fact, this is what the digital signature technique is all about.

b. If you want both confidentiality and a digital signature, then you can encrypt the message plus the private-key encrypted hash code using a symmetrical secret key. This is a common technique.

Create a single-pass password

One of the most common uses of hash functions is the creation of a single pass password file. A single pass password file is a scheme where the operating system holds the hash value of a user’s password rather than the actual password itself.In other words, it’s the hash value that the operating system stores. That’s why a hacker can’t get the real password from that file. When you type a password into your computer, the system uses the hash value to check if you typed the right password. Many systems use this method to protect passwords.

Intrusion detection and virus

Hash functions can also be used to detect intrusions and viruses. For each file on your system, store H(F) and keep the hash values safe (for example, on a protected CD-R). You can later check if a file is changed by recreating H(F). For example, an intruder would have to modify F without altering H(F).

Pseudorandom number generator (PRNG)

You can use a cryptographic hash function to create a PRF or a PRNG. One of the most common uses for a hash based PRF is to generate symmetric keys.

Security Requirements for Cryptographic Hash Functions

Requirement	Description
Variable input size	H can be applied to a block of data of any size.
Fixed output size	H produces a fixed-length output.
Efficiency	H(x) is relatively easy to compute for any given x, making both hardware and software implementations practical
Preimage resistant (one-way property)	For any given hash value , it is computationally infeasible to find u such that H(y) = h
Second preimage resistant (weak collision resistant)	For any given block x, it is computationally infeasible to find y\neq x with H(y)=H(x) .
Collision resistant (strong collision resistant)	It is computationally infeasible to find any pair (x , y) such that H(x) = H(y).
Pseudorandomness	Output of H meets standard tests for pseudorandomness.

The first three characteristics are necessary for the practical use of a hashfunction.
The fourth characteristic is preimage resistant. It is easy to generate a code for a message but almost impossible for a message to be generated for a code. This property is important when the authentication technique is to use a secret value, where the secret value is not to be sent. If the hash function does not have one way, the secret value can easily be discovered by an attacker. For example, if the attacker is able to observe or intercept the transmission, they can get the message M. They can also get the hash code h = H(S||M). They can then invert the hash function S||M = H^-1(MD_M) and get S_AB || M. Since they now have both M and S_AB || M, it is trivial to recover S_AB.

The fifth property – second preimage resistant – ensures that you can’t find a different message with the exact same hash value. This property stops forgery when you’re using encrypted hash code. If this property wasn’t true, attackers would be able to do the following: First, they’d observe or intercept the message plus the encrypted hash code. Second, they would get the unencrypted hash from the message. Third, they would generate an alternate message with the identical hash code.

A weak hash function satisfies the first fifth properties.
If the sixth property (collision resistant) is also met, then the hash function is called a strong one. Strong hash functions protect against an attack where one side creates a message for the other side to sign.

The last requirement – pseudorandomization – has not traditionally been mentioned as a requirement for cryptographic hash functions, but is rather implicit. Because cryptographic hash functions are often used to derive keys and generate pseudorandom numbers, and because in message integrity applications, these three resistant properties are dependent on the hash function’s output being random, it’s logical to verify that a given hash function actually produces random output.

Drawback

Just like other technologies and processes, the hash functions in cryptography aren't perfect either. There are a few key issues that are worth mentioning.

There had been incidences in the past while popular algorithms like MD5 and SHA-1 had been returning the same hash value for different data. Hence, the quality of collision-resistance was compromised.

There's a technology called "rainbow tables" that hackers use to try to crack unsalted hash values. That's why salting before hashing is so important to secure password storage.

There are some software services and hardware tools—known as "hash cracking rigs"—that are used by hackers, security researchers, and sometimes even government entities to crack the hashed passwords.

Some kinds of brute force attacks can crack the hashed data.

Conclusion

Hashing is a very handy cryptographic tool for information technology when it comes to verification: checking digital signatures, file integrity, or data, password integrity, and many more. Cryptographic hash functions are not perfect, but they are a pretty good checksum and authentication mechanism. It is one of the methods of storing passwords securely when a salting technique is in place, in a manner that is just impractical for cybercriminals to even try to invert it to something they can use.