Modern Hash Functions

Hash functions are fundamental building blocks in modern cryptography, providing data integrity, message authentication, and digital signatures. This section covers secure hash functions, their properties, and best practices for implementation.

Properties of Cryptographic Hash Functions

A secure cryptographic hash function must satisfy these properties:

  1. Deterministic: Same input always produces the same hash
  2. Fast Computation: Hash can be computed quickly
  3. Pre-image Resistance: Infeasible to reverse the hash to find the original input
  4. Second Pre-image Resistance: Given an input, it’s infeasible to find another input with the same hash
  5. Collision Resistance: Infeasible to find two different inputs that produce the same hash
  6. Avalanche Effect: Small changes in input cause significant changes in output

Common Hash Functions

SHA-2 Family

SHA-3 (Keccak)

BLAKE3

Implementation Examples

Python: Using hashlib

import hashlib

def hash_sha256(data: bytes) -> str:
    """Generate SHA-256 hash of input data."""
    return hashlib.sha256(data).hexdigest()

def hash_sha3_256(data: bytes) -> str:
    """Generate SHA3-256 hash of input data."""
    return hashlib.sha3_256(data).hexdigest()

def hash_blake2s(data: bytes) -> str:
    """Generate BLAKE2s hash of input data."""
    return hashlib.blake2s(data).hexdigest()

Go: Using Standard Library

package main

import (
    "crypto/sha256"
    "crypto/sha512"
    "encoding/hex"
    "golang.org/x/crypto/blake2b"
    "golang.org/x/crypto/sha3"
)

func SHA256Hash(data []byte) string {
    hash := sha256.Sum256(data)
    return hex.EncodeToString(hash[:])
}

func SHA3512Hash(data []byte) string {
    hash := sha3.New512()
    hash.Write(data)
    return hex.EncodeToString(hash.Sum(nil))
}

func BLAKE2b512Hash(data []byte) (string, error) {
    hash, err := blake2b.New512(nil)
    if err != nil {
        return "", err
    }
    hash.Write(data)
    return hex.EncodeToString(hash.Sum(nil)), nil
}

C: Using OpenSSL

#include <openssl/sha.h>
#include <openssl/evp.h>
#include <string.h>
#include <stdio.h>

void sha256_hash(const unsigned char *data, size_t len, unsigned char *output) {
    SHA256_CTX sha256;
    SHA256_Init(&sha256);
    SHA256_Update(&sha256, data, len);
    SHA256_Final(output, &sha256);
}

void sha3_256_hash(const unsigned char *data, size_t len, unsigned char *output) {
    EVP_MD_CTX *mdctx;
    const EVP_MD *md = EVP_sha3_256();
    unsigned int md_len;

    mdctx = EVP_MD_CTX_new();
    EVP_DigestInit_ex(mdctx, md, NULL);
    EVP_DigestUpdate(mdctx, data, len);
    EVP_DigestFinal_ex(mdctx, output, &md_len);
    EVP_MD_CTX_free(mdctx);
}

Security Considerations

Common Vulnerabilities

  1. Length Extension Attacks: Affects Merkle-Damgård constructions (MD5, SHA-1, SHA-256)

    • Mitigation: Use HMAC or SHA-3
  2. Collision Attacks: Finding two inputs with the same hash

    • MD5 and SHA-1 are considered broken for security-critical applications
  3. Timing Attacks: Side-channel attacks based on execution time

    • Use constant-time comparison functions

Best Practices

  1. Use Modern Hash Functions: Prefer SHA-3 or BLAKE3 for new applications
  2. Use Appropriate Output Size: At least 256 bits for collision resistance
  3. Always Use Salt: For password hashing, always use a unique salt
  4. Use Keyed Hashes: HMAC or KMAC for message authentication
  5. Consider Memory Hardness: For password hashing, use Argon2, bcrypt, or scrypt

Performance Comparison

Algorithm Speed (MB/s) Security Bits Best For
SHA-256 300-500 128 General purpose, blockchain
SHA3-256 200-400 128 Security-critical applications
BLAKE3 1000-2000 128 High-performance applications
BLAKE2b-512 500-1000 256 Cryptocurrency, security
MD5 1000+ < 64 (broken) Non-cryptographic use only

Password Hashing

import argon2

def hash_password(password: str) -> str:
    """Hash a password using Argon2."""
    return argon2.PasswordHasher().hash(password)

def verify_password(hashed: str, password: str) -> bool:
    """Verify a password against a hash."""
    try:
        return argon2.PasswordHasher().verify(hashed, password)
    except:
        return False

Using bcrypt

import bcrypt

def hash_password_bcrypt(password: str) -> bytes:
    """Hash a password using bcrypt."""
    salt = bcrypt.gensalt()
    return bcrypt.hashpw(password.encode(), salt)

def verify_password_bcrypt(hashed: bytes, password: str) -> bool:
    """Verify a password against a bcrypt hash."""
    return bcrypt.checkpw(password.encode(), hashed)

Key Derivation Functions (KDFs)

PBKDF2

import hashlib
import os

def derive_key(password: str, salt: bytes = None, iterations: int = 100000) -> tuple[bytes, bytes]:
    """Derive a key using PBKDF2-HMAC-SHA256."""
    if salt is None:
        salt = os.urandom(16)  # 16 bytes = 128 bits
    
    key = hashlib.pbkdf2_hmac(
        'sha256',
        password.encode('utf-8'),
        salt,
        iterations
    )
    
    return key, salt

Scrypt

import hashlib
import os

def derive_key_scrypt(password: str, salt: bytes = None, N: int = 2**14, r: int = 8, p: int = 1) -> tuple[bytes, bytes]:
    """Derive a key using scrypt."""
    if salt is None:
        salt = os.urandom(16)
    
    key = hashlib.scrypt(
        password.encode('utf-8'),
        salt=salt,
        n=N,  # CPU/memory cost
        r=r,  # block size
        p=p   # parallelization factor
    )
    
    return key, salt

Real-world Applications

  1. Digital Signatures: Hash the message before signing
  2. Password Storage: Store only hashed passwords
  3. Data Integrity: Verify file integrity with checksums
  4. Blockchain: Transaction hashing and proof-of-work
  5. Merkle Trees: Efficient data verification in distributed systems

Further Reading