Introduction

Input Validation Deep Dive

Input validation is the first line of defense against injection attacks. Every piece of data entering an application — form fields, HTTP headers, URL parameters, file uploads, API payloads — must be validated before processing. The principle is simple: never trust user input.

Whitelist vs Blacklist

Whitelist (Allowlist) Validation

Whitelist validation defines what is allowed and rejects everything else. It is far more secure than blacklisting.

import re

Whitelist: only allow specific characters

def validate_username_whitelist(username):

"""Allow only alphanumeric, underscore, and hyphen."""

pattern = r'^[a-zA-Z0-9_-]{3,32}$'

if not re.match(pattern, username):

raise ValueError(

f"Username '{username}' contains invalid characters. "

"Only letters, numbers, underscores, and hyphens are allowed."

)

return username

Whitelist for country codes

ALLOWED_COUNTRIES = {'US', 'CA', 'GB', 'DE', 'FR', 'JP'}

def validate_country_code(code):

if code.upper() not in ALLOWED_COUNTRIES:

raise ValueError(f"Country '{code}' is not in the allowed list")

return code.upper()

Blacklist (Blocklist) Validation

Blacklist validation attempts to block known malicious patterns. It is inherently fragile because attackers constantly discover new bypass techniques.

WEAK: Blacklist approach (easily bypassed)

def validate_input_blacklist(input_string):

Easily bypassed — attacker uses alternative syntax

blocklist = ['