Incident response is the structured process of handling security breaches and cyber attacks. Every development team needs a plan, because it is not a matter of if an incident will happen, but when. This article presents a practical incident response playbook based on the NIST SP 800-61 framework.

Incident Response Playbook for Developers

The NIST Incident Response Framework

The NIST framework defines four phases: Preparation, Detection and Analysis, Containment Eradication and Recovery, and Post-Incident Activity. We add a fifth phase, Triage, between Detection and Containment.

Phase 1: Preparation

Preparation is the most important phase. Without preparation, every incident becomes a chaotic scramble.

Build a response team : Identify who handles security incidents. The team should include a incident commander, a security analyst, a system owner, a communications lead, and a legal representative.

Create runbooks : Document step-by-step procedures for common incident types: phishing, malware outbreak, data breach, ransomware, denial of service, and insider threat.

Set up tooling : Ensure the team has access to:

  • Centralized logging (SIEM like Splunk, ELK, or Sentinel)

  • Endpoint detection and response (EDR like CrowdStrike or Defender)

  • Network monitoring and packet capture

  • Secure communication channels (Slack, Teams, or Signal)

  • Evidence collection tools (FTK Imager, Volatility, tcpdump)

Practice regularly : Run tabletop exercises every quarter. Simulate a ransomware attack, a data exposure, or a compromised credential. Practice builds muscle memory.

Phase 2: Detection and Analysis

Detection relies on monitoring and alerting. Every alert is a potential incident candidate.

Alert sources :

  • SIEM correlation rules detecting anomalous patterns

  • EDR alerts for malware execution or suspicious process behavior

  • Cloud provider alerts (GuardDuty, Security Command Center, Defender)

  • Application logs showing unusual error rates or access patterns

  • User reports of suspicious activity

Triage questions :

  • What happened? What systems are affected?

  • When did it start? Is it ongoing?

  • What is the impact? Data loss? Service disruption?

  • Is this a true positive or a false alarm?

  • What severity level applies?

Severity classification :

  • SEV-1: Critical. Active data exfiltration, ransomware, or service-wide compromise. Immediate response required.

  • SEV-2: High. Confirmed intrusion but contained. Credential compromise affecting multiple users.

  • SEV-3: Medium. Potential compromise under investigation. Phishing campaign targeting employees.

  • SEV-4: Low. Minor policy violations. Automated scans with no evidence of exploitation.

Phase 3: Containment, Eradication, and Recovery

Containment stops the attack from spreading. Eradication removes the attacker's presence. Recovery returns systems to normal operation.

Short-term containment :

  • Disconnect affected systems from the network.

  • Disable compromised user accounts.

  • Block attacker IP addresses at the firewall.

  • Rotate credentials for affected services.

Example: Block an IP at the firewall

iptables -A INPUT -s 203.0.113.50 -j DROP

Example: Disable a compromised AWS IAM user

aws iam update-access-key \

\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\--access-key-id AKIAIOSFODNN7EXAMPLE \

\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\--status Inactive \

\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\--user-name compromised-user

Long-term containment :

  • Apply security patches.

  • Implement additional monitoring for affected systems.

  • Deploy WAF rules to block attack patterns.

Eradication :

  • Remove malware using EDR tools.

  • Rebuild compromised servers from known-good images.

  • Revoke all session tokens and API keys.

  • Reset root passwords and privileged credentials.

Recovery :

  • Restore systems from clean backups.

  • Verify system integrity before returning to production.

  • Gradually reintroduce traffic while monitoring for recurrence.

  • Communicate recovery status to stakeholders.

Phase 4: Post-Incident Activity

The post-mortem is where the team learns from the incident and improves processes.

Post-mortem meeting : Within one week of containment, gather everyone involved. Blameless culture is essential — the goal is to improve systems, not assign blame.

Post-mortem document :

  • Timeline of the incident

  • Root cause analysis

  • What went well and what went wrong

  • Detection gaps and containment delays

  • Remediation items with owners and deadlines

  • Changes to runbooks, tooling, or architecture

Post-Mortem: Service Credential Leak

Date : 2026-04-15

Severity : SEV-2

Timeline

\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\- 2026-04-15 09:23 UTC — GuardDuty alert for anomalous API calls

\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\- 09:25 — Triage begins

\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\- 09:45 — Compromised key identified and revoked

\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\- 10:30 — Containment confirmed

\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\- 14:00 — All affected resources rotated

Root Cause

GitHub Actions workflow accidentally logged AWS_SECRET_ACCESS_KEY to debug output. Logs were publicly accessible.

Action Items

\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\- [ ] Remove debug logging from CI/CD workflows (owner: DevOps, due: 04-22)

\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\- [ ] Enable secret scanning on GitHub repository (owner: Security, due: 04-18)

\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\- [ ] Add alert for API keys used outside expected regions (owner: Platform, due: 04-30)

Forensic Evidence Collection

Proper evidence collection preserves data for legal action and root cause analysis.

  • Capture memory dumps using tools like LiME or Volatility before powering off systems.

  • Collect disk images using dd or FTK Imager rather than copying files live.

  • Record command output with timestamps using the script command.

  • Maintain chain of custody documentation for all evidence.

Capture memory dump with LiME

insmod lime.ko "path=/evidence/memory.dump format=lime"

Capture disk image

dd if=/dev/sda of=/evidence/disk.img bs=4M conv=noerror,sync

Conclusion

A well-practiced incident response process turns a potential disaster into a manageable event. Preparation separates professional teams from those that panic. Detection without response is just noise. And every incident, no matter how small, is an opportunity to improve.