Roadmap

Vulnerability Researcher

The specialist who discovers new vulnerabilities in software, hardware, and protocols through source code analysis, binary reverse engineering, fuzzing, and manual testing. Produces CVEs, bug bounty findings, and security improvements in products before attackers can find and weaponize them.

OPTIMISTIC 4–5 yearsREALISTIC 5–6 years

Stage 00

Programming and Computer Science Fundamentals

Vulnerability research requires genuine engineering depth. You are finding flaws in software systems, which requires understanding those systems at the implementation level.

C and C++ — Primary Research Languages

C fundamentals — all data types, pointers, arrays, structs, memory allocation, file I/O
Memory model — stack vs heap; static allocation; memory layout of a process
Pointers — pointer arithmetic; pointer to pointer; function pointers; void pointers
Memory management: dynamic heap allocation via malloc/calloc/realloc/free; stack allocation with scope-tied lifetime; memory errors that create vulnerabilities including buffer overflow, heap overflow, use-after-free, double-free, integer overflow, off-by-one, format string, and null pointer dereference
C++ additions relevant to security: virtual functions and vtable for control flow hijack; new/delete heap allocation with same UAF/double-free issues; smart pointers (shared_ptr, unique_ptr) safer but not immune; STL containers bounds checking (operator[] vs .at())
Compilation and linking: preprocessor, compiler, assembler, linker stages; compiler flags relevant to security (-fstack-protector, -D_FORTIFY_SOURCE, -pie, -fPIE); debug vs release optimization differences; optimization can eliminate security checks

Python — Automation and Tooling

All fundamentals — data structures, OOP, modules, file I/O
ctypes — calling C functions from Python; interfacing with native code
struct module — packing and unpacking binary data
socket — network programming for protocol fuzzing
Process spawning module — executing programs and capturing output
Fuzzing automation — writing Python harnesses around native targets
Exploit helper scripts — payload construction, IPC, exploitation automation

Assembly Language

Full x86/x64 assembly knowledge required (see Malware Analyst Stage 0)
ARM assembly — growing importance for embedded, mobile, and IoT research; ARM registers R0–R12 general purpose, R13 (SP), R14 (LR), R15 (PC); ARM calling convention R0–R3 for first four arguments; Thumb/Thumb-2 mixed 16/32-bit instruction set; AARCH64 (ARM64) with X0–X7 arguments and X0 return

Operating System Internals

Linux kernel architecture: kernel space vs user space; system calls interface; virtual memory pages and page tables; process management fork/exec/wait/signals; file descriptors; kernel modules as attack surface; /proc, /sys, /dev virtual filesystems
Windows kernel architecture: Windows NT Executive, HAL, Win32 subsystem; kernel objects (handles, named objects, security descriptors); Windows API (Win32, Native ntdll, kernel-mode); driver model (KMDF, UMDF); IRP processing; Windows memory (PFN database, working set); syscall mechanism (sysenter/syscall, SSDT); kernel exploitation targets (win32k.sys, driver bugs, kernel objects)

Resources

"The C Programming Language" (K&R, book)
"Computer Systems: A Programmer's Perspective" (Bryant & O'Hallaron, book)
Linux kernel documentation (free)
"Windows Internals" by Russinovich (book)

Stage 01

Binary Exploitation Fundamentals

Exploit development requires understanding both what vulnerabilities exist and how to turn them into controlled execution. This is the bridge from finding a bug to proving it is exploitable.

Memory Safety — Deep Understanding

Stack buffer overflow: stack layout (local variables, saved EBP/RBP, return address, arguments); overwriting return address to redirect execution to attacker code; NOP sled padding; classic vulnerable functions gets/strcpy/sprintf
Heap exploitation: heap metadata (chunk headers, doubly-linked free lists); ptmalloc/glibc heap bins (fast, small, large, unsorted, tcache); heap overflow; use-after-free; double-free with tcache poisoning; heap spraying
Format string exploitation: %x to read stack values; %s for arbitrary memory read; %n for arbitrary write; GOT reads/overwrites

Modern Exploitation Mitigations

Stack Canary (Stack Protector): random value between locals and return address; detected on function return; bypass via leak, canary-avoidance write, or format string read
ASLR (Address Space Layout Randomization): randomizes base addresses per execution; bypass via information leak, partial overwrite, brute force on 32-bit, or format string; Linux /proc/sys/kernel/randomize_va_space settings 0/1/2
NX / DEP (Non-Executable Stack / DEP): stack and heap non-executable; bypass via ROP or ret2libc
PIE (Position Independent Executable): randomizes executable base address; combined with ASLR makes all addresses random; bypass via info leak or partial overwrite
RELRO (Relocation Read-Only): Full RELRO makes GOT read-only; Partial RELRO has some GOT entries read-only; bypass Full RELRO via alternative code pointers (vtable, function pointers)
CFI (Control Flow Integrity): restricts indirect jumps/calls to valid targets; shadow stack (CET), forward-edge CFI, LLVM CFI implementations
SAFE SEH (Structured Exception Handler): Windows validates SEH chain to prevent SEH overwrite

Return-Oriented Programming (ROP)

Concept: chain together existing code fragments ("gadgets") ending in ret instruction
Gadget finding: ROPgadget, ropper, pwndbg's rop command
Common gadget types: pop rdi; ret to load first syscall arg; pop rax; ret for syscall number; syscall; ret to execute; mov [rdi], rsi; ret to write memory; add rsp, N; ret for stack pivot
ret2libc — calling libc functions via ROP (system, execve, etc.)
SIGROP — signal return oriented programming; abusing rt_sigreturn syscall
Stack pivoting — moving RSP to attacker-controlled memory (e.g., heap buffer)

Tools

pwntools (Python) — CTF exploitation framework: process()/remote() for connecting; p32()/p64()/u64() for packing/unpacking; flat() for ROP chains; ELF() for binary analysis; ROP() for automated chains; gdb.attach() for debugging
pwndbg / GEF / PEDA — GDB enhancement plugins: checksec for protections; vmmap for memory layout; heap visualization; rop gadget search; telescope for pointer inspection
GDB — foundation debugger; commands: break, run, continue, next, step, info registers, x/Nx format
ltrace / strace — tracing library and system calls respectively
checksec — displaying binary security mitigations

Resources

"Hacking: The Art of Exploitation" by Jon Erickson (book, essential)
pwn.college (free, best binary exploitation learning)
exploit.education (free VMs)
LiveOverflow YouTube (free)
CTFtime.org

Stage 02

Security Fundamentals

Vulnerability research is security-domain-specific. Understanding the threat landscape, CVE ecosystem, and disclosure processes is required.

CVE Ecosystem

CVE (Common Vulnerabilities and Exposures) — unique identifier per vulnerability
NVD (National Vulnerability Database) — CVSS scores, descriptions, references
CWE (Common Weakness Enumeration) — vulnerability type taxonomy
CVSS (Common Vulnerability Scoring System) — standardized severity scoring
CVE lifecycle: discovery → report to vendor → patch development → CVE assignment → coordinated disclosure → public release

Responsible Disclosure

Principles: notify vendor before public disclosure; give reasonable time to patch
Standard timelines: Google Project Zero uses 90 days; many researchers use 90 days; some use 45 or 120
CERT Coordination Center — intermediary for complex disclosures
Bug bounty programs — managed disclosure with defined rewards
Full disclosure — immediate public release; controversial; used when vendors are unresponsive
CVE numbering authorities — MITRE, CNAs (vendors who self-assign CVEs)
Writing good vulnerability reports: clear title and summary; affected versions; vulnerability type (CWE); step-by-step reproduction; impact assessment; proof-of-concept code or crash file; suggested fix

Attack Surface Analysis

Threat modeling for research targeting — where are the most interesting attack surfaces?
Input attack surface: parsing code, file format handlers, network protocol implementations
Interface attack surface: APIs, IPC mechanisms, kernel interfaces
Trust boundary attack surface: privilege transitions, sandboxes, hypervisors
Target selection strategy: high-impact targets (browsers, OS kernels, hypervisors) harder but CVEs matter more; under-researched targets more likely to find low-hanging fruit; targets you deeply understand where domain expertise accelerates research

Resources

MITRE CVE (free)
NVD (free)
Google Project Zero blog (free, excellent research examples)
Pwn2Own competition archives (free)

Stage 03

Vulnerability Discovery Techniques

Source Code Auditing

Manual source code review for vulnerability classes: identify external input points; trace data flows to security-sensitive operations; focus on parsing code, integer arithmetic, memory allocation, format strings, command construction; look for missing bounds checks, unchecked return values, type confusion, integer overflows
Semgrep and CodeQL for automated pattern matching on large codebases
Differential analysis / patch diffing: comparing patched vs unpatched versions; tools BinDiff (commercial), diaphora (free IDA plugin), radiff2 (radare2); process: download old and new, diff in disassembler, understand patch, reverse-engineer vulnerability; CVE North Stars methodology

Binary Analysis Without Source

Static binary analysis (see Malware Analyst path Stage 3)
Binary-level sink identification: calls to dangerous functions (strcmp, strcpy, sprintf, system, popen, memcpy without bounds check); integer operations feeding allocation sizes; pointer arithmetic without validation
Recovering data structures: identifying struct layouts from field access patterns; applying RTTI for C++ classes; symbol servers for partially stripped binaries

Fuzzing — Core Research Technique

What fuzzing is: automated input generation to find crashes in programs
Types: black-box (random input, minimal feedback); coverage-guided (AFL/libFuzzer using code coverage); grammar-based (valid-but-adversarial inputs); snapshot fuzzing (memory snapshot reset); differential fuzzing (two implementations compared)
AFL++ compiling targets with instrumentation: AFL_USE_ASAN=1 CC=afl-clang-fast configure && make for ASAN + AFL; afl-clang-fast / afl-clang-lto compiler wrappers; LLVM mode for faster fuzzing with collision-free coverage
Running AFL++: afl-fuzz -i input_seeds/ -o output/ -t 1000 -- ./target @@; @@ replaced with path to fuzz input file; input seed quality matters; -m none disables memory limit
Interpreting output: corpus count shows unique paths found; crashes examined with GDB/ASAN; hangs timeout-triggering inputs may indicate DoS
AFL++ modes: QEMU mode for closed-source binaries; unicorn mode for snippet fuzzing; persistent mode calls target function in loop without re-execution
Parallel fuzzing — multiple AFL++ instances on multi-core systems
In-process fuzzing — target function called in loop by fuzzer; no process overhead
Writing a fuzz target: implementing LLVMFuzzerTestOneInput entry function taking uint8_t data and size_t size, calling target function with fuzz data, returning zero for normal and non-zero for interesting-but-not-a-crash
Compiling: clang -fsanitize=fuzzer,address target.c -o fuzz_target
Running: ./fuzz_target corpus/ -max_total_time=3600
Integration with AFL++ corpus sharing
The critical skill: writing the code that connects the fuzzer to the target
Harness components: initialize the target (libraries, state); feed fuzz input to target API; handle errors gracefully (don't crash the fuzzer on expected errors); clean up state between iterations (for persistent mode)
Challenges: stateful protocols maintaining state across fuzz calls; non-determinism from random seeds, timers, external dependencies; parsing validation skipping to reach deeper code
Network protocol fuzzing: Boofuzz (Python) framework for stateful session support; custom socket-level fuzzing scripts; TLS targets handling wrapping around protocols
AddressSanitizer (ASAN) — detects memory errors: heap/stack buffer overflow, UAF, double-free, use-after-return; compile -fsanitize=address; overhead ~2x memory, ~2x CPU
UndefinedBehaviorSanitizer (UBSan) — detects undefined behavior: integer overflow, null dereference, alignment violations; compile -fsanitize=undefined
MemorySanitizer (MSan) — detects use of uninitialized memory
ThreadSanitizer (TSan) — detects race conditions
Combining sanitizers: ASAN + UBSan is standard fuzzing configuration

Manual Vulnerability Discovery

Code paths rarely reached by fuzzers: complex authentication flows; multi-step transaction logic; error handling paths; edge cases in parser state machines
Hypothesis-driven research: "This parser allocates a buffer based on a size field; if the size field can overflow..."; "This function assumes its caller validates the input; what if called directly?"; "This code reuses a buffer; is it fully cleared between uses?"
CVE reproduction and variant research: read a disclosed CVE; reproduce the vulnerability; understand the root cause pattern; look for the same pattern elsewhere in the codebase or similar products

Resources

AFL++ documentation (free)
"The Fuzzing Book" (free online)
pwn.college fuzzing module (free)
FuzzCon presentations (free YouTube)
Google OSS-Fuzz (free, good examples of production fuzz harnesses)

Stage 04

Web and API Vulnerability Research

Web targets are accessible, high-impact, and available through bug bounty programs. Web vulnerability research is the most accessible entry point to the field.

Advanced Web Vulnerability Classes

Server-side template injection (SSTI): identifying template engines from behavior and error messages; template-specific payloads for Jinja2, Twig, FreeMarker, Velocity, Smarty; escalation from information disclosure to RCE
Deserialization attacks: Java gadget chains via ysoserial (CommonsCollections, Spring, JDK); PHP object injection via magic methods; Python serialization gadget chains; .NET deserialization via ysoserial.net
OAuth/OIDC attacks: account takeover via state parameter CSRF; token theft via open redirect in redirect_uri; bypassing proof-of-possession (DPoP) incorrectly implemented
Prototype pollution in JavaScript: object merge patterns; escalation to XSS or RCE via property injection
HTTP request smuggling: CL.TE (Content-Length interpreted by front-end, Transfer-Encoding by back-end); TE.CL (reverse); HTTP/2 downgrade smuggling
Cache poisoning — injecting malicious content into shared caches via unkeyed headers
Host header attacks — password reset poisoning, SSRF via Host header

Bug Bounty Research Methodology

Target selection: programs with larger attack surface (main product, not just marketing site); programs with known technology stack (prefer familiar tech); programs with good payouts relative to scope
Reconnaissance: asset discovery via subfinder, amass, crt.sh; technology fingerprinting with Wappalyzer, WhatWeb, HTTP headers; JavaScript analysis with linkfinder, secretfinder, JSParser; historical data via Wayback Machine
Systematic coverage: map all endpoints and parameters; test authentication and authorization on every endpoint; focus on unique functionality with less prior testing
High-value targets within programs: authentication flows (password reset, login, account recovery); payment and billing; admin functionality; file upload; import/export features

Resources

PortSwigger Web Security Academy (free)
HackerOne Hacktivity (free, real reports)
Jason Haddix's methodology posts (free)
STÖK's YouTube (free)

Stage 05

Reverse Engineering for Vulnerability Research

Binary targets require reverse engineering before fuzzing or manual analysis. Deep RE skills from the Malware Analyst path apply here with a research focus.

From Malware Analyst Path

Complete Ghidra and IDA Pro proficiency (Stage 3 from Malware Analyst path)
x86/x64 assembly fluency (Stage 0 from Malware Analyst path)
PE/ELF format knowledge (Stage 1 from Malware Analyst path)
Dynamic analysis and debugging (Stage 4 from Malware Analyst path)

RE for Vulnerability Research Specifics

Identifying vulnerability-prone code patterns in disassembly: size calculations with multiplication before malloc; memcpy/strcpy without explicit bounds check; loop conditions with off-by-one potential; pointer arithmetic without validation; format string calls printf(user_data) pattern
Protocol reverse engineering for fuzzer development: identifying message framing (length fields, delimiters); mapping message types and handlers; finding state machine transitions
Vulnerability variant hunting: understanding a patched vulnerability and finding the unfixed version; finding sibling functions with the same pattern; finding related products with ported vulnerable code

Automated Binary Analysis

Binary ninja automation: scripting vulnerability pattern detection; building analysis plugins
Angr (symbolic execution): exploring all execution paths symbolically; proving or disproving reachability of dangerous conditions; constraint solving for input generation
KLEE — symbolic execution on LLVM bitcode

Stage 06

Vulnerability Reporting and Disclosure

A found vulnerability has zero impact without a clear report that enables the vendor to understand and fix it.

Vulnerability Report Structure

Executive summary — severity, affected product/version, one-sentence description
Technical description — precise characterization of the vulnerability class and mechanism
Root cause analysis — the exact code path or design flaw
Reproduction steps — step-by-step from clean state to triggering the vulnerability
Proof-of-concept — crash file, exploit script, or reproduction case
Impact assessment — what an attacker can achieve; confidentiality/integrity/availability
Affected versions — specific version numbers; when was the issue introduced?
Suggested remediation — what the vendor should fix and how
CVSS score — base score calculation with justification

Disclosure Coordination

Initial contact — use vendor security email or bug bounty platform; encrypted if possible
Include enough to demonstrate validity; do not include full exploit
Timeline negotiation — agree on disclosure date; typical 90 days
Tracking — keep records of all communications with dates
CVE assignment — vendor or CNA typically assigns; researcher can request via MITRE
Publication — security advisory, blog post, conference presentation

Conference Presentation

DEF CON, Black Hat, CCC, OffensiveCon — primary vulnerability research venues
Format: technical walkthrough of discovery process + vulnerability detail + demo
Paper writing for academic venues: IEEE S&P, USENIX Security, CCS, NDSS

Resources

Coordinated Vulnerability Disclosure guidelines (CISA, free)
Google Project Zero disclosure policy (free)
CVE assignment process (MITRE, free)

Stage 07

Hands-On Practice & Portfolio

Practice Platforms and Targets

pwn.college (free) — structured binary exploitation curriculum; kernel, heap, stack, ROP
exploit.education (free) — VMs for exploitation practice
CTFtime.org — PWN and RE categories in CTF competitions
HackTheBox (free/paid) — machines and challenges
picoCTF — annual CTF with binary exploitation challenges (free)
VulnHub — vulnerable VMs for exploitation practice (free)
Bug bounty: HackerOne public programs with defined bug bounties; Google Vulnerability Reward Program (VRP) covers Chrome, Android, GCP; Microsoft Bug Bounty covers Windows, Edge, Azure

Building a Portfolio

CVE disclosures — finding and responsibly disclosing a real vulnerability is the highest-signal portfolio item; even a low-severity CVE in a minor project demonstrates the full research workflow
CTF writeups — detailed technical explanations of how exploitation challenges were solved; published on a blog
Original research blog posts — documenting a research area, a fuzzing campaign, or a class of vulnerabilities
Public fuzzing harnesses — contributing fuzz targets to OSS-Fuzz or publishing standalone harnesses on GitHub
Conference talks — presenting at DEF CON, BSides, or local security meetups
Bug bounty hall of fame appearances — publicly credited findings on programs

OSS-Fuzz Contributions

Contributing fuzz targets to Google's OSS-Fuzz is a high-signal portfolio action: pick an open-source project in C/C++ that handles untrusted input; write a libFuzzer target; submit to OSS-Fuzz via GitHub PR; finding a bug in an OSS project this way generates a CVE

What to Document on LabList

CVE research write-ups — full disclosure timeline and technical details
CTF exploitation writeups — binary exploitation, RE challenges with methodology explained
Fuzzing campaigns — documented harness development, corpus management, crash triage
Bug bounty reports — sanitized versions of submitted findings
Research blog — explaining a vulnerability class, a fuzzer technique, or a target analysis

FAQ

Common questions

How long does it take to become a Vulnerability Researcher?

4–5 years optimistic at 20–25 hours/week, 5–6 years realistic. VR is one of the longest paths in cybersecurity because it demands deep operating systems internals, assembly fluency, fuzzing methodology, and CVE-quality writeups. The fastest paths come from reverse engineering, AppSec, or systems programming backgrounds. Pure self-taught paths exist but typically take longer than security-engineer-to-VR transitions.

Which certifications matter for VR roles?

OSEE (Offensive Security Exploitation Expert) for advanced exploitation work. OSED for exploit development depth. GREM for malware analysis overlap. SANS courses (SEC660, SEC760) are gold standard but expensive. CVE discovery is the primary portfolio signal — certs matter less than published vulnerabilities.

Do I need a CS degree?

Helpful but not strictly required. Federal and clearance-required roles often require a bachelor's plus security clearance. Security clearance adds $65K+ in many government-adjacent roles. Self-taught paths through CTF reverse engineering, public CVE research, and bug bounty progression produce competitive candidates. The technical bar is genuinely high — assembly, fuzzing, exploit development — favoring formal CS exposure but not requiring it.

What separates a hired Vulnerability Researcher?

Published CVEs with documented research methodology. Sample fuzz harnesses on GitHub, conference talks (BSides, Defcon Village), and bug bounty disclosure history demonstrate capability. Fuzzing expertise (AFL++, libFuzzer, custom harness development) is the fastest-growing technical skill. Other differentiators: exploit primitive knowledge, mitigation bypass familiarity (ASLR, DEP, CFI), and at least one significant publicly-disclosed vulnerability in your name.

Vulnerability Researcher

Common questions

Related roles