VARIANT DISCOVERY

UNCOVER EVERY MALWARE VARIANT

Variant Discovery turns a single hash into visibility of the entire malware family. In seconds, understand all the infrastructure this malware family used across history. No YARA required.

Variant Discovery

WHY VARIANT DISCOVERY MATTERS

EXPOSES EVASIVE VARIANTS

Analyzes file structure, behavior, and content to uncover the whole malware family.

REVEALS BLIND SPOTS

Reanalyzes your entire history turning yesterday’s blind spot into today’s insight.

MANUAL HUNTS DON'T SCALE

Turns one bad sample into a sweep for lookalikes, giving a campaign view beyond a single-file verdict.

“For hunting malware, Stairwell is the best way to do it.”

Michael Francess
Cybersecurity Advanced Threat and Response | Wyndham Hotels & Resorts

WHAT IS VARIANT DISCOVERY?

Attackers don’t ship one binary. They ship dozens: repacked, re-signed, and slightly modified to evade detection. Variant Discovery finds the entire family from a single sample, so you stop playing whack-a-mole with hashes.

YOUR PRIVATE VAULT

Variant Discovery starts with Stairwell’s file-centric view of your environment. Every executable, script, and artifact is stored in your private, encrypted vault and not a public crowdsourced pool.

FINDS STRUCTURAL SIMILARITY

Attackers change what’s easy: packers, signatures, minor code tweaks. Hash-based detection dies instantly. Variant Discovery examines the underlying structure of files to identify lookalikes that share real DNA.
Malware DNA

MAP THE ENTIRE MALWARE FAMILY

Catching a single sample is useful. Understanding the entire operation is better. Variant Discovery doesn’t stop at the file, it maps the malware family tree.
In seconds, one hash becomes complete breach visibility.

THREAT REPORTS MEETS VARIANT DISCOVERY

As new threat intel, YARA rules, and IOCs from threat reports arrive Stairwell’s Variant Discovery gains insight. At the click of a button Stairwell reanalyzes your entire file corpus in your private vault, lighting up variants that were invisible at first pass.
Run to Ground Hero

FROM ALERT TO “RUN TO GROUND”

Variant Discovery plugs into Stairwell’s broader investigation workflow. When an alert fires, Variant Discovery automatically asks: What else looks like this? And where it may have been?
Run to Ground

THE HIDDEN
MALWARE REPORT

Threat reports are a starting point. Stairwell goes further and finds the look-alikes. On average, we uncover 157% more variants, or 20+ additional malware variants per published threat report.

ENGINEERED FOR PLANET-SCALE

Built by Google and intelligence veterans. Web-scale indexing, YARA at ludicrous speed, and structured AI reasoning turn raw artifacts into instant understanding across everything you’ve ever seen.

LEARN MORE ABOUT STAIRWELL

No posts found! Try adjusting your filters.

FREQUENTLY ASKED QUESTIONS

Malware variants are modified versions of an existing malware sample that share the same underlying code, capabilities, or infrastructure but differ in ways that defeat hash-based detection. Attackers create variants by repacking the binary with a different packer, re-signing the file with a different certificate, changing minor strings or code sections, adding junk code, or even just recompiling code without other modification, all of which alters the hash without changing the malware’s core functionality.

Traditional detection relies heavily on exact hash matching: a file is flagged if its hash appears in a known-bad list. Variants sidestep this entirely because a single changed byte produces a completely different hash. This is why organizations frequently manage to contain one instance of a malware infection only to find related samples on other systems that the detection never recognized. The only reliable way to catch variants is to analyze a file’s structural and code similarity to large data sets of known malware. Similarity matching at scale makes this an incredibly powerful analytical tool.

Attackers use several techniques to create malware variants that evade signature detection while preserving the same capabilities. The most common are repacking (wrapping the original code in a different packer that produces a new binary structure), re-signing (using a different code-signing certificate to change the file’s cryptographic properties), and minor code modification (inserting dead code, reordering functions, or changing variable names that alter the compiled binary without affecting behavior).

More sophisticated techniques include metamorphic coding, where the malware rewrites its own code structure between infections, and polymorphic packing, where the decryption stub changes each time while the payload remains functionally identical. For defenders, the implication is that detection coverage based on a single hash or a small number of known samples provides a false sense of security. Adversaries who understand your detection capabilities will produce new variants specifically designed to fall outside those signatures, making structural analysis that looks at what the code actually does a far more durable detection approach.

A malware family is a group of malware samples that share a common codebase, design, or operational lineage. Samples within a family typically use the same communication protocols, employ the same core capabilities, share code modules, or show structural patterns that indicate common authorship. Family membership is a more useful classification than individual hash verdicts because it connects the dots between samples that appear superficially different.

Analysts identify malware family membership by examining code structure, shared functions, imported libraries, network communication patterns, file system artifacts, and the infrastructure the malware contacts. YARA rules are a common tool for codifying family membership: a researcher who identifies core patterns shared across a family writes rules that match any sample exhibiting those patterns, regardless of surface-level modifications. Automated variant analysis tools accelerate this process by computing structural similarity at scale, grouping files that are related by code rather than requiring a researcher to manually inspect each new sample.

Variant discovery without pre-written YARA rules works by computing mathematical representations of a file’s structural characteristics and comparing those representations against a corpus of known files. When two files score as highly similar across multiple structural dimensions, even when their hashes are completely different, the system groups them as likely variants. This approach finds relationships that no existing YARA rule has ever described. One could think of YARA rules as an attempt at finding variants manually. Variant Discovery automates this process at scale.

This is particularly valuable for zero-day or novel malware families where no YARA rules exist yet. When an analyst confirms that one file is malicious and submits it for analysis, the system immediately searches for structurally similar files across the malware corpus and your enterprise file history, surfacing potential related samples that would otherwise require a researcher to first reverse-engineer the malware, write a rule, and then run that rule against the corpus. The entire research step is compressed into a near-instant similarity query.

Hash-based detection fails against polymorphic and repacked malware because these techniques are specifically designed to change the file’s fingerprint while preserving its functional behavior. A repacked binary runs through a packing layer that produces a completely different hash from the original, and a polymorphic sample regenerates its own code structure between executions. Neither will match a hash-based blocklist even though the underlying malicious capability is identical.

The practical consequence for security operations is significant. Blocklists and reputation services built on hash matching cover only the exact samples that have been previously observed and catalogued. Attackers who repack before each campaign, or who sell malware-as-a-service with unique builds for each customer, can reliably defeat hash-based controls at each step. Detection approaches that analyze what a file is built to do, rather than what its hash value is, provide much more durable coverage because they focus on the parts of the malware that are harder to change: the core logic, communication patterns, and functional capabilities that define the tool regardless of its surface-level packaging.

Detecting polymorphic malware in an enterprise environment requires moving beyond hash-based detection to methods that analyze file behavior, structure, and characteristics that remain consistent across mutations. YARA rules targeting stable code patterns, structural similarity analysis comparing files against known malware families, and behavioral detection looking for TTPs rather than exact signatures are all more effective than hash matching against polymorphic samples.

In enterprise environments, the most practical approach combines multiple detection layers. EDR behavioral detection identifies suspicious runtime activity regardless of the binary’s hash. File structure analysis applied during triage identifies structural similarity to known malware families even when the binary has been repacked. YARA rules targeting code-level patterns catch samples that share functional DNA with known threats. And variant discovery applied retroactively against your file history surfaces samples that were collected before any of these detections existed, ensuring that historical exposure is identified even when real-time detection failed at the time of initial collection.

A malware family tree is a visual or data representation of the relationships between samples in a malware family, showing how variants evolved from each other over time, which samples share the most structural similarity, and where the family spread across an environment or across campaigns. It gives investigators a map of the threat rather than a disconnected list of individual hashes.

During a threat investigation, a family tree accelerates both scope determination and attribution. Scope determination becomes more accurate because investigators see not just the confirmed samples but the full set of structurally related files that belong to the same family. Attribution becomes easier because family trees reveal patterns of code reuse, tooling evolution, and infrastructure overlap that connect current activity to historical campaigns. Investigators who can start from one confirmed malicious file and expand to the complete family in seconds, rather than hours or days of manual research, close investigations faster and with greater confidence in the completeness of their findings.