VARIANT DISCOVERY
UNCOVER EVERY MALWARE VARIANT
Variant Discovery turns a single hash into visibility of the entire malware family. In seconds, understand all the infrastructure this malware family used across history. No YARA required.
WHY VARIANT DISCOVERY MATTERS
EXPOSES EVASIVE VARIANTS
Analyzes file structure, behavior, and content to uncover the whole malware family.
REVEALS BLIND SPOTS
Reanalyzes your entire history turning yesterday’s blind spot into today’s insight.
MANUAL HUNTS DON'T SCALE
Turns one bad sample into a sweep for lookalikes, giving a campaign view beyond a single-file verdict.
Michael Francess
Cybersecurity Advanced Threat and Response
| Wyndham Hotels & Resorts
WHAT IS VARIANT DISCOVERY?
YOUR PRIVATE VAULT
- Every file you’ve ever seen is searchable.
- Nothing ages out because a log pipeline rolled over.
- Your files, analyses, and verdicts are never exposed to adversaries giving you the advantage.
FINDS STRUCTURAL SIMILARITY
- Goes beyond hashes and looks at code structure, sections, imports, relationships
- Groups files that “feel” related even when their surface details differ.
- Surfaces clusters of variants tied to a single campaign or tooling set.
MAP THE ENTIRE MALWARE FAMILY
- See how variants evolved over time in your environment.
- Trace propagation: which hosts, which users, which time windows.
- Pivot from one IOC to the full spread of related artifacts and infrastructure.
THREAT REPORTS MEETS VARIANT DISCOVERY
- New intel re-scores old files.
- Hidden variants from months ago suddenly become visible.
FROM ALERT TO “RUN TO GROUND”
- One click from sample to family, from IOC to campaign.
- Prove containment by showing where variants did, and didn’t, land.
- Replace “we think we got it” with hard evidence from your own files.
THE HIDDEN
MALWARE REPORT
ENGINEERED FOR PLANET-SCALE
Built by Google and intelligence veterans. Web-scale indexing, YARA at ludicrous speed, and structured AI reasoning turn raw artifacts into instant understanding across everything you’ve ever seen.
LEARN MORE ABOUT STAIRWELL
FREQUENTLY ASKED QUESTIONS
What are malware variants and why do they evade traditional detection?
Malware variants are modified versions of an existing malware sample that share the same underlying code, capabilities, or infrastructure but differ in ways that defeat hash-based detection. Attackers create variants by repacking the binary with a different packer, re-signing the file with a different certificate, changing minor strings or code sections, adding junk code, or even just recompiling code without other modification, all of which alters the hash without changing the malware’s core functionality.
Traditional detection relies heavily on exact hash matching: a file is flagged if its hash appears in a known-bad list. Variants sidestep this entirely because a single changed byte produces a completely different hash. This is why organizations frequently manage to contain one instance of a malware infection only to find related samples on other systems that the detection never recognized. The only reliable way to catch variants is to analyze a file’s structural and code similarity to large data sets of known malware. Similarity matching at scale makes this an incredibly powerful analytical tool.
How do attackers modify malware to create new variants without changing functionality?
Attackers use several techniques to create malware variants that evade signature detection while preserving the same capabilities. The most common are repacking (wrapping the original code in a different packer that produces a new binary structure), re-signing (using a different code-signing certificate to change the file’s cryptographic properties), and minor code modification (inserting dead code, reordering functions, or changing variable names that alter the compiled binary without affecting behavior).
More sophisticated techniques include metamorphic coding, where the malware rewrites its own code structure between infections, and polymorphic packing, where the decryption stub changes each time while the payload remains functionally identical. For defenders, the implication is that detection coverage based on a single hash or a small number of known samples provides a false sense of security. Adversaries who understand your detection capabilities will produce new variants specifically designed to fall outside those signatures, making structural analysis that looks at what the code actually does a far more durable detection approach.
What is a malware family and how do analysts identify family membership?
A malware family is a group of malware samples that share a common codebase, design, or operational lineage. Samples within a family typically use the same communication protocols, employ the same core capabilities, share code modules, or show structural patterns that indicate common authorship. Family membership is a more useful classification than individual hash verdicts because it connects the dots between samples that appear superficially different.
Analysts identify malware family membership by examining code structure, shared functions, imported libraries, network communication patterns, file system artifacts, and the infrastructure the malware contacts. YARA rules are a common tool for codifying family membership: a researcher who identifies core patterns shared across a family writes rules that match any sample exhibiting those patterns, regardless of surface-level modifications. Automated variant analysis tools accelerate this process by computing structural similarity at scale, grouping files that are related by code rather than requiring a researcher to manually inspect each new sample.
How does variant discovery work without requiring YARA rules?
Variant discovery without pre-written YARA rules works by computing mathematical representations of a file’s structural characteristics and comparing those representations against a corpus of known files. When two files score as highly similar across multiple structural dimensions, even when their hashes are completely different, the system groups them as likely variants. This approach finds relationships that no existing YARA rule has ever described. One could think of YARA rules as an attempt at finding variants manually. Variant Discovery automates this process at scale.
This is particularly valuable for zero-day or novel malware families where no YARA rules exist yet. When an analyst confirms that one file is malicious and submits it for analysis, the system immediately searches for structurally similar files across the malware corpus and your enterprise file history, surfacing potential related samples that would otherwise require a researcher to first reverse-engineer the malware, write a rule, and then run that rule against the corpus. The entire research step is compressed into a near-instant similarity query.
Why does hash-based malware detection fail against polymorphic and repacked samples?
Hash-based detection fails against polymorphic and repacked malware because these techniques are specifically designed to change the file’s fingerprint while preserving its functional behavior. A repacked binary runs through a packing layer that produces a completely different hash from the original, and a polymorphic sample regenerates its own code structure between executions. Neither will match a hash-based blocklist even though the underlying malicious capability is identical.
The practical consequence for security operations is significant. Blocklists and reputation services built on hash matching cover only the exact samples that have been previously observed and catalogued. Attackers who repack before each campaign, or who sell malware-as-a-service with unique builds for each customer, can reliably defeat hash-based controls at each step. Detection approaches that analyze what a file is built to do, rather than what its hash value is, provide much more durable coverage because they focus on the parts of the malware that are harder to change: the core logic, communication patterns, and functional capabilities that define the tool regardless of its surface-level packaging.
How do you detect polymorphic malware in an enterprise environment?
Detecting polymorphic malware in an enterprise environment requires moving beyond hash-based detection to methods that analyze file behavior, structure, and characteristics that remain consistent across mutations. YARA rules targeting stable code patterns, structural similarity analysis comparing files against known malware families, and behavioral detection looking for TTPs rather than exact signatures are all more effective than hash matching against polymorphic samples.
In enterprise environments, the most practical approach combines multiple detection layers. EDR behavioral detection identifies suspicious runtime activity regardless of the binary’s hash. File structure analysis applied during triage identifies structural similarity to known malware families even when the binary has been repacked. YARA rules targeting code-level patterns catch samples that share functional DNA with known threats. And variant discovery applied retroactively against your file history surfaces samples that were collected before any of these detections existed, ensuring that historical exposure is identified even when real-time detection failed at the time of initial collection.
What is a malware family tree and how is it used during a threat investigation?
A malware family tree is a visual or data representation of the relationships between samples in a malware family, showing how variants evolved from each other over time, which samples share the most structural similarity, and where the family spread across an environment or across campaigns. It gives investigators a map of the threat rather than a disconnected list of individual hashes.
During a threat investigation, a family tree accelerates both scope determination and attribution. Scope determination becomes more accurate because investigators see not just the confirmed samples but the full set of structurally related files that belong to the same family. Attribution becomes easier because family trees reveal patterns of code reuse, tooling evolution, and infrastructure overlap that connect current activity to historical campaigns. Investigators who can start from one confirmed malicious file and expand to the complete family in seconds, rather than hours or days of manual research, close investigations faster and with greater confidence in the completeness of their findings.