Bypass the participant hype, and the MITRE ATT&CK® Evaluations Enterprise 2025 will tell you what effective security should look like in 2026

The hidden treasure in the latest MITRE ATT&CK® Enterprise Evaluation is in the summaries and data deep-dives, not in headlines from vendors with high sales intent, under pressure to “sell” Detection & Response. As a result, even astute readers may miss the forest for the trees, seeing only individual performances and yet-to-be-unpacked scores as the sole representative metric of their prowess. However, the evaluations and MITRE’s summaries were made exactly for the purpose of being unpacked – that is where their value lies.

This year, MITRE pits vendors against two adversary emulations. The first, Demeter (ostensibly Scattered Spider), focused on a high-tempo adversary known for its social engineering techniques. Hermes (likely Mustang Panda – a usual suspect in ESET APT Reports) was the second, emulating a state-sponsored cyberespionage group with rapidly evolving capabilities. These two scenarios tested participating vendors’ expertise and made for a great showcase of how each evaluated product supports analyst tasks.

ESET did well this year, but the evaluations are concerned with furthering the understanding of EDR and XDR users, not handing out medals. After five years invested in the evaluations, we know that the output represents a massive volume of data and insight to navigate that’s challenging, even to seasoned insiders. We have some suggestions on how to cut through the fluff and get to the truly valuable information you can use if you’re running or buying an EDR or XDR.

Key points of this article:

ESET performed well in both tested emulation scenarios and our 100% protection score highlights the relevance of our prevention-first approach. In test, we tied for first place among the nine participating vendors by rapidly blocking attacks at their earliest steps.

Prioritizing low noise and right-sized volume for the critical analyst pivot, ESET PROTECT delivered the fastest detection times in both attack emulations, with a 66.67% detection score.

The value of MITRE’s evaluation lies in the results serving as unbiased guidance to security analysts, challenging their understanding of a vendor and their detection and response platforms.

For the vendors, their solutions are professionally verified against contemporary threats in emulated attacks.

The collective result is a better understanding of the analyst toolsets used to translate the threat landscape, particularly TTPs.

The evaluations, above all, show how different vendor approaches can help an organization improve cyber resilience through every tested solution used within the MITRE evaluation – things like EDR/XDR, sandboxing or network visibility.

What is MITRE ATT&CK® Evaluations Enterprise 2025?

There’s no shortage of information on the Tactics, Techniques and Procedures (TTPs) of threat actors. The role of analysts is to combine their own knowledge and experience with this insight and use the tools at their disposal. Those tools include a central intelligence platform, an EDR/XDR, SIEM/SOAR and the like.

But there is more than one way to go about protecting, and each endpoint security solution does its job differently. Each provider has its own methods and approaches to present its own telemetry/detections, so it takes a bit of time for analysts to find their match, get engaged and then marry their knowledge of that unique environment with the desired advantages of that toolset.

This is the problem the MITRE ATT&CK Evaluations set out to solve: mapping, with as much transparency as possible, how EDR/XDR tools handle malicious scenarios and represent the capabilities and limitations of each participating vendor. There is no “right” way to analyze a threat – it’s all a matter of a tool’s viability for an analyst.

2025 is a year of change for MITRE

This year, the format built upon the significant changes introduced in 2024 also keeping the separate protection test.

This is relevant because while any EDR tool can make detections, it’s the content of those detections and alerts which are pertinent. Yes, detection visibility was tested thoroughly by allowing adversaries to execute behaviors without interruption. This challenged the participants to collate their detections with the MITRE ATT&CK knowledgebase, an approach that should reflect the contemporary threat landscape and baseline functions on which all EDR/XDRs deliver.

The question is whether the test falls short of unique real-world telemetry employed by individual solutions.

The MITRE Evaluation plays a major role within the cybersecurity market, serving as a quality gate and industry standard, with many companies and contractors employing the evaluations to scrutinize vendors, resulting in unwilling participants losing out on major potential deals due to a lack of a perceived “certification”.

The protection evaluation challenged the effectiveness of blocking and, in the end —much akin to ESET’s prevention-first approach — stopping attacks before they could cause major harm. For this, all protective mechanisms were enabled, calling on the participants to demonstrate their detection and real-time prevention of advanced malicious attacks before they reached critical attack points.

The detection challenge also differed in its inclusion of a configuration change, where the participants could tinker with their product following their first run. SOCs continuously reconfigure their environments to adapt to the ever-changing threat landscape, and thus testing vendors for having the option to (perform/test) configuration changes easily is supposed to reflect that.

Interpreting ESET’s performance: A guidance and not a race

Back to the unpacking mentioned at the outset. On the one hand, ESET can say that ESET PROTECT’s protection score is 100% and the best possible result, or that its detection rate offered substantial visibility — for an active analyst, the math may not provide as much insight as they need to know, or does it? Conversely, another vendor could say it had an even better detection score …except, that doesn’t tell the whole story either.

This is one reason why ESET focuses so much on prevention (heavily dependent on detections), and why it’s peculiar that, since protection measurement was introduced in 2024, only two-thirds of MITRE’s participants generally opt in to this part of the evaluation. By automatically blocking attacks like those employed in the protection scenario, your product frees security teams to focus on strategic tasks that further strengthen cyber resilience.

On the other hand, some analysts are locked on to detection performance. You can have a 100% detection score (ESET’s is 66.67%), but is all that telemetry relevant or equal?

No. Within the tests certain TTPs carry a higher severity and thus have heavier weight/impact on detections scores. With five years of participation in the evaluations, our ESET PROTECT engineers have resisted pursuing total coverage of the ATT&CK knowledge base. The reason? Achieving 100% labeling of an adversary’s modus operandi against the ATT&CK knowledge base does not improve defenses or automatically assist security analysts in their daily work.

What matters more is a low, but still substantive, volume. Detection and response requires only enough coverage of highly prevalent or severe techniques (or sub-steps) to do the job. Anything more risks overwhelming analysts. Missing detections for low-prevalence or low-severity techniques does not necessarily translate to lower protection. Quite the opposite: it could mean that work is streamlined and remediation is faster because the major steps needed to identify the attack are immediately highlighted, allowing a timely and adequate response to be triggered — in some cases, even with automatic blocking of the detected threat.

During the course of 2025 testing, ESET PROTECT automatically populated its detected incidents with relevant detections, with alerts containing the most substantial adversary activity within the imagined network, resulting in very well-correlated and low volume per scenario performance (having just one incident in the Hermes scenario!) —a boost for any analyst looking to zero in only on what counts, cutting through the noise.

Yes, MITRE tests for these. One of the reason vendors take part is that they can use the gathered know-how to sharpen their detection engines, but for a cyber security analyst who’s under pressure to stop an ongoing attack, they won’t focus on minutiae. Instead, they need a clear, unequivocal alert, which are often the most severe and telling detections organized by their XDR (best automatically) to inform their next actions.

Winning the battle, losing the war

There will be no shortage of participants claiming to have ‘won’ the latest MITRE Evaluation.

“MITRE has never been about winners and losers,” says Juraj Malcho, Chief Technology Officer at ESET, “Marketing spiel might claim a win, but you may have in fact lost in the real world, the one faced by actual security analysts. They don’t need to log every event happening on their systems. Whether matched against the ATT&CK knowledgebase or not, they already have enough noise to cope with. What matters are accurate detection correlations and flagging suspicious activities for further investigation.”

Another consideration in evaluating any vendor would be a good result for detection logic creation, i.e. (re)configuration. ESET PROTECT’s flexibility in this respect helped ensure good coverage of relevant blind spots following the initial run of each scenario by having the ability to reconfigure the solution’s detection logic, going beyond the initial vendor-assumed artificial baseline. In the real world, analysts are thus able to tailor ESET PROTECT’s logic to their environment’s needs, shaping it to tackle sector-specific threats.

Use MITRE’s tools to make your own evaluation

Luckily, MITRE’s website offers a comprehensive view into each scenario and the involved steps (like user execution) and sub-steps (a malicious link), also supported by images collected from the vendors’ solution UX or interfaces detailing how they present/serve information to an analyst, such as the way they correlate detections to the ATT&CK knowledgebase, various modifiers and means to directly compare vendors in a single view.

ESET recommends our readers head to MITRE ATT&CK® Evaluations Enterprise 2025 page, and interpret every participating vendor’s testing as guidance, not only to test their assumptions, but also to confirm the value provided by each.

Be proactive, not reactive

Question: Should ESET claim that this year’s MITRE results were so good that we eclipsed the other participants? No. That’s not the aim of the Evaluations.

MITRE ATT&CK Enterprise Evaluations isn’t just another test, it’s a series of tests designed to uncover granular insights that guide the understanding of engineers, product R&D, CISOs and others. MITRE instigates vendors to refine their tools, much like a writer seeking fresh perspectives on a familiar topic, or a monitoring specialist integrating another feed from a different provider into their intelligence platform. The evaluation is for the cyber security analysts who use EDR, XDR and the like to protect their organizations. It’s particularly helpful for those seeking to shape or reshape the toolset at their disposal for detection and response activities.

If you’re of the mind to challenge your personal understanding of cybersecurity vendors and MITRE’s test, feel free to check the results page. But remember, it’s about insight, not competition.