Why AI Projects Fail Without AI-Ready Security Data and Context

Written by Dr. Steve Meckl | June 4, 2026 11:45:00 AM Z

Despite the hype, most AI projects fail, and that trend is expected to continue for the foreseeable future. Across every industry, one of the most common reasons is a lack of AI-ready data. Cybersecurity is no different, and producing AI-ready security data is more difficult than teams expect.

What is AI-ready security data? AI-ready security data is security telemetry that's been semantically normalized. Not just standardized field names, but resolved meaning across identities, assets, severities, and outcomes, so an LLM can reason against it without a human in the loop. Most security data doesn't clear that bar, and that's precisely why so many security AI projects stall before they deliver.

We live in a world of syslog and JSON data designed to create alerts and dashboards, which is great for measuring KPIs, but misses the mark when fed to agentic models designed to protect your enterprise. To be at its most effective, AI needs contextually dense and semantically pure data with an unreasonably high signal-to-noise ratio.

Security data needs a surprising amount of work to be made AI-ready.

What Makes Security Data AI-Ready and Why SIEM Models Fall Short

The industry has largely attempted to solve the normalization problem with standardized SIEM data models. Splunk’s Common Information Model, Google’s UDM, Microsoft’s ASIM, and the Open Cybersecurity Schema Framework have all created schemas aimed at normalizing the data models for security data.

While these are all well-designed models, they are insufficient for processing cybersecurity data with AI. They focus on standardizing field names and data formats, but fail to semantically normalize the meaning of field values. This wasn’t a problem pre-AI because human operators resolved semantic ambiguity intuitively. LLMs do not.

Real-world cybersecurity problems require a high degree of semantic awareness, which is why human analysts have been so much more successful at analytic problems to date. In the process of building an agentic pipeline that performs security analysis tasks as well as humans, we have identified the key problems hiding in your data right now that could cause your next AI project to become another statistic.

4 Semantic Problems Hiding in Your Data

Semantic purity is one of the key challenges to overcome with cybersecurity data. As I mentioned above, SIEM data models tried to solve this problem and got some of it right. Regardless of your technology of choice, you write queries in a simple language. You don’t need to memorize what two different security products call a “source IP.”

Good data models even go an additional step and define the field data types: Integer, string, floating point, binary blob, etc. But that’s as far as they go. Unfortunately, they fail to establish meaning to the data, especially in multi-source alerts containing data from more than one vendor.

Here are the four most common semantic problems we have identified by processing millions of alerts across every industry vertical via our Managed Detection & Response (MDR) service.

1. Identity Resolution

Products report on identities in a variety of ways:

Your email system will report (spoiler alert) an email address and, if it’s a modern one, the name of the user.
An HR system might just have the person’s name.
An IDP solution will have a username and an associated email or directory service account name.
Windows logs might just have the SID of the user.

These are all legitimate ways of identifying a user from the myopic view of a single product. They are wholly insufficient from the perspective of a SIEM trying to correlate alerts across technologies. Completely unusable by AI.

The simple problem of identifying that jsmith@acme.com, DOMAIN\john.smith, John Smith, and the user of the MacBook Pro with IP address 10.50.22.43 are the same person is not so simple.

It is critical information for an LLM to have when reasoning against security alerts, where the name, location, job function, access, and assets are critical in understanding if the “user added to an admin group” alert represents business as usual or a sophisticated cyber attacker.

2. Hostname/IP Correlation

A similar issue arises with asset information. Workstation names, FQDNs, IPv4 and IPv6 addresses, and URLs can all represent the same physical asset. It gets even more complicated with multi-homed servers and DHCP leases. Tying these various ways of naming a compute asset together is a surprisingly difficult problem.

The data often exists in enterprises, but is not easy to correlate. DHCP tables are transient, and changes to which machine has an IP address are rarely logged anywhere due to their high volume. DNS records and logs have the same problem. Resolution of true hostnames behind a NAT router is so complex that an entire book could be written on the topic.

These are not new problems in IT, but they have new consequences for AI analysis of security data.

An LLM must be able to determine the identity of a compute asset regardless of how it was identified in a particular log entry. If it can’t, it will fail to correlate alerts together into cases, missing entire attacker kill chains because every security alert will look like an unrelated event.

This is a problem that human analysts are also bad at. AI can do better, but only if systems are in place to tie these disparate data sources together and make them available, via the context window, to LLMs when prompted for analysis.

3. Severity Semantics

This is a problem that shouldn’t exist, but unfortunately does. In my decade-plus of running MSSP and MDR services, I have encountered the following severity models used in production security technologies:

Emergency, Critical, Warning, Informational
High, Medium, Low
1-5, where 5 is the most severe
1-5, where 1 is the most severe
Arbitrary numerical scales (e.g., 1-7, 0-25)
0-100 confidence or severity score
No severity or criticality scoring at all

I’ve seen engineers build entire software systems to resolve and rationalize these severity models with Jira bug queues so big that it would make your head spin. They eventually even worked…until some enterprising Product Manager invented a new model.

Without explicit context provided for each and every data source, AI has no hope of resolving these naturally. When building our agentic SOC pipeline, it was one of the first problems we identified and solved.

4. Contextual Field Interpretation

A similar problem occurs in other contextual fields in security data. For example, terms like “blocked,” “denied,” “prevented,” and “quarantined” show up in JSON field names for EDR products. If you’ve read this far, it should come as no surprise that they don’t mean the same things across products. It should also come as no surprise that they can be incorrect.

For example, at least one EDR product logs an “isolated” event when it attempts to isolate a machine from the network, but if the API call fails for some reason, it is never reported.

Human analysts, who have specialized skills with these products, learn the ins and outs of these fields over time and are rarely surprised. LLMs, on the other hand, never learn this information naturally and must be told explicitly what the field names mean for each data source, in every prompt.

How "Panic Words" Trigger False Positives Without Context

Even when the data are semantically clear, you have another problem. LLMs have a tendency to treat some terms with higher importance than others. In cybersecurity log data, every word sounds scary, and LLMs treat them as such.

Here are a couple of examples:

“securityIncident” as a JSON field name significantly increases the likelihood of “true positive” determination, even with other available context such as “confidence: low.”
“incidentType: ransomware” leads to aggressive response action recommendations, even if the same alert reports the attempt was “blocked” or that the malware was “quarantined.”
“severity: high” leads to over-classification as true positive, even if the reporting technology is known to mis-classify too many alerts as High severity.

During early testing of AI in our SOC pipeline, this problem caused the agent to overreact on every alert. The team dedicated an entire release cycle to “Operation Chill Pill” to get the AI to take a measured response to its analysis. The solutions were surprisingly simple.

In some cases, we could remove the panic words altogether as they didn’t provide meaningful context to the analysis. In other situations, we simply treated the AI as an executive, expressing the panic words in terms of business risk instead of using them in their raw form. Contextualizing the panic words resulted in measured analysis from our system of agents.

Human analysts are much better at avoiding these traps because they carry the context of knowledge and experience in their brains. For agentic AI systems, this context needs to be included with every prompt to avoid panic responses. If it is not present, users of the AI system can fall victim to the “Boy Who Cried Wolf” problem, potentially leading them to ignore real attacks when they occur.

What It Takes to Make Security Data AI-Ready

Unfortunately, solving the semantic problems and panic word issues described above isn't a one-time configuration. Making security data AI-ready is an emerging engineering discipline spanning multiple skill sets.

Infrastructure

Asset databases, identity resolution systems, and product-specific knowledge bases are just the start. This gives you the data. You will need software systems developed on top of those databases to tie things together semantically.

Cybersecurity data fabric and data mesh solve this problem by creating a graph database to correlate identity and asset data together. Fortunately, production-ready commercial systems already exist to handle this, including our entity fabric, Meridian.

Skill Sets

New skill sets are also required. Context Engineering is probably the most critical skill for building and maintaining successful AI systems. Context is a Goldilocks problem. Too much context fills the LLM’s context window with noise, leading to more hallucinations. Too little context, and the model can’t determine if that 3 a.m. security alert happened during business hours or not. You need just the right amount of context for the problem at hand.

If you’re using an LLM with a small context window, you will almost certainly need to re-prompt the model repeatedly, introducing context in smaller chunks. How much context is provided, and in what order, are key pieces of the puzzle. While still in its infancy, Context Engineer will be one of the most critical engineering roles of the future.

Updated Data Processing Systems

AI-ready data are also product-specific, environment-specific, and ongoing. Similar to the parser problem with SIEMs, your data processing systems will need to be constantly updated as product log format and content change. The environmental context will also need to be updated as your enterprise changes. It is an ongoing process that many organizations fail to budget for when planning an AI project.

Budget

Lastly, getting context engineering wrong can be costly. As AI models become more powerful, they will become more expensive. Every token you spend must provide value. The context contained within a prompt must be dense, or you will waste money on tokens just like a poorly optimized cloud application can waste money on compute.

Data Preparation Is The Foundation For Success

None of this shows up in a product demo. Demos use clean, curated data designed to showcase AI capabilities under ideal conditions. Production systems face the reality of messy, inconsistent data from dozens of sources, each with its own quirks and edge cases.

The data preparation problem is the least glamorous part of building an AI-powered SOC and the most consequential. Get it wrong, and your analysts will learn to ignore the AI, which is the one failure mode you can't fix with a software update. Get it right, and you've built the foundation for everything that follows.

Speaking of which: once you have clean, semantically rich data flowing through your pipeline, how do you know the AI's conclusions are actually correct? That's the quality assurance problem, and it's where we're headed next.

View full post