Grayhaven LogoGrayhaven

Edge AI for Defense

Defense AI
January 10, 2025
6 min read
Edge AI for Defense

Why the tactical edge breaks every assumption about AI deployment, and what that teaches us about building systems that actually work.

G

Grayhaven

Author

There's a pattern I keep seeing with defense AI projects. Someone builds an impressive demo using cloud GPUs. Everyone loves it. Then they try to deploy it to a ship or forward operating base and it doesn't work at all.

The demo assumed three things that aren't true at the tactical edge: reliable network access, unlimited compute, and tolerance for latency. None of these are true where the system actually needs to operate.

The Demo Problem

A contractor shows up with something genuinely impressive. Real-time satellite imagery analysis. Instant intelligence report insights. The model is huge, the inference is fast, the results are accurate.

Then someone asks: "Can we deploy this to the field?"

The answer is always: "Well, it needs continuous cloud connectivity and a GPU cluster."

Which means no.

What's interesting is how long it takes people to realize this means no. I've watched organizations spend months trying to make cloud-dependent systems work in environments with no reliable network access. It never works. But the demos were so impressive that people keep trying.

What Makes The Edge Different

The tactical edge has three hard constraints.

No network. Or intermittent access you can't depend on. Systems that assume continuous connectivity fail immediately.

No GPU. Often just standard CPUs with modest specs. Whatever you're running needs to work on hardware you didn't choose and can't upgrade.

Latency matters. Not "it would be nice if this was faster." Decisions happen in milliseconds. If your model can't keep up, it's useless regardless of accuracy.

These aren't artificial limitations. They're operational reality.

Why Smart People Get This Wrong

Technically sophisticated people consistently underestimate these constraints.

I think it's because modern AI development happens entirely in environments where these constraints don't exist. You develop on a machine with a GPU. You test in the cloud. Everything has network access. Everything runs fast.

So you build for these conditions. Then you try to squeeze it down to work at the edge and discover that's not how it works. You can't just shrink a cloud-based system. You need to design for edge constraints from the beginning.

Most organizations approach this backwards: build for ideal conditions, then try to adapt for real ones. This fails.

What Actually Works

I've seen about fifty edge AI deployments. The ones that work share a pattern.

They start by accepting the constraints are real. No cloud connectivity means offline-first operation, not "we'll sync when we can." No GPU means CPU-only from day one, not "we'll optimize later." Real-time means real-time, not "usually pretty fast."

This changes everything about how you build.

The question isn't "What's the best model we can build?" It's "What's the smallest model that solves the problem well enough to deploy?" That's a different question. It produces different answers.

The Size/Accuracy Trade-off

Here's something that surprises people: a model that's 85% accurate and runs in 50ms is more valuable than one that's 95% accurate but won't fit on the hardware.

The second one doesn't deploy. The first one does.

I constantly see projects optimizing for accuracy at the expense of everything else. They build models that are impressively accurate and completely undeployable.

What you need is minimum accuracy that solves the problem, in the smallest possible package. Then you can deploy it.

Hardware You Don't Choose

You don't get to specify the hardware. The system has what it has. Maybe x86, maybe ARM. Maybe 16GB of RAM, maybe 2GB. Maybe 50GB of storage available, maybe 10GB.

You need to run on all of it.

This means building for the worst case. You can't say "well, most systems will have enough RAM." If some systems don't, your deployment fails on those systems.

The right approach is to target the lowest common denominator, then opportunistically use better hardware when it's available. Not the other way around.

The Update Problem

Here's something that doesn't get enough attention: updating models in disconnected environments is really hard.

You can't just push updates over the network. There is no network. You need a process for getting new model versions onto devices that you can't easily access.

This means updates happen slowly. Maybe once a quarter, maybe less. Each update requires physical access or scheduled connectivity windows.

So you need to build systems that work reliably with infrequent updates. You can't count on fixing bugs quickly or retraining often.

Organizations that skip planning for this discover their deployed model has a bug and they can't fix it for months.

What We Learned

After deploying to fifty-plus edge locations, here's what works.

Small, focused models beat large general ones. A 100MB model optimized for your specific use case outperforms a 10GB general model that won't fit.

Hybrid systems provide reliability. Rules plus ML gives you fallbacks. Pure ML systems fail unpredictably on edge cases.

Offline operation isn't negotiable. Everything must work without network access. Not "mostly works" or "works when connected." Everything.

On-device training doesn't work. I've seen organizations try this. It creates more problems than it solves. Update models offline, validate them, then deploy.

Why This Matters

The gap between what AI can do in ideal conditions and what it can do in real conditions is enormous.

Most defense AI projects optimize for ideal conditions because that's where development happens. Then they fail at deployment because deployment doesn't happen in ideal conditions.

The organizations that get this right start with real constraints. They accept that edge deployments will be less accurate, less sophisticated, and less impressive than cloud deployments. But they'll actually work.

That's the trade-off. You can have an impressive demo that doesn't deploy, or a modest system that does.

Most organizations are choosing the first without realizing they're making a choice.

The Real Challenge

The tactical edge isn't a technical problem. It's a product problem.

The question isn't "Can we build AI that works at the edge?" Yes, you can. The question is "Will we accept the constraints required to deploy it?"

That means accepting less accuracy than cloud systems. It means accepting limited functionality. It means accepting that updates are slow and capability improvements are incremental.

Organizations that can't accept these constraints keep funding demos that never deploy.

The alternative is to start with constraints, build for offline operation from the beginning, and accept that you're building a different kind of system. One that works where networks don't exist. Which is where you actually need it to work.


We build AI systems for environments where traditional approaches fail. If you're deploying AI to places with no network access, no GPUs, and no tolerance for latency, we should talk.

Interested in Production AI Systems?

We build AI systems for insurance and defense operations. Let's discuss your requirements.