EVAM
· ···− ·− −−
EVAM
LLMs: From Poker Nights to Mission-Critical Systems
Return to Journal

LLMs: From Poker Nights to Mission-Critical Systems

January 30, 2026Evam Labs

When ChatGPT Loses at Poker

Featuring: Shobhit, Madhu, Sumanth & Prathamesh

We’ve all had our “magic moment” with LLMs.

But we’ve also had our “face-palm” moments.

A colleague of mine recently tried to use ChatGPT as a wingman during a poker night.
The result?

He lost badly.

Because the model, desperate to please, likely hallucinated a winning hand or miscalculated the odds based on incomplete information.

Another colleague used it for recipe substitutes, and the LLM confidently turned a vegetarian dish into a meat-based one.

These aren’t just funny anecdotes.

They are symptoms of the fundamental nature of Large Language Models.

  • They are probabilistic, not deterministic.
  • They are sycophantic — they want to agree with you, even if you are wrong.

If you are an engineer building on top of these models, this is your starting line:

The model is a liar.

Now the question becomes:

How do we build a mission-critical system with it?

1. Tactical or Strategic

To stop treating LLMs like toys, we need to categorize how we use them.

The Tactical Layer (Low-level)

This is where the LLM acts like a transactional worker.

It’s excellent at tasks that are tedious for humans but easy to verify.

Examples:

Data cleaning
Frontier models can take millions of jumbled, unstructured addresses and standardize them with 95%+ accuracy.

Formatting
Turning unstructured text into structured JSON for supervised learning features.

Coding
Writing small, checkable blocks of code using tools like Cursor.

The Strategic Layer (High-level)

Here, the LLM becomes a “sounding board.”

It can:

  • Iterate through deep research
  • Summarize literature for IP projects
  • Help you reason through problems outside your domain

For example, an information theorist trying to understand spatial imaging.

2. The “Fringe Stack”: Where the Real Value Lies

Here is a hard truth for engineering graduates and startups:

Don’t compete on the core stack.

The core LLM is rapidly becoming a commodity.

These models are incredible when there is enough data.

But the real world is messy.

The real world operates on the edges — the uncertainties.

That is where the fringe lives.

The Fringe

Problems where:

  • Data is scarce
  • Data is unique
  • Data is highly enterprise-specific

The Architecture

The future isn’t just “asking the LLM.”

Instead, we use the LLM as an interface that interacts with fringe models.

These are specialized traditional machine learning models designed for specific low-data tasks.

If you want to be valuable:

Don’t just learn prompt engineering.
Learn how to build the Fringe Stack.

3. Multimodality: Seeing the World, Not Just Reading It

We are moving beyond text-in, text-out systems.

Modern models are becoming natively multimodal.

They can perceive, not just process language.

Example: Retail Decision Making

Imagine you want to find the best location to sell fresh milk.

Input

  • An image from a map provider
  • A neighborhood graph
  • Store data

Process

The model analyzes spatial structure, such as:

  • Neighborhood appearance
  • Indicators of income level
  • Urban density patterns

Output

It predicts market opportunities based on visual cues that a purely text-based model would miss.

4. The Engineering Crisis

Validation & The “Yes-Man” Problem

Traditional software testing is deterministic.