Trying to Turn an AI Into an Installer (and Why That’s Harder Than It Sounds)

Over the last several days, we’ve been trying to solve what initially felt like a very reasonable problem:

Can we use an AI assistant as a reliable, step-by-step installer assistant to a human for a moderately complex toolchain?

On the surface, this seemed like a perfect use case for AI. The setup process is linear. The steps are well-defined. The goal is clear. Why not let an AI guide a user through it?

What followed was a long series of experiments, false starts, partial successes, and unexpected constraints — not because the tools were complex, but because AI assistants behave very differently from traditional software systems.

This post is an overview of what we tried, what failed, what we learned along the way, and where we are now.

The Original Goal

We wanted an AI assistant to guide users through installing and configuring a “Personal Toolbar” system built from two pieces:

A browser extension (Subscribed Toolbar)
A web-based configuration tool (Jsonmaker)

The ideal experience looked like this:

The user opens a chat with an AI
The AI walks them through setup step by step
The AI doesn’t skip ahead
The AI doesn’t overwhelm them with explanations
The AI waits for confirmation before moving on
The AI acts like a calm, deterministic installer

In other words, we wanted the AI to behave like a wizard or state machine, not a lecturer.

Attempt #1: A Single Instruction File

The first approach was straightforward:

Write a single, authoritative installation document
Host it publicly
Tell the AI to load it and “follow it exactly”

This mostly worked — but with a major flaw.

The AI would read the whole file, then immediately start referencing later steps:

Explaining Jsonmaker before the extension was installed
Talking about “what’s coming next”
Collapsing multiple phases into one response

From a human perspective, this is helpful.
From an installation perspective, it’s a disaster.

We discovered an important truth early:

You can’t stop an AI from seeing ahead — only from acting ahead.

Attempt #2: Splitting Instructions Into Multiple Files

To prevent lookahead, we tried splitting each phase into its own file:

Phase 1 → link to Phase 2
Phase 2 → link to Phase 3
And so on

The idea was simple: if the AI can’t see the future, it can’t jump ahead.

This introduced a new class of problems.

Many AI environments are:

Unreliable at loading external URLs
Resistant to being told “load this and obey it”
Cautious about chained external instructions (for good security reasons)

We ran into:

Failed fetches
Refusals
Requests for the user to paste content manually
Inconsistent behavior across sessions

In trying to make the system more deterministic, we had accidentally made it more fragile.

Attempt #3: Hosting Instructions Everywhere

We tried moving the instructions around:

GitHub raw URLs
GitHub blob pages
Our own domain
Variants with proxies and mirrors

Each helped a little, none solved the core issue.

At this point, it became clear that the problem wasn’t where the instructions lived.

The problem was how AI assistants relate to external authority.

A Key Realization: AI Resists Being a Script Runner

One of the most important insights came from observing repeated failure modes:

The AI didn’t want to blindly defer to an external document
It resisted being told “this is authoritative”
It preferred to reason, summarize, and help — even when explicitly told not to

This isn’t a bug. It’s a design choice.

AI assistants are built to:

Avoid prompt injection
Avoid remote control via external content
Avoid acting like a programmable agent unless explicitly designed as one

In other words:

We were trying to use a conversational system as a deterministic execution engine.

That mismatch explains a lot.

Attempt #4: Phase Gates and Unlock Phrases

Next, we went back to a single instruction file, but added structure:

Explicit phases
Hard “STOP” markers
Unlock phrases like UNLOCK PHASE 3
Repeated reminders not to move ahead

This worked better, but still wasn’t foolproof. The AI would sometimes comply, sometimes drift, sometimes “helpfully” explain anyway.

The model could see the rules — but wasn’t always motivated to obey them.

The Most Promising Shift: Command-Gated Execution

The best results so far came from a conceptual shift:

Instead of asking the AI to push the next step, we made the user pull it.

We introduced a command-style protocol:

RUN PHASE 1
RUN PHASE 2
RUN PHASE 3

The AI:

Refuses to execute unless it sees an exact command
Treats natural language as clarification, not execution
Behaves more like a CLI than a tutor

This aligns much better with how language models behave.

It’s not perfect — but it dramatically reduces lookahead, premature explanation, and accidental skipping.

Where We Are Now

We don’t have a “perfect” solution yet.

What we do have is a much clearer understanding of the problem space:

AI assistants are not state machines
They don’t like being told to obey external scripts
They default to being helpful, explanatory, and anticipatory
Determinism has to be designed around, not assumed

The current direction — command-gated, user-pulled execution with minimal external dependencies — feels the most promising so far.

But we’re still experimenting.

Why We’re Still Optimistic

This process hasn’t been wasted effort.

It’s revealed something deeper and more interesting:

Designing workflows for AI is not the same as designing workflows with AI.

The patterns that work look less like documentation and more like protocols. Less like instructions and more like interfaces.

We don’t yet know what the final shape of this will be — but we’re confident that an answer exists, and that it will emerge from continued experimentation rather than a single clever trick.

If nothing else, this journey has reinforced a valuable lesson:

AI is powerful — but only when you design with its nature, not against it.

What we know — if we asked AI to coach a user through each step manually — take the instructions and paste phase 1 into chatGPT — it will help solve phase 1 perfectly. Same thing for each subsequent phase. But asking the user to paste 12 things seems ridiculous.

And if we set up a scenario where chatGPT has access to ALL of the steps at once — it starts to hallucinate.

It’s a programming problem. We’ll figure it out.