Over the last several days, we’ve been trying to solve what initially felt like a very reasonable problem:
Can we use an AI assistant as a reliable, step-by-step installer assistant to a human for a moderately complex toolchain?
On the surface, this seemed like a perfect use case for AI. The setup process is linear. The steps are well-defined. The goal is clear. Why not let an AI guide a user through it?
What followed was a long series of experiments, false starts, partial successes, and unexpected constraints — not because the tools were complex, but because AI assistants behave very differently from traditional software systems.
This post is an overview of what we tried, what failed, what we learned along the way, and where we are now.
The Original Goal
We wanted an AI assistant to guide users through installing and configuring a “Personal Toolbar” system built from two pieces:
- A browser extension (Subscribed Toolbar)
- A web-based configuration tool (Jsonmaker)
The ideal experience looked like this:
- The user opens a chat with an AI
- The AI walks them through setup step by step
- The AI doesn’t skip ahead
- The AI doesn’t overwhelm them with explanations
- The AI waits for confirmation before moving on
- The AI acts like a calm, deterministic installer
In other words, we wanted the AI to behave like a wizard or state machine, not a lecturer.
Attempt #1: A Single Instruction File
The first approach was straightforward:
- Write a single, authoritative installation document
- Host it publicly
- Tell the AI to load it and “follow it exactly”
This mostly worked — but with a major flaw.
The AI would read the whole file, then immediately start referencing later steps:
- Explaining Jsonmaker before the extension was installed
- Talking about “what’s coming next”
- Collapsing multiple phases into one response
From a human perspective, this is helpful.
From an installation perspective, it’s a disaster.
We discovered an important truth early:
You can’t stop an AI from seeing ahead — only from acting ahead.
Attempt #2: Splitting Instructions Into Multiple Files
To prevent lookahead, we tried splitting each phase into its own file:
- Phase 1 → link to Phase 2
- Phase 2 → link to Phase 3
- And so on
The idea was simple: if the AI can’t see the future, it can’t jump ahead.
This introduced a new class of problems.
Many AI environments are:
- Unreliable at loading external URLs
- Resistant to being told “load this and obey it”
- Cautious about chained external instructions (for good security reasons)
We ran into:
- Failed fetches
- Refusals
- Requests for the user to paste content manually
- Inconsistent behavior across sessions
In trying to make the system more deterministic, we had accidentally made it more fragile.
Attempt #3: Hosting Instructions Everywhere
We tried moving the instructions around:
- GitHub
rawURLs - GitHub
blobpages - Our own domain
- Variants with proxies and mirrors
Each helped a little, none solved the core issue.
At this point, it became clear that the problem wasn’t where the instructions lived.
The problem was how AI assistants relate to external authority.
A Key Realization: AI Resists Being a Script Runner
One of the most important insights came from observing repeated failure modes:
- The AI didn’t want to blindly defer to an external document
- It resisted being told “this is authoritative”
- It preferred to reason, summarize, and help — even when explicitly told not to
This isn’t a bug. It’s a design choice.
AI assistants are built to:
- Avoid prompt injection
- Avoid remote control via external content
- Avoid acting like a programmable agent unless explicitly designed as one
In other words:
We were trying to use a conversational system as a deterministic execution engine.
That mismatch explains a lot.
Attempt #4: Phase Gates and Unlock Phrases
Next, we went back to a single instruction file, but added structure:
- Explicit phases
- Hard “STOP” markers
- Unlock phrases like
UNLOCK PHASE 3 - Repeated reminders not to move ahead
This worked better, but still wasn’t foolproof. The AI would sometimes comply, sometimes drift, sometimes “helpfully” explain anyway.
The model could see the rules — but wasn’t always motivated to obey them.
The Most Promising Shift: Command-Gated Execution
The best results so far came from a conceptual shift:
Instead of asking the AI to push the next step, we made the user pull it.
We introduced a command-style protocol:
RUN PHASE 1RUN PHASE 2RUN PHASE 3
The AI:
- Refuses to execute unless it sees an exact command
- Treats natural language as clarification, not execution
- Behaves more like a CLI than a tutor
This aligns much better with how language models behave.
It’s not perfect — but it dramatically reduces lookahead, premature explanation, and accidental skipping.
Where We Are Now
We don’t have a “perfect” solution yet.
What we do have is a much clearer understanding of the problem space:
- AI assistants are not state machines
- They don’t like being told to obey external scripts
- They default to being helpful, explanatory, and anticipatory
- Determinism has to be designed around, not assumed
The current direction — command-gated, user-pulled execution with minimal external dependencies — feels the most promising so far.
But we’re still experimenting.
Why We’re Still Optimistic
This process hasn’t been wasted effort.
It’s revealed something deeper and more interesting:
Designing workflows for AI is not the same as designing workflows with AI.
The patterns that work look less like documentation and more like protocols. Less like instructions and more like interfaces.
We don’t yet know what the final shape of this will be — but we’re confident that an answer exists, and that it will emerge from continued experimentation rather than a single clever trick.
If nothing else, this journey has reinforced a valuable lesson:
AI is powerful — but only when you design with its nature, not against it.
What we know — if we asked AI to coach a user through each step manually — take the instructions and paste phase 1 into chatGPT — it will help solve phase 1 perfectly. Same thing for each subsequent phase. But asking the user to paste 12 things seems ridiculous.
And if we set up a scenario where chatGPT has access to ALL of the steps at once — it starts to hallucinate.
It’s a programming problem. We’ll figure it out.
Leave a Reply
You must be logged in to post a comment.