My Claude Code vibe-coding workflow as a non-developer

I a have crazy idea: I want to code WITH ENGLISH! Yeah, how about that! Ever since the advent of AI, I think coding and learning to code as a barrier no longer exists, and thus it pave the way for a new class of builders like you and me - the non-developers. As has been seen in Google, with AI writing code for you, everyone is a builder. But it does not mean that you could just sit down and tell Claude: “Build me Netflix, makes no mistake” and walk away. Building is fun, but it’s still hardwork. This post is my journey building with AI so far.

The basic to start working

Everyone says that AI will help us create more, but to actually create things with AI, we first need to tell it about our context. We prompt for it, it understands, and then it just works. That’s amazing. Then you open another chat session, and you re-tell your context over and over again. Now it is annoying. Too much talking means less doing, that’s why you need to automating prompts in Claude before reading about this

To really start coding, you need a tool built for it. Cowork is great due to its capability to touch file, use project instructions, and allow user to set it up almost like a personal assistant with persistent memory. But to go full on with codes, and integrate with other products of the developer community, you need to learn how to use Claude Code

The AI workflow

Let’s start with the most basic flow of working with AI: You request, AI execute.

          ┌─────────┐
input ──► │ EXECUTE │
          └─────────┘

This is how most people are using AI in the chat. They ask something and the AI give them that thing. When you apply the same thing to Claude Code, you ask it to build a feature, and the AI go and write the code for that feature. Great! What could go wrong?

Turns out that the AI, because of the way they are trained, are very eager to jump into doing, without even thinking it through carefully. Many in the community reported that when the AI have to think and do at the same time, the output is not satisfying at all.

So then, let’s help the AI break it down, by instructing it to make a plan, THEN do the thing.

          ┌──────────┐     ┌─────────┐
input ──► │  PLAN    │ ──► │ EXECUTE │
          └──────────┘     └─────────┘
              (new)

Now we have the AI do the job in two phases: 1) first it think of a plan, and write that plan down 2) then it would look at the plan to do the implementation

That’s better. The output is much higher quality

Then you are building a software, and let’s say it a bit complex. Features are added in, and they starts to conflict each other, or putting constraints onto each other.

For example, you have a feature that Automatically Save user’s edits after 15s, then later you want to build an Undo feature that allow user to undo their edits as long as that edit is not saved yet. That means when planning how to build that feature, you or the AI must remember that Auto-Save feature you guys built, and make the Undo feature to either adapt to the 15s window, or adjust the Auto-Save so that it no longer execute saves by time, but instead by N-3 from last edits (with N being the last edit).

Now imagine if you built the Auto-Save on Monday, then the rest of the week you build 12 more features, and not until Sunday that you would start doing the Undo. Would you be able to remember the constraints between them, and also keeping track of all the complications that the other 12 features introduced? If you do, good for you, but I don’t, so let’s tell the AI to do that for us:

          ┌─────────────────┐     ┌──────────┐     ┌─────────┐
input ──► │  READ & GATHER  │ ──► │  PLAN    │ ──► │ EXECUTE │
          └─────────────────┘     └──────────┘     └─────────┘
                 (new)

Now before going to planning, we instruct the AI to read the whole codebase, and gather all the relevant information in one file. Then, that file will be used as input for the AI to start do planning. The AI is aware of the current build, the plan produce less error. Everyone happy.

After a while. You start to notice that your feature is behaving mostly okay…, but there are behaviors that you did not remember implement or expect from it.

Let’s say you do the Undo feature, and it allow user to undo at most three times from their current edit. But then, at the fourth undo, there is a pop up message that said “You could not undo further!” - That was unexpected, you did not plan that pop up. It did not break your software, but it’s just not what you want.

Then you start to investigate and you realize that the AI decide to build that behavior on its own because you did not tell it what to do. You just prompted “make me the undo feature” without specifying how the feature should behave in case no more undo is allowed. So the AI just silently assume your decisions and implement it.

That is not very nice. So let’s add one more:

          ┌───────────┐     ┌─────────────────┐     ┌──────────┐     ┌─────────┐
input ──► │ INTERVIEW │ ──► │  READ & GATHER  │ ──► │  PLAN    │ ──► │ EXECUTE │
          └───────────┘     └─────────────────┘     └──────────┘     └─────────┘
              (new)

With “Interview” phase, we require the AI to ask us relentlessly until there is no fuzzy things between us and the AI. Then that would be used to inform the “Read & Gather” phase, and so on. At this point, your feature request is much clearer, new feature produces less error, and the output is much align to your expectation.

As you keep building, your features increases, and all was going as plan. But somehow, your app feels…slow? It still does all the thing you want… but it is slow. Often time, new implementation requires edit codes that is out of its scope, potentially causing errors in multiple places.

That is the sign of bad codes design or unoptimized solution. The AI produced working solution, but the quality is not there. AI is usually biased toward the first solution it comes up, that’s how they behave, even though its first solution might not be the best one.

To counteract this, we add in the Design phase:

          ┌───────────┐     ┌─────────────────┐     ┌────────┐     ┌──────────┐
input ──► │ INTERVIEW │ ──► │  READ & GATHER  │ ──► │ DESIGN │ ──► │  PLAN    │
          └───────────┘     └─────────────────┘     └────────┘     └──────────┘
                                                       (new)             │
                                                                         ▼
                                                                    ┌─────────┐
                                                                    │ EXECUTE │
                                                                    └─────────┘

Now, after Interviewing the user, the AI then read & gather relevant context to the design, then start sketching up 2-3 solutions, and present them for us. We will read these and choose the one we think is the best, and let the AI do the rest of the workflow as usual.

This is quite a stable workflow at this point.

However the biggest risk in the workflow now is human - the non-coders like you and me who would not write the codes or even take a look at them at all. Therefore we need to make some adjusts to instruct AI to cover up for our weakness:

                              (read)     (assumptions)     (verify)
          ┌───────────┐     ┌────────┐     ┌───────┐     ┌──────────┐     ┌──────┐
input ──► │ INTERVIEW │ ──► │ DESIGN │ ──► │ SPECS │ ──► │ RESEARCH │ ──► │ PLAN │
          └───────────┘     └────────┘     └───────┘     └──────────┘     └──────┘
                                             (new)           (new)           │
                             agent_1        agent_2         agent_3          │
                                                                             ▼
                                                                      ┌─────────────┐
                                                                      │   EXECUTE   │
                                                                      └─────────────┘

We break the Read & Gather out, and put it into Design and two other smaller phase of Specs and Research, and make sure each phase is handled by a different agent:

agent_1 will take the interview output, then read the code, and design the solution
agent_2 write down the chosen design into detailed specification and explicitly list out the underlying assumptions
agent_3 will then independently take the specs, and do a deep research of the codebase to verify all the assumptions With independent agent reviewing each other’s work against the codebase, we could have better confidence that the solution they present us is a good and well-fitted one.

The documentation

During the explaination of the workflow, I did mention about telling AI to dump its thinking into files, so the AI would slow down, think first, then act later, rather than thinking and doing at the same time.

The practice itself is super simple: tell AI to write a file, then tell it to read it. But, as I will show you in this section, with intentional workflow design, you could achieve much more than just a good problem-solving assistant.

First, let’s talk about the benefits for file-based workflow with AI:

Set up an almost persistent memory for AI about the codebase: AI models are very capable of ingesting context from file, and in environment like VS Code or Claude Code, they could even proactively look for files that are relevant to the chat or task they are handed to. This is nothing new, your CLAUDE.md set up is the direct application of this: by putting project context, instruction, and progress inside CLAUDE.md, the AI always catch up to what you have been doing. This is to save you from repeating your self every new chat session, or having to remember important things. This section will show you how you could scale up a CLAUDE.md into a robust context system for your AI.
Workaround the context windows of the AI, so the output is always quality: All models have a limited amount of temporary memory it can holds, which is often called ‘context window.’ When you open a new chat session, the context window is zero. Then the AI load in Auto-memory about you, your Global Instructions, your CLAUDE.md, your prompt that had just been sent, and finally load in files it might need to read to solve your task. All of this does not occupy that much of the window - the one that actually significant and affect the output quality is the AI’s thinking. Advanced AI models are instructed to think with various frameworks, like step by step (chain of thinking), or zoom out and re-assessment, etc. and the result of these thinking are much higher intelligence, at the cost of context windows because the AI think outloud and consume its own thought as context during the task. This is why AI’s output at the first few prompts is amazing, and keep declining as the session drags out. To keep the thinking quality high, is to manage this context window for the AI: we tell it to think one phase at a time, then write its thought down, and then open a new chat session to pick the task up and do the next phase. This means every phase in the workflow need to be 1) well-scoped and 2) produce artifacts ready to be consumed for the next phase
Have independent agents verifying each other’s work: Each fresh chat session is like a new AI, not having prior context of the last session. Therefore, having the later AI verifying the work of the previous AI is a good tactic. Yes, I am aware that if I use the same model, then the biases they inherit while under training will still be the same, and they might end up making the same mistake as the previous chat session. That is partly true, and it causes, I think, about 10% of the problems. The rest 90% of mistakes are made because the AI missed some codes during its autonomous file discovery, or it did not read codes at all and write based on its assumption (a dangerously frequent behavior of AI models). That is where having another fresh AI, specifically instructed to catch errors from the previous AI’s work, brings so much value. (btw the ‘10%’ and ‘90%’ are just made up by me - they are not exact number and roughly means ‘a little’ and ‘a lot’, so don’t quote me on those numbers!)

Now, let’s look at what our workflow will produce at each step:

In this paper-trail design, each phase emits a file to:

Capture the current progress, so the next AI will be able to pick up and continue
Create break-points where we could safely refresh the context-window without losing any important context
Use AI to cross check each other

WORKFLOW             PAPER-TRAIL
    input  ◄──────── idea_draft
      │             (your input)
      ▼
┌───────────┐
│ INTERVIEW │──────► requirements
└───────────┘        (clarified)
  (read codes)             │
      │   ◄────────────────┘
      ▼
┌───────────┐
│  DESIGN   │──────► design/feature
└───────────┘       (done solution)
  (assumptions)            │
      │   ◄────────────────┘
      ▼
┌───────────┐
│   SPECS   │──────► specs/feature
└───────────┘   (list out assumptions)
    (verify)               │
      │   ◄────────────────┘
      ▼
┌───────────┐
│ RESEARCH  │──────► research/feature
└───────────┘    (verify specs/designs)
      │   ◄────────────────┘
      ▼
┌───────────┐
│   PLAN    │──────► plan/feature
└───────────┘   (write detailed plan)
      │   ◄────────────────┘
      ▼
┌───────────┐
│  EXECUTE  │
└───────────┘

idea_draft: you write your vision of the feature down, and what behavior you expect it to have
requirements: because you probably writing some fuzzy things, with unclear adjective, the AI will clarify it to make sure it does not misunderstand or silently assume your intention.
design/feature: the Design phase will require AI to 1) take your clarified requirements and 2) read the actual codes related to that requirements. Then, the AI will present 2-3 options that it confidently believe to fit the current codes and satisfy your requirements
specs/feature: the next AI will read the design, then it will write down more details on how the design should be implemented. During that process, it would also document ALL the assumptions that the design made. This is to surface the AI’s unconfirmed bias and blindspots.
researhc/feature: this stage is also to be done with a fresh agent. The agent will take the specs, with all the assumptions pointed out, and go vigorously against the real codes to prove or disprove the assumptions (now that I write it, I think requiring AI to try disproving would be a better approach). If there is an unvalidated assumption, the AI would flag it, and raise it to user to re-design if needed.
plan/feature: after the specs and all the assumptions are validated, the next agent write down the detailed plan to prepare for implementation.

By now, I hope the paper-trail illustrate to you how we could achieve the (2) and (3) benefits that I outlined above. For the (1) benefits (persistent memory about the codebase), it needs a separate set up. This is the hard lesson I learn when developing software by myself: there are just so much little things about the software that you could not keep track, and that is not obvious even when you actually read the codes.

As an example, let’s come back to our Undo feature. Now when you think about it, there are a lot to work out, so we start with a simple question: What count as an edit? Does typing a character or typing a whole word count as edit? Or should it be a whole paragraph? What about putting on bold font? etc. Those things that a non-developer often skip are the ones that keep developers up at night. And let’s say you decided to allow all changes related to text will be count as eidt: no matter what it is, if it touches the text, that’s an edit. You thought you really did it, and you moved onto to other plans. Then after a while, you implement a new feature: embed picture. This will allow user to put picture alongside text. That’s genuinely useful. But then, the “What is an edit?” problem re-emerges: Does picture count as an edit?

Above is an oversimplied example, but I hope you get the idea: so many decisions made without all the insight, which requires us to aware of them and check on them, because every new feature is always potential for breaking an existing rule. For those of you who think “This is actually simple, I could remember them myself”: no, you can’t. I was just simplify it for the sake of this post, while in reality, the things you need to remember is probably 200-300 lines of random facts for just a small, personal software. This means that you might need to look at that lists and check if ANY in those would break, EVERY time you want to add a new cool feature (that’s why we don’t get cool stuff like cat emojis!)

With the introduction of AI, surely this what-could-go-wrong job will be pushed into its face. Checking for constraints or remember random things are some of the most efficient task to delegate for AI. But as my codebase grow, I face a challenge: the damn model also does not want to do it!

Let me start at the beginning: One CLAUDE.md document for all the context about my codebase. CLAUDE.md is a very good place to store important information about your codebase, because per Anthropic’s design, Claude will always look for this file and read it first before doing anything to your codes. This means: as long as we CLAUDE.md updated, AI will have latest context. Simple, I’ll just tell the AI at the end of execution to update CLAUDE.md

AI-CONTEXT        WORKFLOW            PAPER-TRAIL

CLAUDE.md ────────►     input    ◄─────── idea_draft
                          │              (your input)
                          ▼
                         ...
                          │
                          ▼
                    ┌───────────┐
CLAUDE.md ◄──────── │  EXECUTE  │
            update  └───────────┘

This works for about…a week. Yes, I already did all the things I could to shorten the content input to CLAUDE.md, having ‘optimizing CLAUDE.md’ skills from other people to help reduce the length, etc. All that and still it is 500+ lines - there are simply too much information. But then, what’s the problem? 500 lines? AI could handle that easily. I thought so, but when you have 500 lines of well-written and optimized information of very important rules about your codebase, there are two problems: 1) every rule is important, so missing any one is a huge risk and 2) if you really have all-very-important rules in one file, you simply have too much. At this point, the AI revolts against me: when CLAUDE.md is too long, AI will simply ignore parts or the whole of it (which is not nice at all).

Jokes aside, the real challenge is that AI could not handle 500-very-important-rules at once with the same focus and quality. It will miss something, or does not dedicate enough time to think about how a rule could go wrong.

So let’s break CLAUDE.md down into multiple files, and manage AI context window around those files. Currently, my CLAUDE.md has these section:

Project context & Behavior contract
Current Progress: what built vs what planned
Constraints: limitations that past designs put onto future designs
Open Questions: things we yet to decide, because current stage is not a good place to deal with it
Tech Debts: things we defer to prioritize current implementation, or a limitation of the design we are yet to solve
Architecture Decision Records: records of why we decided to build things this way
Dictionary: for terms specific to this project

The first obvious thing to leave out is the Current Progress - where are we on the roadmap. This information does not require constant monitoring, and instead should be treated as a brief. So we have a STATE.md file for this, which would travel with CLAUDE.md, and got updated at the same time:

AI-CONTEXT           WORKFLOW            PAPER-TRAIL

  CLAUDE.md  ─────────►   input  ◄─────────── idea_draft
  STATE.md                  │                (your input)
                            ▼
                           ...
                            │
                            ▼
                       ┌───────────┐
  CLAUDE.md  ◄──────── │  EXECUTE  │
  STATE.md     update  └───────────┘

That might looks like the old CLAUDE.md, but in two files. Believe, the separation is small but is so meaningful to the AI. I leave instruction inside CLAUDE.md to tell it that STATE.md is just progress tracking, read once to catch up, nothing serious. This save it from getting distracted by the same information had it been placed inside CLAUDE.md, and make space in the AI memory to actually pay attention to other parts

The above approach works out pretty well, so I continue to do the same with other sections inside CLAUDE.md. Here is the full list of context-keeping documents and their scope

CLAUDE.md : project context & behavior contract
STATE.md : current status of the build
CONSTRAINTS.md, OPEN QUESTIONS.md, TECH DEBTS.md: important guardrails when building the next features
ADR: records of decision
CONTEXT: dictionary for shared language

Now it gets ridiculous.

Putting a reference of these files inside CLAUDE.md would signal to the AI that they are to be read “as needed” - the most unreliable label ever. For one-time read, low stake information like STATE.md, that is okay. But information like guardrails in CONTRAINTS.md is not to be treated as such, but loading them up on the agent also does not work due to the issues with focus effort when the list gets too long.

To solve this problem, I try to use the same trick like how I did with the workflow: using multiple agents, each has their own responsibility. Let’s try it with the three guardrails documents: CONSTRAINTS.md, OPEN QUESTIONS.md, TECH DEBTS.md

The problem with these three docs is that they are essentially a checklist of things you need to be careful when designing new features. Maybe it is a missing logic you need to solve before implement the new feature, or a constraints 2 weeks ago that limit your options in the design phase. Whatever they are, they are the consequences of previous designs you now have to deal with, or you could deal with the ultimate consequences: BBB - back-breaking bugs (litterally break your back in search of bugs)

So instead of torturing the design agent to always think about the checklist while working on my task, I delegate a subagent that read all of them, and check them against the work of the design agent. So,

agent_1 design
agent_2 try to invalidate the design with the guardrails and they go back and forth until a design that 1) does not violate any guardrails and 2) meet my requirements. This is the design we need.

This works out pretty well, and fit naturally with the main workflow, so I expand it up, and here is the full flow:

AI-CONTEXT               WORKFLOW                 PAPER-TRAIL

CLAUDE.md  ─────────────► input  ◄─────────────── idea_draft
STATE.md                    │                    (your input)
                            ▼
                      ┌───────────┐
CONTEXT.md ◄───────── │ INTERVIEW │─────────────► requirements
+ ADR        update   └───────────┘               (clarified)
                       (read codes)                    │
                            │   ◄──────────────────────┘
                            ▼
                      ┌───────────┐
GUARDRAILS ─────────► │  DESIGN   │─────────────► design/feature
     ▲       verify   └───────────┘               (done solution)
     |                (assumptions)                    │
     │                     │   ◄───────────────────────┘
     │                     ▼
     │                ┌───────────┐
     │                │   SPECS   │─────────────► specs/feature
     │                └───────────┘         (list out assumptions)
     │                 (verify)                       │
     │                     │   ◄──────────────────────┘
     │                     ▼
     │    update      ┌───────────┐
     └─────────────── │ RESEARCH  │─────────────► research/feature
                      └───────────┘          (verify specs/designs)
                            │   ◄──────────────────────┘
                            ▼
                      ┌───────────┐
                      │   PLAN    │─────────────► plan/feature
                      └───────────┘         (write detailed plan)
                            │   ◄──────────────────────┘
                            ▼
                      ┌───────────┐
CLAUDE.md  ◄───────── │  EXECUTE  │
STATE.md     update   └───────────┘

The agents collaborate as follow:

The Architect:
- Read CLAUDE.md, STATE.md to catch up to the current progress
- Receive user’s request, then interview user for clarified requirements
- After clarification, update CONTEXT.md (dictionary of special terms in this project), and ADR (user’s major decisions about the program)
- Then go to work on Design
The Guardian:
- Read all the guardrails docs (CONSTRAINTS.md, OPEN QUESTIONS.md, TECH DEBTS.md)
- Verify The Architect’s work and only allow design to pass if none of the guardrail is violated
The Assumption Buster:
- Read the Design
- Transform Design brief into detailed specification
- Note down all the assumptions that The Architect made without clear evidence and warn the next agent to verify
The Researcher:
- Take the specs and see all the assumptions
- Read codes deeply to validate/invalidate each assumptions
- If there is inaccurate assumption, call in The Architect to re-design, or raise to user if the error is serious
The Planner:
- Read both the specs and the research
- Write plan for implementation
The Worker:
- Write the codes according to plan

The human at the wheel

At this point, the workflow is pretty good at managing the AI. Still, the biggest weakness remains to be unsolved: human control and decision making. AI is powerful, but they need human instruction and input to know which direction they should go. This is not AI-driven development, this is Driving-AI development - human is the one steering the wheel, not the AI.

For a developer, especially experienced one, this is just “read the plan” that AI presented to us, and correct it if something is off. Believe me, lots of things will be off due to the AI’s tendency to assume user’s decisions (still exist, even with the above workflow), or limitation in AI’s thinking (it could not think of all the edge cases!). However, for non-developers like you and me, we could not just “read the plan,” because we don’t even understand the plan, and adding “never write or read the codes” to that, and we are basically let the AI run free.

Tempting as it might be, we could not trust the AI yet, and thus we need to try our best to understand what they are planning to do. This could be achieved either through 1) actually learning how to code and build a software or 2) just ask the AI until it is clear. The first option is a hard NO for non-developer (that’s why we are non-developer!), while the second option is just not effective and tiring - asking every details of the plan is just time-consuming, and also takes up the context window of the AI.

Instead, I would propose that we should try guiding AI to write the plan in ways that could help us understand what it wants to do. In short, we use AI as a codes-to-English translator. This approach is the thing I am still working on, because, simple as it may appears, there are difficult constraints and conflicts:

Simplification benefits understanding, but at the risk of misleading: Technical things are technical because there are a lot of nuance and edge cases that often times requires building up background knowledge before you could actually understand them. Trying to force simplification to these things risk missing all those details, and misleading people into a flawed model of how things work. Couple this with the tendency of AI to hallucinate under pressure, and you got a disaster.
Cohesion between plan and implementation: Some level of codes are needed in the plan so user could link it with the actual codes. If the plan is written without using any codes mention, user would be a complete stranger to the codebase, and later if something goes wrong and need bug fixing, user would bascially rely on the AI to do all the work, which is very dangerous. On the other hand, putting too much code details would cause confusion, and hinder understanding.
User should not just understand, but should also be encouraged to engage with the codes and technical stuffs: You are building a software here, mate. Unless you are happy with just adding some simple numbers together, you might want to start improve yourself as a non-developer vibe coder (I’ll think of other better name next time). If you are ambitious and want to build more complex software, you need to improve, and reading the AI’s plan is a perfect way to do that, but only when the plan is written in ways that helps you understand, while still introduce some codes to you.

As I said before, I am still working on this problem, and there’s also a personal aspect to it - your level of understand will be different from mine. So I’ll present my current solution here, and I hope it give you some ideas to start working on your own version.

I found that reading a block of text from AI explaining the codes is very confusing for me, and I could not visualize in my head how things work, so my main technique is: “Draw me a diagram.” It still takes a lot of trials and errors to explain for AI how it should draw diagrams that fit my level of understanding, but it basically boils down to:

Drawing with English, no complex jargon
If you have to use jargon, attach an explanation for what it is and what it does
Each component is a block, and use arrow to indicate how they interact
Build up step by step, not all at once

The last point is my hard learn lesson. I found that if I tell the AI to draw me a diagram, and explain all the details at once, I would not be able to understand any of it, but a simplified version is also not a good choice due to the risk of halluciation and misunderstanding. Instead, I break a middle ground between the two ends, telling the AI to starts with a simple diagram first, then build up to more complex one as we move on. Let’s me explain with my workflow, because it, again, map really well with a multi-step workflow:

┌───────────┐
│  DESIGN   │──────► Diagram_1: What are we building?
└───────────┘
  (assumptions)
      │
      ▼
┌───────────┐
│   SPECS   │──────► Diagram_2: How does components connect with each other?
└───────────┘
    (verify)
      │
      ▼
┌───────────┐
│ RESEARCH  │──────► Diagram_3 + Diagram_4: What's wrong + What need to change?
└───────────┘
      │
      ▼
┌───────────┐
│   PLAN    │──────► Diagram_5: Why are things built this way?
└───────────┘

As you can see, I name the diagrams based on their purpose. I found these are the essential questions to understand a system, and since I could not handle them all at once, I would break them out at each step, and tackle them one by one.

The key principle I give to AI when drawing these things is: make the next diagram consistent with the first. This means that if the AI drawing D2, it will need to first look at D1, and then make sure that things like names of the blocks, position of the block, flow of the arrows, etc. need to be consistent from D1 to D2. This is crucial to not confuse me, as we move from simple to more complex diagrams.

Let me give you an example. Suppose we are building a flow for promoting notes [Phat’s note: Sorry, I’m haven’t write this part yet. I have yet to found a good enough example to show here. Check back later! Also, other things I would add to this article (the next time I continue to work on it): Keeping track with SSOTs.]

Thanks for reading!