Most Claude Code users burn through credits and get inconsistent results because of five completely fixable mistakes. No CLAUDE.md, too many MCP servers, no voice files, no persistence between sessions, and loading everything upfront instead of on demand. This field note walks through each mistake, explains why it happens, and gives you the one thing to change. Fix all five and your sessions get longer, your output gets sharper, and you stop re-explaining yourself every time you open the terminal.
Who This Is For
You've installed Claude Code. You've used it for a few projects. It's useful, but something feels off. Sessions end too quickly. The output is generic. You keep repeating yourself. You're starting to wonder if you need a higher-tier plan.
You probably don't. You need to fix how your environment is set up.
I run my entire content pipeline through Claude Code — LinkedIn posts, YouTube scripts, blog articles, Instagram reels, Reddit engagement, consulting proposals. Dozens of skills across seven platforms. The system works because I've made (and fixed) every mistake on this list.
Here are the five that cost you the most time and credits.
Mistake 1: No CLAUDE.md File
The Symptom
Claude asks the same questions every session. "What framework are you using?" "What's the project structure?" "Do you want TypeScript or JavaScript?" It gives generic suggestions that don't match your codebase. You spend the first 5 minutes of every session re-establishing context.
Why It Happens
People think Claude "learns" their project just by being in the directory. It does read your codebase to some extent — it can see file names, scan imports, pick up patterns. But without explicit guidance, it wastes tokens rediscovering context that you could hand it for free.
Every session starts cold. Claude doesn't remember that your project uses Bicep instead of Terraform, that you deploy to canadacentral, that your naming convention is kebab-case with resource-type prefixes. Without CLAUDE.md, it has to figure all of that out again. Every. Single. Time.
The Fix
Create a CLAUDE.md file at your project root. Claude Code reads this automatically at the start of every session. Think of it as the briefing document you'd hand a new team member on day one.
What to include:
- Project overview — what this repo does, in 2-3 sentences
- Architecture notes — key patterns, directory structure, how things connect
- Commands — how to build, test, deploy
- Rules — naming conventions, coding standards, things Claude should never do
Before vs After
Without CLAUDE.md:
You: "Write a Bicep template for an App Service with managed identity"
Claude: "Here's a basic App Service template..."
→ Uses eastus instead of your standard region
→ Names the resource "appService1" instead of your naming convention
→ Doesn't include the diagnostic settings you always add
→ Uses an API version from 2023With CLAUDE.md that specifies your conventions:
You: "Write a Bicep template for an App Service with managed identity"
Claude: Reads CLAUDE.md → knows your region, naming pattern, standard modules
→ Uses canadacentral
→ Names it "app-myproject-prod" following your convention
→ Includes diagnostic settings pointing to your standard Log Analytics workspace
→ Uses current API versionsThe difference isn't just convenience. It's the difference between output you can use and output you have to rewrite.
Keep CLAUDE.md under 200 lines. If you need more detail, link to knowledge files: "For voice guidelines, see knowledge/linkedin_voice.md." CLAUDE.md is the map — knowledge files are the territory.
Mistake 2: MCP Server Bloat
The Symptom
Your sessions feel short. Claude runs out of context faster than you'd expect. You see tool calls you never asked for — random searches, unnecessary API hits. The session seems to "forget" things you said 10 minutes ago.
Why It Happens
Every MCP (Model Context Protocol) server registers its tools at session start. Each tool has a name, description, and parameter schema. That's tokens. And they're loaded whether you use them or not.
Here's the math that nobody tells you: a typical MCP server registers 5-15 tools. Each tool definition costs roughly 150-300 tokens. If you have 6 servers running with a total of 84 tools, that's approximately 15,000-25,000 tokens consumed before you've typed a single character.
That's your context window getting eaten by tool definitions you're not going to use.
How to Audit
Check how many tools are registered in your current session. In Claude Code, you can see your MCP server configuration in two places:
- Global:
~/.claude/settings.json— applies to every project - Project:
.claude/settings.local.json— applies to this project only
Count the servers. Count the tools each one registers. Multiply by ~200 tokens per tool. That's your overhead.
The Token Cost
Here's what typical MCP servers cost you in context overhead:
| MCP Server Type | Typical Tools | Estimated Token Cost |
|---|---|---|
| Notion | 12-15 tools | ~3,000 tokens |
| Linear | 20-25 tools | ~5,000 tokens |
| Gmail | 6-8 tools | ~1,500 tokens |
| Buffer | 10-12 tools | ~2,400 tokens |
| Sentry | 12-15 tools | ~3,000 tokens |
| File system / search | 3-5 tools | ~800 tokens |
| Total (all enabled) | 63-80 tools | ~15,700 tokens |
That's the equivalent of a 4,000-word document — gone before you start.
The Fix
Three things to do right now:
1. Use project-level settings instead of global. Move MCP servers from ~/.claude/settings.json to .claude/settings.local.json in the projects that actually need them. Your blog writing project doesn't need Sentry. Your debugging project doesn't need Buffer.
2. Disable servers you don't need for the current task. If you're writing a blog post, you need Notion and maybe Buffer. You don't need Linear, Sentry, and Gmail all loaded and consuming tokens in the background.
3. Enable deferred tool loading where available. Some MCP integrations support loading tool definitions only when invoked rather than at session start. This is the single biggest token saver for heavy setups.
Most people think they need a higher Claude plan when their usage runs out in 2 hours. They don't need more tokens — they need fewer tools. Audit your MCP servers before upgrading your subscription.
Mistake 3: No Knowledge Files
The Symptom
Claude writes content that sounds like it was written by a competent stranger. The information is correct, the grammar is fine, but it doesn't sound like you. If you're producing content across multiple platforms, each output sounds slightly different — but not in a deliberate, platform-appropriate way. Just... inconsistent.
You find yourself re-explaining your tone, your vocabulary, your opinions every time you ask Claude to write something.
Why It Happens
Without reference files, Claude has no "voice DNA" to match. It defaults to a helpful, slightly formal, generically professional tone. Which is fine for one-off tasks. But if you're building a brand — writing LinkedIn posts, YouTube scripts, blog articles, Reddit comments — generic tone is the enemy.
Claude can't match a voice it's never seen. It can match a voice it reads at the start of every relevant task.
The Fix
Create knowledge/*.md files for each output type. These aren't prompts — they're voice profiles. Style guides. Reference documents that Claude reads before generating content for a specific platform.
What goes in a voice file:
- Sentence patterns — do you use short sentences? Long, flowing ones? A mix?
- Vocabulary — words you always use, words you never use
- Opinion markers — how strongly do you state positions? Do you hedge or commit?
- Structural patterns — do you open with a story? A question? A bold claim?
- Anti-patterns — the generic phrases that should never appear in your content
The Difference It Makes
Without a voice file:
You: "Write a LinkedIn post about Azure Private Endpoints"
Claude: "I'm excited to share my thoughts on Azure Private Endpoints!
Private Endpoints are a powerful feature that enables secure
connectivity to Azure services. Here are 5 things you should know..."
→ Generic opening
→ Listicle format you'd never use
→ No opinion, no experience, no edgeWith knowledge/linkedin_voice.md loaded:
You: "Write a LinkedIn post about Azure Private Endpoints"
Claude: Reads voice file → matches your patterns
"Every week I see the same architecture mistake.
A team deploys 40 Azure resources. Locks them down with NSGs,
firewalls, conditional access. Then connects their SQL Database
over a public endpoint.
Private Endpoints fix this. But they're set up wrong more than
they're set up right. Here's what I keep seeing..."
→ Opens with observation from experience
→ States a position
→ Matches your sentence rhythmSame topic. Completely different output. The knowledge file is the difference.
Your knowledge files aren't prompts. They're voice DNA. Write them like you're briefing a ghostwriter who needs to sound exactly like you. Include real examples of your writing — 3-5 paragraphs that capture your rhythm. That's worth more than a page of abstract style rules.
Knowledge File Types
You don't need one for everything. Start with these:
| File | Purpose | When Claude Reads It |
|---|---|---|
linkedin_voice.md | LinkedIn tone, format, character limits | Before any LinkedIn draft |
writing_style.md | Long-form voice for scripts and articles | Before video scripts, blog posts |
content_strategy.md | Pillars, topics, audience segments | During brainstorming and ideation |
about_me.md | Credentials, experience, brand identity | When establishing authority in content |
Build from there based on your platforms and content types.
Mistake 4: No Persistence Layer
The Symptom
Every session starts from scratch. Claude doesn't know what content you've already created. It suggests topics you covered last week. It can't tell you what's in your pipeline. You re-explain decisions, re-share context, and re-establish the state of your project every time you sit down.
Why It Happens
Claude Code doesn't persist state between sessions by default. There's no hidden memory, no conversation history carried over, no automatic "last time we were working on..." context. Each session is a clean slate.
People treat Claude Code like a chat app that remembers their history. It doesn't. When the session ends, the context is gone. If you didn't write it down somewhere Claude can read it, it doesn't exist.
The Fix
Write outputs to local files. Maintain index files. Give Claude something to read at the start of each session that tells it exactly where things stand.
The principle: vault is source of truth. External tools are downstream sync.
Here's what that looks like in practice. Instead of storing your content pipeline in Notion and querying it through MCP (slow, token-expensive, unreliable for filtering), you maintain a local file:
// vault/_master_index.json
{
"entries": [
{
"id": "li-20260405-private-endpoints",
"channel": "linkedin",
"title": "Private Endpoints Are Set Up Wrong More Than Right",
"lifecycle": "published",
"date_published": "2026-04-05",
"topic_keywords": ["azure", "private-endpoints", "networking"],
"cross_references": ["blog-20260401-pe-guide", "yt-20260320-pe-demo"]
},
{
"id": "yt-20260410-bicep-modules",
"channel": "youtube",
"title": "Bicep Modules — Stop Writing Monolithic Templates",
"lifecycle": "drafting",
"topic_keywords": ["azure", "bicep", "iac", "modules"]
}
]
}One file. One read at session start. Claude now knows:
- What you've published — no duplicate suggestions
- What's in progress — it can pick up where you left off
- Cross-channel relationships — it knows the LinkedIn post relates to the YouTube video and the blog article
- Topic coverage — it can find gaps instead of retreading old ground
This single file replaces dozens of Notion API calls. It loads in a fraction of the tokens. And it never fails because the API is down.
The Architecture
Session starts → Claude reads vault/_master_index.json
→ Knows everything you've created
→ Knows what's in progress
→ Can suggest what's missing
→ Writes new content to vault/
→ Updates the index
→ Syncs to Notion/Buffer/wherever (downstream, non-blocking)
If the downstream sync fails — Notion is slow, Buffer API times out — your content is safe. It's in the vault. The sync can retry later. You never lose work because an external service had a hiccup.
Start simple. You don't need a complex index on day one. Even a single projects.md file that lists your active projects, their status, and key decisions is better than nothing. Grow the persistence layer as your usage grows.
Mistake 5: Not Using Progressive Disclosure
The Symptom
Complex tasks hit token limits. Claude loses context halfway through a long conversation. Sessions start slow because everything loads upfront. You notice Claude "forgetting" instructions you gave it earlier — not because it's broken, but because the context window filled up and older content got pushed out.
Why It Happens
Loading everything at session start — all skill definitions, all knowledge files, all MCP tool schemas — is like importing every module in your application at boot time. Most of it won't be used. All of it costs tokens.
A skill definition might be 5,000 tokens. A voice file, 2,000 tokens. A strategy document, 3,000 tokens. If you have 10 skills and 6 knowledge files, that's potentially 60,000+ tokens loaded before Claude processes your first message.
Your context window is finite. Every token spent on "just in case" context is a token unavailable for actual work.
The Fix
Structure your skills and knowledge files so they load in stages. Only load what's needed, when it's needed.
Level 1: Frontmatter only (~100 tokens per skill)
---
name: "linkedin"
description: "Draft, discover, and publish LinkedIn posts"
---This is all Claude needs to know a skill exists. It scans these at session start to know what's available. Total cost for 10 skills: ~1,000 tokens.
Level 2: Full SKILL.md (~2,000-5,000 tokens)
Loaded only when the skill is invoked. You type /linkedin draft — now the full instructions load. Not before.
Level 3: Reference files (~1,000-3,000 tokens each)
Loaded only when the skill explicitly needs them. The LinkedIn skill loads linkedin_voice.md before drafting but not before discovering topics. The blog skill loads techrupt_voice.md only when writing for Techrupt, not for every blog post.
How This Looks in Practice
Session start:
→ Scan 10 skill frontmatters: ~1,000 tokens
→ Read CLAUDE.md: ~800 tokens
→ Total startup cost: ~1,800 tokens
You type: /linkedin draft "Azure Private Endpoints"
→ Load full linkedin/SKILL.md: ~3,000 tokens
→ Load linkedin_voice.md: ~2,000 tokens
→ Load shared-context.md: ~1,500 tokens
→ Total for this task: ~6,500 tokens
Compare to loading everything upfront: ~60,000+ tokens
That's a 10x reduction in startup cost. Which means 10x more context available for your actual work.
Think of it like lazy loading in web development. You don't ship the entire JavaScript bundle on page load. You code-split and load modules when the user navigates to them. Same principle, applied to AI context windows.
The Three-Layer Pattern
| Layer | What Loads | When | Token Cost |
|---|---|---|---|
| Frontmatter | Skill name + description | Session start | ~100 per skill |
| Full instructions | Complete SKILL.md | Skill invoked | ~2,000-5,000 |
| References | Voice files, configs, guides | Task requires them | ~1,000-3,000 each |
Design every skill with this pattern. Frontmatter tells Claude what's available. Full instructions tell it how to execute. References give it the context for quality output.
The Fix Checklist
One table. Five fixes. Print this out.
| Mistake | Symptom | One-Line Fix | Time to Fix |
|---|---|---|---|
| No CLAUDE.md | Claude doesn't know your project | Create CLAUDE.md at project root with overview, architecture, rules | 15 minutes |
| MCP bloat | Sessions feel short, tokens burn fast | Disable unused MCP servers, use project-level settings | 5 minutes |
| No knowledge files | Generic, inconsistent output | Create knowledge/*.md voice profiles for each platform | 30 min per file |
| No persistence | Starts from scratch every session | Write to local vault files, maintain a master index | 20 minutes |
| No progressive disclosure | Token waste, context overflow | Structure skills in layers: frontmatter, instructions, references | Ongoing |
The Order That Matters
If you're going to fix these one at a time, do them in this order:
1. CLAUDE.md first. This is the highest-impact, lowest-effort fix. Fifteen minutes of writing saves you 5 minutes of re-explaining at the start of every session. It pays for itself on the second use.
2. MCP audit second. Five minutes of disabling unused servers can double your effective session length. The ROI is immediate.
3. Knowledge files third. These take longer to write — you need to actually study your own voice and document it — but the output quality improvement is dramatic. Start with the platform you use most.
4. Persistence fourth. This requires a bit of architectural thinking, but even a simple status file is better than nothing. Start with a single JSON file that tracks your active projects.
5. Progressive disclosure last. This is the most sophisticated fix and only matters once you have enough skills and knowledge files that loading them all becomes a problem. For most people, the first four fixes eliminate 90% of the waste.
The Bigger Picture
Here's what all five mistakes have in common: they're all about giving Claude the right context at the right time.
Too little context (no CLAUDE.md, no knowledge files, no persistence) and you get generic output that doesn't match your project or voice.
Too much context (MCP bloat, no progressive disclosure) and you burn tokens on overhead instead of actual work.
The sweet spot is deliberate context — exactly what Claude needs for the current task, loaded when it's needed, stored where it persists.
Get that right and Claude Code stops feeling like a chatbot that forgets everything. It starts feeling like a tool that knows your project, matches your voice, and picks up where you left off.
That's the difference between burning credits and multiplying your output.
What's Next
- If you work with Azure, check out The Cloud Engineer's Field Guide to Claude Code — it covers how I use Claude Code for day-to-day infrastructure work.
- Building labs or training content? How I Use Claude Code to Build Cloud Labs 10x Faster walks through my complete workflow.
- For the full skill and MCP setup walkthrough, see Claude Code Skills and MCP Setup.