I've worked through most of the field: GitHub Copilot, Cursor, Kilo Code, Cline, Aider, Windsurf, Continue, Supermaven, Amazon Q, Tabnine. Some are good. Cursor in particular is a serious tool. But after running Claude Code on real production work for several months, I use the others significantly less — and the gap isn't about the underlying model. It's about how the agent is structured around it.

I use it daily on data engineering and AI platform work: ClickHouse pipelines, FastAPI services, LLM orchestration layers, analytics schemas. This is a direct account of what that looks like — where it saves real hours, where it produces quietly wrong output you have to catch, and what separates the people getting leverage from it from the ones who tried it once and moved on.

What Claude Code actually is

Claude Code runs in your terminal inside a project directory. It reads files, writes code, executes commands, and loops — chaining multiple actions before stopping for input. Not autocomplete. Not a chat window where you paste snippets and get text back. It operates on your actual codebase, with full context of the files it's been given access to.

The model underneath is Claude Sonnet or Opus, but the model isn't the differentiator. The difference between Claude Code and pasting the same prompt into the API is the loop and the context: the agent reads the relevant files itself, runs the code to check its changes actually work, and iterates without you stage-managing every step. That feedback loop is what makes it categorically different from a chat interface.

Where it genuinely saves time in data engineering

01
Schema design & migration
Generating ClickHouse table definitions, materialized views, and migration scripts from a plain-language description of the data model. Handles ReplacingMergeTree, partitioning strategy, and index choices well when given enough context.
02
FastAPI boilerplate
Endpoint scaffolding, Pydantic model generation, dependency injection patterns — the parts of building an API that are mechanical but tedious. The agent completes these in one pass with consistent style if the project already has conventions established.
03
Pipeline debugging
Give it a failing test or an error trace, point it at the relevant files, and it traces the logic path. Faster than stepping through manually for classes of bugs where the root cause is in the data transformation rather than an environmental issue.
04
Test coverage
Writing unit and integration tests for existing code is where the time savings are most consistent. The agent reads the implementation and generates tests that cover the actual logic paths — not the obvious happy paths a developer writes when they're tired.

These tasks share a structure: clear input, verifiable output. The agent can run the migrations, execute the tests, hit the endpoint. When it can check its own work, results are consistent. When it can't — when correctness requires human judgement rather than a green test suite — that's where the limits start to show.

The CLAUDE.md file is not optional

Every project I run Claude Code on has a CLAUDE.md at the root. The agent reads it at the start of every session. It's a plain text file — but it's the single most important thing you can do to make the agent useful beyond the first hour.

A thin CLAUDE.md that lists the tech stack produces mediocre results. A real one covers the data model, naming conventions, architectural decisions that aren't obvious from the code, and explicit no-go zones — files or patterns the agent shouldn't touch without asking. That's the difference between an agent that's reliably useful and one that's constantly producing work you have to correct.

From experience

Write your CLAUDE.md as if onboarding a developer who reads fast, learns immediately, but has zero institutional memory. Every implicit convention you leave out is a mistake you'll see twice — once when the agent makes it, once when you fix it.

For ClickHouse work specifically: document your schema conventions in detail. The agent doesn't know your project uses ReplacingMergeTree everywhere, that deduplication runs on a specific schedule, or that a particular column is nullable for a historical reason nobody wants to revisit. Without that context, it generates technically valid SQL that's contextually wrong — and technically valid wrong SQL is the worst kind.

Here's what a working CLAUDE.md for a ClickHouse/FastAPI project looks like — not a template, but an actual stripped-down version of the kind of file that produces consistent results:

CLAUDE.md
# Project: Market Intelligence Platform

## Stack
- Python 3.12, FastAPI 0.115
- ClickHouse (ReplacingMergeTree on all analytics tables)
- MongoDB for raw storage and processing queues
- Redis for API-layer caching

## ClickHouse conventions
- All tables use ReplacingMergeTree with `updated_at` as version column
- Deduplication job runs every 15 min — never assume instant dedup
- Partition by toYYYYMM(date_col) on all time-series tables
- Aggregations go in materialized views only — no GROUP BY on raw tables in API layer
- Column naming: snake_case, booleans prefixed with is_ or has_

## FastAPI conventions
- All endpoints are async
- Auth via dependency injection — see auth/deps.py
- Response models always explicit — never return raw dicts
- Pagination: cursor-based only, max 100 results, no offset

## Never touch without asking
- clickhouse/migrations/ — discuss before modifying existing migrations
- auth/ — no changes without explicit approval
- .env files — never read or modify

And here's what a typical working session looks like — the agent reading context, planning, executing, and verifying, without you managing each step:

Terminal session
$ claude "Add cursor-based pagination to /brands endpoint.
  Max 100 results per page. Same pattern as /keywords."

 Reading routes/keywords.py...
 Reading routes/brands.py...
 Reading models/pagination.py...

I'll follow the cursor pattern in /keywords. Plan:
  1. Add CursorPage[BrandResponse] return type to GET /brands
  2. Add optional cursor: str parameter
  3. Update ClickHouse query to filter by cursor value
  4. Add tests in tests/routes/test_brands.py

Proceed?

> yes

 Editing routes/brands.py...
 Editing tests/routes/test_brands.py...
 Running pytest tests/routes/test_brands.py...

  4 passed in 1.12s

Done. Endpoint now returns next_cursor in the response body.
Existing tests unaffected.

How the agentic loop actually works

The biggest shift between basic and advanced usage is knowing when to let the agent run autonomously and when to keep it on a short leash. Claude Code supports both: full autonomous execution where it chains steps end to end without stopping, and interactive mode where it proposes each change and waits.

Autonomous mode works well for low-risk, well-defined tasks — a new API endpoint that follows an established pattern, tests for a module with clear contracts. For anything touching core architecture, production data migrations, or code paths that interact with external systems: interactive, every time. Not because the agent gets it wrong more often in those cases, but because the cost of one wrong step isn't symmetric with the cost of one extra confirmation.

01
Give it a concrete task, not a direction
"Add pagination to the /brands endpoint, max 100 results per page, cursor-based, same pattern as /keywords" works. "Improve the API" doesn't. The more specific the deliverable, the less the agent needs to make scope decisions — which is where things go sideways.
02
Point it at the right files first
Use /add to load specific files into context before starting. The agent can search the codebase itself, but explicitly loading the relevant files avoids it spending tokens exploring and occasionally loading the wrong ones. In a large project this matters.
03
Let it run tests, not just write them
Claude Code can execute your test suite directly. If you give it permission to run pytest or your equivalent, it closes the loop — write code, run tests, fix failures, repeat. This is the most productive configuration for feature work. Without execution, it's writing code blind.
04
Review the diff, not just the output
Before accepting changes, read the git diff. The agent occasionally makes adjacent changes that seem reasonable in isolation but conflict with something elsewhere. The diff review is where you catch these — and it's usually faster than a full code review because you know what you asked for.

Where it breaks down

Three failure patterns show up repeatedly. Worth understanding each before you've been burned by them.

Long-range consistency. The agent produces code that's locally correct — clean, well-structured, passes tests — but conflicts with a convention established somewhere else in the project. A different error-handling pattern, a different naming style for async functions, a slightly different abstraction for the same concept. It doesn't look wrong in isolation. It looks wrong six weeks later when two parts of the system don't fit together cleanly. This is why the CLAUDE.md conventions matter and why the diff review isn't optional.

Underspecified tasks. Give the agent a vague brief and it makes scope decisions. Those decisions are often plausible. Plausible and correct are not the same thing. "Improve the caching layer" produces something. Rarely the thing you wanted. The cost of fixing an agent's interpretation is higher than the cost of writing a precise spec upfront — which is the same lesson every engineering team learns with junior developers, and tends to get re-learned with agents.

Niche infrastructure. Claude Code is strong on mainstream stacks. ClickHouse is not mainstream. The quirks that matter at scale — aggregations on ReplacingMergeTree tables before deduplication has run, the behaviour of certain window functions, how partitioning interacts with query performance on large datasets — these are the places where the agent produces queries that look right, run without errors, and return wrong numbers. Always verify generated analytical queries on real data. Not representative samples. Real data, real scale.

The honest take

Claude Code doesn't change what good engineering looks like. It changes how fast you can do it. The judgement calls — what to build, how to structure it, where the abstraction boundaries belong — still require someone who understands the system. What the agent eliminates is the mechanical gap between a decision and its implementation. That gap is larger than most engineers admit.

Using it across a team

Individual productivity is the easy case. The harder question is how to integrate an AI coding agent into a team that has shared conventions, shared ownership, and people at different experience levels. A few things hold up in practice.

The CLAUDE.md becomes team documentation. It encodes the same institutional knowledge as your internal wiki, but in a form the agent can act on. Keeping it current forces the team to articulate conventions explicitly — naming rules, architectural decisions, things that are obvious to the person who made them and invisible to everyone else. That clarity is valuable independent of the agent.

Point it at the backlog nobody touches. Test coverage for legacy code. Stale documentation. Scripts written three years ago that technically work but violate every current convention. These are high-effort, low-excitement tasks where the cost of a mistake is low and the review burden is minimal. Claude Code clears them faster than a developer sprint, with less resentment.

The failure mode is invisible debt. If the agent writes something and nobody reviews it closely enough to understand it, you've shipped something unmaintainable. The agent accelerates implementation. It doesn't substitute for comprehension. Teams that treat it as a shortcut to skip understanding are building a codebase that nobody will be able to explain six months from now — which is a problem that's older than AI and not going away.

What advanced usage actually looks like

The gap between basic and advanced usage has nothing to do with prompt tricks. It's context, specificity, and review discipline — applied consistently.

Advanced usage looks like: a CLAUDE.md detailed enough that the agent rarely needs to ask about conventions, tasks scoped tightly enough that it doesn't have to make architectural decisions, a genuine habit of reading diffs rather than just running tests, and a clear internal sense of which task types belong in autonomous mode and which ones don't.

None of this is complicated. All of it requires deliberate setup that most people skip because they're eager to see what the agent can do. The teams getting real leverage are the ones who did the groundwork first. Without it, you have a faster autocomplete. With it, you have something that noticeably changes how much your team can ship.

MM
Michele Mader
Technical Leader · Fortop S.R.L.

I lead the technical direction of AI-driven data products for enterprise clients — defining architecture, making stack decisions, and owning delivery from roadmap to production.

Connect on LinkedIn