My experiences with LLMs

10 Sep, 2025

I’ve been using LLMs for coding at work and in personal projects for over a year, and I want to capture and share my thoughts while they’re still fresh enough to look back on later.

The first proper LLM based coding assistant that used was one of the earlier versions of ChatGPT. It could solve HackerRank questions, but it felt more like a gimmick than a professional tool.

GitHub Copilot was the first tool that felt useful professionally. It was far more effective than IntelliJ IDEA’s autocomplete and was good at boilerplate tests or single-file changes.

Cursor impressed me even more by running commands like npm install, mkdir, and curl directly. Then Claude Code topped that with better code, hooks, agents, and a strong CLI that worked well with Visual Studio Code.

For the first time, I was building a small game or command-line tool almost every day. Claude Code even generated usable frontend code and, with Playwright, proper UI tests.

The optimal way to use these tools isn’t set in stone yet. There are approaches like BMAD or tips on making the most of the available context window. MCPs add powerful capabilities, but they also contribute to context pollution. Extra documentation can help, but LLMs don’t always take it into account when generating new code. And unless you maintain it carefully, that documentation can go stale quickly.

Now I use Gemini, Windsurf or Cursor daily at work, and Chatgpt and Gemini on my phone but the novelty is gone. They’re just more tools that you need to operate effectively or they can waste your time or lie to you.

The hype around “100× productivity” is also mostly gone. Today, LLMs shine at boilerplate or small, well-defined changes, but they still need very careful use. Context limits, confident speech patterns that misguide you, and hallucinations remain major issues. The “everything works now” followed by “I see the issue now!” 1-2 combo is almost a meme now. You really need to keep the LLM on track especially if you are building something complex or serious.

Big players like OpenAI, Anthropic, Google, and Cursor keep pushing updates, but AGI feels no closer than two years ago.

The moat for new companies is millions of dollars they need to burn to keep data centers and models working. Because almost all tools work similarly, so there is close to zero brand attachment now.

I’m curious how this race, bubble, or transition will look in two years.