7 ways to improve your AI coding results

Leaders are calling for more AI use in favor of hiring additional engineers, expecting developers to “10x” themselves. But there’s an art to actually being productive using AI coding assistants.

For starters, AI coding assistants have known strengths and weaknesses. AI is non-deterministic, meaning it’s prone to unexpected behaviors, such as randomly deleting code or introducing logical bugs, which can be a pain to grapple with.

Some limitations are inherent to the large language models (LLMs) that power coding assistants. Other errors, like code deletion or security gaps, can arise from how they’re used. And AI agents often get caught in recursive loops or endless testing cycles—a big productivity killer.

AI-assisted development is largely uncharted territory, an entirely new muscle that developers and tech leaders are just beginning to tone. So, how do you get the most out of AI coding assistants?

Leaders in the field suggest a combination of new skills and tactics to improve the effectiveness of working with AI and the quality of your code. Consider these tips to realize a more successful working relationship with AI.

Improve your prompting skills

“The first step to leveraging AI coding assistants effectively is to begin with clear, well-defined prompts that address the domain-specific complexity of the codebase,” says Harry Wang, chief growth officer at Sonar.

In the report, Guide to AI Assisted Engineering: 10x Your AI-Driven Development, developer productivity company DX outlines many prompting techniques to significantly improve outcomes, such as:

Meta-prompting: Embedding instructions within the prompt to help the model with its task.
Prompt-chaining: Creating a chained workflow of prompts — good for specifications and planning.
One-shot prompting: Including output references, like example code structures, in the prompt.
System prompts: Updating the underlying system prompt to reflect project-specific conditions.

Poor prompting can bring security implications, making it a good skill to sharpen. 2025 research from Backslash Security found that “naive” prompting led all major LLMs to generate code that was vulnerable to at least four out of 10 tested common weakness enumerations (CWEs).

Keep the humans around

Developers with “moderate” generative AI usage were the highest performers, according to a 2024 report from BlueOptima, The Impact of Generative AI on Software Developer Performance, which examined 880 million commits from 218,354 enterprise software developers.

BlueOptima’s report indicates that the best outcomes result from an optimal balance between AI assistance and human expertise. What this looks like in practice is outsourcing writing and validating code to AI, and tasking humans with project design and end approval.

This places human checks on both ends of the workflow. “Professional software creation will gradually move to being human-defined, AI-developed, AI-verified, and human-approved,” says Sonar’s Wang.

Use the right LLM for the job

Choosing the right LLM is a trade-off between accuracy, speed, and cost. Use a cheap tool for a complex job, and you’ll get poor results. But, use a powerful model for simple boilerplate code generation, and you waste significant resources.

LLM Stats currently ranks Anthropic’s Claude 3.5 Sonnet highest in coding, citing the HumanEval benchmark. A separate study, SWE-Lancer: Can Frontier LLMs Earn $1 Million from Real-World Freelance Software Engineering?, also found Claude the best at completing real-world programming tasks.

Claude also ranks high in security. The aforementioned Backslash Security research found that Claude 3.7 Sonnet generates more secure code than OpenAI’s GPT-4o and Google’s Gemini.

Although Claude is a top AI programmer, LLM Stats lists OpenAI’s o3 best at knowledge, and DeepSeek’s R1 best at reasoning. Other factors may be more critical, too. Gemini 1.5 Pro offers the highest token window, whereas Lambda is the most cost-effective.

“It’s really important to learn the boundaries of these tools and what they’re capable of to use them most effectively,” says Kevin Swiber, API strategist at Layered System. Swiber has developed a matrix to review AI coding agents across various dimensions, such as technical capabilities, workflow integrations, resource utilization, refactoring tasks, and debugging ability.

Program and test iteratively

Experts recommend working piece by piece with AI. “Ask for small code changes, not big ones,” says Charity Majors, co-founder and CTO of Honeycomb. “Don’t generate an entire website at once, or an entire API at once, or an entire feature at once.”

If you’re working on a 3,000-line code file, you’ll probably need to break it apart. The thing is, refactoring with AI is challenging—AI has a habit of accidentally deleting code or moving things around without warning. “It will optimize for small things, and not keep big ideas in mind,” says Swiber.

As such, it’s a good idea not only to start small but also to test changes with each iteration. Majors recommends starting with an endpoint, component, or task, then generating tests, running the tests, generating more code, and so on.

Leave breadcrumbs

Certain AI coding issues can be solved by working in the code editor (as opposed to a chatbot interface) with a tool like GitHub Copilot, Cursor, or Continue that suggests Git-like code changes. Some developers also have better luck making direct API calls.

Another method is to plan things from the outset and walk the agent through the process. Swiber recommends composing a plan in a Markdown file for AI assistants that specifies the project’s goals and details your progress over time. They also recommends backing up original files so you can always revert to a previous version.

“Leave a trail of breadcrumbs, for yourself and agents to pick up,” Swiber says. You might even benefit from inline commenting with explicit language, like “Don’t touch these lines.” “It’s an interesting practice to start leaving comments specifically for an LLM to not do something harmful.”

Test, test, test

AI-generated code requires a thorough review if it’s ever to reach production. “Never ship something you don’t understand,” says Honeycomb’s Majors. “Don’t ship what you’ve generated until you understand what you’ve done.”

The massive productivity gains from AI development shouldn’t eschew basic software engineering fundamentals. Testing is paramount from a security perspective. “Even if AI produces the code, humans will still be held accountable for its quality and security,” says Sonar’s Wang.

Arguably, AI creates the need for more testing. “AI-generated code needs even more rigorous review and testing to ensure that it’s correct, performant, and secure,” says Merrill Lutsky, CEO and co-founder at Graphite.

However, Lutsky adds that the traditional development cycle is becoming increasingly outmoded in the AI age. “Many companies are realizing that their ‘outer loop’ processes—code review, testing, and deployment—can’t keep up with the surge of AI-generated code changes,” he says.

Lutsky sees a clear opportunity in using AI to help solve the problems it creates. An AI agent, he suggests, could streamline devops processes, autonomously moving through the review and testing processes and looping humans in as needed, reducing manual steps that could slow down CI/CD.

Focus on data access

Another tactic is arming AI with the proper context. “AI usually makes sound judgments when it has enough information,” says Spencer Kimball, CEO of Cockroach Labs. Although some limitations are inherent to today’s models, like context window sizes and external data accessibility, developers can still deploy certain strategies.

Feeding LLMs internal data, documentation, or entire projects can be useful context. Making this publicly available could help train public LLMs on the nuances of your project (not to mention encouraging AI to recommend using your software to other engineers).

Interestingly, for this reason, Kimball views the newer, open-source-at-their-core companies as well-positioned in this era, since their source code and design documents are publicly available for LLM ingestion. “We need to be the obvious choice the AI recommends—we can appeal to the chief architects of the world,” he says.

At the very least, this is a strong case for using third-party tools built by organizations with a strong open source creed.

Significant headway is already being made around connectivity, too. For example, OpenAI’s agent SDK and Anthropic’s Model Context Protocol (MCP) are making strides in connecting AI with tools, data sources, and other AIs. This kind of automated intelligence will lead to massive productivity gains, Kimball predicts.

Future outlooks

“We’re at a point of maturity where we all should be getting experience with these tools,” says Swiber. “For the stuff they’re good at doing, they’re a huge time-saver.”

AI coding assistants aren’t just for experimentation anymore. They’re set to become standard in enterprise development processes. Gartner predicts that by 2028, 75% of enterprise software engineers will be using AI coding assistants.

According to Kimball, we’re heading toward a future where we’ve democratized the creation of useful things, where there will be smaller, more bespoke markets. “Extraordinary new products and services will be created with AI,” he says.

Small-to-mid-sized companies are slated to get a significant boost, too. A company with $100-million annual recurring revenue that is supported by 15 people is no longer out of the cards, Kimball says. With new Y-Combinator companies being 95% AI-written, clearly we’re moving in that direction.

In the meantime, for successful use of AI in coding, it takes expert wisdom to navigate tools and build truly productive workflows, especially in circumstances in which optimal performance, savings, or quality is required. Results also hinge on supplying AI with the right context.

Knowing how to get the best results from AI coding tools is fast becoming yet another important skill in the developer’s toolkit.