Start United States USA — software The best Large Language Models (LLMs) for Coding in 2024

The best Large Language Models (LLMs) for Coding in 2024

Von

June 21, 2024

160

The Best Large Language Models (LLMs) for coding tested by experts
The best Large Language Models (LLMs) for coding have been trained with code related data and are a new approach that developers are using to augment workflows to improve efficiency and productivity. These coding assistants can be used for a wide range of code related tasks, such as code generation, code analysis to help with debugging, refactoring, and writing test cases, as well offering chat capabilities to discuss problems and inspire developers with solutions. For this guide we tested several different LLMs that can be used for coding assistants to work out which ones present the best results for their given category.
The best large language models are area of technology that is moving very quickly so while we do our best to keep this guide as up to date as possible, you may want to check if a newer model has been released and whether it fits your specific use case better.The best large language models (LLMs) for codingBest for Enterprises
Originally released in October 2021, GitHub Copilot is a version of Microsoft’s Copilot LLM that is specifically trained with data to assist coders and developers with their work with the aim to improve efficiency and productivity. While the original release used OpenAI’s Codex model, a modified version of GPT-3 which was also trained as a coding assistant, GitHub Copilot was updated to use the more advanced GPT-4 model in November 2023.
A core feature of GitHub Copilot is the extension provided that allows direct integration of the LLM into commonly used Integrated Development Environments (IDEs) popular among developers today, including Visual Studio Code, Visual Studio, Vim, Neovim, the JetBrains suite of IDEs, and Azure Data Studio. This direct integration allows GitHub Copilot to access your existing project to improve the suggestions made when given a prompt, while also providing users hassle free installation and access to the features provided. For enterprise users, the model can also be granted access to existing repositories and knowledge bases from your organization to further enhance the quality of outputs and suggestions.
When writing code, GitHub Copilot can offer suggestions in a few different ways. Firstly, you can write a prompt using an inline comment that can be converted into a block of code. This works in a similar way to how you might use other LLMs to generate code blocks from a prompt, but with the added advantage of GitHub Copilot being able to access existing project files to use as context and produce a better output. Secondly, GitHub Copilot can provide real-time suggestions as you are writing your code. For example, if you are writing a regex function to validate an email address, simply starting to write the function can offer an autocomplete suggestion that provides the required syntax. Additionally, you can also use the GitHub Copilot Chat extension to ask questions, request suggestions, and help you to debug code in a more context aware fashion than you might get from LLMs trained on more broad datasets. Users can enjoy unlimited messages and interactions with GitHub Copilot’s chat feature across all subscription tiers.
GitHub Copilot is trained using data from publicly available code repositories, including GitHub itself. GitHub Copilot claims it can provide code assistance in any language where a public repository exists, however the quality of the suggestions will depend on the volume of data available. All subscription tiers include a public code filter to reduce the risk of suggestions directly copying code from a public repository. By default, GitHub Copilot excludes submitted data from being used to train the model further for business and enterprise tier customers and offers the ability to exclude files or repositories from being used to inform suggestions offered. Administrators can configure both features as needed based on your business use cases.
While these features aim to keep your data private, it’s worth keeping in mind that prompts aren’t processed locally and rely on external infrastructure to provide code suggestions and you should factor this into whether this is the right product for you. Users should also be cautious about trusting any outputs implicitly – while the model is generally very good at providing suggestions, like all LLMs it is still prone to hallucinations and can make poor or incorrect suggestions. Always make sure to review any code generated by the model to make sure it does what you intend it to do.
In the future it’s possible that GitHub will upgrade GitHub Copilot to use the recently released GPT-4o model. GPT-4 was originally released in March 2023, with GitHub Copilot being updated to use the new model roughly 7 months later. It makes sense to update the model further given the improved intelligence, reduced latency, and reduced cost to operate GPT-4o, though at this time there has been no official announcement.
If you want to try before you buy, GitHub Copilot offers a free 30 day trial of their cheapest package which should be sufficient to test out its capabilities, with a $10 per month fee thereafter. Copilot Business costs $19 per user per month, while Enterprise costs $39 per user per monthBest for individuals
CodeQwen1.5 is a version of Alibaba’s open-source Qwen1.5 LLM specifically trained using public code repositories to assist developers in coding related tasks. This specialized version was released in April 2024, a few months after the release of Qwen1.5 to the public in February 2024.
There are 2 different versions of CodeQwen1.5 available today. The base model of CodeQwen1.5 is designed for code generation and suggestions but has limited chat functionality, while the second version can also be used as a chat interface that can answer questions in a more human-like way. Both models have been trained with 3 trillion tokens of code related data and support a very respectable 92 languages, which include some of the most common languages in use today such as Python, C++, Java, PHP, C# and JavaScript.
Unlike the base version of Qwen1.5, which has several different sizes available for download, CodeQwen1.5 is only available in a single size of 7B. While this is quite small when compared to other models on the market that can also be used as coding assistants, there are a few advantages that developers can take advantage of. Despite its small size, CodeQwen1.5 performs incredibly well compared to some of the larger models that offer coding assistance, both open and closed source. CodeQwen1.5 comfortably beats GPT3.5 in most benchmarks and provides a competitive alternative to GPT-4, though this can sometimes depend on the specific programming language. While GPT-4 may perform better overall by comparison, it’s important to remember that GPT-4 requires a subscription and has per token costs that could make using it very expensive compared to CodeQwen1.