Does GitHub Copilot Improve Code Quality? (github.blog) 10
Microsoft-owned GitHub published a blog post asking "Does GitHub Copilot improve code quality? Here's what the data says."
Its first paragraph includes statistics from past studies — that GitHub Copilot has helped developers code up to 55% faster, leaving 88% of developers feeling more "in the flow" and 85% feeling more confident in their code.
But does it improve code quality? [W]e recruited 202 [Python] developers with at least five years of experience. Half were randomly assigned GitHub Copilot access and the other half were instructed not to use any AI tools... We then evaluated the code with unit tests and with an expert review conducted by developers.
Our findings overall show that code authored with GitHub Copilot has increased functionality and improved readability, is of better quality, and receives higher approval rates... Developers with GitHub Copilot access had a 56% greater likelihood of passing all 10 unit tests in the study, indicating that GitHub Copilot helps developers write more functional code by a wide margin. In blind reviews, code written with GitHub Copilot had significantly fewer code readability errors, allowing developers to write 13.6% more lines of code, on average, without encountering readability problems. Readability improved by 3.62%, reliability by 2.94%, maintainability by 2.47%, and conciseness by 4.16%. All numbers were statistically significant... Developers were 5% more likely to approve code written with GitHub Copilot, meaning that such code is ready to be merged sooner, speeding up the time to fix bugs or deploy new features.
"While GitHub's reports have been positive, a few others haven't," reports Visual Studio magazine: For example, a recent study from Uplevel Data Labs said, "Developers with Copilot access saw a significantly higher bug rate while their issue throughput remained consistent."
And earlier this year a "Coding on Copilot" whitepaper from GitClear said, "We find disconcerting trends for maintainability. Code churn — the percentage of lines that are reverted or updated less than two weeks after being authored — is projected to double in 2024 compared to its 2021, pre-AI baseline. We further find that the percentage of 'added code' and 'copy/pasted code' is increasing in proportion to 'updated,' 'deleted,' and 'moved 'code. In this regard, AI-generated code resembles an itinerant contributor, prone to violate the DRY-ness [don't repeat yourself] of the repos visited."
Its first paragraph includes statistics from past studies — that GitHub Copilot has helped developers code up to 55% faster, leaving 88% of developers feeling more "in the flow" and 85% feeling more confident in their code.
But does it improve code quality? [W]e recruited 202 [Python] developers with at least five years of experience. Half were randomly assigned GitHub Copilot access and the other half were instructed not to use any AI tools... We then evaluated the code with unit tests and with an expert review conducted by developers.
Our findings overall show that code authored with GitHub Copilot has increased functionality and improved readability, is of better quality, and receives higher approval rates... Developers with GitHub Copilot access had a 56% greater likelihood of passing all 10 unit tests in the study, indicating that GitHub Copilot helps developers write more functional code by a wide margin. In blind reviews, code written with GitHub Copilot had significantly fewer code readability errors, allowing developers to write 13.6% more lines of code, on average, without encountering readability problems. Readability improved by 3.62%, reliability by 2.94%, maintainability by 2.47%, and conciseness by 4.16%. All numbers were statistically significant... Developers were 5% more likely to approve code written with GitHub Copilot, meaning that such code is ready to be merged sooner, speeding up the time to fix bugs or deploy new features.
"While GitHub's reports have been positive, a few others haven't," reports Visual Studio magazine: For example, a recent study from Uplevel Data Labs said, "Developers with Copilot access saw a significantly higher bug rate while their issue throughput remained consistent."
And earlier this year a "Coding on Copilot" whitepaper from GitClear said, "We find disconcerting trends for maintainability. Code churn — the percentage of lines that are reverted or updated less than two weeks after being authored — is projected to double in 2024 compared to its 2021, pre-AI baseline. We further find that the percentage of 'added code' and 'copy/pasted code' is increasing in proportion to 'updated,' 'deleted,' and 'moved 'code. In this regard, AI-generated code resembles an itinerant contributor, prone to violate the DRY-ness [don't repeat yourself] of the repos visited."