Appendix III. Code Generation Tools and Search

    tags: diss
    December 16th, 2022

    The latest generation of plugins for IDEs that some people121 suggest might replace web search are those that support code generation directly from comments written within the code. This is a variant of a larger class of tools, called code generation tools.122 One such plugin is GitHub’s Copilot, based on the OpenAI Codex model, itself based on the Generative Pre-trained Transformer (GPT) models from OpenAI. My interview research did not generally directly address GitHub’s Copilot.123 The data engineers I asked about it had not used it. I will not go into the technical mechanisms of these systems, other than note that they are designed to take a prompt and predict the most likely text strings to follow. If I type “Mary had a” into the OpenAI GPT-3 Playground124 , the system completes the nursery rhyme.

    GitHub Copilot, free to GitHub verified students, teachers, and maintainers of popular open source projects125 , is trained on portions of the significant amount of code uploaded to GitHub, which raises legal, ethical, and security concerns.126 can It perhaps be imagined as an advanced autocomplete. Rather than suggesting the completion of a function name or command in your code, the plugin will suggest an entire block of code, perhaps the entire function. When the user types a comment or a line of code, the plugin will suggest a completion. Some users appear to have been satisfied with these suggestions. GitHub wrote a blog post in July 2022 reporting on a survey of Copilot users combined with data on their shown and accepted Copilot suggestions (Ziegler, 2022) . They claim “[u]sing GitHub Copilot correlates with improved developer productivity”. GitHub continues to publish reports along this line (Kalliamvakou, 2022) , while there is also intense interest from external researching on the use and effects of GitHub Copilot.

    External researchers are particularly examining security vulnerabilities (some in a manner similar to that of Fischer et al. (2017) and others mentioned in Extending searching ). A team from NYU and the University of Calgary examined code suggested by GitHub copilot scenarios developed relevant to MITRE’s “Top 25” Common Weakness Enumeration (a regularly updated list of significant software vulnerabilities). Across the 89 scenarios they had Copilot produce over 1,500 programs of which they found approximately 40% to be vulnerable (Pearce et al., 2021) . They recommend that Copilot “should be paired with appropriate security-aware tooling during both training and generation to minimize the risk of introducing security vulnerabilities.”

    Security concerns, and such precautions, are also acknowledged by GitHub127 :

    You should take the same precautions as you would with any code you write that uses material you did not independently originate. These include rigorous testing, IP scanning, and checking for security vulnerabilities. You should make sure your IDE or editor does not automatically compile or run generated code before you review it.

    These suggested responses to vulnerabilities in Copilot mirror some of what I discussed in Extending searching around the evaluation of search results and the decoupling from search.

    The plugins for IDEs, and the voice-search, free-text, as in the OpenAI GPT-3 Playground128 and chat-based, ex. ChatGPT129 , language interfaces are user interface components that, in the language of Handoff, provide distinct engagements for interactions. One of the benefits of using a general-purpose search engine is the contestability and interrogatability, perhaps not of the ranking of the websites on the SERP, but of the results. Data engineer searchers can look at the websites where they find information to gain clues as to its provenance and trustworthiness. GitHub Copilot and ChatGPT are black boxed and do not currently provide access for that, though GitHub has announced future product changes that will allow some interrogation130 , beyond directly engaging with the system for alternatives.

    The tool designers will likely continue to improve the tool, and IDEs and companies may adapt practices to pull in such untested code in way primed for effective testing. OpenAI and others continue to do research looking into the hazards posed by such tools (Khlaaf et al., 2022) . Copilot does and will likely continue to replace some subset of searching done by some data engineers. And also just like promises of automatic programming in the past (Ensmenger, 2010) , if it does lower the cost of programming it will likely only increase the demand for more programmers. The programming languages used by my research participants are far simpler to use and understand than even the “automatic programming” languages of the past, like FORTRAN and COBOL. The hard problems remain, how to use a tool to do something you or someone else wants.

    A web search engine provides some means to find what others, not only the search engine, say about top ranking search results, in addition to viewing the source website. This is a capacity of the configuration of web search that is leveraged in misinformation research. Some search engines provide interface options to learn general information about a website, to, for instance, see if the website presenting itself as a news organization is identified by Google as one. Mike Caulfield’s SIFT model for basic fact-checking practices has four moves: Stop, Investigate the source, find better coverage, trace the original context (Caulfield, 2019b) . Those moves are not supported by the Copilot configuration itself. Users of Copilot and other such tools will still find it helpful to refer to web search, and turn to sources of search repair, for learning things they do not already know.131 GitHub’s investments in Copilot come alongside a significant redesign of their search platform, for searching for code within GitHub (GitHub, 2022) 132 , suggesting a recognition from the creators of the tool that search will not be fully replaced.

    Bibliography

    Avgustinov, P. (2021).Improving github code search | the github blog. https://github.blog/2021-12-08-improving-github-code-search/ . [avgustinov2021improving]

    Caulfield, M. (2019b).SIFT (the four moves) | hapgood. https://hapgood.us/2019/06/19/sift-the-four-moves/ . [caulfield2019sift]

    Ensmenger, N. (2010).The computer boys take over: Computers, programmers, and the politics of technical expertise. The MIT Press. [ensmenger2010computer]

    Fischer, F., Böttinger, K., Xiao, H., Stransky, C., Acar, Y., Backes, M., & Fahl, S. (2017). Stack overflow considered harmful? The impact of copy&paste on android application security.2017 Ieee Symposium on Security and Privacy (Sp), 121–136. [fischer2017stack]

    GitHub. (2022).Introducing an all-new code search and code browsing experience | github changelog. https://github.blog/changelog/2022-11-09-introducing-an-all-new-code-search-and-code-browsing-experience/ . [github2022introducing]

    Kalliamvakou, E. (2022).Research: Quantifying github copilot’s impact on developer productivity and happiness | the github blog. https://github.blog/2022-09-07-research-quantifying-github-copilots-impact-on-developer-productivity-and-happiness/ . [kalliamvakou2022research]

    Khlaaf, H., Mishkin, P., Achiam, J., Krueger, G., & Brundage, M. (2022).A hazard analysis framework for code synthesis large language models. arXiv. https://doi.org/10.48550/ARXIV.2207.14157 [khlaaf2022hazard]

    Metaphor. (2021).Today we’re releasing wanderer 2Metaphor. https://twitter.com/metaphorsystems/status/1428793313663111170 . [metaphor2021today_tweet]

    Pearce, H. A., Ahmad, B., Tan, B., Dolan-Gavitt, B., & Karri, R. (2021). An empirical cybersecurity evaluation of github copilot’s code contributions.ArXiv,abs/2108.09293. [pearce2021asleep]

    Salva, R. J. (2022).Preview: Referencing public code in github copilot | the github blog. https://github.blog/2022-11-01-preview-referencing-public-code-in-github-copilot/ . [salva2022preview]

    Ziegler, A. (2022).Research: How github copilot helps improve developer productivity | the github blog. https://github.blog/2022-07-14-research-how-github-copilot-helps-improve-developer-productivity/ . [ziegler2022research]


    1. I am referring to popular commentary on Twitter from software developers and data engineers. ↩︎

    2. I have used an older tool, TabNine, based on an earlier generation of OpenAI’s GPT, since the summer of 2019. It is installed in my text editor, Sublime Text, which I use for all of my writing and python coding (until early 2022 when I started using PyCharm from JetBrains to gain familiarity with the sorts of integrated development environments available to my interviewees). I have used it in my Python coding and in any writing that I’ve done. It runs locally on my machine and provides a several predictions to suggest an autocomplete for most any string that I type. In my prose writing it is particularly helpful for spelling suggestions and remembering the shorthand for inserting a citation. It has not replaced a significant part of my web searching. ↩︎

    3. The first technical preview of Copilot was available in the summer of 2021, with the subscription service starting in the summer of 2022. ↩︎

    4. https://beta.openai.com/playground ↩︎

    5. I signed up for a free trial and started paying $100.00/year in September 2022. (While I am notionally a student, the GitHUb verification process does not recognize UC Berkeley’s filing fee status.) I have found it at times both very frustrating and very useful. The coding I do is very different from that of data engineers coding within their company codebase and systems. My use gives me only limited insight and I rely mostly on the findings from my interviews on the larger practices of the data engineers and external commentary and research on Copilot. ↩︎

    6. I will not address the legal concerns and ethical concerns here except to note the broad sort of concerns and that they have some (but not total) similarity to early conversations and legal suits regarding web search. The concerns include questions about the copyright protection for and licenses of the uploaded code as well as GitHub’s terms of service, the power imbalance between GitHub, now owned my Microsoft, and many developers who have uploaded code, and the absence of profit sharing. ↩︎

    7. https://github.com/features/copilot ↩︎

    8. https://beta.openai.com/playground ↩︎

    9. OpenAI released their GPT-driven ChatGPT for public beta testing on November 30th 2022. One of my interviewees sent me a screenshot of them using it, asking the chatbot “How do data engineers use google search for their job?”

      Data engineers can use Google Search to find information and resources related to data engineering. This can include finding tutorials, documentation, and other resources that can help them with their work. For example, a data engineer might use Google Search to look up the syntax for a particular SQL query, find a tutorial on how to use a certain data analysis tool, or search for information on best practices for data engineering.

      Additionally, data engineers can use Google Search to stay-to-date on the latest developments and trends in the…

      ↩︎
    10. GitHub announced plans in November, 2022 to identify public repositories of code that contain matching code, of a particular character count (Salva, 2022) . ↩︎

    11. Google and other search engines already use large language models, such as BERT, as components in their search engine. These components generally are used to support the same configurations in the user interface of the SERP, a list of links with some rich features. Some search engines, like metaphor.systems a large language model trained on Hacker News posts (Metaphor, 2021) , have adapted the search box to be a longer freetext field, but still provide a list of results. Another, Andi, operates as a chatbot that also provides a list of results. Other general-purpose search engines and search tools have also introduced distinct plugins on the SERP making use of generative AI. You.com, a general-purpose web search engine, has applications that provide generative AI tooling for code writing, short prose, and image generation from prompts directly within the search bar. ↩︎

    12. Significant investments into GitHub’s code search were announced in December 2021 (Avgustinov, 2021) . While my research participants reported searching GitHub, they were generally searching for particular issues posted to repositories for debugging. ↩︎