“Why Johnny Can’t Prompt”

    status: unread (2023-06-30 17:06:56)

    Why Johnny Can’t Prompt: How Non-AI Experts Try (and Fail) to Design LLM Prompts

    @inproceedings{zamfirescu-pereira2023johnny,
        author = {Zamfirescu-Pereira, J.D. and Wong, Richmond Y. and Hartmann, Bjoern and Yang, Qian},
        title = {Why Johnny Can’t Prompt: How Non-AI Experts Try (and Fail) to Design LLM Prompts},
        year = {2023},
        isbn = {9781450394215},
        publisher = {Association for Computing Machinery},
        address = {New York, NY, USA},
        url = {https://doi.org/10.1145/3544548.3581388},
        doi = {10.1145/3544548.3581388},
        abstract = {Pre-trained large language models (“LLMs”) like GPT-3 can engage in fluent, multi-turn instruction-taking out-of-the-box, making them attractive materials for designing natural language interactions. Using natural language to steer LLM outputs (“prompting”) has emerged as an important design technique potentially accessible to non-AI-experts. Crafting effective prompts can be challenging, however, and prompt-based interactions are brittle. Here, we explore whether non-AI-experts can successfully engage in “end-user prompt engineering” using a design probe—a prototype LLM-based chatbot design tool supporting development and systematic evaluation of prompting strategies. Ultimately, our probe participants explored prompt designs opportunistically, not systematically, and struggled in ways echoing end-user programming systems and interactive machine learning systems. Expectations stemming from human-to-human instructional experiences, and a tendency to overgeneralize, were barriers to effective prompt design. These findings have implications for non-AI-expert-facing LLM-based tool design and for improving LLM-and-prompt literacy among programmers and the public, and present opportunities for further research.},
        booktitle = {Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems},
        articleno = {437},
        numpages = {21},
        keywords = {end-users, design tools, language models},
        location = {Hamburg, Germany},
        series = {CHI '23}
    }

    bibtext from doi>