search audits

    July 5th, 2023

    This is an incomplete doc about search audits, or search engine audits.

    Search audits, as a form of “algorithm audit”, are systematic approaches to evaluating the performance of a search tool through documenting interacting with the tool in various ways. These audits may be exploratory, like Gerhart (2004)’s development of a methodology and exploration of five controversial subtopics across three search engines (Google, Teoma, and AllTheWeb) and two “multi-searchers”. Or they may examine a single topic much more intensely, like Urman et al. (2022) constructing 200 virtual agents to search seven queries on one topic across six search engines.

    See also Metaxa et al. (2021) for a more comprehensive treatment of audits of search engines. Kulshrestha et al. (2017)—also below—and Trielli & Diakopoulos (2018) note the importance of examining the role of the “input bias” (including both the corpus and the user, respectively).

    Some related non-audit work takes a much smaller subset of query-result pairs, perhaps a single example, and conducts extensive theoretical and functional examination of the causes, effects, and implications (see, for example, Noble (2018), Mulligan & Griffin (2018), or Haider & Rödl (2023)). Other work may be more focused on aspects of the search experience outside of the search engine results page, like how the search engine presents itself in response to searcher complaints (Griffin & Lurie (2022)).

    Academic research

    Here is a very incomplete list of search audits in academic research (they may not always use the term “audit”):

    • Gerhart’s “Do Web search engines suppress controversy?” (2004), in First Monday. https://doi.org/10.5210/fm.v9i1.1111 [gerhart2004web]
    • Diaz’s “Through the Google Goggles: Sociopolitical Bias in Search Engine Design” (2008), in Web Search: Multidisciplinary Perspectives. [diaz2008through]
    • Jiang’s “Search Concentration, Bias, and Parochialism: A Comparative Study of Google, Baidu, and Jike’s Search Results From China” (2014), in Journal of Communication. [jiang2014search]
    • Mart’s “The algorithm as a human artifact: implications for legal [Re] search” (2017), in Law Libr. J.. https://scholar.law.colorado.edu/articles/755/ [mart2017algorithm]
    • Kulshrestha et al.’s “Quantifying search bias: Investigating sources of bias for political searches in social media” (2017), in Proceedings of the 2017 ACM Conference on Computer Supported Cooperative Work and Social Computing. [kulshrestha2017quantifying]
    • Robertson et al.’s “Auditing the Personalization and Composition of Politically-Related Search Engine Results Pages” (2018), in Proceedings of the 2018 World Wide Web Conference on World Wide Web - WWW . https://doi.org/10.1145/3178876.3186143 [robertson2018personalization]
    • Hagan & Li’s “Legal Help Search Audit: Are Search Engines Effective Brokers of Legal Information?” (2020). https://ssrn.com/abstract=3623333 [hagan2020legal]
    • Mustafaraj et al.’s “The Case for Voter-Centered Audits of Search Engines During Political Elections” (2020), in FAT* ’20. [mustafaraj2020case]
    • Lurie & Mulligan’s “Searching for Representation: A sociotechnical audit of googling for members of U.S. Congress” (2021), from FAccTRec Workshop: Responsible Recommendation. https://arxiv.org/abs/2109.07012 [lurie2021searching_facctrec]
    • Ulloa et al.’s “Scaling up search engine audits: Practical insights for algorithm auditing” (2022), in Journal of Information Science. https://doi.org/10.1177/01655515221093029 [ulloa2022scaling]
    • Urman & Makhortykh’s ““Foreign beauties want to meet you”: The sexualization of women in Google’s organic and sponsored text search results" (2022), in New Media & Society. https://doi.org/10.1177/14614448221099536 [urman2022foreign]
    • Urman et al.’s “Where the earth is flat and 9/11 is an inside job: A comparative algorithm audit of conspiratorial information in web search results” (2022), in Telematics and Informatics. https://doi.org/10.1016/j.tele.2022.101860 [urman2022earth]
    • Urman et al.’s “Auditing the representation of migrants in image web search results” (2022), in Humanit Soc Sci Commun. https://doi.org/10.1057/s41599-022-01144-1 [urman2022auditing]

    Press

    Press reporting on search audits include (these audits may be commissioned or performed by the journalists or external entities):

    • Yin & Jeffries’s “How We Analyzed Google’s Search Results” (2020), from The Markup. https://themarkup.org/google-the-giant/2020/07/28/how-we-analyzed-google-search-results-web-assay-parsing-tool [markup2020analyzed]
    • Asher-Schapiro’s “Gaming Google: Oil firms use search ads to greenwash, study says” (2022), from Context. https://www.context.news/climate-risks/gaming-google-oil-firms-use-search-ads-to-greenwash-study-says [asher-schapiro2022gaming]

    References

    Gerhart, S. (2004). Do web search engines suppress controversy? First Monday, 9(1). https://doi.org/10.5210/fm.v9i1.1111 [gerhart2004web]

    Griffin, D., & Lurie, E. (2022). Search quality complaints and imaginary repair: Control in articulations of Google Search. New Media & Society, 0(0), 14614448221136505. https://doi.org/10.1177/14614448221136505 [griffin2022search]

    Haider, J., & Rödl, M. (2023). Google search and the creation of ignorance: The case of the climate crisis. Big Data &Amp; Society, 10(1), 205395172311589. https://doi.org/10.1177/20539517231158997 [haider2023google]

    Kulshrestha, J., Eslami, M., Messias, J., Zafar, M. B., Ghosh, S., Gummadi, K. P., & Karahalios, K. (2017). Quantifying search bias: Investigating sources of bias for political searches in social media. Proceedings of the 2017 Acm Conference on Computer Supported Cooperative Work and Social Computing, 417–432. [kulshrestha2017quantifying]

    Metaxa, D., Park, J. S., Robertson, R. E., Karahalios, K., Wilson, C., Hancock, J., & Sandvig, C. (2021). Auditing algorithms: Understanding algorithmic systems from the outside in. Foundations and Trends® in Human–Computer Interaction, 14(4), 272–344. https://doi.org/10.1561/1100000083 [metaxa2021auditing]

    Mulligan, D. K., & Griffin, D. (2018). Rescripting search to respect the right to truth. The Georgetown Law Technology Review, 2(2), 557–584. https://georgetownlawtechreview.org/rescripting-search-to-respect-the-right-to-truth/GLTR-07-2018/ [mulligan2018rescripting]

    Noble, S. U. (2018). Algorithms of oppression how search engines reinforce racism. New York University Press. https://nyupress.org/9781479837243/algorithms-of-oppression/ [noble2018algorithms]

    Trielli, D., & Diakopoulos, N. (2018). Defining the role of user input bias in personalized platforms. Paper presented at the Algorithmic Personalization and News (APEN18) workshop at the International AAAI Conference on Web and Social Media (ICWSM). https://www.academia.edu/37432632/Defining_the_Role_of_User_Input_Bias_in_Personalized_Platforms [trielli2018defining]

    Urman, A., Makhortykh, M., & Ulloa, R. (2022). Auditing the representation of migrants in image web search results. Humanit Soc Sci Commun, 9(1), 5. https://doi.org/10.1057/s41599-022-01144-1 [urman2022auditing]