Who is going to try to make web search better?

    Previously…

    This is a continuation of Towards “benchmarking” democratization of good search.


    draft

    to be developed further… Feedback welcome!

    Search is a mess, it seems either broken or bought or shaped by a behemoth. But sparks of disruption in the energy around large language models (LLMs) have helped some to imagine alternative models of search.

    What organizations are best situated to apply their own resources and promote developer attention and energy to this problem and opportunity?

    I don’t think we will find this from the largest search-associated technology companies1. Below I will mention some organizations and provide a few comments. These are considering both towards folks who might organize disruptive efforts or pursue open disruptions in search. This is largely an exercise in brainstorming for factors that might be relevant to engaging on this project. This is not directed towards making a prediction. Rather, I want to flexibly play with this question to focus my own attention and action.

    Who might be able to align their interests with disruptions in search (and/or disruption of search advertising)? Who might some portion of early adopting companies and independent developers “trust enough”? Who has a user base willing engage in and extend experimentation? Who would explore any part of efforts to benchmark search experiences?

    Hugging Face? LangChain? LlamaIndex? Replit? Mozilla? Snowflake? (are some of the former Neeva folks looking to address Google’s continued dominance?) Databricks? Automattic?

    Anthropic? (does their safety model account for the dominance of a single way of searching the web?) Cohere? (They have a reranker and Coral (RAG / “grounded” / “a knowledge assistant for enterprises”)) Mistral? (in Perplexity AI’s pplx-7b-online & pplx-70b-online APIs, Hugging Face’s Chat, You.com’s “Safe search: Off (uncensored chat)”, and Nagato) Yahoo? (or Vespa?) Meta? (noting their the “open sourcing”2 of their Llama models)

    Majority World developers?

    Salesforce? Getty Images? (who has a model for web search that is genuinely both pro-user and creator?) Disney or some other media conglomerate?

    Folks organizing to develop tooling for ignored, forgotten, or unseemly purposes?

    TikTok? (seems likely to continue to disrupt search practices but does not seem inclined (baring regulations) towards sharing technology and integrating with the open web)

    AI2? (see their incubator?) Wikimedia Foundation? Open Philanthropy? Open Society Foundations? Omidyar Network? Knight Foundation? Internet Archive? (see Internet Archive Scholar search) Other libraries? Protocol Labs? (See: Berjon’s “Fixing Search”)

    Other established search technology providers? (Algolia? Elasticsearch?) Generative search startups? (Perplexity AI has APIs and detailed technical blogs, Phind released models, You.com (API) & Andi Search have made comments about addressing dominance of Google), Metaphor Search has an API.)

    OpenWebSearch.eu? (see Mager (2023)) Glocal approaches? SEO-efforts? The academy? (is there PIT-UN work on search? What is the NSF doing? What multi-campus efforts exist around search? What are the big science or BigScience folks doing?)

    What might motivate these moves? The pursuit of “relevance”? Or a refusal of the status quo? A drive to repair? Efforts to meet the searchers where they are?


    Added

    Added December 12, 2023 02:56 PM (PST)

    Perhaps also answer.ai? See answer.ai & “A new old kind of R&D lab”



    Footnotes

    1. Perhaps mapping out such scenarios later would be useful.↩︎

    2. Some have challenged various companies’ recent uses of “open source”.↩︎

    References

    Mager, A. (2023). European search? How to counter-imagine and counteract hegemonic search with european search engine projects. Big Data & Society, 10(1), 205395172311631. https://doi.org/10.1177/20539517231163173 [mager2023european]