jump to search:/  select search:K  navigate results:/  navigate suggestions:/  close suggestions:esc

    Shared weblinks

    May 15th, 2023
    This page lists shared weblinks.

    December 7, 2023

    “there's pressure for every LLM to have a live web connection”
    @RichardSocher via Twitter on Dec 7, 2023

    With Google’s Gemini coming out, there’s pressure for every LLM to have a live web connection.

    Ping us at api@you.com if you want to get rid of hallucinations, keep your LLM answers up-to-date or offer citations for facts.
    We can help with a complete solution.

    I still think there is an important distinction between access to a regularly updated search index and dynamic web access, though I recognize it is possible that You.com is providing live webpage parsing in addition to search results.

    See: “What does ‘no knowledge cutoff’ mean?”

    Tags: knowledge-cutoffs, generative-search-APIs, You.com

    “Today marks the first year anniversary of the launch of perplexity.ai”
    @AravSrinivas via Twitter on Dec 7, 2023

    Today marks the first year anniversary of the launch of http://perplexity.ai, launched on Dec 7, 2022. Lot of people ask me why we ended up being the best product in this category. If I had to pick one word: it is conviction. We believed world needed answers rather than links.

    See thread.

    Tags: Perplexity-AI

    “The internal name for Deep Search was "Search Harder"”
    @RangThang via Twitter on Dec 5, 2023

    Funny story: The internal name for Deep Search was “Search Harder” but it didn’t test well. We asked GPT-4 to suggest some better names and it came up with “Deep Search”! So yeah, GPT-4 named this GPT-4 powered feature.

    Tags: Bing-Deep-Search

    “what do you prefer for live accurate information?”
    @AravSrinivas via Twitter on Dec 7, 2023

    what do you prefer for live accurate information?

    [omitted poll: perplexity OR grok]

    Tags: fresh-search, knowledge-cutoffs, Perplexity-AI, X-Grok

    ssr: “Do folks have favorite examples of systematic literature reviews? Better if in HCI, but okay if not.”
    @naz_andalibi via Twitter on Dec 6, 2023

    Do folks have favorite examples of systematic literature reviews? Better if in HCI, but okay if not. We are doing one and looking for details on methods.

    Tags: social-search-request, systematic-literature-reviews

    What questions should we ask when building software?
    Jim Herbsleb’s Jim Herbsleb’s Home Page on Dec 7, 2023

    Rather than ask
    How can I specify, design, and build the system that my stakeholders need?

    Maybe we should ask
    How can I set up the socio-technical ecosystem that will allow users, developers, businesses, and everyone else to cooperate and compete to build what everyone needs?

    I’ve been thinking about these sorts of questions a lot in relation to our current disruption in web search. See some of my musings in Towards “benchmarking” democratization of good search and Who is going to try to make web search better?

    Tags: incentives, constraints, social-construction, technology-delegation

    December 1, 2023

    “It would be really helpful to have other people dabble in search without having to build an entire search engine from scratch.”
    Tessel Renzenbrink’s Interview With Viktor Lofgren from Marginalia Search on Oct 10, 2023

    Viktor Lofgren is the creator of Marginalia Search, a search engine takes you of the beaten track by letting you find small quality web pages. These pages barely surface in commercial search engines because they are snowed under by larger commercial websites and marketing. We interviewed Viktor for FreeWebSearchDay. You can listen to the recording of the interview or read the edited transcript below.

    [ . . . ]

    “…when you design a ranking algorithm you basically encode your own values and perspectives into the software. So if you don’t have enough software diversity in the sense that you have multiple search engines build by multiple people than you get a very one-sided view of the world. And having someone else come and build a search engine with ther own ranking algorithm for example, is that they would promote different types of content. And that would benefit people in general.”

    [ . . . ]

    “It would be really helpful to have other people dabble in search without having to build an entire search engine from scratch.”

    [ . . . ]

    “…there are a lot of assumptions that have been around since the ‘80s, or the ‘70s even, on how to build a search engine. So having fresh eyes on the problem, even if it does mean occasionally reinventing the wheel, is refreshing.”

    [ . . . ]

    “I am hopeful for the future that something good will come out of this and something like the Linux of internet search engines will emerge. Where people can collaborate and build something great together, open source.”

    Tags: Marginalia-Search, alternative-search-engines, other-quality-goals, reimagine, interviews-with-search-producers

    Ollie.ai & “your personal shopper”
    @blennon_ via Twitter on May 16, 2023

    I’m super excited to share the first step in our journey of building your personal shopper @HeyOllieAi . Noise, bias, influence, infinite choice make shopping online just awful. Today is the first step where http://heyollie.ai helps you find the perfect gift.

    Note: The URL is now ollie.ai

    Tags: generative-search, niche-generative-search, Ollie

    “I love it when single devs can build stuff like a new search engine that actually work at scale.”
    @FloweeTheHub via Twitter on Dec 01, 2023

    I love it when single devs can build stuff like a new search engine that actually work at scale.

    Democratization of innovation, which I’m trying with my stuff too!

    Here is an interview with the author @MarginaliaNu, a “for the people” search-engine:

    nlnet.nl/news/2023/20231016-marginalia.html

    Tags: Marginalia-Search, alternative-search-engines

    “Product human evals are what matter to us”
    @AravSrinivas via Twitter on Dec 01, 2023

    I was asked why we never published metrics relative to GPT-4 for the pplx-chat and pplx-online LLMs and only compared to 3.5-turbo. Just like everyone else today, we are far away from GPT-4 capabilities, even on the narrow task of answering questions accurately with search grounding. Product human evals are what matter to us, not academic evals that can be gamed. Our own data flywheel and better base models are necessary ingredients to getting there. Mistral and Meta are doing incredible work to help the community to get closer. But at the same time, it’s important to acknowledge OpenAI’s tremendous work on GPT 4.

    Tags: evaluating-results-meta, Perplexity-AI, benchmarks

    “How many people who type "hi" want Adolf Hitler with a picture in autocomplete?”
    @OriZilbershtein via Twitter on Dec 01, 2023

    Hey @searchliaison can we stop doing these? How many people who type “hi” want Adolf Hitler with a picture in autocomplete?

    is it indicative on the amount of interest or how does this work?

    These are so basic…

    smh

    Image 1

    Zilbershtein continues:

    Stop showing Hitler on predictive and suggestive features exactly like you do not show gambling or porn or any other topic that demands someone to specifically search for it.

    I found the same:

    @danielsgriffin via Twitter on Dec 01, 2023

    Desktop, incognito, Seattle area.

    Image 1



    There is lack of transparency in the policy choices around both autocomplete and safesearch interventions. Here is a different angle on that problem in safesearch: The “SEARCHING FOR SOMETHING AND GETTING WHAT APPEARS TO BE EVERYTHING” section in the chapter on “TO REMOVE OR TO FILTER” in Gillespie (2018, pp. 186–194). An excerpt, from p. 187:

    To be safe, search engines treat all users as not wanting accidental porn: better to make the user looking for porn refine her query and ask again, than to deliver porn to users who did not expect or wish to receive it. Reasonable, perhaps; but the intervention is a hidden one, and in fact runs counter to my stated preferences. Search engines not only guess the meaning and intent of my search, based on the little they know about me, they must in fact defy the one bit of agency I have asserted here, in turning safesearch off.[41]


    1. Jemima Kiss, “YouTube Looking at Standalone ‘SafeTube’ Site for Families,”Guardian, May 29, 2009, https://www.theguardian.com/media/pda/2009/may/29/youtube-google.

    Without excusing the hidden interventions, note how “the one bit of agency” line both reflects and reconstructs articulations of search and perceptions of affordance.


    Footnotes

    1. That said, I do regularly see the SEO folks engaging in search quality complaints on behalf of those searching (not only those searched for or trying to be found). I think SEOs, partially because of their expertise and this relationship with Google, may have a responsibility to engage in more public advocacy for public interest searches. See my other comments on this in [tags:seo-for-social-good].↩︎

    Tags: search-liaison, search-engine-optimization, search-quality-complaints, seo-for-social-good

    “this is well within the model training (it has very likely seen a ton of code for pong)”
    @abacaj via Twitter on Nov 16, 2023

    Ok hear me out, I think this is not impressive. I would say this is well within the model training (it has very likely seen a ton of code for pong). If it can do other things that are less represented in the training then it would be more interesting to me

    Image 1

    Tags: hypecert

    November 15, 2023

    “Anyone else finding the new ChatGPT-4 with browsing to be a major downgrade?”
    @geoffreylitt via Twitter on Nov 14, 2023

    Anyone else finding the new ChatGPT-4 with browsing to be a major downgrade?

    Now instead of quickly hallucinating an approximate answer it slowly Googles (sorry, Bings?) the wrong thing and then gives a worse answer

    Tags: generative-search, OpenAI-ChatGPT

    “Bing Chat and Bing Chat Enterprise will now simply become Copilot.”
    @JordiRib1 via Twitter on Nov 15, 2023

    Since launching Bing Chat, I’m pleased to share that there have been more than 1 billion prompts. As we work to simplify the user experience across @Microsoft products and services, Bing Chat and Bing Chat Enterprise will now simply become Copilot.

    Tags: Microsoft-Copilot, Bing-Chat

    “HotGirls on TikTok, why do I look stupid at the gym?”

    Most of the social-search-requests that I share on here are ones that I am relaying for others to answer or consider or to find answers at the link (marked with “ssr”. I am sharing the request in this post as an example1 of what appears to be a very effective packaging of a question or request for help——it has a simple message, signals preparation for responses, and highlights possible issues already considered. While the transcript below provides some sense of the request, one would need to watch the video to get the full packaging.

    I discuss packaging of questions in my dissertation: Ch. 5. Repairing searching: Due diligence and packaging questions. I’m not suggesting this approach might work for everyone: this searcher is an actor.

    Hannah Brown’s HotGirls on TikTok, why do I look stupid at the gym? on Nov 07, 2023

    HotGirls on TikTok, why do I look stupid at the gym? Why do I look stupid at the gym? Don’t be nice to me. Don’t worry about my feelings. Give it to me straight because I know that I don’t look as hot at the gym as I could. But I don’t know what the problem is. Is it that I need a set? Is it the socks? Is it the shoes? I feel like these sneakers look fucking stupid. Is it the hairstyle? Is it the jewelry or lack thereof in this region? Why do I look stupid at the gym? Someone tell me. Because I want to look hot all the time.

    The screenshot for the video, from TikTok, showing the viewcount of over 7.7 million and the pinned status.

    Footnotes

    1. I am comfortable sharing this as an example because it has received over 7.7 million views at the time of this post, with over 461.6K likes and 20.5K comments. The question asker, Hannah Brown, has also pinned the post to her profile.↩︎

    Tags: social-search-request, TikTok, packaging

    ssr: “what are the most useful LLM-powered apps?”
    @binarybits via Twitter on Nov 15, 2023

    Aside from standalone chatbots (like ChatGPT) and code completion tools (like Github copilot), what are the most useful LLM-powered apps? Are people finding Microsoft 365 Copilot useful?

    Tags: social-search-request

    November 14, 2023

    Friday AI
    @andi_search via Twitter on Nov 14, 2023

    EXCITING ANNOUNCEMENT

    Andi X Friday AI

    We’re stoked to share that Andi is acquiring Friday AI!

    @FridayAIApp is an AI-powered educational assistant that helps thousands of busy college students with their homework.

    Grateful and excited to be a new home for their users’ educational and search needs! Welcome to Andi

    Here is Friday, as of today:
    friday.page’s Friday AI on Nov 14, 2023

    Hi, I’m Friday, your AI copilot for school! Learn a difficult topic, draft an email, or speed up your hw.

    Use a command to get started: Generate essay outline to draft an outline. New thread to start a new conversation. / to see a list of commands.

    Screencapture of Friday AI.

    Tags: Andi

    A syllabus for ‘Taking an Internet Walk’
    Spencer Chang & Kristoffer Tjalve’s Taking an Internet Walk on Nov 09, 2023

    In the 1950-70s, urban highways were built across many cities. It is beyond our syllabus to reason why parks, lakes, and sidewalks were sacrificed for additional car lanes,1 but, as these high-speed traffic veins warped the faces of neighborhoods, so have the introduction of search engines and social news feeds changed our online behavior. Fortunately, on the Internet, we still have the agency to wayfind through alternative path systems.

    Tags: alternative-search-engines

    “Don't make me type everything into a box, let me point at stuff.”
    @roberthaisfield via Twitter on Oct 19, 2023

    I wish more RAG and Agent apps would let me point at stuff on the screen. Like highlight text or draw a rectangle on the screen, and say “hey, change that” or “search for things like this.

    Don’t make me type everything into a box, let me point at stuff.

    Tags: search-outside-the-box

    “way more accessible to have a list of experiences to try”
    @nickadobos via Twitter on Nov 11, 2023

    There’s some serious OAI shills on here, after trying some GPTs I can say that the ones I tried are pretty useless? Like I can just prompt the model myself, I don’t know what the point is

    Tags: repository-of-examples

    November 9, 2023

    “building a search engine by listing "every query anyone might ever make"”
    @fchollet via Twitter on Nov 8, 2023

    The idea that you can build general cognitive abilities by fitting a curve on “everything there is to know” is akin to building a search engine by listing "every query anyone might ever make". The world changes every day. The point of intelligence is to adapt to that change. [emphasis added]

    What might this look like? What might some one learn, or know more intuitively, from doing this?


    See a comment on Chollet’s next post in the thread, in: True, False, or Useful: ‘15% of all Google searches have never been searched before.’

    Tags: speculative-design

    November 7, 2023

    “where all orgs, non-profits, academia, startups, small and big companies, from all over the world can build”
    @clementdelangue via Twitter on Nov 3, 2023

    The current 7 best trending models on @huggingface are NOT from BIG TECH! Let’s build a more inclusive future where all orgs, non-profits, academia, startups, small and big companies, from all over the world can build AI versus just use AI. That’s the power of open-source and collaborative approaches!

    Tags: open-source, hugging-face

    Intervenr
    Stephanie Wang, Danaë Metaxa, Michelle Lam, Rachit Gupta, Megan Mou, Poonam Sahoo, Colin Kalicki, Ayush Pandit, and Melanie Zhou’s Intervenr on Nov 7, 2023 (accessed)

    Intervenr is a platform for studying online media run by researchers at the University of Pennsylvania. You can learn more at our About page. This website does not collect or store any data about you unless you choose to sign up. We do not use third party cookies and will not track you for advertising purposes.

    [ . . . ]

    Intervenr is a research study. Our goal is to learn about the types of media people consume online, and how changing that media affects them. If you choose to participate, we will ask you to install our Chrome extension (for a limited time) which will record selected content from websites you visit (online advertisements in the case of our ads study). We will also ask you to complete three surveys during your participation.

    References

    Lam, M. S., Pandit, A., Kalicki, C. H., Gupta, R., Sahoo, P., & Metaxa, D. (2023). Sociotechnical audits: Broadening the algorithm auditing lens to investigate targeted advertising. Proc. ACM Hum.-Comput. Interact., 7(CSCW2). https://doi.org/10.1145/3610209 [lam2023sociotechnical]

    Tags: sociotechnical-audits

    November 3, 2023

    [how to change the oil on a 1965 Mustang]
    @superwuster via Twitter on Nov 3, 2023

    1. Google’s theory is that, as for every query, Google faces competition from Amazon, Yelp, AA.com, Cars.com and other verticals. The problem is that government kept bringing up searches that only Google / Bing / Duckduckgo and other GSs do.


    For example, only general search engines return links to websites with information you might be looking for, e.g., a site explaining how to change the oil on a 1965 Mustang. There’s no way to find that on cars.com.

    The second result on Google is the same as the second result when searching posts on Facebook

    What is searchable where? If it were on its own (i.e. outside tight integration in an argument about the very dominance of Google shaping not only the availability of alternatives but our concept of search) this claim seems to ignore searcher agency and the context of searches.

    Also, why is this the example? What other examples are there for search needs where “only general search engines return links to websites with information you might be looking for”?


    That said, it seems worth engaging with…

    1. Cars.com doesn’t even have a general site search bar.
    2. But there are many places that folks might try if avoiding general search.
    3. How many owners of 1965 mustangs are searching up on a general search engine how to conduct oil changes? I don’t know, maybe the government supplied that sort of information. I assume that many have retained knowledge of how to change the oil, reference old manuals on hand, or are plugged in to relevant communities (including forums or groups of various sorts—including Facebook (and Facebook groups) and all those online groups before it, let alone offline community). But maybe I’m way off. I think it is likely (partially in scanning Reddit results) that people looking to change the oil in the 1965 mustang is likely searching much more particular questions (at least that is the social searching that I saw on Reddit).
    4. You should be able to go to Ford.com and find a manual. A search for [how to change the oil on a 1965 Mustang] there shows a view titled “How do I add engine oil to my Ford?” though it is unclear to me if this information is wildly off base or not. It does refer to the owners manual. Ford does not provide, directly on their website, manuals for vehicles prior to 2014. Ford does have a link with the text “Where can I get printed copies of Owner Manuals or older Owner Manuals not included on this site?” to a Where can I get an Owner’s Manual? page. They link to Bishko for vehicles in that year. It seems you can pay $39.99 for the manual.. Ford does have a live chat, that may be fruitful, no clue.

    People are so creative, already come in with so much knowledge, and make choices about what to share or search services to provide in the context of the massive power of Google.

    Tags: Googlelessness, general-search-services, social-search, US-v-Google-2020

    November 2, 2023

    “I'm having a much harder time finding news clips via Google search than a few months ago.”
    @lmatsakis via Twitter on Nov 2, 2023

    I’m having a much harder time finding news clips via Google search than a few months ago. Instead, I often get tons of random blogs from law firms, coaching websites, etc. even when I include the name of the outlet in the search

    Tags: search-quality-complaints

    November 1, 2023

    ssr: “What's the phrase to describe when an algorithm doesn't take into effect it's own influence on an outcome?”

    References

    Selbst, A. D., Boyd, D., Friedler, S. A., Venkatasubramanian, S., & Vertesi, J. (2019). Fairness and abstraction in sociotechnical systems. Proceedings of the Conference on Fairness, Accountability, and Transparency, 59–68. https://doi.org/10.1145/3287560.3287598 [selbst2019fairness]

    Tags: social-search-request, ripple-effect-trap

    October 31, 2023

    ssr: “What is the equivalent of SEO, but for navigating the physical built environment?”
    @chenoehart via Twitter on Jul 10, 2023

    What is the equivalent of SEO, but for navigating the physical built environment? And for searching for physical destinations on a GPS map?

    References

    Ziewitz, M. (2019). Rethinking gaming: The ethical work of optimization in web search engines. Social Studies of Science, 49(5), 707–731. https://doi.org/10.1177/0306312719865607 [ziewitz2019rethinking]

    Tags: search-engine-optimization, wayfinding, blazing, navigation, GPS, social-search-request

    October 16, 2023

    “might be the biggest SEO development we've had in a long time”
    @lilyraynyc via Twitter on Oct 16, 2023

    For the record, I think Google testing the Discover feed on desktop might be the biggest SEO development we’ve had in a long time.

    Why do I think this is bigger than other developments, updates, etc.?

    Ask me later when you see how much traffic comes from Discover on desktop.

    Tags: Google-Discover, TikTok, Twitter&solX-Explore, personalized-search, prospective-search, recommendation

    October 11, 2023

    "the importance of open discussion of these new tools gave me pause"
    Davey Alba’s Even Google Insiders Are Questioning Bard AI Chatbot’s Usefulness on Oct 11, 2023

    [ . . . ]

    Daniel Griffin, a recent Ph.D. graduate from University of California at Berkeley who studies web search and joined the Discord group in September, said it isn’t uncommon for open source software and small search engine tools to have informal chats for enthusiasts. But Griffin, who has written critically about how Google shapes the public’s interpretations of its products, said he felt “uncomfortable” that the chat was somewhat secretive.

    The Bard Discord chat may just be a “non-disclosed, massively-scaled and long-lasting focus group or a community of AI enthusiasts, but the power of Google and the importance of open discussion of these new tools gave me pause,” he added, noting that the company’s other community-feedback efforts, like the Google Search Liaison, were more open to the public.

    [ . . . ]

    Initially shared on Twitter (Sep 13, 2023)

    I was just invited to Google’s “Bard Discord community”: “Bard’s Discord is a private, invite-only server and is currently limited to a certain capacity.”

    A tiny % of the total users. It seems to include a wide range of folks.

    There is no disclosure re being research subjects.

    The rules DO NOT say: ‘The first rule of Bard Discord is: you do not talk about Bard Discord.’ I’m not going to discuss the users. But contextual integrity and researcher integrity suggests I provide some of the briefest notes.

    The rules do include: “Do not post personal information.” (Which I suppose I’m breaking by using my default Discord profile. This is likely more about protecting users from each other though, since Google verifies your email when you join.)

    Does Google’s privacy policy cover you on Google’s uses of third party products?

    There are channels like “suggestion-box’ and”bug-reports’, and prompt-chat’ (Share and discuss your best prompts here with your fellow Bard Community members! Feel free to include screenshots…)

    I’ll confess it is pretty awkward being in there, with our paper—“Search quality complaints and imaginary repair: Control in articulations of Google Search”[1]—at top of mind.



    1. https://doi.org/10.1177/14614448221136505

    A lot of the newer search systems that I’m studying use Discord for community management. And I’ve joined several.

    References

    Griffin, D., & Lurie, E. (2022). Search quality complaints and imaginary repair: Control in articulations of Google Search. New Media & Society, 0(0), 14614448221136505. https://doi.org/10.1177/14614448221136505 [griffin2022search]

    Tags: griffin2022search, articulations, Google-Bard, Discord

    October 9, 2023

    "he hopes to see AI-powered search tools shake things up"
    Will Knight’s Chatbot Hallucinations Are Poisoning Web Search on Oct 05, 2023

    [ . . . ]

    Griffin says he hopes to see AI-powered search tools shake things up in the industry and spur wider choice for users. But given the accidental trap he sprang on Bing and the way people rely so heavily on web search, he says “there’s also some very real concerns.”

    [ . . . ]

    Tags: shake-things-up

    September 25, 2023

    "a public beta of our project, Collective Cognition to share ChatGPT chats"
    @teknium1 via Twitter on Sep 22, 2023

    Today @SM00719002 and I are launching a public beta of our project, Collective Cognition to share ChatGPT chats - allowing for browsing, searching, up and down voting of chats, as well as creating a crowdsourced multiturn dataset!

    https://collectivecognition.ai

    References

    Burrell, J., Kahn, Z., Jonas, A., & Griffin, D. (2019). When users control the algorithms: Values expressed in practices on Twitter. Proc. ACM Hum.-Comput. Interact., 3(CSCW). https://doi.org/10.1145/3359240 [burrell2019control]

    Cotter, K. (2022). Practical knowledge of algorithms: The case of BreadTube. New Media & Society, 1–20. https://doi.org/10.1177/14614448221081802 [cotter2022practical]

    Griffin, D. (2022). Situating web searching in data engineering: Admissions, extensions, repairs, and ownership [PhD thesis, University of California, Berkeley]. https://danielsgriffin.com/assets/griffin2022situating.pdf [griffin2022situating]

    Griffin, D., & Lurie, E. (2022). Search quality complaints and imaginary repair: Control in articulations of Google Search. New Media & Society, 0(0), 14614448221136505. https://doi.org/10.1177/14614448221136505 [griffin2022search]

    Lam, M. S., Gordon, M. L., Metaxa, D., Hancock, J. T., Landay, J. A., & Bernstein, M. S. (2022). End-user audits: A system empowering communities to lead large-scale investigations of harmful algorithmic behavior. Proc. ACM Hum.-Comput. Interact., 6(CSCW2). https://doi.org/10.1145/3555625 [lam2022end]

    Metaxa, D., Park, J. S., Robertson, R. E., Karahalios, K., Wilson, C., Hancock, J., & Sandvig, C. (2021). Auditing algorithms: Understanding algorithmic systems from the outside in. Foundations and Trends® in Human–Computer Interaction, 14(4), 272–344. https://doi.org/10.1561/1100000083 [metaxa2021auditing]

    Mollick, E. (2023). One useful thing. Now Is the Time for Grimoires. https://www.oneusefulthing.org/p/now-is-the-time-for-grimoires [mollick2023useful]

    Zamfirescu-Pereira, J. D., Wong, R. Y., Hartmann, B., & Yang, Q. (2023). Why johnny can’t prompt: How non-ai experts try (and fail) to design llm prompts. Proceedings of the 2023 Chi Conference on Human Factors in Computing Systems. https://doi.org/10.1145/3544548.3581388 [zamfirescu-pereira2023johnny]

    Tags: sharing interface, repository-of-examples

    Is there any paper about the increasing “arXivification” of CS/HCI?
    @IanArawjo via Twitter on Sep 24, 2023

    Is there any paper about the increasing “arXivification” of CS/HCI? (eg pre-print cultures and how they relate/intersect w increasingly sped-up processes of technology development and academic paper churn culture)?

    Tags: the scholarly economy, arXiv, preprints

    September 19, 2023

    scoreless peer review
    Stuart Schechter’s How You Can Help Fix Peer Review on Sep 19, 2023

    When we scrutinize our students’ and colleagues’ research work to catch errors, offer clarifications, and suggest other ways to improve their work, we are informally conducting author-assistive peer review. Author-assistive review is almost always a * scoreless*, as scores serve no purpose even for work being prepared for publication review.

    Alas, the social norm of offering author-assistive review only to those close to us, and reviewing most everyone else’s work through publication review, exacerbates the disadvantages faced by underrepresented groups and other outsiders.

    [ . . . ]

    We can address those unintended harms by making ourselves at least as available for scoreless author-assistive peer review as we are for publication review.7

    Tags: peer review

    ssr: uses in new instruct model v. chat models?
    @simonw via Twitter on Sep 19, 2023

    Anyone seen any interesting examples of things this new instruct model can do that are difficult to achieve using the chat models?

    Tags: social-search-request

    September 14, 2023

    [What does the f mean in printf]
    @brettsmth via Twitter on Sep 14, 2023

    Interesting that @replit Ghostwriter gave a better response than GPT-4 for a coding question. Ghostwriter has gotten noticeably better for me and I find myself using it more than GPT-4 for development

    @danielsgriffin via Twitter on Sep 14, 2023

    Oooh. This is a slippery one! Because both are right?

    They must assume/interpolate:
    What does the f [format specifier] [mean/stand for] in printf?
    What does the [letter] f [mean/stand for] in [the string] printf?

    Tags: end-user-comparison

    ssr: LLM libraries that can be installed cleanly on Python
    @simonw via Twitter on Sep 14, 2023

    Anyone got leads on good LLM libraries that can be installed cleanly on Python (on macOS but ideally Linux and Windows too) using “pip install X” from PyPI, without needing a compiler setup?

    I’m looking for the quickest and simplest way to call a language model from Python

    Tags: social-search-request

    September 12, 2023

    DAIR.AI's Prompt Engineering Guide
    Prompt Engineering Guide on Jun 6, 2023

    Prompt engineering is a relatively new discipline for developing and optimizing prompts to efficiently use language models (LMs) for a wide variety of applications and research topics. Prompt engineering skills help to better understand the capabilities and limitations of large language models (LLMs).

    Researchers use prompt engineering to improve the capacity of LLMs on a wide range of common and complex tasks such as question answering and arithmetic reasoning. Developers use prompt engineering to design robust and effective prompting techniques that interface with LLMs and other tools.

    Prompt engineering is not just about designing and developing prompts. It encompasses a wide range of skills and techniques that are useful for interacting and developing with LLMs. It’s an important skill to interface, build with, and understand capabilities of LLMs. You can use prompt engineering to improve safety of LLMs and build new capabilities like augmenting LLMs with domain knowledge and external tools.

    Motivated by the high interest in developing with LLMs, we have created this new prompt engineering guide that contains all the latest papers, learning guides, models, lectures, references, new LLM capabilities, and tools related to prompt engineering.

    References

    Yao, S., Zhao, J., Yu, D., Du, N., Shafran, I., Narasimhan, K., & Cao, Y. (2023). ReAct: Synergizing reasoning and acting in language models. http://arxiv.org/abs/2210.03629 [yao2023react]

    Tags: prompt engineering

    September 8, 2023

    ragas metrics

    github.com/explodinggradients/ragas:

    Ragas measures your pipeline’s performance against different dimensions

    1. Faithfulness: measures the information consistency of the generated answer against the given context. If any claims are made in the answer that cannot be deduced from context is penalized.

    2. Context Relevancy: measures how relevant retrieved contexts are to the question. Ideally, the context should only contain information necessary to answer the question. The presence of redundant information in the context is penalized.

    3. Context Recall: measures the recall of the retrieved context using annotated answer as ground truth. Annotated answer is taken as proxy for ground truth context.

    4. Answer Relevancy: refers to the degree to which a response directly addresses and is appropriate for a given question or context. This does not take the factuality of the answer into consideration but rather penalizes the present of redundant information or incomplete answers given a question.

    5. Aspect Critiques: Designed to judge the submission against defined aspects like harmlessness, correctness, etc. You can also define your own aspect and validate the submission against your desired aspect. The output of aspect critiques is always binary.


    Tags: RAG

    September 1, 2023

    searchsmart.org
    Search Smart FAQ on Sep 1, 2023

    Search Smart suggests the best databases for your purpose based on a comprehensive comparison of most of the popular English academic databases. Search Smart tests the critical functionalities databases offer. Thereby, we uncover the capabilities and limitations of search systems that are not reported anywhere else. Search Smart aims to provide the best – i.e., most accurate, up-to-date, and comprehensive – information possible on search systems’ functionalities.

    Researchers use Search Smart as a decision tool to select the system/database that fits best.

    Librarians use Search Smart for giving search advice and for procurement decisions.

    Search providers use Search Smart for benchmarking and improvement of their offerings.

    More…

    We defined a generic testing procedure that works across a diverse set of academic search systems - all with distinct coverages, functionalities, and features. Thus, while other testing methods would be available, we chose the best common denominator across a heterogenic landscape of databases. This way, we can test a substantially greater number of databases compared to already existing database overviews.

    We test the functionalities of specific capabilities search systems have or claim to have. Here we follow a routine that is called “metamorphic testing”. It is a way of testing hard-to-test systems such as artificial intelligence, or databases. A group of researchers titled their 2020 IEEE article “Metamorphic Testing: Testing the Untestable”. Using this logic, we test databases and systems that do not provide access to their systems.

    Metamorphic testing is always done from the perspective of the user. It investigates how well a system performs, not at some theoretical level, but in practice - how well can the user search with a system? Do the results add up? What are the limitations of certain functionalities?

    References

    Goldenfein, J., & Griffin, D. (2022). Google scholar – platforming the scholarly economy. Internet Policy Review, 11(3), 117. https://doi.org/10.14763/2022.3.1671 [goldenfein2022platforming]

    Gusenbauer, M., & Haddaway, N. R. (2019). Which academic search systems are suitable for systematic reviews or meta-analyses? Evaluating retrieval qualities of google scholar, pubmed and 26 other resources. Research Synthesis Methods. https://doi.org/10.1002/jrsm.1378 [gusenbauer2019academic]

    Segura, S., Towey, D., Zhou, Z. Q., & Chen, T. Y. (2020). Metamorphic testing: Testing the untestable. IEEE Software, 37(3), 46–53. https://doi.org/10.1109/MS.2018.2875968 [segura2020metamorphic]

    Tags: evaluating-search-engines, academic-search

    August 31, 2023

    "We really need to talk more about monitoring search quality for public interest topics."
    Dave Guarino (website | Twitter; “the founding engineer (and then Director) of GetCalFresh.org at Code for America”)
    @allafarce via Twitter on Jan 16, 2020

    We really need to talk more about monitoring search quality for public interest topics.

    References

    Arawjo, I., Vaithilingam, P., Swoopes, C., Wattenberg, M., & Glassman, E. (2023). ChainForge. https://www.chainforge.ai/. [arawjo2023chainforge]

    Guendelman, S., Pleasants, E., Cheshire, C., & Kong, A. (2022). Exploring google searches for out-of-clinic medication abortion in the united states during 2020: Infodemiology approach using multiple samples. JMIR Infodemiology, 2(1), e33184. https://doi.org/10.2196/33184 [guendelman2022exploring]

    Lurie, E., & Mulligan, D. K. (2021). Searching for representation: A sociotechnical audit of googling for members of U.S. Congress. https://arxiv.org/abs/2109.07012 [lurie2021searching_facctrec]

    Mejova, Y., Gracyk, T., & Robertson, R. (2022). Googling for abortion: Search engine mediation of abortion accessibility in the united states. JQD, 2. https://doi.org/10.51685/jqd.2022.007 [mejova2022googling]

    Mustafaraj, E., Lurie, E., & Devine, C. (2020). The case for voter-centered audits of search engines during political elections. FAT* ’20. [mustafaraj2020case]

    Noble, S. U. (2018). Algorithms of oppression how search engines reinforce racism. New York University Press. https://nyupress.org/9781479837243/algorithms-of-oppression/ [noble2018algorithms]

    Sundin, O., Lewandowski, D., & Haider, J. (2021). Whose relevance? Web search engines as multisided relevance machines. Journal of the Association for Information Science and Technology. https://doi.org/10.1002/asi.24570 [sundin2021relevance]

    Urman, A., & Makhortykh, M. (2022). “Foreign beauties want to meet you”: The sexualization of women in google’s organic and sponsored text search results. New Media & Society, 0(0), 14614448221099536. https://doi.org/10.1177/14614448221099536 [urman2022foreign]

    Urman, A., Makhortykh, M., & Ulloa, R. (2022). Auditing the representation of migrants in image web search results. Humanit Soc Sci Commun, 9(1), 5. https://doi.org/10.1057/s41599-022-01144-1 [urman2022auditing]

    Urman, A., Makhortykh, M., Ulloa, R., & Kulshrestha, J. (2022). Where the earth is flat and 9/11 is an inside job: A comparative algorithm audit of conspiratorial information in web search results. Telematics and Informatics, 72, 101860. https://doi.org/10.1016/j.tele.2022.101860 [urman2022earth]

    Zade, H., Wack, M., Zhang, Y., Starbird, K., Calo, R., Young, J., & West, J. D. (2022). Auditing google’s search headlines as a potential gateway to misleading content. Journal of Online Trust and Safety, 1(4). https://doi.org/10.54501/jots.v1i4.72 [zade2022auditing]

    Tags: public-interest-technology, seo-for-social-good, search-audits

    August 30, 2023

    "The robot is not, in my opinion, a skip."
    @mattbeane via Twitter on Aug 30, 2023

    I came across this in my dissertation today. It stopped me in my tracks.

    Most studies show robotic surgery gets equivalent outcomes to traditional surgery. You read data like this and you wonder about how much skill remains under the hood in the profession…

    The word 'skip' is highlighted in the sentence: The robot is not, in my opinion, a skip. The full paragraph of text: It's not the same as doing a weekend course with intuitive surgical and then saying you're a robotic surgeon and now offering it at your hospital [italics indicate heavy emphasis]. I did 300 and something cases a as a fellow on the robot and 300 and something cases laparoscopically. So a huuuge difference in the level of skill set since I was operating four days a week as opposed to the guy who's offering robotic surgery of surgery and does it twice a month, okay? The way I was trained, and the way I train my residents, my fellows and the people I train at the national level is that you need to know how to do a procedure laparoscopically first before you'd tackle it robotically. The robot is not, in my opinion, a skip. You don't jump from open to robot, although that is exactly what has happened in the last five years. For the vast majority, and it's a marketing, money issue driven by Intuitive. No concern for patient care. And unfortunately, the surgeons who don't have the laparoscopic training who have been working for 10 to 15 years - panic, because they're like "I can't do minimally invasive surgery, maybe I can do it with the robot." Right? And then that'll help with marketing and it's a money thing, so you're no longer thinking about patient care it's now driven by money from Intuitive's. perspective and from the practice perspective. This is all a mistake. This is a huge fucking mistake. - AP

    References

    Beane, M. (2017). Operating in the shadows: The productive deviance needed to make robotic surgery work [PhD thesis, MIT]. http://hdl.handle.net/1721.1/113956 [beane2017operating]

    Microsoft CFP: "Accelerate Foundation Models Research"

    Note: “Foundation model” is another term for large language model (or LLM).

    Microsoft Research on Aug 24, 2023
    Accelerate Foundation Models Research

    …as industry-led advances in AI continue to reach new heights, we believe that a vibrant and diverse research ecosystem remains essential to realizing the promise of AI to benefit people and society while mitigating risks. Accelerate Foundation Models Research (AFMR) is a research grant program through which we will make leading foundation models hosted by Microsoft Azure more accessible to the academic research community via Microsoft Azure AI services.

    Potential research topics
    Align AI systems with human goals and preferences

    (e.g., enable robustness, sustainability, transparency, trustfulness, develop evaluation approaches)

    • How should we evaluate foundation models?
    • How might we mitigate the risks and potential harms of foundation models such as bias, unfairness, manipulation, and misinformation?
    • How might we enable continual learning and adaptation, informed by human feedback?
    • How might we ensure that the outputs of foundation models are faithful to real-world evidence, experimental findings, and other explicit knowledge?
    Advance beneficial applications of AI

    (e.g., increase human ingenuity, creativity and productivity, decrease AI digital divide)

    • How might we advance the study of the social and environmental impacts of foundation models?
    • How might we foster ethical, responsible, and transparent use of foundation models across domains and applications?
    • How might we study and address the social and psychological effects of large language models on human behavior, cognition, and emotion?
    • How can we develop AI technologies that are inclusive of everyone on the planet?
    • How might foundation models be used to enhance the creative process?
    Accelerate scientific discovery in the natural and life sciences

    (e.g., advanced knowledge discovery, causal understanding, generation of multi-scale multi-modal scientific data)

    • How might foundation models accelerate knowledge discovery, hypothesis generation and analysis workflows in natural and life sciences?
    • How might foundation models be used to transform scientific data interpretation and experimental data synthesis?
    • Which new scientific datasets are needed to train, fine-tune, and evaluate foundation models in natural and life sciences?
    • How might foundation models be used to make scientific data more discoverable, interoperable, and reusable?

    References

    Hoffmann, A. L. (2021). Terms of inclusion: Data, discourse, violence. New Media & Society, 23(12), 3539–3556. https://doi.org/10.1177/1461444820958725 [hoffmann2020terms]

    Tags: CFP-RFP

    August 28, 2023

    caught myself having questions that I normally wouldn't bother
    @chrisalbon via Twitter on Aug 27, 2023

    Probably one of the best things I’ve done since ChatGPT/Copilot came out is create a “column” on the right side of my screen for them.

    I’ve caught myself having questions that I normally wouldn’t bother Googling but if since the friction is so low, I’ll ask of Copilot.

    [I am confused about this]
    @hyperdiscogirl via Twitter on Aug 27, 2023

    I was confused about someone’s use of an idiom so I went to google it but instead I googled “I am confused about this” and then stared at the results page, confused

    Tags: found-queries

    Tech Policy Press on Choosing Our Words Carefully

    https://techpolicy.press/choosing-our-words-carefully/

    This episode features two segments. In the first, Rebecca Rand speaks with Alina Leidinger, a researcher at the Institute for Logic, Language and Computation at the University of Amsterdam about her research– with coauthor Richard Rogers– into which stereotypes are moderated and under-moderated in search engine autocompletion. In the second segment, Justin Hendrix speaks with Associated Press investigative journalist Garance Burke about a new chapter in the AP Stylebook offering guidance on how to report on artificial intelligence.

    HTT: Alina Leidinger (website, Twitter)

    The paper in question: Leidinger & Rogers (2023)

    abstract:

    Warning: This paper contains content that may be offensive or upsetting.

    Language technologies that perpetuate stereotypes actively cement social hierarchies. This study enquires into the moderation of stereotypes in autocompletion results by Google, DuckDuckGo and Yahoo! We investigate the moderation of derogatory stereotypes for social groups, examining the content and sentiment of the autocompletions. We thereby demonstrate which categories are highly moderated (i.e., sexual orientation, religious affiliation, political groups and communities or peoples) and which less so (age and gender), both overall and per engine. We found that under-moderated categories contain results with negative sentiment and derogatory stereotypes. We also identify distinctive moderation strategies per engine, with Google and DuckDuckGo moderating greatly and Yahoo! being more permissive. The research has implications for both moderation of stereotypes in commercial autocompletion tools, as well as large language models in NLP, particularly the question of the content deserving of moderation.

    References

    Leidinger, A., & Rogers, R. (2023). Which stereotypes are moderated and under-moderated in search engine autocompletion? Proceedings of the 2023 Acm Conference on Fairness, Accountability, and Transparency, 1049–1061. https://doi.org/10.1145/3593013.3594062 [leidinger2023stereotypes]

    Tags: to-look-at, search-autocomplete, artificial intelligence

    open source project named Quivr...
    @bradneuberg via Twitter on Aug 26, 2023

    Open source project named Quivr that indexes your local files on your machine & allows you to query them with large language models. I want something like this but directly integrated into my Macs Apple Notes + all my browser tabs & history, local on PC

    Tags: local-search

    August 22, 2023

    "And what matters is if it works."
    This is a comment about Kabir et al. (2023), following a theme in my research. @NektariosAI is replying to @GaryMarcus saying: “the study still confirms something I (and others) have been saying: people mistake the grammaticality etc of LLMs for truth.”
    @NektariosAI via Twitter on Aug 10, 2023

    I understand. But when it comes to coding, if it’s not true, it most likely won’t work. And what matters is if it works. Only a bad programmer will accept the answer without testing it. You may need a few rounds of prompting to get to the right answer and often it knows how to correct itself. It will also suggest other more efficient approaches.

    References

    Kabir, S., Udo-Imeh, D. N., Kou, B., & Zhang, T. (2023). Who answers it better? An in-depth analysis of chatgpt and stack overflow answers to software engineering questions. http://arxiv.org/abs/2308.02312 [kabir2023answers]

    Widder, D. G., Nafus, D., Dabbish, L., & Herbsleb, J. D. (2022, June). Limits and possibilities for “ethical AI” in open source: A study of deepfakes. Proceedings of the 2022 ACM Conference on Fairness, Accountability, and Transparency. https://davidwidder.me/files/widder-ossdeepfakes-facct22.pdf [widder2022limits]

    Tags: treating information as atomic

    August 4, 2023

    Are prompts—& queries—not Lipschitz?
    @zacharylipton via Twitter on Aug 3, 2023

    Prompts are not Lipschitz. There are no “small” changes to prompts. Seemingly minor tweaks can yield shocking jolts in model behavior. Any change in a prompt-based method requires a complete rerun of evaluation, both automatic and human. For now, this is the way.

    References

    Hora, A. (2021, May). Googling for software development: What developers search for and what they find. 2021 IEEE/ACM 18th International Conference on Mining Software Repositories (MSR). https://doi.org/10.1109/msr52588.2021.00044 [hora2021googling]

    Lurie, E., & Mulligan, D. K. (2021). Searching for representation: A sociotechnical audit of googling for members of U.S. Congress. https://arxiv.org/abs/2109.07012 [lurie2021searching_facctrec]

    Trielli, D., & Diakopoulos, N. (2018). Defining the role of user input bias in personalized platforms. Paper presented at the Algorithmic Personalization and News (APEN18) workshop at the International AAAI Conference on Web and Social Media (ICWSM). https://www.academia.edu/37432632/Defining_the_Role_of_User_Input_Bias_in_Personalized_Platforms [trielli2018defining]

    Tripodi, F. (2018). Searching for alternative facts: Analyzing scriptural inference in conservative news practices. Data & Society. https://datasociety.net/output/searching-for-alternative-facts/ [tripodi2018searching]

    Tags: prompt engineering

    August 3, 2023

    Keyword search is dead?
    Keyword search is dead?

    Perhaps we might rather say that other search modalities are now showing more signs of life? Though perhaps also distinguish keyword search from fulltext search or with reference to various ways searching is mediated (from stopwords to noindex and search query length limits) When is keyword search still particularly valuable? (Cmd/Ctrl+F is still very alive?) How does keyword search have a role in addressing hallucination?

    Surely though, one exciting thing about this moment is how much people are reimagining what search can be.
    @vectara via Twitter on Jun 15, 2023

    Keyword search is dead. Ask full questions in your own words and get the high-relevance results that you actually need.
    🔍 Top retrieval, summarization, & grounded generation
    😵‍💫 Eliminates hallucinations
    🧑🏽‍💻 Built for developers
    ⏩ Set up in 5 mins
    vectara.com

    References

    Burrell, J. (2016). How the machine “thinks”: Understanding opacity in machine learning algorithms. Big Data & Society, 3(1), 2053951715622512. https://doi.org/10.1177/2053951715622512 [burrell2016machine]

    Duguid, P. (2012). The world according to grep: A progress from closed to open? 1–21. http://courses.ischool.berkeley.edu/i218/s12/Grep.pdf [duguid2012world]

    Tags: keyword search, hallucination, full questions, automation bias, opening-closing, opacity, musingful-memo

    OWASP Top 10 for Large Language Model Applications

    Here is the ‘OWASP Top 10 for Large Language Model Applications’. Overreliance is relevant to my research.

    (I’ve generally used the term “automation bias”, though perhaps a more direct term like overreliance is better.)

    You can see my discussion in the “Extending searching” chapter of my dissertation (particularly the sections on “Spaces for evaluation” and “Decoupling performance from search”) as I look at how data engineers appear to effectively address related risks in their heavy use of general-purpose web search at work. I’m very focused on how the searcher is situated and what they are doing well before and after they actually type in a query (or enter a prompt).

    Key lessons in my dissertation: (1) The data engineers are not really left to evaluate search results as they read them and assigning such responsibility could run into Meno’s Paradox (instead there are various tools, processes, and other people that assist in evaluation). (2) While search is a massive input into their work, it is not tightly coupled to their key actions (instead there are useful frictions (and perhaps fictions), gaps, and buffers).

    I’d like discussion explicitly addressing “inadequate informing” (wc?), where the information generated is accurate but inadequate given the situation-and-user.

    The section does refer to “inappropriate” content, but usage suggests “toxic” rather than insufficient or inadequate.

    OWASP on Aug 01, 2023

    The OWASP Top 10 for Large Language Model Applications project aims to educate developers, designers, architects, managers, and organizations about the potential security risks when deploying and managing Large Language Models (LLMs). The project provides a list of the top 10 most critical vulnerabilities often seen in LLM applications, highlighting their potential impact, ease of exploitation, and prevalence in real-world applications. Examples of vulnerabilities include prompt injections, data leakage, inadequate sandboxing, and unauthorized code execution, among others. The goal is to raise awareness of these vulnerabilities, suggest remediation strategies, and ultimately improve the security posture of LLM applications. You can read our group charter for more information

    OWASP Top 10 for LLM version 1.0

    LLM01: Prompt Injection
    This manipulates a large language model (LLM) through crafty inputs, causing unintended actions by the LLM. Direct injections overwrite system prompts, while indirect ones manipulate inputs from external sources.

    LLM02: Insecure Output Handling
    This vulnerability occurs when an LLM output is accepted without scrutiny, exposing backend systems. Misuse may lead to severe consequences like XSS, CSRF, SSRF, privilege escalation, or remote code execution.

    LLM03: Training Data Poisoning
    This occurs when LLM training data is tampered, introducing vulnerabilities or biases that compromise security, effectiveness, or ethical behavior. Sources include Common Crawl, WebText, OpenWebText, & books.

    LLM04: Model Denial of Service
    Attackers cause resource-heavy operations on LLMs, leading to service degradation or high costs. The vulnerability is magnified due to the resource-intensive nature of LLMs and unpredictability of user inputs.

    LLM05: Supply Chain Vulnerabilities
    LLM application lifecycle can be compromised by vulnerable components or services, leading to security attacks. Using third-party datasets, pre-trained models, and plugins can add vulnerabilities.

    LLM06: Sensitive Information Disclosure
    LLM’s may inadvertently reveal confidential data in its responses, leading to unauthorized data access, privacy violations, and security breaches. It’s crucial to implement data sanitization and strict user policies to mitigate this.

    LLM07: Insecure Plugin Design
    LLM plugins can have insecure inputs and insufficient access control. This lack of application control makes them easier to exploit and can result in consequences like remote code execution.

    LLM08: Excessive Agency
    LLM-based systems may undertake actions leading to unintended consequences. The issue arises from excessive functionality, permissions, or autonomy granted to the LLM-based systems.

    LLM09: Overreliance
    Systems or people overly depending on LLMs without oversight may face misinformation, miscommunication, legal issues, and security vulnerabilities due to incorrect or inappropriate content generated by LLMs.

    LLM10: Model Theft
    This involves unauthorized access, copying, or exfiltration of proprietary LLM models. The impact includes economic losses, compromised competitive advantage, and potential access to sensitive information.

    Tags: automation-bias, decoupling, spaces-for-evaluation, prompt-injection, inadequate-informing, Meno-Paradox

    July 31, 2023

    they answered the question
    This is partially about prompt engineering and partially about what a good essay or search does. More than answer a question, perhaps? (this is engaged with in the essay, though not to my liking). Grimm

    The linked essay includes a sentiment connected with a common theme that I think is unfounded: denying the thinking and rethinking involved in effective prompting or querying, and reformulating both, hence my tag: prompting is thinking too:

    there is something about clear writing that is connected to clear thinking and acting in the world
    I don’t think that prompting, in its various forms, encourages and supports the same exact thinking as writing, in its various forms, but we would be remiss not to recognize that significant thinking can and does take place in interacting with (and through) computational devices via UIs in different ways (across time). (The theme reminds me also of the old critique of written language itself—as relayed in Plato’s dialogues—. Such critiques were, also, both not entirely wrong and yet also very ungracious and conservative? (And it reminds me that literacy itself—reading and writing—is a technology incredibly unequally distributed, with massive implications.))
    @ianarawjo via Twitter on Jul 30, 2023

    “Then my daughter started refining her inputs, putting in more parameters and prompts. The essays got better, more specific, more pointed. Each of them now did what a good essay should do: they answered the question.”

    @CoreyRobin via Twitter on Jul 30, 2023

    I asked my 15-year-old to run through ChatGPT a bunch of take-home essay questions I asked my students this year. Initially, it seemed like I could continue the way I do things. Then my daughter refined the inputs. Now I see that I need to change course.

    https://coreyrobin.com/2023/07/30/how-chatgpt-changed-my-plans-for-the-fall/

    Tags: prompt engineering, prompting is thinking too, on questions

    July 28, 2023

    The ultimate question
    @aravsrinivas via Twitter on Jul 24, 2023

    The ultimate question is what is the question. Asking the right question is hard. Even framing a question is hard. Hence why at perplexity, we don’t just let you have a chat UI. But actually try to minimize the level of thought needed to ask fresh or follow up questions.

    @mlevchin via Twitter on Jul 24, 2023

    In a post-AI world perhaps the most important skill will be knowing how to ask a great question, generalized to knowing how to think through exactly what you want [to know.]

    Tags: search-is-hard, query-formulation, on-questions

    Cohere's Coral
    @aidangomezzz via Twitter on Jul 25, 2023

    We’re excited to start putting Coral in the hands of users!

    Coral is “retrieval-first” in the sense it will reference and cite its sources when generating an answer.

    Coral can pull from an ecosystem of knowledge sources including Google Workspace, Office365, ElasticSearch, and many more to come.

    Coral can be deployed completely privately within your VPC, on any major cloud provider.

    @cohere via Twitter on Jul 25, 2023

    Today, we introduce Coral: a knowledge assistant for enterprises looking to improve the productivity of their most strategic teams. Users can converse with Coral to help them complete their business tasks.

    https://cohere.com/coral


    Coral is conversational. Chat is the interface, powered by Cohere’s Command model. Coral understands the intent behind conversations, remembers the history, and is simple to use. Knowledge workers now have a capable assistant that can research, draft, summarize, and more.


    Coral is customizable. Customers can augment Coral’s knowledge base through data connections. Coral has 100+ integrations to connect to data sources important to your business across CRMs, collaboration tools, databases, search engines, support systems, and more.


    Coral is grounded. Workers need to understand where information is coming from. To help verify responses, Coral produces citations from relevant data sources. Our models are trained to seek relevant data based on a user’s need (even from multiple sources).


    Coral is private. Companies that want to take advantage of business-grade chatbots must have them deployed in a private environment. The data used for prompting, and the Coral’s outputs, will not leave a company’s data perimeter. Cohere will support deployment on any cloud.

    Tags: retrieval-first, grounded, Cohere

    this data might be wrong

    Screenshot of Ayhan Fuat Çelik’s “The Fall of Stack Overflow” on Observable omitted. The graph in question has since been updated.

    @natfriedman via Twitter on Jul 26, 2023

    Why the precipitous sudden decline in early 2022? That first cliff has nothing to do with ChatGPT.


    I also think this data might be wrong. Doesn’t match SimilarWeb visit data at all

    Tags: Stack Overflow, website analytics

    Be careful of concluding
    @jeremyphoward via Twitter on Jul 25, 2023

    Be careful of concluding that “GPT 4 can’t do ” on the basis you tried it once and it didn’t work for you.

    See the thread below for two recent papers showing how badly this line of thinking can go wrong, and an interesting example.

    Tags: prompt engineering, capability determination

    ssr: attention span essay or keywords?
    @katypearce via Twitter on Jul 26, 2023

    Does anyone have a quick link to a meta-analysis or a really good scholarly-informed essay on what evidence we have on the effect of technology/internet/whatever on “attention span”? Alternatively, some better search keywords than “attention span” would help too. Thanks!

    Tags: social-search-request, keyword-request

    OverflowAI
    @pchandrasekar via Twitter on Jul 27, 2023

    Today we officially launch the next stage of community and AI here at @StackOverflow: OverflowAI! Just shared the exciting news on the @WeAreDevs keynote stage. If you missed it, watch highlights of our announcements and visit https://stackoverflow.co/labs/.

    Tags: Stack Overflow, CGT

    Just go online and type in "how to kiss."
    Good Boys (2019), via Yarn
    We’re sorry. We just wanted to learn how to kiss.

    [ . . . ]

    Just go online and type in “how to kiss.”
    That’s what everyone does.

    Tags: search directive

    AnswerOverflow
    via answeroverflow.com on Jul 28, 2023

    Bringing your Discord channels to Google

    Answer Overflow is an open source project designed to bring discord channels to your favorite search engine. Set it up in minutes and bring discovery to your hidden content.

    Tags: social search, void filling

    Gorilla
    via cs.berkeley.edu on Jul 28, 2023

    🦍 Gorilla: Large Language Model Connected with Massive APIs

    Gorilla is a LLM that can provide appropriate API calls. It is trained on three massive machine learning hub datasets: Torch Hub, TensorFlow Hub and HuggingFace. We are rapidly adding new domains, including Kubernetes, GCP, AWS, OpenAPI, and more. Zero-shot Gorilla outperforms GPT-4, Chat-GPT and Claude. Gorilla is extremely reliable, and significantly reduces hallucination errors.

    Tags: CGT

    [chamungus]
    via r/NoStupidQuestions on Jul 13, 2023

    What does it mean when people from Canada and US say chamungus in meetings?

    I am from slovenia and this week we have 5 people from US and toronto office visiting us for trainings. On monday when we were first shaking hands and getting to know each other before the meetings they would say something like “chamungus” or “chumungus” or something along those lines. I googled it but I never found out what it means. I just noticed they only say that word the first time they are meeting someone.

    Anyone know what it means or what it is for?

    Tags: googled it, social search, void filling

    July 17, 2023

    Everything Marie Haynes Knows About Google’s Quality Raters
    There’s been a flurry of commentary recently on Twitter about Google’s search quality raters…

    Marie Haynes on Jul 12, 2023

    Everything We Know About Google’s Quality Raters: Who They Are, What They Do, and What It Means for Your Site If They Visit
    The inner workings of Google’s search algorithm remain shrouded in secrecy, yet one important piece of the ranking puzzle involves an army of over 16,000 contractors known as quality raters. Just what do these raters evaluate when they visit websites, and how much influence do their judgements have over search rankings?

    References

    Meisner, C., Duffy, B. E., & Ziewitz, M. (2022). The labor of search engine evaluation: Making algorithms more human or humans more algorithmic? New Media & Society, 0(0), 14614448211063860. https://doi.org/10.1177/14614448211063860 [meisner2022labor]

    Tags: Google, Search-Quality-Raters, UCIS

    Simon Willison (@simonw) on misleading pretending re LLMs and reading links
    @simonw via Twitter on Jul 14, 2023

    Just caught Claude from @AnthropicAI doing the thing where it pretends to be able to read links you give it but actually just hallucinates a summary based on keywords in the URL - using https://claude.ai

    [tweeted image omitted]

    I wrote about how misleading it is when ChatGPT does this a few months ago:

    Simon Willison on Mar 10, 2023:
    ChatGPT can’t access the internet, even though it really looks like it can
    A really common misconception about ChatGPT is that it can access URLs. I’ve seen many different examples of people pasting in a URL and asking for a summary, or asking it to make use of the content on that page in some way.
    A few weeks after I first wrote this article, ChatGPT added a new alpha feature called “Browsing” mode. This alpha does have the ability to access content from URLs, but when it does so it makes it very explicit that it has used that ability, displaying additional contextual information [ . . . ]

    Tags: hallucination, Anthropic-Claude, OpenAI-ChatGPT

    Should we not "just google" phone numbers?
    @swiftonsecurity via Twitter on Jul 17, 2023

    My firm went through hell on earth to get our phone number on Google Maps updated. Google has malicious insider or a process has been hacked to get all these scammer replacements.

    @Shmuli via Twitter on Jul 17, 2023

    My (???) flight got canceled from JFK. The customer service line was huge, so I google a Delta JFK phone number. The number was 1888-571-4869 Thinking I reached Delta, I started telling them about getting me on a new flight.

    Tags: Google, do not just google

    July 11, 2023

    Claude 2 on my Claude Shannon hallucination test

    Added September 28, 2023 11:18 PM (PDT)

    It appears that my attempts to stop the search systems from adopting these hallucinated claims have failed. I share on Twitter screenshots of various search systems, newly queried with my Claude Shannon hallucination test, highlighting an LLM response, returning multiple LLM response pages in the results, or citing to my own page as evidence for such a paper. I ran those tests after briefly testing the newly released Cohere RAG.

    Added October 06, 2023 10:59 AM (PDT)

    An Oct 5 article from Will Knight in Wired discusses my Claude Shannon “hallucination” test: Chatbot Hallucinations Are Poisoning Web Search

    A round-up here: Can you write about examples of LLM hallucination without poisoning the web?

    Reminder: I think “hallucination” of the sort I will show below is largely addressable with current technology. But, to guide our practice, it is useful to remind ourselves of where it has not yet been addressed.
    @AnthropicAI via Twitter on Jul 11, 2023

    Introducing Claude 2! Our latest model has improved performance in coding, math and reasoning. It can produce longer responses, and is available in a new public-facing beta website at http://claude.ai in the US and UK.

    Tags: hallucination, Anthropic-Claude, false-premise

    July 10, 2023

    "tap the Search button twice"
    @nadaawg via Threads on Jul 6, 2023

    But what about that feature where you tap the Search button twice and it pops open the keyboard?


    @spotify way ahead of the curve

    Single-tap.

    A Spotify mobile app search screen showing explore options. Screenshot taken manually on iOS at roughly: 2023-07-10 09:40


    Double-tap.

    A Spotify mobile app search screen showing keyboard ready to afford typing. Screenshot taken manually on iOS at roughly: 2023-07-10 09:40

    Tags: micro interactions in search

    July 7, 2023

    GenAI "chat windows"
    @gergelyorosz

    What are good (and efficient) alternatives to ChatGPT *for writing code* or coding-related topics?

    So not asking about Copilot alternatives. But GenAI “chat windows” that have been trained on enough code to be useful in e.g. scaffolding, explaining coding concepts etc.

    On Twitter Jul 5, 2023

    Tags: CGT

    "I wish I could ask it to narrow search results to a given time period"
    @mati_faure

    Thanks for the recommendation, it’s actually great for searching! I wish I could ask it to narrow search results to a given time period though (cc @perplexity_ai)

    On Twitter Jul 7, 2023

    Tags: temporal-searching, Perplexity-AI

    July 6, 2023

    Kagi and generative search
    https://blog.kagi.com/:
    Kagi is building a novel ad-free, paid search engine and a powerful web browser as a part of our mission to humanize the web.
    Kagi: Kagi’s approach to AI in search

    Kagi Search is pleased to announce the introduction of three AI features into our product offering.

    We’d like to discuss how we see AI’s role in search, what are the challenges and our AI integration philosophy. Finally, we will be going over the features we are launching today.

    on the open Web Mar 16, 2023

    Tags: generative-search, Kagi

    July 5, 2023

    [Please summarize Claude E. Shannon's "A Short History of Searching" (1948).]

    Added September 28, 2023 11:18 PM (PDT)

    It appears that my attempts to stop the search systems from adopting these hallucinated claims have failed. I share on Twitter screenshots of various search systems, newly queried with my Claude Shannon hallucination test, highlighting an LLM response, returning multiple LLM response pages in the results, or citing to my own page as evidence for such a paper. I ran those tests after briefly testing the newly released Cohere RAG.

    Added October 01, 2023 12:57 AM (PDT)

    I noticed today that Google's Search Console–in the URL Inspection tool–flagged a missing field in my schema:
    Missing field "itemReviewed"
    This is a non-critical issue. Items with these issues are valid, but could be presented with more features or be optimized for more relevant queries
    In the hopes of finding out how to better discuss problematic outputs from LLMs, I went back to Google's Fact Check Markup Tool and added the four URLs that I have for the generated false claims. I then updated the schema in this page (see the source, for ease of use, see also this gist that shows the two variants.)

    Added October 06, 2023 10:59 AM (PDT)

    An Oct 5 article from Will Knight in Wired discusses my Claude Shannon "hallucination" test: Chatbot Hallucinations Are Poisoning Web Search

    A round-up here: Can you write about examples of LLM hallucination without poisoning the web?

    The comment below prompted me to do a single-query prompt test for "hallucination" across various tools. Results varied. Google's Bard and base models of OpenAI's ChatGPT and others failed to spot the imaginary reference. You.com, Perplexity AI, Phind, and ChatGPT-4 were more successful.

    I continue to be impressed by Phind's performance outside of coding questions (their headline is "The AI search engine for developers").

    @anthonymoser via Bluesky on Jul 4, 2023

    I'm imagining an instructor somewhere making a syllabus with chat gpt, assigning reading from books that don't exist

    But the students don't notice, because they are asking chat gpt to summarize the book or write the essay

  • I generally think addressing hallucination of this second sort (summarizing fake papers) is low-hanging fruit. The remedies seem straight forward (though not free) and the incentives appear to be well-aligned.
  • But I was surprised at how poorly ChatGPT performed on a simplistic mock-attempt at the student prompt here. Running on other tools was also pretty disappointing.
  • Granted, models may perform worse if the title itself were hallucinated. It is likely the author-and-title tested below title is somewhat in their hallucinatory-space, whereas other titles may not be. For instance, ChatGPT correctly noted that neither Stephen Hawking nor Plato had a piece by that title
  • See also

    Added October 01, 2023 12:57 AM (PDT):

    ChatGPT [ Please summarize Claude E. Shannon's "A Short History of Searching" (1948). ]
    A ChatGPT.GPT-4[Please summarize Claude E. Shannon's "A Short History of Searching" (1948).] search. Screenshot taken with GoFullPage (distortions possible) at: 2023-07-04 23:06:36
    I conducted a follow-on test today and ChatGPT 3.5 still failed:
    "A Short History of Searching" is an influential paper written by Claude E. Shannon in 1948. In this paper, Shannon provides a historical overview of searching techniques and the development of information retrieval systems.

    Note: Andi does not hallucinate the contents of such a paper.
    Andi [ Please summarize Claude E. Shannon's "A Short History of Searching" (1948). ]
    A Andi[Please summarize Claude E. Shannon's "A Short History of Searching" (1948).] search. Screenshot taken with GoFullPage (distortions possible) at: 2023-07-04 23:32:24

    Bard [ Please summarize Claude E. Shannon's "A Short History of Searching" (1948). ]
    A Bard[Please summarize Claude E. Shannon's "A Short History of Searching" (1948).] search. Screenshot taken with GoFullPage (distortions possible) at: 2023-07-04 23:16:40

    Note: Perplexity AI takes the paper title at face value and hallucinates only briefly the contents before expanding on other work. (In a follow-on test (after querying Perplexity AI's Copilot), to account for my misordered test of You.com & You.com's GPT-4 version, does better at indicating the reference may be imaginary: Claude E. Shannon's "A Short History of Searching" is not mentioned in the search results....)
    Perplexity AI [ Please summarize Claude E. Shannon's "A Short History of Searching" (1948). ]
    A Perplexity AI[Please summarize Claude E. Shannon's "A Short History of Searching" (1948).] search. Screenshot taken with GoFullPage (distortions possible) at: 2023-07-04 23:15:29

    Inflection AI Pi [ Please summarize Claude E. Shannon's "A Short History of Searching" (1948). ]
    A Inflection AI Pi[Please summarize Claude E. Shannon's "A Short History of Searching" (1948).] search. Screenshot taken with GoFullPage (distortions possible) at: 2023-07-04 23:35:49 [screenshot manually trimmed to remove excess blankspace]

    Yes, even the namesake model struggles here.

    via Quora's Poe

    Claude Instant [ Please summarize Claude E. Shannon's "A Short History of Searching" (1948). ]
    A Claude Instant[Please summarize Claude E. Shannon's "A Short History of Searching" (1948).] search. Screenshot taken with GoFullPage (distortions possible) at: 2023-07-04 23:35:16 [screenshot manually trimmed to remove excess blankspace]

    ✅ Note: I messed up this test. The timestamp for the base model search on You.com is _after_ my search on the GPT-4 model. It is possible that their base model draws on a database of previous responses from the better model.
    You.com [ Please summarize Claude E. Shannon's "A Short History of Searching" (1948). ]
    A You.com[Please summarize Claude E. Shannon's "A Short History of Searching" (1948).] search. Screenshot taken with GoFullPage (distortions possible) at: 2023-07-05 11:22:19

    ✅ Note: While I believe GPT-4 was selected when I submitted the query, I am not sure (given it can be toggled mid-conversation?).
    You.com.GPT-4 [ Please summarize Claude E. Shannon's "A Short History of Searching" (1948). ]
    A You.com.GPT-4[Please summarize Claude E. Shannon's "A Short History of Searching" (1948).] search. Screenshot taken with GoFullPage (distortions possible) at: 2023-07-04 23:14:49


    Note: This is omitting the Copilot interaction where I was told-and-asked "It seems there might be a confusion with the title of the paper. Can you please confirm the correct title of the paper by Claude E. Shannon you are looking for?" I responded with the imaginary title again.
    Perplexity AI.Copilot [ Please summarize Claude E. Shannon's "A Short History of Searching" (1948). ]
    A Perplexity AI.Copilot[Please summarize Claude E. Shannon's "A Short History of Searching" (1948).] search. Screenshot taken with GoFullPage (distortions possible) at: 2023-07-04 23:39:13

    Phind [ Please summarize Claude E. Shannon's "A Short History of Searching" (1948). ]
    A Phind[Please summarize Claude E. Shannon's "A Short History of Searching" (1948).] search. Screenshot taken with GoFullPage (distortions possible) at: 2023-07-04 23:37:20

    ChatGPT.GPT-4 [ Please summarize Claude E. Shannon's "A Short History of Searching" (1948). ]
    A ChatGPT.GPT-4[Please summarize Claude E. Shannon's "A Short History of Searching" (1948).] search. Screenshot taken with GoFullPage (distortions possible) at: 2023-07-05 11:16:03

    Tags: hallucination, comparing-results, imaginary-references, Phind, Perplexity-AI, You.com, Andi, Inflection-AI-Pi, Google-Bard, OpenAI-ChatGPT, Anthropic-Claude, data-poisoning, false-premise

    June 30, 2023

    "the text prompt is a poor UI"

    This tweet is a reply—from the same author—to the tweet in: very few worthwhile tasks? (weblink).

    [highlighting added]

    @benedictevans

    In other words, I think the text prompt is a poor UI, quite separate to the capability of the model itself.

    On Twitter Jun 29, 2023

    Tags: text-interface

    June 29, 2023

    all you need is Sourcegraph's Cody?

    Downloaded.

    You’re all set

    Once embeddings are finished being generated, you can specify Cody’s context and start asking questions in the Cody Chat.

    Current status: “Generating repositories embeddings”
    @steve_yegge

    https://about.sourcegraph.com/blog/all-you-need-is-cody

    I’m excited to announce that Cody is here for everyone. Cody can explain, diagnose, and fix your code like an expert, right in your IDE. No code base is too challenging for Cody.

    It’s like having your own personal team of senior engineers. Try it out!

    Tweet Jun 28, 2023


    Added: 2023-06-30 16:18:08

    Current status: “Generating repositories embeddings”

    @tonofcrates

    Looking forward to it! If the tool is in beta, I might consider saying that more prominently. Neither Steve’s post nor the Sourcegraph website make that clear. I only just found “Cody AI is in beta” as a sentence in the VSCode plugin README.

    Tweet Jun 29, 2023

    Tags: CGT, Sourcegraph

    very few worthwhile tasks?

    What is a “worthwhile task”?

    [highlighting added]

    @benedictevans

    The more I look at chatGPT, the more I think that the fact NLP didn’t work very well until recently blinded us to the fact that very few worthwhile tasks can be described in 2-3 sentences typed in or spoken in one go. It’s the same class of error as pen computing.

    On Twitter Jun 29, 2023

    References

    Reddy, M. J. (1979). The conduit metaphor: A case of frame conflict in our language about language. In A. Ortony (Ed.), Metaphor and thought. Cambridge University Press. https://www.reddyworks.com/the-conduit-metaphor/original-conduit-metaphor-article [reddy1979conduit]


    Footnotes

    1. Reddy (1979):

      Human communication will almost always go astray unless real energy is expended.

      ↩︎

    References

    Gillespie, T. (2018). Custodians of the internet: Platforms, content moderation, and the hidden decisions that shape social media. Yale University Press. https://yalebooks.yale.edu/book/9780300261431/custodians-of-the-internet/ [gillespie2018custodians]

    Tags: autocomplete-in-web-search, safesearch, search-quality-complaints

    November 30, 2023

    TRUST ISSUES
    Data & Society’s TRUST ISSUES: Perspectives on Community, Technology, and Trust on Nov 29, 2023

    Can trust be built into systems that users have determined to be untrustworthy? Should we be thinking of trust as something that is declining or improving, something to be built into AI and other data-centric systems, or as something that is produced through a set of relations and in particular locations? Where else, besides large institutions and their technologies, is trust located? How do other frames of trust produce community-centered politics such as politics of refusal or data sovereignty? What can community-based expertise tell us about how trust is built, negotiated, and transformed within and to the side of large-scale systems? Is there a disconnect between the solutions to a broad lack of trust and how social theorists, community members, and cultural critics have thought about trust?

    [ . . . ]

    In our work together, we aim to move away from a concept of trust that is inherent to the object (e.g. information as trustworthy) or a concept of trust that is overly normative (prescribing trust as a goal that should be achieved), and toward a concept of trust as a relational process. We will work toward an empirical grounding of how trust is stymied, broken, established, reestablished, co-opted, and redirected among the powerful and among communities who have never been able to fully trust the institutions that shape their lives.

    Tags: CFP-RFP, trust, doubting

    November 29, 2023

    “I finally use zero Google products for my personal life”
    @mitchellh via Twitter on Nov 29, 2023

    It took about 6 months of gradual change (shorter than I expected) but I finally use zero Google products for my personal life. Moved my email to Fastmail with my own domain, browser to Safari, maps to Apple, search to Kagi. Less painful than I expected, no noticeable downsides.


    See also: refusal in shortcuts/goldenfein2022platforming/

    Tags: refusal, Kagi

    “search terms that confirm preconceptions”
    @danielsgriffin via Twitter on Nov 11, 2023

    +1
    In a “search terms that confirm preconceptions” vignette (Haider & Rödl 2023):
    “‘I do not sit down and google “milk good for health”, I google “milk bad”’ … through choice of search terms, Google Search can be said to serve as a tool for confirmation bias (Tripodi, 2022)…”


    This tweet quotes from and cites Haider & Rödl (2023). This tweet is in reply to this, from Gagan Ghotra:

    @gaganghotra_ via Twitter on Nov 11, 2023

    Keywords based Search can’t help you because it just show you what you want and ignore the real Truth

    Should I …… says Yes take the shower but warm

    Should I not ……. says no don’t take shower even if it’s warm

    Image 1

    Image 2

    References

    Haider, J., & Rödl, M. (2023). Google search and the creation of ignorance: The case of the climate crisis. Big Data &Amp; Society, 10(1), 205395172311589. https://doi.org/10.1177/20539517231158997 [haider2023google]

    Tags: confirmation-biased-queries

    Perplexity AI announcing generative search APIs
    @aravsrinivas via Twitter on Nov 29, 2023

    Excited to announce that pplx-api is coming out of beta and moving to usage based pricing, along with the first-ever live LLM APIs that are grounded with web search data and have no knowledge cutoff!

    @denisyarats via Twitter on Nov 29, 2023

    Excited to release our online LLMs! These models have internet access and perform very well on prompts that require factuality and up-to-date information.

    read more here: blog.perplexity.ai/blog/introducing-pplx-online-llms

    Image 1

    @perplexity_ai via Twitter on Nov 29, 2023

    We’re thrilled to announce two online LLMs we’ve trained: pplx-7b-online and pplx-70b-online! Built on top of open-source LLMs and fine-tuned to use knowledge from the internet. They are now available via Labs and in a first-of-its-kind live-LLM API.

    Tags: Perplexity-AI, generative-search, generative-search-API, knowledge-cutoffs

    November 27, 2023

    “Q: “What is Google doing” A: Selling ads.”
    @AravSrinivas via Twitter on Nov 27, 2023

    Q: “What is Google doing”
    A: Selling ads.

    Image 1

    Tags: advertising-in-search

    November 17, 2023

    ssr: “is there a well known service that expands any arbitrary links for LLM consumption?”
    @swyx via Twitter on Nov 16, 2023

    Dev friends - is there a well known service that expands any arbitrary links for LLM consumption?

    example:

    input: any url (twitter, youtube, HN, medium, github, discord, reddit)

    output: opengraph content, but also post body, optionally summary of the post content

    usecase:

    when people drop links, its useful for the LLM to “read it a little bit” rather than only read the URL (as @simonw has shown, LLMs can hallucinate content from URLs)

    Tags: social-search-request

    “I don't personally know a single professional who'd be using GPT for anything seriously pertaining to their work”
    @filippie509 via Twitter on Nov 13, 2023

    Anecdotal but I realized I don’t personally know a single professional who’d be using GPT for anything seriously pertaining to their work, beyond what I’d call amusement. People are “playing with it”, “probing it”, “experimenting” but nobody is actually “using” it for anything.

    I think there are two interesting things here:

    1. Just flagging a strong claim about actual use. Do we know? How much does serious work or not integrate with ideas like Graeber’s “Bullshit Jobs”? How much does non-serious and non-amusement still provide significant value? I’m not sure if the seriousness or not here is engaging with critiques of AGI or actual utility.

    2. The language of “playing with it”, “probing it”, “experimenting” makes me think of how the data engineers in my dissertation interviews would sort of hold web search at a distance or refer to only jokingly, not wanting, it seemed, to fully admit the significant role it played in the performance of their expertise.

    Tags: actual-use, admitting-searching, search-confessions

    “Personally I've found Bingchat with search, and GPT4 with Browse to be far more confused at all times than the no-RAG model”

    Comments below this tweet discuss explicitly instructing ChatGPT not to search. I think there is something here, in addition to concerns about the constrained browsing and limited extent of reformulating queries and follow-on searches conducted by some of these systems, that connects with concerns we might have about the various domains where we think “a quick search” or “just google it” is/isn’t appropriate.

    @teknium1 via Twitter on Nov 16, 2023

    Personally I’ve found Bingchat with search, and GPT4 with Browse to be far more confused at all times than the no-RAG model

    Tags: RAG, search-avoidance, a-quick-search, just-google-it

    ROBOTS.TXT PARSER
    Will Critchlow’s ROBOTS.TXT PARSER on Nov 17, 2023

    PARSE YOUR ROBOTS.TXT FILE THE SAME WAY GOOGLE’S CRAWLERS DO

    Tags: robots-txt

    Nov 16, 2023 update to Google's Search Quality Rater Guidelines
    @glenngabe via Twitter on Nov 16, 2023

    Heads-up, the Quality Rater Guidelines have been updated (as of 11/16)

    “Specifically, we’ve simplified the”Needs Met" scale definitions, added more guidance for different kinds of web pages and modern examples including newer content formats such as short form video, removed outdated and redundant examples, and expanded rating guidance for forum and discussion pages. None of these involve any major or foundational shifts in our guidelines." https://developers.google.com/search/blog/2023/11/search-quality-rater-guidelines-update…

    Image 1

    @cyrusshepard via Twitter on Nov 17, 2023

    Google released a new version of its Quality Rater Guidelines. One of the big changes, IMO:

    The Relationship between Page Quality and Needs Met

    In the new version, High-Quality pages (often big brands) have less of an advantage on Needs Met

    Is this to level the playing field?

    Image 1

    Image 2

    Tags: SQRG, Search-Quality-Raters

    “Perplexity user: “Microsoft banned your domain, so can't use it on work devices”.”

    This is interesting. But is it a signal or noise?

    @aravsrinivas via Twitter on Nov 13, 2023

    Perplexity user: “Microsoft banned your domain, so can’t use it on work devices”.

    Tags: blocking-search-tools, Perplexity-AI

    Open Philanthropy RFP: “studying and forecasting the real-world impacts of systems built from LLMs”
    Open Philanthropy’s Request for proposals: studying and forecasting the real-world impacts of systems built from LLMs on Nov 10, 2023

    seeking proposals for a wide variety of research projects which might shed light on what real-world impacts LLM systems could have over the next few years.

    Below are some examples of project ideas that could make for a strong proposal to this RFP, depending on details:

    • Conducting randomized controlled trials…
    • Polling members of the public…
    • In-depth interviews…
    • Collecting “in the wild” case studies…
    • Estimating and collecting key numbers…
    • Creating interactive experiences…
    • Eliciting expert forecasts…
    • Synthesizing, summarizing, and analyzing the various existing lines of evidence about what language model systems can and can’t do at present…

    This is interesting. As often is the case, the resources gathered in the CFP merit note. I could imagine a proposal developed around “studying and forecasting the real-world impacts of [search] systems built from LLMs”…

    Tags: CFP-RFP

    November 16, 2023

    Thread from @⁠searchliaison: “Last week, I gave a presentation about Google Search results not being perfect, how we...
    @searchliaison via Twitter on Nov 16, 2023

    Last week, I gave a presentation about Google Search results not being perfect, how we update to improve those results, and how our guidance to creators needs to improve. In this thread, I’ll share my slides and commentary for those who weren’t able to attend my talk…

    Further down in the thread is this line from Google’s Search Liaison (GSL):

    The gap between what Google says to creators and what creators hear about being successful in Google Search needs to get better.

    This, and remarks throughout the thread, connects with our discussion in Griffin & Lurie (2022): the GSL appears to be much more heavily engaged with the search engine optimization (SEO) community than searchers writ-large.1

    References

    Griffin, D., & Lurie, E. (2022). Search quality complaints and imaginary repair: Control in articulations of Google Search. New Media & Society, 0(0), 14614448221136505. https://doi.org/10.1177/14614448221136505 [griffin2022search]

    Tags: decontextualized, queries-and-prompts, extending searching

    definitions of prompt engineering evolving and shapeshifting

    the definition(s) and use(s) of “prompt engineering” will continue to evolve, shapeshift, fragment and become even more multiple and context-dependent. But still a useful handle?

    [highlighting added]

    @yoavgo

    1. i didnt look in details yet but this is roughly what i’d imagine a chaining tool api to look like (ahm langchain, ahm).
    2. its interesting how the definition of “prompt engineering” evolves and shapeshifts all the time.

    @jxnlco

    Why prompt engineer @openai with strings?

    Don’t even make it string, or a dag, make it a pipeline.

    Single level of abstraction:

    Tool and Prompt and Context and Technique? its the same thing, it a description of what I want.

    The code is the prompt. None of this shit
    “{}{{}} {}}”.format{“{}{}”

    PR in the next tweet.

    Tweet from @jxnlco Tweet Jun 29, 2023

    Tweet Jun 29, 2023

    Tags: prompt engineering

    June 27, 2023

    imagining OpenGoogle?
    @generativist via Twitter on Jun 26, 2023

    i imagine there’s no alpha left in adding the word “open” to various names anymore, right?

    Tags: speculative-design

    Perplexity, Ads, and SUVs

    I don’t think ads[1] are necessarily wrong to have in search results (despite the misgivings in Brin & Page (1998)), but people are definitely not happy with how the dominant search engine has done ads.


    1. relevant, clearly labelled, and fair (as in not unfair in the FTC sense)

    It is pretty striking to me how text-heavy Perplexity AI’s SERP is for this query: “highly-rated” x10?

    My experience has generally been much better, but I’m not normally doing queries like this.

    Here’s a link to the same query as that in the screenshot below (which, is likely not using their Copilot):

    Perplexity AI [ I want to buy a new SUV which brand is best? ]

    • note also the generated follow-on prompts under Related
    @jowyang

    Seven reasons why perplexity.ai is better than Google search:

    1. No ads.
    2. All the content in one place.
    3. No ads.
    4. You can chat with it and get additional details.
    5. No ads.
    6. Sources are provided with URLs.
    7. No ads.

    Here’s a screenshot of car reviews, as just one of infinite examples. Perplexity is focused on being the search tool in the age of AI.

    I saw a demo from @AravSrinivas at the Synthedia conference hosted by @bretkinsella. I’ll have closing remarks.

    Example search results for Perplexity AI On Twitter Jun 27, 2023

    References

    Brin, S., & Page, L. (1998). The anatomy of a large-scale hypertextual web search engine. Computer Networks, 30, 107–117. http://www-db.stanford.edu/~backrub/google.html [brin1998anatomy]

    Tags: ads, Perplexity-AI