AI/LLM: Difference between revisions

From Woozle Writes Code
< AI
Jump to navigation Jump to search
No edit summary
 
(3 intermediate revisions by the same user not shown)
Line 15: Line 15:
** The LLM then processes/glosses the top results to look for pages that seem to be most applicable to the request. If nothing passes a threshold test, it looks at the next bunch of results. If it can't find anything likely, it summarizes what it did find and notes that it may not be what you're looking for.
** The LLM then processes/glosses the top results to look for pages that seem to be most applicable to the request. If nothing passes a threshold test, it looks at the next bunch of results. If it can't find anything likely, it summarizes what it did find and notes that it may not be what you're looking for.
* I'd like to see an LLM browser plugin that can find tabs based on natural-language queries -- "where's the one with that medical form?" "Have I stashed any tabs about...[subject]?". Similarly, an app or library which can take a URL, load the corresponding web page, and produce a summary and keywords would also be hella useful.
* I'd like to see an LLM browser plugin that can find tabs based on natural-language queries -- "where's the one with that medical form?" "Have I stashed any tabs about...[subject]?". Similarly, an app or library which can take a URL, load the corresponding web page, and produce a summary and keywords would also be hella useful.
==Notes==
* (long-standing wish which now seems within reach) I need an LLM which can make phone calls to do basic adulting tasks, e.g. making appointments, renewing prescriptions, getting information ("is the car ready for pickup?").
 
==Links==
* '''2024-11-11''' [https://venturebeat.com/ai/ais-math-problem-frontiermath-benchmark-shows-how-far-technology-still-has-to-go/ AI’s math problem: FrontierMath benchmark shows how far technology still has to go]
* '''2024-11-10''' [https://arxiv.org/pdf/2406.03689 Evaluating the World Model Implicit in a Generative Model] (paper)
* '''2024-11-05''' [https://news.mit.edu/2024/generative-ai-lacks-coherent-world-understanding-1105 Despite its impressive output, generative AI doesn’t have a coherent understanding of the world]
* '''2024-11-04''' [https://direct.mit.edu/tacl/article/doi/10.1162/tacl_a_00713/125177/When-Can-LLMs-Actually-Correct-Their-Own-Mistakes When Can LLMs ''Actually'' Correct Their Own Mistakes? A Critical Survey of Self-Correction of LLMs]
* '''2024-10-07''' [https://arxiv.org/pdf/2410.05229 GSM-Symbolic: Understanding the Limitations of Mathematical Reasoning in Large Language Models] (paper by Apple researchers; [https://sfba.social/@drahardja/113310311247575811 via])
* '''2024-10-07''' [https://arxiv.org/pdf/2410.05229 GSM-Symbolic: Understanding the Limitations of Mathematical Reasoning in Large Language Models] (paper by Apple researchers; [https://sfba.social/@drahardja/113310311247575811 via])
* '''2024-09-21''' [https://arxiv.org/abs/2409.14160 Hype, Sustainability, and the Price of the Bigger-is-Better Paradigm in AI] (paper)
* '''2024-08''' [https://aclanthology.org/2024.findings-acl.826/ LLMs cannot find reasoning errors, but can correct them given the error location] (paper)
* '''2024-07-27''' [https://arxiv.org/pdf/2404.03602v2 Evaluating LLMs at Detecting Errors in LLM Responses] (paper) - introduces the ReaLMistake benchmark
** [https://github.com/psunlpgroup/ReaLMistake ReaLMistake] code - but [https://github.com/psunlpgroup/ReaLMistake/blob/main/data.zip the data] is password-protected.
* '''2024-03-14''' [https://arxiv.org/abs/2310.01798 Large Language Models Cannot Self-Correct Reasoning Yet] (paper)
* '''2023-05-24''' [https://www.vice.com/en/article/3akd7y/worried-about-sending-your-data-to-a-chatbot-privategpt-is-here Worried About Sending Your Data to a Chatbot? 'PrivateGPT' Is Here]
* '''2023-05-24''' [https://www.vice.com/en/article/3akd7y/worried-about-sending-your-data-to-a-chatbot-privategpt-is-here Worried About Sending Your Data to a Chatbot? 'PrivateGPT' Is Here]
===Code Sources===
===Code Sources===

Latest revision as of 00:22, 18 November 2024

Large Language Model AI (LLMs)

I often use these for answering questions (usually technical, occasionally political).

commercial services

ideas

  • LLMs could make federated search (of which I think YaCy is currently the only exemplar) actually workable/useful:
    • You type in your search -- either as keywords or as a natural-language query.
    • A search is done (however LLMs do it currently -- I don't know if they fine-tune the search parameters or just pass them on to a search engine).
    • The LLM then processes/glosses the top results to look for pages that seem to be most applicable to the request. If nothing passes a threshold test, it looks at the next bunch of results. If it can't find anything likely, it summarizes what it did find and notes that it may not be what you're looking for.
  • I'd like to see an LLM browser plugin that can find tabs based on natural-language queries -- "where's the one with that medical form?" "Have I stashed any tabs about...[subject]?". Similarly, an app or library which can take a URL, load the corresponding web page, and produce a summary and keywords would also be hella useful.
  • (long-standing wish which now seems within reach) I need an LLM which can make phone calls to do basic adulting tasks, e.g. making appointments, renewing prescriptions, getting information ("is the car ready for pickup?").

Links

Code Sources

  • Auto-GPT «chains together LLM "thoughts", to autonomously achieve whatever goal you set.»
    • Downside: not fully autonomous; still depends on OpenAI
  • privateGPT

Footnote

  1. ...in the sense that LLMs in general are not really what the term "AI" means to me.