LLMs
Large Language Models are artificial intelligence models that are trained on vast amounts of text data to generate, understand, and generate human-like language. These models are designed to process and analyze large volumes of text, allowing them to learn patterns, relationships, and nuances in language. The development of Large Language Models has revolutionized the field of NLP, enabling applications like chatbots, virtual assistants, and language translation tools. They have also sparked significant advancements in areas like research, education, and entertainment.
See:
Resources
- eugeneyan/open-llms: 📋 A list of open LLMs available for commercial use. (github.com)
- GitHub - SylphAI-Inc/LLM-engineer-handbook - A curated list of Large Language Model resources, covering model training, serving, fine-tuning, and building LLM applications
- NiuTrans/ABigSurveyOfLLMs: A collection of 150+ surveys on LLMs (github.com)
- Question answering using embeddings-based search | OpenAI Cookbook
- Chat with your PDF: Using Langchain, F.A.I.S.S., and OpenAI to Query PDFs | by johnthuo | Medium
- The Large Language Model (LLM) Index | Sapling
- LLM Explorer: A Curated LLM List. Explore LLM List of the Open-Source LLM Models.
- Maxime Labonne - LLM Articles
- Maxime Labonne - Merge Large Language Models with MergeKit
- Decoding strategies in LLMs - understanding temperature, topk, topp, numbeams
- Pricing calculators:
APIs
- OpenAI
- https://openrouter.ai/ - A unified interface for LLMs
- Groq is Fast AI Inference
- HuggingFace Serverless Inference API
Benchmarks and evaluation
See LLMs evaluation
Training and Tuning LLMs
LLMs in production
See LLM Ops
Interfaces, model hubs and UIs
- Ollama
- GoLlama - Gollama is a macOS / Linux tool for managing Ollama models
- Open-WebUI - User-friendly WebUI for LLMs (Formerly Ollama WebUI)
- Jan
- LM studio
- LLM farm
- Sanctum
- Msty
- jasonacox/TinyLLM: Setup and run a local LLM and Chatbot using consumer grade hardware. (github.com)
- Builds a local OpenAI API web service via ollama, vllm
- Serves up a Chatbot web interface with customizable prompts, accessing external websites (URLs), vector databases and other sources (e.g. news, stocks, weather)
Services and applications
See AI for scientific discovery
Search engines
Coding
- Aider - AI pair programming in your terminal
- Cursor - The AI Code Editor
- Warp: The intelligent terminal
Chatbots
- Chatbots | 🦜️🔗 Langchain
- Custom Chat Model | 🦜️🔗 Langchain
- 12 métricas para analizar el rendimiento de un chatbot - Planeta Chatbot
NER
Text-to-sql, data analysis
- #CODE microsoft/lida: Automatic Generation of Visualizations and Infographics using Large Language Models (github.com)
- Graphext (paid service - startup)
- Goodbye, Text2SQL: Why Table-Augmented Generation (TAG) is the Future of AI-Driven Data Queries! | by Pavan Emani | Sep, 2024 | Artificial Intelligence in Plain English
- Text2SQL is Not Enough: Unifying AI and Databases with TAG
- BIRD-bench
- The Death of Schema Linking? Text-to-SQL in the Age of Well-Reasoned Language Models
- LOTUS
Courses
- #COURSE Get into LLMs with roadmaps and colab notebooks
- mlabonne/llm-course: Course to get into Large Language Models (LLMs) with roadmaps and Colab notebooks. (github.com)
- machinelearnear/curso-ia-generativa-y-llms: Un curso para meterse en todo lo que es Generative AI y modelos grandes de lenguaje (LLMs) con roadmaps y notebooks para Colab. (github.com)
- #COURSE Large Language Models (Stanford CS324)
Code
- #CODE LMOps - General technology for enabling AI capabilities w/ LLMs and MLLMs
- #CODE openai/tiktoken - tiktoken is a fast BPE tokeniser for use with OpenAI's models
- #CODE Litellm - Python SDK, Proxy Server (LLM Gateway) to call 100+ LLM APIs in OpenAI format
- #CODE Exllamav2 - A fast inference library for running LLMs locally on modern consumer-class GPUs
- #CODE AdalFlow: The library to build & auto-optimize LLM applications.
Lifecycle and deployment
See LLM Ops
Frameworks
- Framework comparisons:
- #CODE huggingface/transformers: 🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX. (github.com)
- #CODE LangChain
- How to build a tool-using agent with LangChain | OpenAI Cookbook
- Unleashing the Power of LangChain Expression Language (LCEL): from proof of concept to production | by Tom Darmon
- Chains | 🦜️🔗 LangChain
- LangGraph: langchain-ai/langgraph: Build resilient language agents as graphs. (github.com)
- Command: A new tool for building multi-agent architectures in LangGraph
- #CODE microsoft/semantic-kernel: Integrate cutting-edge LLM technology quickly and easily into your apps (github.com)
- #CODE microsoft/autogen: A programming framework for agentic AI. Discord: https://aka.ms/autogen-dc. Roadmap: https://aka.ms/autogen-roadmap (github.com)
- #CODE microsoft/TaskWeaver
- A code-first agent framework for seamlessly planning and executing data analytics tasks (autonomous agent)
- It translates user requests into executable code and treats user-defined plugins as callable functions. TaskWeaver supports rich data structures, flexible plugin usage, dynamic plugin selection, and harnesses LLM coding capabilities for complex logic. It also incorporates domain-specific knowledge through examples and ensures the secure execution of generated code
- #PAPER TaskWeaver: A Code-First Agent Framework (arxiv.org)
- The Future of Agent Frameworks: TaskWeaver and Microsoft Autogen and Microsoft Semantic Kernel
- TaskWeaver beats AutoGen! 🚀 Microsoft's Code-First Agent Framework 🤯 ( LIVE DEMO Full Tutorial ) - YouTube
- #CODE Haystack | Haystack (deepset.ai)
- deepset-ai/haystack: :mag: LLM orchestration framework to build customizable, production-ready LLM applications. Connect components (models, vector DBs, file converters) to pipelines or agents that can interact with your data. With advanced retrieval methods, it's best suited for building RAG, question answering, semantic search or conversational agent chatbots. (github.com)
- What is Haystack? | Haystack (deepset.ai)
- Haystack: An Alternative to Langchain carrying LLMs | by Amanatullah | 𝐀𝐈 𝐦𝐨𝐧𝐤𝐬.𝐢𝐨 | Medium
- Customizing Agent to Chat with Your Documents | Haystack (deepset.ai)
- LM Format Enforcer | Haystack (deepset.ai)
- Generating Structured Output with Loop-Based Auto-Correction | Haystack (deepset.ai)
- #CODE LlamaIndex, Data Framework for LLM Applications
- Llama-cloud, llama-parse: Introducing LlamaCloud and LlamaParse — LlamaIndex, Data Framework for LLM Applications
- #CODE stanfordnlp/dspy: DSPy: The framework for programming—not prompting—foundation models (github.com)
- #CODE embedchain/embedchain: The Open Source RAG framework (github.com)
- #CODE sgl-project/sglang: SGLang is a structured generation language designed for large language models (LLMs). It makes your interaction with models faster and more controllable. (github.com)
- #CODE lm-sys/FastChat: An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena. (github.com)
- An open platform for training, serving, and evaluating large language models (chatbot arena)
References
- #PAPER InstructGPT - Training language models to follow instructions with human feedback (Ouyang 2022)
- https://openai.com/blog/chatgpt
- https://en.wikipedia.org/wiki/ChatGPT
- See GPT-3 in AI/Deep Learning/Transformers
- ChatGPT – a generative pre-trained transformer (GPT) – was fine-tuned on top of GPT-3.5 using Supervised learning as well as Reinforcement learning. Both approaches used human trainers to improve the model's performance. In the case of supervised learning, the model was provided with conversations in which the trainers played both sides: the user and the AI assistant. In the reinforcement learning step, human trainers first ranked responses that the model had created in a previous conversation. These rankings were used to create 'reward models' that the model was further fine-tuned on using several iterations of Proximal Policy Optimization (PPO)
- #PAPER LLaMA: Open and Efficient Foundation Language Models (Touvron 2023)
- Introducing LLaMA: A foundational, 65-billion-parameter large language model
- Paper explained
- #CODE Dalai - The simplest way to run LLaMA on your local machine
- #PAPER GPT-4 Technical Report (OpenAI 2023)
- #PAPER ChatGPT: Jack of all trades, master of none (Kocon 2023)
- #PAPER #REVIEW Harnessing the Power of LLMs in Practice: A Survey on ChatGPT and Beyond (2023)
- #PAPER Sparks of Artificial General Intelligence: Early experiments with GPT-4 (Bubeck 2023)
- #PAPER Sequential Integrated Gradients: a simple but effective method for explaining language models (2023)
- #PAPER Pythia: A Suite for Analyzing Large Language Models Across Training and Scaling (Biderman 2023)
- #PAPER Alpaca: A Strong, Replicable Instruction-Following Model (Taori 2023)
- Fine-tuned LLaMA 7B model on 52K instruction-following demonstrations produced using OpenAI API
- Performance qualitatively similar to OpenAI’s text-davinci-003, while being surprisingly small and easy/cheap to reproduce (<600$)
- Alpaca dataset
- https://the-decoder.com/stanfords-alpaca-shows-that-openai-may-have-a-problem/
- https://github.com/tatsu-lab/stanford_alpaca#fine-tuning
- How to finetune your own Alpaca 7B
- https://huggingface.co/mrm8488/Alpacoom
- #PAPER GSM-Symbolic: Understanding the Limitations of Mathematical Reasoning in Large Language Models (2024)
Causality and LLMs
See Causality
- #PAPER Tell me why! Explanations support learning relational and causal structure (2022)
- #PAPER Passive learning of active causal strategies in agents and language models (2023)
- #PAPER Understanding Causality with Large Language Models: Feasibility and Opportunities (2023)
- LLMs are not yet able to provide satisfactory answers for discovering new knowledge or for high-stakes decision-making tasks with high precision
- Current LLMs can answer causal questions with existing causal knowledge as combined domain experts
- #PAPER Causal Reasoning and Large Language Models: Opening a New Frontier for Causality (2024)
- LLM-based methods establish new state-of-the-art accuracies on multiple causal benchmarks
- Metaculus Presents — Causal Inference and LLMs: A New Frontier - YouTube