Large Language Model (LLM)
Also known as: LLM, Foundation model, Frontier model
A neural network trained on massive text corpora to predict the next token, used for chat, coding, reasoning and as the brain inside AI agents.
A large language model (LLM) is a neural network — almost always a Transformer — trained on terabytes of text and code to predict the next token in a sequence. After pre-training, models are fine-tuned with supervised data and reinforcement learning from human feedback (RLHF) to follow instructions, refuse harmful requests, and behave like an assistant.
Modern LLMs scale to hundreds of billions or trillions of parameters. Capabilities emerge with scale: small models do autocomplete, mid-size models hold a conversation, and the largest frontier models (GPT-5, Claude Opus 4, Gemini 2.5 Pro) can reason across long documents, use tools, and act as the planning layer inside an AI agent.
LLMs are priced per million tokens — a token is roughly 0.75 of an English word. Input tokens (your prompt) are typically 4–8× cheaper than output tokens (the model's reply).
See also on SoftPerceptron
Related terms
- AI Agent
An LLM-based system that can plan, use tools and take multi-step actions toward a goal — not just answer a single prompt.
- Tokens
The atomic units that LLMs read and write — sub-word pieces produced by a tokenizer. Pricing and context limits are measured in tokens, not words.
- Context Window
The maximum number of tokens an LLM can read in a single request, including the prompt, retrieved documents and the model's own reply.
- Transformer
The neural-network architecture (Vaswani et al., 2017) that powers virtually every modern LLM, based on self-attention instead of recurrence.
- Fine-Tuning
Continuing to train a pre-trained model on your own data to specialise its behaviour, tone or domain knowledge.
- Multimodal AI
AI models that natively accept and/or produce more than one modality — text, image, audio, video — in a single model.
More to explore
Other wiki entries that touch on Large Language Model (LLM).
- Perceptron
The first algorithmically trained neural network, invented by Frank Rosenblatt in 1957, and the foundational unit that sparked the deep-learning revolution.
- Model Context Protocol (MCP)
An open standard from Anthropic for connecting LLMs to external tools and data sources through a uniform server interface.
- Soft-Perceptron
The conceptual bridge between Rosenblatt's hard binary perceptron and modern, differentiable, probabilistic neural networks — the soft engine that powers today's AI.