blanketglossary

Reinforcement learning from human feedback

Definition

In machine learning, reinforcement learning from human feedback (RLHF) is a technique to align an intelligent agent with human preferences. It involves training a reward model to represent preferences, which can then be used to train other models through reinforcement learning.

Related concepts

15.ai AAAI Conference on Artificial Intelligence AI agent AI alignment AI anthropomorphism AI boom AI bubble AI data center AI effect AI literacy AI nationalism AI safety AI slop AI takeover AI veganism AI winter Action selection Activation function Active learning (machine learning)Actor-critic algorithm Adam optimizer Adobe Firefly Adversarial machine learning Agent2Agent Aidan Gomez Alan Turing AlexNet Alex Graves (computer scientist)Alex Krizhevsky Algorithmic bias Allen Newell AlphaFold AlphaGo AlphaZero Andrej Karpathy Andrew Ng Anomaly detection Anthropic Applications of artificial intelligence Apprenticeship learning Artificial Intelligence Act Artificial Intelligence Cold War Artificial general intelligence Artificial human companion Artificial intelligence Artificial intelligence and elections Artificial intelligence arms race Artificial intelligence in architecture Artificial intelligence in education Artificial intelligence in fiction Artificial intelligence in healthcare Artificial intelligence in mental health Artificial intelligence in video games Artificial intelligence visual art Artificial neural network Artificial superintelligence Ashish Vaswani Association rule learning Atari Attention (machine learning)Aurora (text-to-image model)AutoGPT Autoencoder Automated machine learning Automated reasoning Automated theorem proving Autoregressive Autoregressive model BERT (language model)BIRCH BLOOM (language model)Backpropagation Batch learning Batch normalization Bayesian network Bernard Widrow Bias–variance tradeoff Boltzmann machine Boosting (machine learning)Bootstrap aggregating Bradley–Terry model Bradley–Terry–Luce CURE algorithm Canonical correlation Change of variables ChatGPT Chatbot psychosis Chinchilla (language model)Christopher D. Manning Claude (language model)Claude Shannon Cliff Shaw Cluster analysis Coefficient of determination Competition in artificial intelligence Computational learning theory Computer vision Conditional random field Conference on Neural Information Processing Systems Confidence bound Confusion matrix Conjugate gradient method Constitutional AI Convergent series Conversational agents Convolution Convolutional neural network Cross-entropy Crowdsourcing Curriculum learning DALL-E DBRX DBSCAN Daniel Kokotajlo (researcher)Data augmentation Data cleaning Data mining David Silver (computer scientist)Decision tree learning DeepDream DeepMind DeepSeek (chatbot)Deep learning Deep learning speech synthesis Demis Hassabis Density estimation Differentiable neural computer Diffusion model Diffusion process Dimensionality reduction Discrete choice Double descent Dream Machine (text-to-video model)ECML PKDD Echo state network Efficiency (statistics)Electrochemical RAM ElevenLabs Elo rating system Empirical risk minimization Ensemble learning Environmental impact of artificial intelligence Ethics of artificial intelligence Expectation–maximization algorithm Expected value Explainable artificial intelligence Exploration (reinforcement learning)Facial recognition system Factor analysis Feature engineering Feature learning Feedback Feedforward neural network Fei-Fei Li Fine-tuning (deep learning)Flux (text-to-image model)Frank Rosenblatt François Chollet Fuzzy clustering GPT Image Game the system Gated recurrent unit Gating mechanism Gemini (chatbot)Gemini (language model)Gemma (language model)Generalization (learning)Generative AI Generative adversarial network Generative engine optimization Generative model Generative pre-trained transformer Genie (world model)Geoffrey Hinton GloVe Glossary of artificial intelligence Google Gradient descent Grammar induction Graph neural network Graphical model Grok (chatbot)Hallucination (artificial intelligence)Handwriting recognition Herbert A. Simon Hidden Markov model Hierarchical clustering Highway network History of artificial intelligence Huawei PanGu Human-in-the-loop Human image synthesis Humanity's Last Exam Hyperparameter (machine learning)IBM Granite IBM Watson IBM Watsonx Ian Goodfellow Ideogram (text-to-image model)Ilya Sutskever Imagen (text-to-image model)Imitation learning Independent component analysis InstructGPT Intelligent agent International Conference on Learning Representations International Conference on Machine Learning International Joint Conference on Artificial Intelligence Interpretability (machine learning)Isolation forest James Goodnight Jan Leike John Hopfield John McCarthy (computer scientist)John Schulman John von Neumann Joseph Weizenbaum Journal of Machine Learning Research Jürgen Schmidhuber K-means clustering K-nearest neighbors algorithm KL divergence Kernel machines Kling AI Kullback–Leibler divergence Kunihiko Fukushima LaMDA Labeled data Lagrange multiplier Language model Large language model Latent diffusion model LeNet Learning curve (machine learning)Learning to rank Lethal autonomous weapon Linear discriminant analysis Linear model Linear regression List of artificial intelligence companies List of artificial intelligence projects List of datasets for machine-learning research List of datasets in computer vision and image processing Llama (language model)Local outlier factor Logistic regression Long short-term memory Loss function Loss functions for classification Lotfi A. Zadeh Machine Learning (journal)Machine learning Mamba (deep learning architecture)Markov property Marvin Minsky Mathematical optimization Maximum likelihood estimation Maximum likelihood estimator Mean shift Mechanistic interpretability Memoryless Memtransistor Meta-learning (computer science)Midjourney MiniMax (company)Mode collapse Model Context Protocol MuZero Multi-agent reinforcement learning Multilayer perceptron Multimodal learning Music and artificial intelligence Mustafa Suleyman Naive Bayes classifier Nathaniel Rochester (computer scientist)Natural language processing Neural Turing machine Neural field Neural machine translation Neural network (machine learning)Neural radiance field Neuro-symbolic AI Neuromorphic engineering Noam Shazeer Non-negative matrix factorization Normalization (machine learning)OPTICS algorithm Oasis (Minecraft clone)Occam learning Oliver Selfridge Online machine learning Ontology learning OpenAI OpenAI Five Optical character recognition Optimization algorithm Oriol Vinyals Outline of machine learning Overfit Overfitting PaLM Pairwise comparison (psychology)Parameter Partition function (statistical mechanics)Paul Werbos Perceptron Physics-informed neural networks Plackett–Luce model Policy gradient method Precautionary principle Preference Principal component analysis Probably approximately correct learning Project Debater Prompt engineering Proper generalized decomposition Prospect theory Proximal policy optimization Q-learning Quantum machine learning Quasi-Newton method Quoc V. Le Qwen Random forest Random sample consensus Reasoning model Receiver operating characteristic Recraft Rectifier (neural networks)Recurrent neural network Recursive self-improvement Reflection (artificial intelligence)Regression analysis Regret (decision theory)Regularization (mathematics)Regulation of artificial intelligence Regulation of artificial intelligence in the United States Reinforcement learning Relevance vector machine Representative sample Reservoir computing Residual neural network Restricted Boltzmann machine Retrieval-augmented generation Reward-based selection Riffusion Robot control Robotics simulator Robust optimization Rule-based machine learning Runway (company)Sampling (statistics)Score (game)Seedance 2.0 Self-driving car Self-organizing map Self-play (reinforcement learning technique)Self-supervised learning Semantic analysis (machine learning)Semi-supervised learning Seppo Linnainmaa Seq2seq Seymour Papert Shun'ichi Amari Sigmoid function Softmax function Sora (text-to-video model)Sparrow (chatbot)Sparse dictionary learning Speech recognition Spiking neural network Stable Diffusion State–action–reward–state–action Statistical classification Statistical distance Statistical entropy Statistical learning theory Statistical mechanics Stephen Grossberg Stochastic gradient descent String (computer science)Structured prediction Suno (platform)Supervised learning Support vector machine Symbolic artificial intelligence T-distributed stochastic neighbor embedding T5 (language model)Takeo Kanade Temporal difference learning Text-to-image model Text-to-video model Text summarization Timeline of artificial intelligence Topological deep learning Training, validation, and test data sets Transformer (deep learning)Transformer (deep learning architecture)U-Net Udio Uncanny valley Uncertainty Unsupervised learning Vapnik–Chervonenkis theory Variational autoencoder Veo (text-to-video model)Vibe coding Video game bot Virtual politician Vision transformer Walter Pitts Warren Sturgis McCulloch WaveNet Weak artificial intelligence Weight initialization Whisper (speech recognition system)Word2vec Word embedding Workplace impact of artificial intelligence World model (artificial intelligence)Xiaomi MiMo Yann LeCun Yoshua Bengio

22 concepts already in your glossary