Definition
A vision transformer (ViT) is a transformer designed for computer vision. A ViT decomposes an input image into a series of patches, serializes each patch into a vector, and maps it to a smaller dimension with a single matrix multiplication. These vector embeddings are then processed by a transformer encoder as if they were token embeddings.
Related concepts
15.aiAI agentAI alignmentAI anthropomorphismAI boomAI bubbleAI data centerAI effectAI literacyAI nationalismAI safetyAI slopAI takeoverAI veganismAI winterAction selectionActivation functionAdobe FireflyAdversarial machine learningAgent2AgentAidan GomezAlan TuringAlexNetAlex Graves (computer scientist)Alex KrizhevskyAllen NewellAlphaFoldAlphaGoAlphaZeroAndrej KarpathyAndrew NgAnomaly detectionApplications of artificial intelligenceArtificial Intelligence ActArtificial Intelligence Cold WarArtificial general intelligenceArtificial human companionArtificial intelligenceArtificial intelligence and electionsArtificial intelligence arms raceArtificial intelligence in architectureArtificial intelligence in educationArtificial intelligence in fictionArtificial intelligence in healthcareArtificial intelligence in mental healthArtificial intelligence in video gamesArtificial intelligence visual artArtificial superintelligenceAshish VaswaniAttention (machine learning)Attention Is All You NeedAurora (text-to-image model)AutoGPTAutoencoderAutomated reasoningAutomated theorem provingAutonomous drivingAutoregressive modelBERT (language model)BLOOM (language model)BackpropagationBatch normalizationBernard WidrowBias–variance tradeoffByte pair encodingCOCO (dataset)Cell PaintingChatbot psychosisChinchilla (language model)Christopher D. ManningClaude (language model)Claude ShannonCliff ShawCluster analysisCo-occurrence matrixCompetition in artificial intelligenceComputer visionConjugate gradient methodContrastive Language-Image Pre-trainingConvolutionConvolutional neural networkCross-entropyDALL-EDBRXDaniel Kokotajlo (researcher)Data augmentationDavid Silver (computer scientist)DeepSeek (chatbot)Deep learningDeep learning speech synthesisDeepfakeDemis HassabisDenoising autoencoderDenseNetDifferentiable neural computerDiffusion modelDiffusion processDouble descentDream Machine (text-to-video model)Echo state networkEfficientNetElevenLabsEnvironmental impact of artificial intelligenceEthics of artificial intelligenceExplainable artificial intelligenceFacial recognition systemFei-Fei LiFloating point operations per secondFlux (text-to-image model)Frank RosenblattFrançois CholletFrontier (supercomputer)GPT-3GPT ImageGated recurrent unitGating mechanismGemini (chatbot)Gemini (language model)Gemma (language model)Generative AIGenerative adversarial networkGenerative engine optimizationGenerative pre-trained transformerGenie (world model)Geoffrey HintonGloVeGlossary of artificial intelligenceGradient descentGram matrixGraph neural networkGrok (chatbot)Hallucination (artificial intelligence)Handwriting recognitionHerbert A. SimonHighway networkHistory of artificial intelligenceHuawei PanGuHuman image synthesisHumanity's Last ExamHyperparameter (machine learning)IBM GraniteIBM WatsonIBM WatsonxIan GoodfellowIdeogram (text-to-image model)Ilya SutskeverImageNetImage classificationImage recognitionImage segmentationImage synthesisImagen (text-to-image model)Imitation learningInceptionv3Inductive biasIntelligent agentJames GoodnightJan LeikeJohn HopfieldJohn McCarthy (computer scientist)John SchulmanJohn von NeumannJoseph WeizenbaumJürgen SchmidhuberKling AIKnowledge distillationKunihiko FukushimaLaMDALanguage modelLanguage modelsLarge language modelLatent diffusion modelLatent spaceLayerNormLethal autonomous weaponList of artificial intelligence companiesList of artificial intelligence projectsLlama (language model)Long short-term memoryLoss functions for classificationLotfi A. ZadehMachine learningMamba (deep learning architecture)Marvin MinskyMatrix multiplicationMeta AIMidjourneyMing-Hsuan YangMiniMax (company)Model Context ProtocolMuZeroMultilayer perceptronMusic and artificial intelligenceMustafa SuleymanNathaniel Rochester (computer scientist)Natural language processingNeural Turing machineNeural machine translationNeural network (machine learning)Noam ShazeerNormalization (machine learning)Oasis (Minecraft clone)Object detectionOliver SelfridgeOpenAI FiveOptical character recognitionOriol VinyalsOverfittingPaLMParameterPaul WerbosPerceiverPercy LiangPixelPolicy gradient methodPooling layerPrecautionary principleProject DebaterProjection (linear algebra)Prompt engineeringPyTorchPyramid (image processing)Q-learningQuasi-Newton methodQuoc V. LeQwenRGB color modelReasoning modelRecraftRectifier (neural networks)Recurrent neural networkRecursive self-improvementReflection (artificial intelligence)Regression analysisRegularization (mathematics)Regulation of artificial intelligenceRegulation of artificial intelligence in the United StatesReinforcement learningReinforcement learning from human feedbackResidual neural networkRetrieval-augmented generationRiffusionRobot controlRotary positional embeddingRunway (company)Seedance 2.0Self-driving carSelf-supervised learningSeppo LinnainmaaSeq2seqSeymour PapertShun'ichi AmariSigmoid functionSoftmax functionSora (text-to-video model)Speech recognitionStable DiffusionState–action–reward–state–actionStephen GrossbergStochastic gradient descentSuno (platform)Supervised learningSymbolic artificial intelligenceT5 (language model)Takeo KanadeTensorFlowText-to-image modelText-to-video modelTimeline of artificial intelligenceTraining, validation, and test data setsTransfer learningTransformer (deep learning)Transformer (deep learning architecture)Transformer (machine learning model)UdioUncanny valleyVariational autoencoderVector quantizationVector spaceVeo (text-to-video model)Vibe codingVirtual politicianVitWalter PittsWarren Sturgis McCullochWaveNetWeak artificial intelligenceWeather forecastingWeight initializationWhisper (speech recognition system)Word2vecWord embeddingWorkplace impact of artificial intelligenceWorld model (artificial intelligence)Xiaomi MiMoYann LeCunYoshua Bengio
15 concepts already in your glossary