Buscar
Estás en modo de exploración. debe iniciar sesión para usar MEMORY

   Inicia sesión para empezar


Por supuesto:

Intro to AI 2

» Iniciar este curso
(Practica preguntas similares gratis)
Pregunta:

Transformer - Architecture

Autor: Christian N



Respuesta:

Transformers are the basic architecture used in NLP (chatbots, translators) • Contrary to LSTM, they do not work sequencially -> high parallelization • They need positional encoding to specify the position of each word • They use multiple attention layers to keep track of important information across sentences • The output sentence is produced word by word • The output of the network is a probability distribution across all word in the dictionary to predict the next word in the sentence • The process stops only when <EOS> is predicted


0 / 5  (0 calificaciones)

1 answer(s) in total