Link Search Menu Expand Document

Neural machine translation

Deep learning approaches to machine translation


Neural machine translation (NMT) is a machine translation approach based on machine learning that uses large neural networks to predict the likelihood of correct translations. Like statistical machine translation, neural machine translation is data-driven.

Neural networks

Neural networks use training data to create vectors for every word and its relations, called word embeddings. Words with similar meaning cluster together, and words with more than one meaning appear simultaneously in different clusters.

Cluster1:

  • Bank
  • Lake
  • River
  • Stream
  • Terrain

Cluster2:

  • Money
  • Finance
  • Credit
  • Bank
  • Banking

Neural networks use cluster information to disambiguate the meaning of input words and generate the most relevant translations.

Sequence model

In general, neural machine translation can be seen as a sequence-to-sequence task. Given an input sequence, the system predicts and generates an output sequence. The sequence model arranges a sentence order by calculating the probability of the sequence of words.

Encoder/decoder framework

Neural machine translation architecture consists of an encoder and a decoder.

The encoder analyses the input sequence words and their relations. The result is the representation of the sentence, called context vector. The context vector summarizes the entire input sequence into a single fixed-length vector.

The decoder takes that sequence representation and produces the translation.

Attention mechanism

Single fixed-length vectors are too limited to cram all the information from long sentences.

A solution to this problem is to employ an attention mechanism. An attention mechanism focuses on the input sentence areas that are relevant instead of looking at the complete input sentence. The attention mechanism also learns the alignment between the relevant information.

See also


↑

Edit this article β†’

Machine Translate is created and edited by contributors like you!

Learn more about contributing β†’

Licensed under CC-BY-SA-4.0.

Cite this article β†’