NMT的出现大大提升了翻译质量，围绕NMT的工作非常多，同时来自NMT的研究成果对其他任务的辐射效果也很明显，因此研究NMT不一定是为了做一个很棒的翻译系统，可能更多的是带来一些具有启发意义的想法。虽然NMT成果显著，但是NMT方法仍然面临很多基础性问题，这些文章主要采用经验分析(较多的实验+一定的理论解释，可能理解的有偏颇。)的方式讨论这些问题。

#### Long Sentences

The quality of the NMT system is dramatically lower for these since it produces too short translations.


Bengio在2014年的一篇文章中《Overcoming the Curse of Sentence Length for Neural Machine Translation using Automatic Segmentation》针对句子长度对NMT的影响给出的原因如下：

Training on long sentences is difficult because few available training corpora include sufficiently many long sentences, and because the computational overhead of each update iteration in training is linearly correlated with the length of training sentences.

Additionally, by the nature of encoding a variablelength sentence into a fixed-size vector representation, the neural network may fail to encode all the important details.


#### Word Alignment

Attention虽然在一定程度上可以实现词对齐，但是二者之间的关系到底是什么？看下图，

Note that the attention model may produce better word alignments by guided alignment training where supervised word alignments(such as the ones produced by fast-align) are provided to model training.


The main cause of deteriorating quality are shorter translations under wider beams.


#### Interpretable

“在机器学习里，Explainable 和 Interpretable 是不一样的。Explainable ML指的是构建另一个模型来解释一个黑盒模型，而Interpretable ML指的是模型本身在设计的时候就具备解释自己的功能。 ”