Current evidence of post-editese: differences between post-edited neural machine translation output and human translation revealed through human evaluation
Abstract
The experiment reported in this paper is a follow-up to one conducted in 2017/2018. The new experiment aimed to establish if the previously observed lexical impoverishment in machine translation post-editing (MTPE) has become more marked as technology has developed or if it has attenuated. This was done by focusing on two n-grams, which had been previously identified as MT markers, i.e., n-grams that give rise to translation solutions that occur with a higher frequency in MTPE than is natural in HT. The new findings suggest that lexical impoverishment in the two short texts examined has indeed diminished with DeepL Translator. The new experiment also considered possible syntactic differences, namely the number of text segments in the target text. However no significant difference waThe experiment reported in this paper is a follow-up to one conducted in 2017/2018. The new experiment aimed to establish if the previously observed lexical impoverishment in machine translation post-editing (MTPE) has become more marked as technology has developed or if it has attenuated. This was done by focusing on two n-grams, which had been previously identified as MT markers, i.e., n-grams that give rise to translation solutions that occur with a higher frequency in MTPE than is natural in HT. The new findings suggest that lexical impoverishment in the two short texts examined has indeed diminished with DeepL Translator. The new experiment also considered possible syntactic differences, namely the number of text segments in the target text. However no significant difference was observed. The participants were asked to complete a short questionnaire on how they went about their tasks. It emerged that it was helpful to consult the source language text while post-editing, and the original unedited raw output while self-revising, suggesting that monolingual MTPE of the two chosen texts would have been unwise. Despite not being given specific guidelines, the productivity of the post-editors increased. If the ISO 18587:2017 recommendation of using as much of the MT output as possible had been strictly followed, the MTPE would have been easier to distinguish from HT. If this can be taken to be generally true, it suggests that it is neither necessary nor advisable to follow this recommendation when lexical diversity is crucial for making the translation more engaging.
Published in
International Conference on Human-Informed Translation and Interpreting Technology (HiT-IT 2023): proceedings. Naples, Italy, 7-9 July 2023.