To meet the growing demand for translation, post-editing of machine translation output (PEMT) is being increasingly adopted as a mainstream alternative working method. The compelling reason behind this trend is the widely reported increase in productivity compared to human translation together with a comparable and sometimes higher quality level. The skills required for post-editing are different from those needed for the editing of author-written texts and different from those required for translation. This workshop aims to familiarize attendees with post-editing methods by analysing the typical mistakes of both neural and statistical machine translation (MT). It also provides some insight into why certain errors occur in raw MT output through a presentation of the historical development of the technology. It will conclude with a discussion of when PEMT should and should not be used and how raw MT output can be improved through preparatory steps.
PART 1 After looking at various standard industry guidelines for light and full post-editing, half the attendees will translate short texts from various languages into English or vice versa and the other half will full-post-edit machine-translated versions of the same texts. The two groups will then come together in pairs according to language combination to compare the results, along with the speaker, both in terms of productivity increase and overall quality.
PART 2 In the second part of the Lab, all the attendees will receive a machine-translated text to post-edit from Italian or Spanish into English, or vice versa. While they are doing so, they will also be asked to use any knowledge they may have of how machine translation works to attempt a preliminary categorization of the errors they find. The speaker will then present an analysis of the errors in the raw outputs, as well as other typical errors which occur, in order to provide practical tips for post-editors. Most of the error types analysed are language-independent and attendees who do not normally work with Spanish or Italian but are familiar with a Neo-Latin language are still likely to find the practical exercise useful.
To meet the growing demand for translation, post-editing of machine translation output (PEMT) is being increasingly adopted as a mainstream alternative working method. The compelling reason behind this trend is the widely reported increase in productivity compared to human translation together with a comparable and sometimes higher quality level. The skills required for post-editing are different from those needed for the editing of author-written texts and different from those required for translation. This workshop aims to familiarize attendees with post-editing methods by analysing the typical mistakes of both neural and statistical machine translation (MT). It also provides some insight into why certain errors occur in raw MT output through a presentation of the historical development of the technology. It will conclude with a discussion of when PEMT should and should not be used and how raw MT output can be improved through preparatory steps.
Dal confronto fra la traduzione umana e la traduzione automatica post-editata si nota che certi giri di parole, espressioni e scelte di termini si trovano con maggiore frequenza nella seconda di quanto non si trovino nella prima. Ciò implica che i testi post-editati, in media, sono meno ricchi nella varietà e nell’inventiva tipiche della traduzione umana, e qualsiasi tentativo di eliminare quelli che sono a tutti gli effetti marcatori di traduzione automatica richiederebbe ulteriori sforzi di post-editing e annullerebbe la maggior parte del risparmio di tempo e dei vantaggi economici. Naturalmente varietà e inventiva non sono sempre caratteristiche auspicabili in una traduzione. Tuttavia ci sono numerose tipologie di testo in cui l’omogeneizzazione e l’uniformità renderebbero la traduzione meno interessante da leggere e meno stimolante intellettualmente. In questi casi, la mancata eliminazione di questi marcatori può portare a lungo andare all’impoverimento lessicale della lingua target.
In questa presentazione si illustrano i rischi connessi all’utilizzo indiscriminato della traduzione automatica post-editata per mettere l’LSP in condizione di valutare quando è opportuno usarla.
Comparison shows that certain turns of phrase, expressions and choices of words occur with greater frequency in post-edited machine translation output than they do in human translation. This implies that post-edited texts, on average, lack the variety and inventiveness of human translation, and any attempt to eliminate what are effectively machine translation markers would require additional post-editing effort and nullify most, if not all, of the time and cost-saving advantages. Of course variety and inventiveness are not always desirable features. Nevertheless, there are various kinds of text where homogenization and uniformity would make the translation less interesting to read and less intellectually stimulating. In such cases, failure to eradicate these markers may eventually lead to lexical impoverishment of the target language.
This talk will illustrate the risks involved in using post-edited machine translation output indiscriminately and put the translator in a position to explain when its use might be detrimental.
Raw Output Evaluator is a freeware tool, which runs under Microsoft Windows. It allows quality evaluators to compare and manually assess raw outputs from different machine translation engines. The outputs may be assessed in comparison to each other and to other translations of the same input source text, and in absolute terms using standard industry metrics or ones designed specifically by the evaluators themselves. The errors found may be highlighted using various colours. Thanks to a built-in stopwatch, the same program can also be used as a simple post-editing tool in order to compare the time required to post-edit MT output with how long it takes to produce an unaided human translation of the same input text. The MT outputs may be imported into the tool in a variety of formats, or pasted in from the PC Clipboard. The project files created by the tool may also be exported and re-imported in several file formats. Raw Output Evaluator was developed for use during a postgraduate course module on machine translation and post-editing.
Ovvero un esempio del perché serve il post-editor in un solo titolo
Programma
Breve storia della traduzione automatica
Come funzionano le cose: la traduzione automatica
(ovvero la traduzione automatica per negati)
Linee guida per il post-editing
Sfida tra post-editing e traduzione umana
La puzza di traduzione automatica
Individuazione e classificazione degli errori della traduzione automatica
Tecniche per migliorare la qualità dell’output grezzo
Sono previsti due esperimenti pratici dall’inglese all’italiano da eseguire con il proprio portatile/tablet, che i partecipanti sono invitati a portare con sé (eventualmente con cavi di alimentazione). Non è richiesto alcun software particolare oltre a un comune word processor, ma serve almeno una discreta conoscenza della lingua inglese.
* Il titolo principale è l’output grezzo di un noto motore di traduzione automatica neurale, senza post-editing..
The author has conducted an experiment for two consecutive years with postgraduate university students in which half do an unaided human translation (HT) and the other half post-edit machine translation output (PEMT). Comparison of the texts produced shows – rather unsurprisingly – that post-editors faced with an acceptable solution tend not to edit it, even when often more than 60% of translators tackling the same text prefer an array of other different solutions. As a consequence, certain turns of phrase, expressions and choices of words occur with greater frequency in PEMT than in HT, making it theoretically possible to design tests to tell them apart. To verify this, the author successfully carried out one such test on a small group of professional translators. This implies that PEMT may lack the variety and inventiveness of HT, and consequently may not actually reach the same standard. It is evident that the additional post-editing effort required to eliminate what are effectively MT markers is likely to nullify a great deal, if not all, of the time and cost-saving advantages of PEMT. However, the author argues that failure to eradicate these markers may eventually lead to lexical impoverishment of the target language.
Raw Output Evaluator is a freeware tool, which runs under Microsoft Windows. It allows quality evaluators to compare and manually assess raw outputs from different machine translation engines. The outputs may be assessed in comparison to each other and to other translations of the same input source text, and in absolute terms using standard industry metrics or ones designed specifically by the evaluators themselves. The errors found may be highlighted using various colours. Thanks to a built-in stopwatch, the same program can also be used as a simple post-editing tool in order to compare the time required to post-edit MT output with how long it takes to produce an unaided human translation of the same input text. The MT outputs may be imported into the tool in a variety of formats, or pasted in from the PC Clipboard. The project files created by the tool may also be exported and re-imported in several file formats. Raw Output Evaluator was developed for use during a postgraduate course module on machine translation and post-editing.
Listen to Lisa Agostini’s interview of me talking about my presentation. Thanks to MET and Julian Mayers (Yada Yada) per their permission to post the interview here.
Post-editors are asked to do either light post-editing, to get rid of the worst machine translation errors, or full post-editing, to bring the output up to the same standard as human translation.
But is full post-editing in reality a pipe dream?
The speaker has conducted an experiment for two years running with groups of postgraduate university students in which half do an unaided human translation and the other half post-edit machine translation output. Comparison of the texts produced shows that certain turns of phrase, expressions and choices of words occur with greater frequency in the post-edited machine translation output than they do in human translation. This is easily explained by the fact that even neural machine translation systems seem to choose the most statistically frequent solutions even when those solutions occur less frequently than the sum of the frequencies of all the other possible solutions, and post-editors faced with an acceptable solution tend not to edit it. This however implies that post-edited machine translation output, on average, lacks the variety and inventiveness of human translation, and therefore does not in fact reach the same standard. It is evident that the additional post-editing effort required to eliminate what are effectively machine translation markers would nullify most, if not all, of the time and cost-saving advantages of post-edited machine translation. On the other hand, failure to eradicate these markers may eventually lead to lexical and syntactic impoverishment of the target language.
The speaker provides examples of post-editing and translation from English into Italian. However, with the aid of some back-translations, the mechanisms at play should be equally clear to non-Italian speakers, particularly if they are familiar with other Neo-Latin languages.
Engaging copy translated literally into English, without taking account of differences in linguistic, semantic and cultural expressions, at best leaves much to be desired and at worst provokes hysterical laughter.
Thanks to my scientific background, I specialize in technical translations. Over the years I have acquired experience in transcreating advertising copy and press releases primarily for the promotion of technology products.
This website only uses essential and function-related cookies. It does not use marketing, statistics or third-party cookies. Questo sito web utilizza solo cookie tecnici, finalizzati a garantire il corretto funzionamento del sito. Non utilizza cookie di profilazione o di terze parti.Accept / AccettoRead More / Informativa
Cookie Policy
Privacy Overview
This website uses cookies to improve your experience while you navigate through the website. Out of these, the cookies that are categorized as necessary are stored on your browser as they are essential for the working of basic functionalities of the website. We also use third-party cookies that help us analyze and understand how you use this website. These cookies will be stored in your browser only with your consent. You also have the option to opt-out of these cookies. But opting out of some of these cookies may affect your browsing experience.
Necessary cookies are absolutely essential for the website to function properly. This category only includes cookies that ensures basic functionalities and security features of the website. These cookies do not store any personal information.
Any cookies that may not be particularly necessary for the website to function and is used specifically to collect user personal data via analytics, ads, other embedded contents are termed as non-necessary cookies. It is mandatory to procure user consent prior to running these cookies on your website.