While there is a tendency nowadays to build increasingly bigger models, with the aim of handling more and more language processing tasks, we have obtained very nice results in our work by contrarily scaling things down and focusing on more relevant quality data. As a result, we were able to show that our system had improved regarding certain aspects such as lexical diversity or literalness by comparison with publicly available systems that are trained on a lot more data.
And more importantly, we noted that it was a lot closer to the human latvia mobile database reference. Not only in regard to simple lexical choices, but because it reproduced more abstract strategies (omission of specific types of information, heavier syntactic reorganization, etc.) that are in line with the reference and that we could say are indicative of adaptation to translator style.
If anything, it shows that we have to consider how to make MT less constraining and more inspiring for translators. User reception studies have been getting more and more attention, but we still have some ways to go, and we are hoping to contribute to the debate as a continuation of this project, as is the case with the numerous ethical questions that arise with the mere possibility of literary MT.
This article is part of a series that takes a deeper look at the research presented at the 2022 NeTTT conference. You can find the rest here:
MT gender bias is more than a technical problem: An interview with Beatrice Savoldi.