Special Issue - Generative tools using large language models and translation

Thousands of articles and think-pieces on generative machine learning tools such as ChatGPT and GPT-4 have appeared in recent months. A very unscientific perusal indicates that many are enthusiastic – these tools are undeniably impressive – but in the more critical pieces, themes of work displacement, alignment, and data reuse are prominent. In this short article, I want to briefly address these topics, in turn, from the perspective of a researcher in translation studies.

In 2017, Frey and Osborne predicted that 47% of jobs in the United States would be at risk due to ‘computerisation.’ More recent work, such as that by Felten et al. (2023), talks of ‘exposure’ to generative artificial intelligence (AI), with desk-based, cognitive work coming near the top of the list. However, translation has been exposed to machine learning tools trained on large amounts of human-created data for some time in the form of neural machine translation (NMT), and the effects have been mixed. Automation is not all-or-nothing: we see portions of translation work being replaced, portions augmented, and some tasks decomposed to small chunks to fit with what machines can be trusted to do. The rule of thumb is that the level of automation should relate to the shelf-life and value of the content along with the level of risk in case of mistranslation (Guerberof-Arenas & Moorkens, 2023; Way, 2013). There are times when a focus on cutting (labour) cost has led to the overuse of NMT, resulting in the exposure of end users to risk or to unhappy workers who are underpaid to tidy up poorquality NMT output. A lesson that should be learned for other occupations that will be affected by AI is that we need sustainable work systems, in which intrinsically motivating work maintains the satisfaction and interest of workers.

At the time of writing, many companies are scrambling to find uses for generative tools to add to their offerings. Discussion of alignment of these tools with human needs is necessarily retrospective, as they were not built to directly address human needs, but rather because developers could, following a trajectory of supervised and unsupervised machine learning to maximize specific performance indicators (such as translation of decomposed sentences, text summarisation, or semantic evaluation). This means that we are gradually finding out what they are capable of, such as solving complex tasks in mathematics, coding, or vision, while occasionally failing at basic arithmetic or miscounting words (Bubeck et al., 2023). Research so far indicates that translation quality from generative tools is competitive with NMT for well-resourced languages (Hendy et al., 2023), with promising consideration of context (Castilho et al., 2023) and automatic translation evaluation (Kocmi & Federmann, 2023). We are likely to see these tools replace NMT for some use cases. NMT’s problems of hallucinations (inexplicable incorrect output) and bias (gender or racial) affect generative tools too. A concern is that hype and too little concern about ethics and sustainability will lead to the use of AI tools in inappropriate circumstances. Literacy about how they work and the data that drives them will be increasingly important. The problem remains that the internal workings of machine learning systems are opaque, meaning that we can’t interrogate choices and decisions from systems.

The Pirate Publisher. From Puck Magazine in 1886 around the time of the Berne Convention. Illustration by Joseph Ferdinand Keppler, Restoration by Adam Cuerden.

Joss Moorkens,

a personal account

Joss Moorkens is an Associate Professor at the School of Applied Language and Intercultural Studies in Dublin City University (DCU), leader of the Transforming Digital Content group at the ADAPT Centre, and member of DCU's Institute of Ethics, and Centre for Translation and Textual Studies. He has authored over 50 articles, chapters, and papers on translation technology, evaluation, and ethics. He is General Co-editor of Translation Spaces journal and coauthor of Translation Tools and Technologies (Routledge 2023). He jointly leads the Technology working group as a board member of the European Masters in Translation network.

Finally, generative tools are based on huge amounts of data, much of it scraped from the internet. Web data has been widely used for NMT training for many years, but the hype around generative tools has stirred up new interest in training data, particularly from authors and artists who are unhappy with their text and images being used. The reuse of data for MT or for text generation is difficult to reverse engineer, as texts are broken down to words and subword chunks in NMT training, making the output difficult to recognise. However, recent work by Chang et al. (2023) found that ChatGPT and GPT4 had been trained on copyrighted material from books. While web scraping is generally considered acceptable by developers, there are likely to be many publishers who will try to block the use of their materials as training data. The automatic generation of text is likely to cause further data issues. How much new web content will be automatically generated? Differentiating useful from useless data will be difficult. It’s been a problem for MT developers scraping multilingual data from the web for some time. Can we avoid the ouroboros effect of feeding AI tools their own output?

These are not the only challenges in integrating generative tools into our lives and work. But they have been part of the translation industry since at least 2016 and addressing them has been painful at times. Portions of the industry have been hollowed out, leading to claims of a ‘talent crunch’ in subtitling, for example, where pay rates have dropped and many talented workers are leaving the industry (Deryagin et al., 2021). It would be disappointing (if not entirely surprising) to see the same mistakes made in other fields. As we discover more about the abilities of generative tools and their capabilities improve, there should be great opportunities for their ethical use. To worry about unethical uses or their potential to widen the growing digital divides is not to ignore current and future capabilities. To quote Meadows (2008, pp. 169–170), the hope, ultimately, is that we might discover how the system’s “properties and our values can work together to bring forth something much better than could ever be produced by our will alone”

Joss Moorkens
Associate Professor, Dublin City University, Ireland
joss.moorkens@dcu.ie
twitter @jossmo
mastodon @joss@mastodon.ie

References

Bubeck, S., Chandrasekaran, V., Eldan, R., Gehrke, J., Horvitz, E., Kamar, E., Lee, P., Lee, Y. T., Li, Y., Lundberg, S., Nori, H., Palangi, H., Ribeiro, M. T., & Zhang, Y. (2023). Sparks of Artificial General Intelligence: Early experiments with GPT-4.

Castilho, S., Mallon, C., Meister, R., & Yue, S. (2023, June). Do online machine translation systems care for context? What about a GPT model? 24th Annual Conference of the European Association for Machine Translation (EAMT 2023), Tampere, Finland. https://doras.dcu.ie/28297/

Chang, K. K., Cramer, M., Soni, S., & Bamman, D. (2023). Speak, Memory: An Archaeology of Books Known to ChatGPT/GPT-4.

Deryagin, M., Pošta, M., & Landes, D. (2021). Machine Translation Manifesto. Audiovisual Translators Europe. https://avteurope.eu/avte-machine-translation-manifesto/

Felten, E. W., Raj, M., & Seamans, R. (2023). Occupational Heterogeneity in Exposure to Generative AI. SSRN Electronic Journal. https://doi.org/10.2139/ssrn.4414065

Frey, C. B., & Osborne, M. A. (2017). The future of employment: How susceptible are jobs to computerisation? Technological Forecasting and Social Change, 114, 254–280. https://doi.org/10.1016/j.techfore.2016.08.019

Guerberof-Arenas, A., & Moorkens, J. (2023). Ethics and Machine Translation: The End User Perspective. In Towards Responsible Machine Translation: Ethical and Legal Considerations in Machine Translation (Vol. 4). Springer.

Hendy, A., Abdelrehim, M., Sharaf, A., Raunak, V., Gabr, M., Matsushita, H., Kim, Y. J., Afify, M., & Awadalla, H. H. (2023). How Good Are GPT Models at Machine Translation? A Comprehensive Evaluation. arXiv. https://doi.org/10.48550/ARXIV.2302.09210

Kocmi, T., & Federmann, C. (2023). Large Language Models Are State-of-the-Art Evaluators of Translation Quality. arXiv. https://doi.org/10.48550/ARXIV.2302.14520

Meadows, D. (2008). Thinking in Systems (D. Wright, Ed.). Chelsea Green Publishing.

Way, A. (2013). Traditional and Emerging Use-Cases for Machine Translation. 12

Special Issue - Generative tools using large language models and translation

Newsletter