UnbIAs - Google Translate Gender Bias Mitigation and Addition of Genderless Translation

Authors

DOI:

https://doi.org/10.22456/2175-2745.139902

Keywords:

Language Models, Gender bias, Google Translator, Constrained Beam Search, Non-binary gender

Abstract

Machine Learning, increasingly present in everyday life, is subject to bias. These biases not only reflect social inequalities but also reinforce them. The present study seeks to mitigate gender bias in Google Translate, the most used translation system in the world. For this, we created a translation model with high gender accuracy and performed a linguistic analysis with the spaCy tool and entity identification with roBERTa. The Constrained Beam Search technique is used to maintain the sentence structure of the business model, but with the replacement for the correct genre indicated by the created model. The final sentence is the result of an alignment done with the SimAlign tool. In addition, the present study also produces an algorithm so that sentences without gender indication in English present translations with inflection for feminine, masculine, and neutral gender. Our approach yields a BLEU score of 48.39. In relation to Google Translate, the model increased gender accuracy from 68.75 to 70.09, enhanced in 15.7% the score that measures the difference in accuracy between male and female entities, and improved stereotyped translations in 43%.

Downloads

Download data is not yet available.

References

OLTEANU, A.; DIAZ, F.; KAZAI, G. When are search completion suggestions problematic? Proceedings of the ACM on Human-Computer Interaction, ACM New York, NY, USA, v. 4, n. CSCW2, p. 1–25, 2020.

HUANG, S. et al. Artificial intelligence in cancer diagnosis and prognosis: Opportunities and challenges. Cancer letters, Elsevier, v. 471, p. 61–71, 2020.

ZEISER, F. A. et al. Deepbatch: A hybrid deep learning model for interpretable diagnosis of breast cancer in whole-slide images. Expert Systems with Applications, Elsevier, v. 185, p. 115586, 2021.

MEHRABI, N. et al. A survey on bias and fairness in machine learning. ACM Computing Surveys (CSUR), ACM New York, NY, USA, v. 54, n. 6, p. 1–35, 2021.

DASTIN, J. Amazon scraps secret ai recruiting tool that showed bias against women. In: Ethics of Data and Analytics. [S.l.]: Auerbach Publications, 2018. p. 296–299.

SKEEM, J. L.; LOWENKAMP, C. T. Risk, race, & recidivism: Predictive bias and disparate impact.(2016). Criminology, v. 54, p. 680, 2016.

METZ, R. Microsoft’s neo-nazi sexbot was a great lesson for makers of ai assistants. Artificial Intelligence, 2018.

HASSANI, B. K. Societal bias reinforcement through machine learning: a credit scoring perspective. AI and Ethics, Springer, v. 1, n. 3, p. 239–247, 2021.

PRATES, M. O.; AVELAR, P. H.; LAMB, L. Assessing gender bias in machine translation–a case study with google translate. National Communication Association (NCA), 2018.

AI, H. High-level expert group on artificial intelligence. [S.l.]: European Commission. Available at: https://digital-strategy.ec.europa.eu/en/library/ethics-guidelines-trustworthy-ai, 2019. 6 p.

ZHAO, J. et al. Gender bias in coreference resolution: Evaluation and debiasing methods. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 2 (Short Papers). New Orleans, Louisiana: Association for Computational Linguistics, 2018. p. 15–20. Disponível em: ⟨https://aclanthology.org/N18-2003⟩.

BOLUKBASI, T. et al. Man is to computer programmer as woman is to homemaker? debiasing word embeddings. Advances in neural information processing systems, v. 29, 2016.

VANMASSENHOVE, E.; HARDMEIER, C.; WAY, A. Getting gender right in neural machine translation. In: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. [S.l.: s.n.], 2018. p. 3003–3008.

SAUNDERS, D.; BYRNE, B. Reducing gender bias in neural machine translation as a domain adaptation problem. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. [S.l.: s.n.], 2020. p. 7724–7736.

SAUNDERS, D.; SALLIS, R.; BYRNE, B. First the worst: Finding better gender translations during beam search. In: Findings of the Association for Computational Linguistics: ACL 2022. [S.l.: s.n.], 2022. p. 3814–3823.

SAUNDERS, D. Domain adaptation for neural machine translation. Tese (Doutorado) — University of Cambridge, 2021.

SUN, T. et al. Mitigating gender bias in natural language processing: Literature review. Association for Computational Linguistics (ACL 2019), 2019.

BOLUKBASI, T. et al. Man is to computer programmer as woman is to homemaker? debiasing word embeddings. Advances in neural information processing systems, v. 29, p. 4349–4357, 2016.

ZHAO, J. et al. Men also like shopping: Reducing gender bias amplification using corpus-level constraints. Empirical Methods of Natural Language Processing, 2017.

WERLEN, L. M.; POPESCU-BELIS, A. Using coreference links to improve Spanish-to-English machine translation. In: Proceedings of the 2nd Workshop on Coreference Resolution Beyond OntoNotes (CORBON 2017). Valencia, Spain: Association for Computational Linguistics, 2017. p. 30–40. Disponível em: ⟨https://aclanthology.org/W17-1505⟩.

HOKAMP, C.; LIU, Q. Lexically constrained decoding for sequence generation using grid beam search. arXiv preprint arXiv:1704.07138, 2017.

CHOUSA, K.; MORISHITA, M. Input augmentation improves constrained beam search for neural machine translation: Ntt at wat 2021. In: Proceedings of the 8th Workshop on Asian Translation (WAT2021). [S.l.: s.n.], 2021. p. 53–61.

LIU, Y. et al. Roberta: A robustly optimized bert pretraining approach. 2019.

SABET, M. J. et al. SimAlign: High quality word alignments without parallel training data using static and contextualized embeddings. In: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: Findings. Online: Association for Computational Linguistics, 2020. p. 1627–1643. Disponível em: ⟨https://www.aclweb.org/anthology/2020.findings-emnlp.147⟩.

LIU, Y. et al. Multilingual denoising pre-training for neural machine translation. Transactions of the Association for Computational Linguistics, v. 8, p. 726–742, 2020.

ZHANG, B. et al. Simplifying neural machine translation with addition-subtraction twin-gated recurrent networks. In: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 2018. p. 4273–4283. Disponível em: ⟨http://aclweb.org/anthology/D18-1459⟩.

LOPES, A. et al. Lite training strategies for Portuguese-English and English-Portuguese translation. In: Proceedings of the Fifth Conference on Machine Translation. Online: Association for Computational Linguistics, 2020. p. 833–840. Disponível em: ⟨https://www.aclweb.org/anthology/2020.wmt-1.90⟩.

RAFFEL, C. et al. Exploring the limits of transfer learning with a unified text-to-text transformer. J. Mach. Learn. Res., v. 21, n. 140, p. 1–67, 2020.

AUXLAND, M. Para todes: A case study on portuguese and gender-neutrality. Journal of Languages, Texts and Society, v. 4, p. 1–23, 2020.

STANOVSKY, G.; SMITH, N. A.; ZETTLEMOYER, L. Evaluating gender bias in machine translation. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. [S.l.: s.n.], 2019. p. 1679–1684.

RUDINGER, R. et al. Gender bias in coreference resolution. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 2 (Short Papers). [S.l.: s.n.], 2018. p. 8–14.

PAPINENI, K. et al. Bleu: a method for automatic evaluation of machine translation. In: . [S.l.: s.n.], 2002. p. 311–318.

CAE, G. Manual para o uso da linguagem neutra em língua portuguesa. https://drive.google.com/file/d/16BQ59w4ePbUqMAzrFwUiCsz3r9zJw9XL/view. Acesso em, v. 25, n. 07, p. 2020, 2020.

Downloads

Published

2024-09-04

How to Cite

Schenkel, V., Mello, B., Rigo, S. J., & de Oliveira Ramos, G. (2024). UnbIAs - Google Translate Gender Bias Mitigation and Addition of Genderless Translation. Revista De Informática Teórica E Aplicada, 31(2), 74–90. https://doi.org/10.22456/2175-2745.139902

Issue

Section

Regular Papers

Similar Articles

1 2 3 4 5 > >> 

You may also start an advanced similarity search for this article.