This research introduces RTP-LX, a dataset aimed at evaluating language models for toxicity in 28 languages. This corpus, enriched through participatory design, aims to tackle challenges in detecting culturally-specific toxic content, ensuring safer deployments of multilingual models globally.
This study is crucial as it addresses the increasing necessity for models that can operate safely across different languages and cultures. It sets a foundation for future works to build upon, ensuring that as technology reaches a global audience, it remains inclusive and safe.