As Risks Multiply, Should We Redefine Quality Management?

Act 2: Moving Pictures

It’s been three years with GenAI in our lives.

You may think we understand what AI can or cannot do (at least those in the Global North think they do). But think again. There are entire regions of the world that are playing catch-up to the AI-first reality. And there are always newcomers to our industry who need guidance to set them on the right path.

Ultimately, progress will look different to every team and professional. It won’t be linear, and it may come at a cost. It may require us to redefine ourselves. In our quest onward, no one should be left behind, right?

In this section, read the articles that explore the human side of the (r)evolution unfolding around us.


Kateřina Gašová

Global Quality Solutions Strategist at Argos Multilingual

With almost 30 years in the language industry, Kateřina Gašová drives the development of language quality programs for Argos’ enterprise clients. She also owns the company’s overall language quality management strategy and helps introduce new language-related solutions.

What a time to be alive, no? This may come across as a very pro-AI question on the surface. 

We’re well advanced into the AI era, and it’s undeniable that tech is a fertile ground for discussions, controversy, fundamental questions of ethics, and reflections on the nature of language work (and the future of our jobs). 

Then there’s an explosion of content triggered by AI — good, bad, and every shade in between. And it’s not just text. Audio, video, immersive experiences and, yes, text now all coexist and seamlessly integrate across a multitude of surfaces and platforms, and are increasingly accessible to global users. Content is multimodal. Use cases number in the dozens.

And as AI-powered companies scale their content and user interactions, the risks multiply. More touchpoints, more complexity, more potential for something to go wrong.

In this world of “more is more,” what’s new in the realm of quality management?

The case for a risk-focused approach to quality management

Quality is (still) a hot topic — and a highly subjective one. Everyone has a definition, few agree, and the stakes keep rising. That’s why a shift is underway: from measuring perfection to managing risk. Companies are running pilots comparing the output of MT and LLM, asking just how hands-off they can afford to be — and whether C-level ambitions align with technological reality.

Over the past two years, localization programs have redefined what’s possible. The industry has coined new terms: agentic AI, AI-enabled LQA, automated quality scores, and developed the practice of prompt engineering to keep pace and build guardrails as we go. Quality pipelines are becoming more intricate and dynamic, and in many cases, more dependent on the machine. In parallel, we are trying to reimagine humans’ place in an increasingly automated process.

This is the most telling development: not what has been added, but what is being reconsidered. Some of the AI-first companies are now walking back their bets. The limits of automation have been tested. We’ve been busily deploying agentic solutions, and the end-users are not necessarily happy. Hallucinations remain a feature, and quality still doesn’t tick all the boxes (e.g., fluency, accuracy, etc.). It turns out automation cannot eliminate risk — it redistributes it.

This about-face gives one pause. Have these companies truly mapped the risks? Or did they unknowingly trade cost savings for the wrong kind of exposure?

In any case, these highly publicized examples of companies course-correcting are forcing us to refocus on two essential questions: What is the content, and what is it for? It’s not enough to tell an LLM this is marketing content, we need to be much more precise — is it an ad, a tagline, landing page copy? Each will come with vastly different requirements that the AI will need to take into account. 

In other words, when trusting the machine to do the work, are humans’ expectations aligned with what the content is intended for? Are the associated risks understood and reflected in the quality management framework? Is the tool producing trustworthy results?

Risk-free localization: tech-powered utopia or human-led orchestra?

The shift to a risk-based approach to quality management is not necessarily a novel concept, but rather a maturation of existing practices. You need to measure risks from the outset, especially if you have agentic workflows (where the agents are supposed to help you assess, manage, or solve the risks). You need to be precise and detailed when instructing the agent on what to do. 

With this latest wave of AI implementation, instead of chasing validation through numbers on a dashboard, teams should be asking, “What could go wrong?” 

The mapping of possible risks begins at the source. Organizations must assess the data that feeds their tools, the failures they want to avoid, and how best to prompt or fine-tune models to produce safer outputs. One way to approach this is by blending different disciplines, such as pairing a terminologist with a prompt engineer to shape inputs and expectations. Not every team has the available talent or budget for this, but the principle holds: Before teams can instruct AI on what they want, they must first understand what they cannot afford to get wrong.

With large language models, generating content on virtually any topic has become trivial. At first glance, much of it will look fine. However, the question remains, “Is the answer correct?“ The model did its job in replying to you — they are tuned to answer your prompt — but the crucial question is, can you trust the answer? From a quality standpoint, the primary concern is the trustworthiness of the information the user receives from the system. Understanding this helps to pinpoint high-risk content types and surfaces.

A risk-based quality management strategy becomes increasingly inevitable when considering the broader context of AI governance. Legislation, such as the EU AI Act, is now setting clear expectations for when and how to use AI. In some industries, air-tight quality isn’t just a goal — it’s a compliance requirement.

Quality management needs a reset. At its core, it is about controlling risks by identifying content use cases, understanding their requirements, aligning them with the specificities of the selected languages, and developing acceptance criteria of what passes for “quality” content. This clarity will help narrow AI’s focus, improve its output, and allow teams to deploy the most expensive asset — humans — where it matters. When applied well, humans not only review the risky parts of the process but also verify what is assumed to be safe. In the world of AI, quality management becomes an end-to-end process.

Multiple paths to quality results, all with a human somewhere in the loop

We have every incentive to pivot to a risk-based approach, and not just because it is mandated by legislation. In our industry, we see localization teams trying to align with stakeholder expectations, but these expectations also need to be managed effectively. 

The results of pilots are compelling evidence: Working backward from the results and clearly defining risks helps make the case that a human is still needed somewhere in the loop. In fact, I’d argue we need a broader spectrum of human skills so people can help with what goes into the AI models, and then ensure that the outputs are properly validated and verified for integrity. The public examples of companies backtracking on the AI-first experiments just show that the world is not yet ready to embrace AI-first — and that the technology needs to be managed to reach its potential. Localization is used to managing complexity, but now with AI, we‘re also managing a wider variety of risks that have previously been alien to localization.

Ultimately, there isn’t just one way to set up a quality framework (note the shift from talking about quality “programs”). Multiple pathways exist. Companies will increasingly resort to using AI for translation and editing, especially if they are pushed to produce something fast. But even then the AI will need a quality style guide and terminology to produce decent results. They publish AI-made content, which is then also flagged as generated by AI, which ties in nicely with the content labeling debate unfolding in parallel. In the meantime, critical touchpoints are revisited by humans. It‘s often only post-facto that the content receives polish and the assets are expanded and cleaned so the input data can be well-read by AI in the next run. Risk is managed by feeding AI quality assets — and letting humans test and tune the system. For now, the humans retain their role in the loop.

Read the full 132-page Global Ambitions: (R)Evolution in Motion publication featuring vital perspectives from 31 industry leaders on the ongoing AI-spurred (r)evolution.

Scroll to top