How Marco Baglioni Built and Scaled Aqrate Using AI to Transform Translation Quality and Productivity

1. Can you briefly introduce yourself and your AI business?

I’m Marco Baglioni, Founder and CEO of Aqrate. We build AI-powered solutions for the localization and translation industry.

As machine translation adoption accelerates globally, the challenge is no longer fluency—it’s accuracy and trust. Modern LLM-based engines produce translations that sound natural even when they’re subtly wrong, making errors harder and more time-consuming to detect.

That’s why we developed LanguageCheck.ai: a quality-evaluation platform designed to assess both machine and human translations. Instead of giving a generic score, we provide translators and language teams with actionable, line-by-line insights, highlighting exactly where and why a translation needs correction.

Our customers—translators, translation companies, and enterprises—use LanguageCheck to focus effort where it matters, reduce review time, and increase productivity without compromising quality.

2. What was the original insight that made you believe this AI business was worth building?

The initial insight came from observing how LLMs actually perform best. While most attention was on content generation, we noticed—early on—that LLMs are often more reliable when evaluating and comparing existing content than when creating it from scratch.

About three years ago, we began experimenting with AI-based translation quality evaluation. In 2024, we applied this internally to help customers clean their translation memories—large repositories of legacy translations that directly affect future output quality.

When we shared detailed evaluation reports with the vendors responsible for revising those translations, something interesting happened: they started asking if they could use the same system for their day-to-day translation work.

That was the moment we knew this wasn’t just an internal capability—it was a real productivity lever for the entire industry. Instead of abstract quality scores, we focused on explanations and guidance that translators could immediately act on. That shift made the opportunity very real.

Also Read: How Tim Cakir Scaled AI Operator Using AI Agents Across Sales, Ops, and Finance Day to Day

3. How did you validate demand for your AI product before fully committing to it?

Validation happened through real usage, not assumptions.

Over a three-year period, we deployed early versions of the technology across multiple customer projects. This allowed us to observe how the tool performed under real constraints—different language pairs, domains, and quality levels.

In parallel, we involved professional linguists and translators at every iteration. After each major update, we gathered structured feedback on accuracy, usefulness, and trust.

Demand wasn’t theoretical: customers kept asking for broader access, more control, and standalone availability. That consistent pull from real users gave us confidence to invest fully and productize the solution.

4. At what point did your AI product move from experimentation to a real business?

There were two clear phases.

The first phase began in 2023, when the tool was primarily used internally and with select customers to improve translation memory quality. This validated the core technology but wasn’t yet a product.

The second phase started in 2024, when we made a deliberate decision to build a dedicated solution for translators and language service providers. We released a beta version of LanguageCheck.ai in May 2025 and officially launched in October 2025, after several months of paying customers and repeat usage.

That transition marked a mindset shift—from experimentation to accountability. Our priorities moved toward reliability, usability, support, and scalability, not just technical performance.

5. What were the biggest technical challenges you faced early on, and how did you overcome them?

The main challenges were hallucinations, false positives, inconsistent accuracy, and even language-output issues.

Solving them wasn’t about a single breakthrough—it required a deep understanding of how LLMs behave in evaluation tasks. We invested heavily in both theoretical study and empirical testing, analyzing failure cases as carefully as successes.

The key was finding the right balance: maximizing meaningful insights while keeping false positives at an acceptable level. That balance is never static, but our structured approach allowed us to continuously improve reliability without over-engineering.

6. How did you decide what to build in-house versus what to rely on from existing AI platforms or models?

From the beginning, we recognized that LLM ecosystems were evolving extremely fast—sometimes every few months.

Rather than locking ourselves into a rigid in-house model stack, we focused on owning the evaluation logic, workflows, and user experience, while leveraging best-in-class LLMs through APIs. The real limitations weren’t in how models were trained, but in the current state of the technology itself.

This approach gave us speed, flexibility, and scalability—while allowing us to adapt as models improve without rebuilding our core platform.

7. What was your initial go-to-market strategy, and how has it evolved since launch?

Our initial go-to-market strategy focused on enterprise customers looking to improve the quality of their existing translation assets.

However, as we collaborated with their vendors—translation companies and freelancers—we saw unexpected interest from the supply side. Translators immediately recognized how the tool could reduce review time and increase throughput.

That insight led us to pivot: we expanded our focus to directly serve translators and language service providers, tailoring the product to their workflows and productivity needs. That shift significantly accelerated adoption.

8. What has been the hardest part of turning AI capability into a clear business value proposition?

In our case, explaining the value was relatively straightforward because the outcome is tangible.

LanguageCheck doesn’t ask customers to “trust the AI blindly.” It shows them where issues exist, why they matter, and what to fix. The ability of AI to understand meaning across languages is immediately visible in the output.

By framing the tool as a decision-support system, not an autonomous judge, we made the value clear and built trust quickly with non-technical users.

9. How did you approach pricing your AI product in the early stages?

We chose simplicity and familiarity.

The translation industry already operates on a per-word pricing standard, so we aligned LanguageCheck’s pricing to the same unit of measure. This removed friction, avoided long explanations, and made ROI easy for customers to calculate.

Early on, this clarity proved far more valuable than experimenting with complex pricing models.

10. What key metrics do you track to measure whether your AI business is truly growing?

We focus on metrics that reflect real usage and value, not vanity numbers.

We track the number of active users, user segments, processed word volume, and language combinations. These metrics help us understand adoption depth, workflow integration, and where the product delivers the most value.

Viewing these indicators from multiple angles allows us to validate strategy, detect shifts in behavior, and adjust priorities quickly.

11. How did you build your team around an AI-centric product?

One of our advantages was already having strong internal teams in both software engineering and linguistics.

This combination was critical during development and testing—it allowed rapid iteration with domain expertise built in. As we look ahead, our focus is on selectively expanding the team to strengthen scalability, product development, and customer support, without diluting quality or culture.

12. What mistakes did you make while building or scaling your AI business that taught you the biggest lessons?

Rather than major strategic mistakes, our challenges were mostly around prompt design and system behavior under edge cases.

Thanks to our long experience in the translation market and a strong research mindset, we avoided many common pitfalls. Interestingly, some issues we encountered during testing were later documented in academic papers months afterward, which validated our cautious approach.

The biggest lesson was to treat AI as a probabilistic system—not deterministic software—and design processes accordingly.

13. How do you balance rapid innovation with reliability and trust in your AI product?

Absolute accuracy and consistency in AI systems are unrealistic today. The real challenge is defining a performance threshold where users gain net positive value.

Initially, we were cautious about investing heavily in AI-dependent products that could be overtaken by model evolution. Over time, as our understanding deepened, decision-making became clearer—even while accepting that not all variables can be controlled.

Trust comes from transparency, continuous improvement, and setting correct expectations—not from promising perfection.

14. What does scaling an AI business look like in practice beyond just adding more users?

Scaling starts with infrastructure.

At Aqrate, we already had experience building highly scalable platforms. As early as 2015, we were among the few companies running cloud-native architectures on AWS and GCP.

For LanguageCheck, we followed the same philosophy: cloud-first, API-driven, and modular. While some advocated for on-premises AI solutions, we chose APIs because they offer the fastest path to scale with minimal capital risk.

That decision allowed us to grow without large upfront investments, even during periods when AI platforms still had processing limits.

Must Read: OpenAI’s GDPval Test Examined 44 Key Human Jobs — See How Close AI Models Came to Expert-Level Work

15. What advice would you give to founders who want to build an AI business today but are unsure where to start?

Start with the problem, not the technology.

AI today is powerful but imperfect—it’s not fully reliable, not consistent, and still makes mistakes. The key is finding use cases where those imperfections don’t negate the value, but instead still create a meaningful advantage.

Understand your market deeply, know your users’ real constraints, and design AI as a support system—not a replacement for judgment. That’s where sustainable businesses are built.