Social Media

Light
Dark

Patronus AI conjures up an LLM evaluation tool for regulated industries

When two seasoned AI experts, both with prior experience at Meta researching responsible AI, joined forces, something remarkable occurred. Last March, the founders of Patronus AI united to create a solution aimed at assessing and testing large language models, with a particular focus on regulated industries that have little room for error.

Rebecca Qian, the company’s CTO, had previously spearheaded responsible NLP research at Meta AI, while her co-founder and CEO, Anand Kannappan, had played a key role in developing explainable ML frameworks at Meta Reality Labs. Today, their startup is making significant strides by emerging from stealth mode, making their product widely available, and unveiling a $3 million seed funding round.

Patronus AI finds itself in the right place at the right time, offering a managed service in the form of a security and analysis framework for evaluating large language models. Their goal is to pinpoint potential problem areas, particularly the risk of hallucinations where the model generates erroneous answers due to insufficient data.

“In our product, we aim to fully automate and scale the evaluation process for models and alert users when we detect issues,” Qian explained.

This process involves three key steps. First, they provide users with real-world scenario scores, assessing models against critical criteria such as hallucination probability. Next, the product generates test cases automatically, including adversarial test suites, and subjects the models to stress tests. Finally, it benchmarks models using various criteria, tailored to specific requirements, to determine the most suitable model for a given task. “We compare different models to help users identify the best model for their specific use case. For example, one model might have a higher failure rate and more hallucinations compared to another base model,” she added.

Patronus AI primarily focuses on highly regulated industries where incorrect responses can have severe consequences. “We assist companies in ensuring the safety of the large language models they employ. We detect instances where these models produce sensitive business information and inappropriate outputs,” Kannappan explained.

The startup’s ultimate objective is to serve as a trusted third party for model evaluations. “While it’s easy for someone to claim that their large language model is the best, an unbiased, independent perspective is necessary. That’s where we come in. Patronus is the symbol of credibility,” he emphasized.

Currently, the company boasts six full-time employees, with plans to expand its workforce in the coming months to keep pace with the rapidly evolving landscape. Qian emphasized that diversity is a fundamental pillar of their company culture. “Diversity is a core value for us, starting at the leadership level at Patronus. As we grow, we’re committed to implementing programs and initiatives to ensure an inclusive workplace,” she noted.

Today’s $3 million seed funding round was led by Lightspeed Venture Partners, with participation from Factorial Capital and other industry experts.

Leave a Reply

Your email address will not be published. Required fields are marked *