Artificial intelligence models from some of the biggest tech companies are falling short of European regulations in critical areas such as cybersecurity resilience and discriminatory output, according to a new report. As the European Union prepares to enforce its wide-reaching AI Act, these shortcomings could lead to significant fines and reputational damage.
The Push for AI Regulation
The EU has been discussing AI regulations for years, but the release of OpenAI’s ChatGPT in late 2022 intensified public debate over the potential risks of artificial intelligence. As a result, lawmakers quickly moved to draft specific rules governing “general-purpose” AI models (GPAI), like ChatGPT and others.
With the AI Act set to take effect in stages over the next two years, European officials are closely monitoring compliance. A new tool developed by Swiss startup LatticeFlow AI, alongside research partners ETH Zurich and Bulgaria’s INSAIT, is helping test AI models against these upcoming regulations.
LatticeFlow’s AI Compliance Testing
LatticeFlow’s framework assigns scores to AI models based on dozens of categories, such as technical robustness and safety, with scores ranging from 0 to 1. Recent tests evaluated AI models from companies like Meta (META), OpenAI, and Alibaba, with most models scoring 0.75 or higher.
Despite these relatively high scores, LatticeFlow’s “Large Language Model (LLM) Checker” identified several areas where models still struggle. For example, OpenAI’s “GPT-3.5 Turbo” received a low score of 0.46 in discriminatory output, an issue that reflects biases in gender, race, and other areas. Meta’s “Llama 2 13B Chat” also scored low (0.42) in preventing cyberattacks like “prompt hijacking.”
Regulatory Consequences
Under the AI Act, companies found non-compliant could face fines of up to 35 million euros ($38 million) or 7% of their global annual revenue. The LLM Checker provides an early look at where companies need to improve to avoid these penalties.
While the AI Act is still being finalized, experts are crafting a code of practice that will govern the use of generative AI models, with formal rules expected by spring 2025. LatticeFlow’s CEO, Petar Tsankov, emphasized that the test results offer valuable insights and a roadmap for companies to fine-tune their models.
“The EU is still working out all the compliance benchmarks, but we can already see some gaps in the models,” Tsankov said.
A Path to Compliance
Although the European Commission cannot verify external tools, it has acknowledged LatticeFlow’s work as an essential step in implementing the AI Act. This tool provides developers with a way to test their models’ compliance and prepare for the upcoming regulatory changes.
With compliance standards tightening, AI developers will need to focus on improving their models to align with the EU’s technical requirements.