AI Middleware could become your company's biggest insider threat

By Matthew Griffin Security and Privacy 23rd July 2024

WHY THIS MATTERS IN BRIEF

Third party AI’s that act as go betweens between your organisation and other AI’s around the world could be a hackers dream come true.

Love the Exponential Future? Join our XPotential Community, future proof yourself with courses from XPotential University, read about exponential tech and trends, connect, watch a keynote, or browse my blog.

A new concept is emerging in the Large Language Models (LLMs) Ops workflow – Artificial Intelligence (AI) proxy middleware which, as with everything that can be hacked and manipulated introduces yet another cyber threat into the AI models and environments that companies are becoming increasingly reliant on.

Accenture on boarded over 150,000 new hires in the Metaverse

AI proxies are services that stand between an application and the model inference provider, such as OpenAI or Hugging Face. They are responsible for consolidating important steps in the Generative AI developer workflow, including calling different models (LLaMA, GPT*, Mixtral) with a single API, monitoring usage, latency, cost, caching and throttling of inference requests, and managing API keys for inference providers.

The Future of AI and Cyber Security, by Keynote Matthew Griffin

AI proxy middleware sits between applications and model inference providers, yet this architecture might be the wrong solution for solving valid problems. Instead, the capabilities provided by these middlewares could be handled more gracefully by frameworks and protocols that avoid middleware altogether.

The problems AI proxies aim to solve are indeed significant: separating concerns, decoupling model-specific logic from application code, enabling applications to invoke different models with a consistent API surface, monitoring generative AI usage, latency, and cost, caching inference requests, and managing throttling and API keys for different inference providers. However, introducing a proxy middleware creates additional challenges.

Technology and the future of privacy

A monolithic design incorporates monitoring, observability, and caching, which are already well-established concepts in the software development workflow, with dedicated systems for each. This extra service layer introduces a security risk, requiring encrypted requests and user-specific data. The proxy causes two hops to the LLM provider, potentially degrading performance and reducing debuggability. It does not support local models, which are becoming increasingly important as models get smaller and more efficient. Moreover, many proxy middlewares are closed managed services from third-party providers, creating a critical external dependency without a failover strategy.

An open source AI framework and storage format can replace the AI proxy layer, providing a uniform API while connecting to relevant services to handle monitoring, caching, and key management separately. AIConfig, a config-driven framework, manages prompts, models, and inference settings as JSON-serializable configs. These configs can be version controlled, evaluated, monitored, and edited in a notebook-like playground, integrating directly into the developer workflow.

AIConfig observes that prompts, models, and inference settings should be saved as config, not code, and that a common storage format, model-agnostic and multi-modal, allows for straightforward switching between different models. Breaking down a monolithic service into its constituent parts allows for the use of existing service providers for inference, monitoring, caching, and KMS. AIConfig stores and iterates on prompts separately from application code, providing a uniform API surface across any model and modality.

The world's first living robots were created by an AI and a supercomputer

The framework offers callback handlers for usage tracking, integrating monitoring for generative AI into existing application monitoring services. Solutions like GPTCache for semantic caching can be integrated straightforwardly with a framework instead of a proxy. Existing KMS services can manage inference endpoint keys, addressing API key management.

In addition to these capabilities, frameworks enable critical generative AI workflow elements for building production applications. Evaluation is supported through dedicated config artifacts, defining evals and triggering eval runs as part of CI/CD whenever the config changes. A framework allows for collapsing experimentation and productionization into a single workflow, enabling local experimentation in a notebook-like playground for visual editing and rapid prototyping. Governance and version control ensure reproducibility and provenance of the generative AI components of an application, making AIConfig a comprehensive solution for modern AI development.

Matthew Griffin / About Author

Matthew Griffin is a multi-award winning Futurist and expert in Disruption and Innovation, Geopolitics, Leadership, and Technology, who NASA have described as a "walking encyclopaedia of the future" and a "futurist Polymath." 15-time best selling author of the "Codex of the Future" series, Matthew is the Founder and Futurist in Chief of the 311 Institute, a global Futures and Deep Futures advisory firm working with royal households, world leaders, G7, G20, and G77 governments, NGOs, and multi-national mid and mega cap firms to help them explore, shape, and lead the next 50 years of business and society.

An award-winning YouTube creator with over a million followers, with an unrivalled global reach and impact, Matthew is a highly sought-after international keynote speaker, lecturer, and mentor who collaborates with global leaders through the United Nations Alliance of Civilizations (UNAOC) and United Nations General Assembly (UNGA) to shape pivotal initiatives such as the UN’s AI for Humanity program, the United Nations Conference of the Parties (UN COP), and the World Economic Forum in Davos.

As the former Global Head of Cloud, National Security, and Enterprise Sales for companies including Atos, Dell-EMC, and IBM, Matthew has a proven track record of building multi-billion dollar business units and turning failing divisions into market leaders. His ability to identify, analyse, and communicate the implications of hundreds of emerging technologies and trends is unparalleled, and his insights are trusted by many of the world’s most respected organisations, including ABB, Accenture, Adidas, AON, ARM, BCG, Centrica, Citi, Coca-Cola, Dentons, Deloitte, Dow Jones, EY, Google, KPMG, Lego, Legal & General, LinkedIn, Microsoft, PepsiCo, Qualcomm, RWE, Samsung, Siemens AG and Siemens Energy, T-Mobile, UBS, VISA, Walmart, Workday, Worldpay and many others.

Regularly featured in the global media including the AP, BBC, Bloomberg, CNBC, Discovery, Forbes, Khaleej Times, Telegraph, TIME, ViacomCBS, WIRED, and the WSJ, Matthews mission is to help organisations create a fair and sustainable future whose benefits are shared by everyone irrespective of their ability, background, or circumstances.