Loading…
Wednesday June 24, 2026 4:00pm - 4:25pm PDT
In the agentic era, deploying proprietary AI on-premises raises a critical question: how do you protect model IP when infrastructure admins have full hardware access? This session introduces Private Model as a Service (PMaaS), a production-ready reference architecture that secures AI model weights across their entire lifecycle using hardware-rooted Trusted Execution Environments (TEEs).

We dive into the technical orchestration of Confidential Containers (CoCo) and KServe to build a cryptographically verified inference pipeline with vLLM. Model weights are distributed and decrypted exclusively inside hardware-verified CPU TEEs (Intel TDX, AMD SEV-SNP) with GPU memory protection (NVIDIA H100/B200). Remote attestation via a Key Broker Service (KBS) ensures decryption keys are only released to policy-compliant, verified environments.

We also cover the challenges of running vLLM inside restricted TEEs and our work upstreaming GPU attestation logic into Kata Containers and CoCo. Attendees leave with a practical blueprint for deploying zero-trust confidential AI workloads that decouple model security from infrastructure trust.
Speakers
avatar for Marcos Entenza

Marcos Entenza

Sr. Principal Product Manager, Red Hat
Marcos Entenza, a.k.a Mak, works on the core Red Hat OpenShift Container Platform for hybrid and multi-cloud environments to enable customers to run Red Hat OpenShift anywhere. Mak is an experienced Product Manager passionate about building scalable infrastructures, and he oversees... Read More →
Wednesday June 24, 2026 4:00pm - 4:25pm PDT
Mint Ballroom

Sign up or log in to save this to your schedule, view media, leave feedback and see who's attending!

Share Modal

Share this link via

Or copy link