NVIDIA’s Multi-Instance GPU (MIG) technology allows for the partitioning of GPU computing power into separate hardware instances, each with complete resource isolation. Despite the benefits, the shared last-level TLB (L3 TLB) can cause performance issues in multi-tenant environments. The newly proposed STAR method dynamically adjusts TLB entry sharing to optimize address translation and minimize interference, improving performance by an average of 30.2% across various workloads.
This study addresses critical efficiency issues in modern datacenters using GPU virtualization. By improving TLB utilization, the STAR method enhances overall performance, highlighting significant implications for cloud computing and multi-tenant environments.