Skip to content

KT Cloud Introduces AI SERV: GPU Slicing for Cost-Effective AI Inference

Source: KT Cloud

Seoul, South Korea - KT Cloud introduced a new infrastructure service, AI SERV, targeting the AI inference sector. 

The service aims to solve the economic inefficiency of using high-capacity GPUs for AI training and inference using slicing technology.

Why it matters: 

GPUs are essential but typically short-lived in the AI training phase, while AI inference requires a smaller but constant GPU presence. 

This mismatch has led to unnecessary costs when organizations use the same GPU resources for both functions.

The Key Points

  • AI SERV aims to mitigate this problem by allowing organizations to divide their GPU usage into 0.2 units every five minutes, providing financial benefits and greater flexibility. This granularity could significantly impact an organization's AI development cost structure.

  • Regarding performance, the service uses NVIDIA A100 chips and is based on NVIDIA's CUDA architecture. This results in compute speeds that are twice as fast as competing services, potentially impacting the competitive landscape in the AI infrastructure market.

  • The service can maintain 100% performance even when using slicing technology. This contrasts with existing slicing technologies, often known to degrade performance.

The Big Picture

KT Cloud has a roadmap for AI SERV that includes adding monitoring and container image cloning capabilities and an "auto-scaling" feature to automatically adjust infrastructure based on load, making it easier for users to scale.

The launch of AI SERV follows KT Cloud's previous milestones in 2022. 

It launched HAC (Hyperscale AI Computing), the first pay-as-you-go infrastructure service, and collaborated with "Rebellion" and "Moreh" to develop AI frameworks and cloud semiconductor chips. 

It launched South Korea's first high-performance, low-power NPU (Neural Processing Unit) service in June. 

KT Cloud aims to strengthen its AI infrastructure offerings further, including new NPU and GPU services, focusing on cost-effectiveness and high performance.

AI SERV brings flexibility and cost-effectiveness to the AI infrastructure market by enabling more customized resource allocation.