Development of core NPU technology to improve chatGPT inference performance by over 60%
The Korea Advanced Institute of Science and Technology (KAIST)Reports and Proceedings
Latest generative AI models such as OpenAI's ChatGPT-4 and Google's Gemini 2.5 require not only high memory bandwidth but also large memory capacity. This is why generative AI cloud operating companies like Microsoft and Google purchase hundreds of thousands of NVIDIA GPUs. As a solution to address the core challenges of building such high-performance AI infrastructure, Korean researchers have succeeded in developing an NPU (Neural Processing Unit)* core technology that improves the inference performance of generative AI models by an average of over 60% while consuming approximately 44% less power compared to the latest GPUs.
*NPU (Neural Processing Unit): An AI-specific semiconductor chip designed to rapidly process artificial neural networks.
- Funder
- National Research Foundation of Korea, he Institute for Information & Communications Technology Planning & Evaluation, AI Semiconductor Graduate School Support Project