Sungmin Yun
My ultimate goal is to reduce the computing cost of AI models and make them more accessible to
everyone.
My research focuses on computer architectures and systems for AI, with an emphasis on the efficient serving of
large-scale AI models.
I have worked on various AI models, including recommender systems and graph neural networks, and my recent
work focuses
on large language models (LLMs).
I am a postdoctoral researcher at Seoul National University (SNU).
I received my Ph.D. in Artificial Intelligence from SNU, under the supervision of Prof.Jung Ho Ahn in the SCALE Lab.
Before pursuing Ph.D., I received a B.S. in Integrated Information Technology from Yonsei University in 2020.
Publications
-
The New LLM
Bottleneck: A
Systems
Perspective
on Latent Attention and Mixture-of-Experts
arXiv 2025
Sungmin Yun, Seonyong Park, Hwayong Nam, Younjoo Lee, Gunjun Lee, Kwanhee Kyung, Sangpyo Kim, Nam Sung Kim, Jongmin Kim, Hyungyo Kim, Juhwan Cho, Seungmin Baek, Jung Ho Ahn -
SSD
Offloading for
LLM
Mixture-of-Experts Weights Considered Harmful in Energy Efficiency
IEEE CAL 2025
Kwanhee Kyung, Sungmin Yun, Jung Ho Ahn -
COSMOS: A CXL-Based
Full In-Memory System
for
Approximate Nearest Neighbor Search
IEEE CAL 2025
Seoyoung Ko, Hyunjeong Shim, Wanju Doh, Sungmin Yun, Jinin So, Yongsuk Kwon, Sang-Soo Park, Si-Dong Roh, Minyong Yoon, Taeksang Song, Jung Ho Ahn -
Anaheim: Architecture
and
Algorithms for Processing Fully Homomorphic Encryption in Memory
HPCA 2025
Jongmin Kim, Sungmin Yun, Hyesung Ji, Wonseok Choi, Sangpyo Kim, Jung Ho Ahn -
Duplex: A Device for
Large Language Models
with
Mixture of Experts, Grouped Query Attention, and Continuous Batching
MICRO 2024
Sungmin Yun, Kwanhee Kyung, Juhwan Cho, Jaewan Choi, Jongmin Kim, Byeongho Kim, Sukhan Lee, Kyomin Sohn, Jung Ho Ahn -
CLAY:
CXL-based Scalable
NDP
Architecture Accelerating Embedding Layers
ICS 2024
Sungmin Yun, Hwayong Nam, Kwanhee Kyung, Jaehyun Park, Byeongho Kim, Yongsuk Kwon, Eojin Lee, Jung Ho Ahn -
GraNDe: Efficient
Near-Data Processing Architecture for Graph Neural Networks
IEEE TC 2023
Sungmin Yun, Hwayong Nam, Jaehyun Park, Byeongho Kim, Jung Ho Ahn, Eojin Lee -
GraNDe: Near-Data
Processing Architecture with Adaptive Matrix Mapping for Graph Convolutional Networks
IEEE CAL 2022
Sungmin Yun, Byeongho Kim, Jaehyun Park, Hwayong Nam, Jung Ho Ahn, Eojin Lee -
TRiM:
Enhancing
Processor-Memory Interfaces with Scalable Tensor Reduction in Memory
MICRO 2021
Jaehyun Park, Byeongho Kim, Sungmin Yun, Eojin Lee, Minsoo Rhu, Jung Ho Ahn