avatar
Articles
73
Tags
17
Categories
8
Stanley Zheng
Home
Archives
Tags
Categories
LogoStanley's Blog
Search
Stanley Zheng
Home
Archives
Tags
Categories

Stanley's Blog

Reading Notes for FasterMoE
Created2025-03-28|Reading Paper
Summary Abstract & Introduction & Background and Challenges 前面又是简单介绍MoE,基本都一样。 这个也是training方向的,说了三个challenges: dynamic load imbalance 在intro里,叫Dynamic expert selection,就也比较明显,就是每次选的experts不一样。 inefficient synchronous execution mode 在intro里,叫Inefficient synchronous operations,就是expert有dependency,就需要别的worker的data,要等。 congested all-to-all communication 在intro里,叫Mismatch of model design and network topology,感觉他的意思是现在的system只管摆放experts的computation...
1…89
avatar
Stanley Zheng
Hi, I am Stanley. I am currently a CS student in the University of Wisconsin-Madison.
Articles
73
Tags
17
Categories
8
Follow Me
Announcement
This is my Blog
Recent Posts
如何更好的使用你的 coding agent2026-05-08
Reaction---WhynotTV(Danfei Xu:人类数据,行为克隆,机器人GPT-3,全栈,EgoMimic,遥操作,UMI,斯坦福)2026-05-01
Statistics---Tail Sum Formula2026-03-09
记录一下不脱产速通 GRE 3182026-02-26
Algorithm---Amortized Analysis: Accounting Method2026-02-05
Categories
  • Coding Blogs8
  • Life Blogs7
  • Reaction3
  • Reading Paper5
  • Research Blogs9
  • Study Blogs27
  • Tech Blogs13
  • 书评1
Tags
Nas Math Web Algorithm Tools notes Calculus NLP Statistics 日记 Personal Blog Website CS CV 随笔 Multimodal MLSys Operating System
Archives
  • May 2026 2
  • March 2026 1
  • February 2026 2
  • January 2026 8
  • December 2025 2
  • November 2025 1
  • October 2025 8
  • September 2025 3
Website Info
Article Count :
73
Unique Visitors :
Page Views :
Last Update :
©2019 - 2026 By Stanley Zheng
Framework Hexo 7.3.0|Theme Butterfly 5.3.5
Search
Loading Database