Zhongzhu / Charlie
Home
Research
Publication
Experience
Recent News
Blog
CV
↗
Tag
#
Pipeline Parallelism
15 posts tagged with this label. Back to
all tags
or the
main feed
.
2026
07-02
EN
Tangram: Hiding GPU Heterogeneity for Efficient LLM Parallelization
07-02
中
Tangram:为异构GPU集群隐藏硬件差异的高效LLM并行化系统
06-25
EN
ReMP: Low-Downtime Runtime Model-Parallelism Reconfiguration for LLM Serving
06-25
中
ReMP:LLM 推理服务中的低停机运行时并行拓扑重配置
06-11
EN
MegaScale: Engineering 55% MFU at 12,288 GPUs for LLM Training
06-11
中
MegaScale:ByteDance 如何在 12,288 块 GPU 上实现 55% MFU 的大规模 LLM 训练
05-20
EN
Sarathi-Serve: Taming the Throughput–Latency Tradeoff in LLM Inference — Technical Review
05-20
中
Sarathi-Serve:用 chunked-prefill 驯服 LLM 推理的吞吐-延迟权衡 —— 阅读笔记
05-14
EN
DisagMoE: Disaggregating Attention and FFN to Beat the MoE All-to-All Bottleneck
05-14
中
DisagMoE:用解耦 Attention 和 FFN 打通 MoE 训练的 all-to-all 瓶颈
05-07
EN
Piper: Efficient Large-Scale MoE Training via Resource Modeling and Pipelined Hybrid Parallelism
04-16
EN
PipeDream: Turning Pipeline Parallelism into a Practical Training System — Deep Technical Review
04-16
中
PipeDream:把 Pipeline Parallelism 做成真正可训练系统——深度阅读笔记
04-02
EN
GPipe: Easy Scaling with Micro-Batch Pipeline Parallelism — In-Depth Technical Review
04-02
中
GPipe:微批次流水线并行的大规模模型训练 — 深度阅读笔记