Zhongzhu / Charlie

Home Research Publication Experience Recent News Blog CV ↗

Zhongzhu / Charlie Zhou

Keep

200 Posts 25 Tags

© 2019 - 2026 Zhongzhu Zhou

Tag

#Pipeline Parallelism

15 posts tagged with this label. Back to all tags or the main feed.

2026

07-02 EN

Tangram: Hiding GPU Heterogeneity for Efficient LLM Parallelization
07-02 中

Tangram：为异构GPU集群隐藏硬件差异的高效LLM并行化系统
06-25 EN

ReMP: Low-Downtime Runtime Model-Parallelism Reconfiguration for LLM Serving
06-25 中

ReMP：LLM 推理服务中的低停机运行时并行拓扑重配置
06-11 EN

MegaScale: Engineering 55% MFU at 12,288 GPUs for LLM Training
06-11 中

MegaScale：ByteDance 如何在 12,288 块 GPU 上实现 55% MFU 的大规模 LLM 训练
05-20 EN

Sarathi-Serve: Taming the Throughput–Latency Tradeoff in LLM Inference — Technical Review
05-20 中

Sarathi-Serve:用 chunked-prefill 驯服 LLM 推理的吞吐-延迟权衡 —— 阅读笔记
05-14 EN

DisagMoE: Disaggregating Attention and FFN to Beat the MoE All-to-All Bottleneck
05-14 中

DisagMoE：用解耦 Attention 和 FFN 打通 MoE 训练的 all-to-all 瓶颈
05-07 EN

Piper: Efficient Large-Scale MoE Training via Resource Modeling and Pipelined Hybrid Parallelism
04-16 EN

PipeDream: Turning Pipeline Parallelism into a Practical Training System — Deep Technical Review
04-16 中

PipeDream：把 Pipeline Parallelism 做成真正可训练系统——深度阅读笔记
04-02 EN

GPipe: Easy Scaling with Micro-Batch Pipeline Parallelism — In-Depth Technical Review
04-02 中

GPipe：微批次流水线并行的大规模模型训练 — 深度阅读笔记

Zhongzhu Zhou / Charlie Zhou

Efficient machine learning, systems and research notes.

© 2019 - 2026 Zhongzhu Zhou · All rights reserved.

Where readers visit from

Visitor map