Practice of large language model training optimization based on large-scale AI cluster with more than 10 000 domestic NPU

In order to solve the problems of low computing efficiency utilization, poor stability, high difficulty in training optimization, and imperfect domestic accelerator technology ecology in AI cluster model training with more than 10 000 NPU, a large language model training optimization solution based...

詳細記述

書誌詳細
出版年:Dianxin kexue
主要な著者: LOU Tao, NIU Hongweihua, ZHANG Pengfei, DONG Jiangfan, LI Panpan, LI Daotong, XU Weidong, YAO Chenghui, XUE Lianhao, TANG Ting, XIANG Jie
フォーマット: 論文
言語:中国語
出版事項: Beijing Xintong Media Co., Ltd 2025-07-01
主題:
オンライン・アクセス:http://www.telecomsci.com/zh/article/doi/10.11959/j.issn.1000-0801.2025166/