the crazy part in Tingbo He's paper that I missed is that Ascend isn't going to use LogicFolding until 2030s. They don't have the tools to design their NPUs in 3D. Frankly this looks very bad for them, so bad I think it'll be revised.
Huawei technical paper reveals Ascend AI roadmap delays 3D LogicFolding to the 2030s, relying on 2.5D packaging
Story Overview
A Huawei technical paper sketches an Ascend AI accelerator path that keeps near-term chips on chiplets and 2.5D packaging through 2026 while shifting denser 3D LogicFolding techniques to roughly 2030, with wafer-scale designs eyed for that later window.
Smartphone chips may adopt the folding approach years earlier than AI accelerators.
Separate reports place first commercial LogicFolding silicon in Kirin smartphone SoCs targeted for fall 2026, creating a split timeline where consumer devices test the architecture well before data-center parts.
Power and density numbers for the 2030 parts rest on limited public detail.
Social posts cite 30 kW single-chip draw and transistor density above 400 Mtr/mm² for the planned wafer-scale engines, yet broader reporting has not confirmed those exact figures or the intermediate yield assumptions behind them.
Positive users are amazed by the explosive computing power in Huawei's planned 30KW wafer-scale AI chips, while negative users criticize the extreme energy consumption.
Most Activity
ok what the hell. I missed this completely So, Huawei expects to deploy LogicFolding Ascends in 2030/31, and have >400 Mtr/mm^2, *and* have single-chip power draw of 30KW. I think this must mean heavy design for yield and, likely, Cerebras-style wafer-scale engines.
What's the largest WSE-3-based cluster? I see there were plans for max 2048 "systems" (ie wafers). That's 47 MW, plus change. Huawei 950 SuperCluster is likely >500 MW, 960 (2027) >1 GW? And they want >30 KW *chips* by 2030. Imagine. 32K wafers, working as one.
WSE-2 consumes 20kW while WSE-3 consumes 23kW
WSE-2 consumes 20kW while WSE-3 consumes 23kW
ok what the hell. I missed this completely So, Huawei expects to deploy LogicFolding Ascends in 2030/31, and have >400 Mtr/mm^2, *and* have single-chip power draw of 30KW. I think this must mean heavy design for yield and, likely, Cerebras-style wafer-scale engines.
@zephyr_z9 WSE-3 is on 5nm If they pursue same effective density of both logic and memory via LF, this would be roughly in line
WSE-2 consumes 20kW while WSE-3 consumes 23kW

@teortaxesTex W2W HB is easier than D2D or D2W

@teortaxesTex I think they’ll need that much time to figure out the thermals? That’s a big challenge with 3D stacking logic, and with a smartphone SoC it’s maybe manageable.

@zephyr_z9 Yes, but this is a very ambitious design on its own terms I guess it might not be that hard though. Something like SRAM on the bottom, diamond with microchannels on top, they could cool it.

@teortaxesTex I want to see the cooling plans. Cryogenic possibly

@teortaxesTex In addition, they are carrying out major internal reforms to the fundamental architecture of optical interconnects, and we should start seeing the results in compute card clusters over the next few years. All optical media are manufactured by their own internal departments.

@YGWEvan Just in September, 970 was not finalized. I don't think precommitting to 980/990 designs now is prudent.

@teortaxesTex 卧槽这算力要炸了

@teortaxesTex 华为这数据真是炸裂了

@teortaxesTex 这能耗简直是在烧掉一个小国家

@teortaxesTex it means around 2028/29, a realistic goal.

@teortaxesTex The Ascend 960 has already completed tape-out, and the designs for the 970 and 980 are finished. Logic folding will only be introduced with the Ascend 990 in 2029, which means you’ll see a massive leap in single-chip performance after 2029.

@Pranavenstein LogicFolding on Kirins promises better thermals I don't think 5 years are needed to figure this out but, maybe that's a part

@teortaxesTex WSE-4 specs