A behind-the-scenes look at Broadcom's design labs

(techbrew.com)

13 points | by giuliomagnifico 10 days ago ago

7 comments

rkagerer 2 days ago ago

Given Broadcom's history (exemplified by the VMware debacle) I expect their "design" lab resembles a boardroom full of MBA's plotting how to milk the last drop.
DiabloD3 a day ago ago

Man, given Broadcom's history, I wouldn't want their products in a datacenter I owned, purely out of spite. I don't care if Trident, Tomahawk, etc basically run the world, there are still alternatives that perform just as well.
supermatt 2 days ago ago

Probably naive question: They talk a lot about how these network switches are geared for "AI data centers" - AI is mentioned at least once in almost every paragraph. How/why would these switches for the "AI world" differ from any other high performance and low latency switch?

[-]
- DiabloD3 a day ago ago
  
  Because AI is pushing Ethernet standards exclusively for universal RDMA use.
  The minimum to get into this party is already have 400gbE (QSFP 112) or 800 (QSFP-DD 112 and QSFP 224) and already working on 800 (QSFP-DD 112) or 1600 (QSFP-DD 224).
  Broadcom doesn't belong to any of the AI-era SIGs, so they're trying to drag their networking fabric stuff up to speed to match, but they do belong to Ethernet Alliance and technically the IBTA (but the IBTA is no longer relevant since Nvidia bought Mellanox).
  The SIG they need to belong to is the UALink Consortium, which is moving past simply RoCE/iWARP-style RDMA over Ethernet to CPU bus over Ethernet. In other words, Ultra Ethernet is trying to do multi-vendor super-computer stuff like how AMD did Hypertransport over Mellanox circa 2001-2015 (and this is why Nvidia bought Mellanox btw, they wanted to deprive AMD of an advantage that they no longer needed... they had already moved to external PCI-E fabric to replace Mellanox, and thus the brand switch to Infinity Fabric).
  UALink is a socket-to-socket protocol that is PHY independent, and can use CXL (common with Intel-focused supercomputing), PCI-E (PCI over PCI-E, ie, normal non-CCNUMA hardware being babysat by the local CPU), InfinityFabric/XGMI (AMD CPU, AMD GPU), and others, while having native support for RDMA over Ultra Ethernet (200gbE and up) to glue clusters together across NUMA/UALink domains.
  The UALink Consortium was founded by Alibaba, AMD, Apple, Astera Labs, AWS, Cisco, Google, Hewlett Packard Enterprise, Intel, Meta, Microsoft and Synopsys... notice the lack of Broadcom in that list. Nvidia is also not a member of this, as they desperately want a moat to keep the rest of the industry out.
- m4rtink a day ago ago
  
  Cost 2x as much, have a sticker saying "AI" but are otherwise thew same.
- giuliomagnifico 2 days ago ago
  
  The article never said that these switch are “made for AI”, instead it says that due to the high demand of chips for AI workloads, the data centers need lots of switches to connect all the units.
  
  [-]
  - supermatt a day ago ago
    
    I never said “made for AI”. The terms I used come directly from the article - where they are literally talking about developing switches for AI data centers. My question was how/why the needs would differ.