{"id":24310,"date":"2024-11-13T23:21:17","date_gmt":"2024-11-14T02:21:17","guid":{"rendered":"https:\/\/mastertrend.info\/?p=24310"},"modified":"2026-04-28T16:41:07","modified_gmt":"2026-04-28T19:41:07","slug":"the-secrets-of-xai-colossus","status":"publish","type":"post","link":"https:\/\/mastertrend.info\/en\/los-secretos-de-xai-colossus\/","title":{"rendered":"The Secrets of xAI Colossus: 100,000 GPUs"},"content":{"rendered":"<h2>\ud83c\udf1f The Secrets of xAI Colossus: Discover Elon Musk&#039;s 100,000 GPU AI Cluster \ud83d\ude80<\/h2>\n<p>If you&#039;re passionate about artificial intelligence and cutting-edge technology, you can&#039;t miss out on what Elon Musk is doing with his AI cluster. This tech giant, known as xAI Colossus, is making waves in the tech world. With a staggering 100,000 GPU processing power, this cluster is a true marvel of modern engineering. \ud83e\udd16\ud83d\udcbb<\/p>\n<p>In this article, we are going to unravel the secrets behind this amazing innovation. <a href=\"https:\/\/mastertrend.info\/en\/boston-dynamics-atlas\/\" data-wpil-monitor-id=\"810\" target=\"_blank\">technological<\/a>. We will explore how xAI Colossus is revolutionizing the field of artificial intelligence and what this means for the future. \ud83c\udf1f Get ready for a fascinating journey into the heart of one of the greatest feats <a href=\"https:\/\/mastertrend.info\/en\/most-valuable-company-in-the-world-2\/\" data-wpil-monitor-id=\"997\" target=\"_blank\">technological<\/a> of our time. \ud83d\ude80 Don&#039;t miss it!<\/p>\n<p>Elon Musk&#039;s expensive new project, the xAI Colossus AI supercomputer, has been detailed for the first time. YouTuber ServeTheHome was given access to the Supermicro servers inside the 100,000 GPU beast, showing off various facets of this supercomputer. Musk&#039;s xAI Colossus supercluster has been online for almost two months, following an assembly that took 122 days. \ud83d\udd27\ud83d\udca1<\/p>\n<div class=\"youtube-video youtube-facade\" data-nosnippet=\"\">\n<div class=\"video-aspect-box\">\n<div class=\"watch-on-iframe-Jf8EPSBZU7Y\" title=\"Inside the World&#039;s Largest AI Supercluster xAI Colossus - YouTube\" data-yt-video-token=\"Jf8EPSBZU7Y\"><span class=\"youtube-video-title\">Inside the world&#039;s largest AI supercluster, xAI Colossus \u2013 YouTube<\/span><\/div>\n<div title=\"Inside the World&#039;s Largest AI Supercluster xAI Colossus - YouTube\" data-yt-video-token=\"Jf8EPSBZU7Y\">\n<div class=\"jeg_video_container jeg_video_content\"><iframe title=\"Inside the World&#039;s Largest AI Supercluster xAI Colossus\" width=\"500\" height=\"281\" src=\"https:\/\/www.youtube.com\/embed\/Jf8EPSBZU7Y?feature=oembed\" frameborder=\"0\" allow=\"accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share\" referrerpolicy=\"strict-origin-when-cross-origin\" allowfullscreen><\/iframe><\/div>\n<h2>What&#039;s inside a 100,000 GPU cluster? \ud83e\udd14<\/h2>\n<p>Patrick from ServeTheHome takes us on a tour with his camera through different parts of the server, offering a bird&#039;s-eye view of its operations. While some <a href=\"https:\/\/mastertrend.info\/en\/category\/tutorials\/\" target=\"_blank\" rel=\"noopener\" data-wpil-monitor-id=\"1891\">details<\/a> More specific details about the supercomputer, such as its power consumption and the size of the bombs, could not be revealed due to a confidentiality agreement, xAI was responsible for blurring and censoring parts of the video before its release. \ud83c\udfa5<\/p>\n<p>Despite this, the most important thing, like the servers <a href=\"https:\/\/mastertrend.info\/en\/future-of-nvidia-geforce-rtx-50-series\/\" target=\"_blank\" rel=\"noopener\" data-wpil-monitor-id=\"2950\">GPU<\/a> from Supermicro, remained virtually untouched throughout the footage. These GPU servers are <a href=\"https:\/\/mastertrend.info\/en\/the-ualink-consortium\/\" target=\"_blank\" rel=\"noopener\" data-wpil-monitor-id=\"107\">Nvidia HGX<\/a> H100, a powerful server solution featuring eight H100 GPUs each. \ud83d\ude80 The HGX H100 platform is integrated within the 4U Universal GPU Liquid Crystal Display System <a href=\"https:\/\/mastertrend.info\/en\/arctic-s12038\/\" target=\"_blank\" rel=\"noopener\" data-wpil-monitor-id=\"1507\">Cooled<\/a> from Supermicro, providing easily hot-swappable liquid cooling for each GPU. \u2744\ufe0f<\/p>\n<p>These servers are organized in racks containing eight servers each, totaling 64 <a href=\"https:\/\/mastertrend.info\/en\/rtx-5080-fe-instability\/\" target=\"_blank\" rel=\"noopener\" data-wpil-monitor-id=\"3007\">GPU<\/a> per rack. 1U manifolds are sandwiched between each HGX H100, providing the necessary liquid cooling for the servers. At the bottom of each rack, we find another 4U Supermicro unit, this time equipped with a redundant pump system and a rack monitoring system. \ud83d\udd0d<\/p>\n<picture data-hydrate=\"true\"><\/picture><picture data-hydrate=\"true\"><img decoding=\"async\" class=\"image-wrapped__image image__image\" src=\"https:\/\/mastertrend.info\/wp-content\/uploads\/2024\/10\/1730144712_202_Primera-mirada-en-profundidad-al-cluster-de-IA-de-100000.jpg\" alt=\"Four banks of xAI HGX H100 server racks, each with capacity for eight servers. \" data-normal=\"https:\/\/vanilla.futurecdn.net\/cyclingnews\/media\/img\/missing-image.svg\" data-pin-media=\"https:\/\/mastertrend.info\/wp-content\/uploads\/2024\/10\/1730144712_202_Primera-mirada-en-profundidad-al-cluster-de-IA-de-100000.jpg\" data-pin-nopin=\"true\" data-slice-image=\"true\" title=\"\"><\/picture><span class=\"caption-credit__credit\">(Image credit: ServeTheHome)<\/span> <picture data-hydrate=\"true\"><source class=\"image-wrapped__image image__image\" type=\"image\/webp\" data-normal=\"https:\/\/vanilla.futurecdn.net\/cyclingnews\/media\/img\/missing-image.svg\" data-original-mos=\"https:\/\/mastertrend.info\/wp-content\/uploads\/2024\/10\/1730144712_686_Primera-mirada-en-profundidad-al-cluster-de-IA-de-100000.jpg\" data-pin-media=\"https:\/\/mastertrend.info\/wp-content\/uploads\/2024\/10\/1730144712_686_Primera-mirada-en-profundidad-al-cluster-de-IA-de-100000.jpg\" data-pin-nopin=\"true\" data-slice-image=\"true\" \/><source class=\"image-wrapped__image image__image\" type=\"image\/jpeg\" data-normal=\"https:\/\/vanilla.futurecdn.net\/cyclingnews\/media\/img\/missing-image.svg\" data-original-mos=\"https:\/\/mastertrend.info\/wp-content\/uploads\/2024\/10\/1730144712_686_Primera-mirada-en-profundidad-al-cluster-de-IA-de-100000.jpg\" data-pin-media=\"https:\/\/mastertrend.info\/wp-content\/uploads\/2024\/10\/1730144712_686_Primera-mirada-en-profundidad-al-cluster-de-IA-de-100000.jpg\" data-pin-nopin=\"true\" data-slice-image=\"true\" \/><img decoding=\"async\" class=\"image-wrapped__image image__image\" src=\"https:\/\/mastertrend.info\/wp-content\/uploads\/2024\/10\/1730144712_686_Primera-mirada-en-profundidad-al-cluster-de-IA-de-100000.jpg\" alt=\"The rear access of an xAI Colossus GPU server. Nine Ethernet cables run out of each server, with four power supplies on each. The power and water cooling hoses are also visible.\" data-normal=\"https:\/\/vanilla.futurecdn.net\/cyclingnews\/media\/img\/missing-image.svg\" data-pin-media=\"https:\/\/mastertrend.info\/wp-content\/uploads\/2024\/10\/1730144712_686_Primera-mirada-en-profundidad-al-cluster-de-IA-de-100000.jpg\" data-pin-nopin=\"true\" data-slice-image=\"true\" title=\"\"><\/picture><span class=\"caption-credit__credit\">(Image credit: ServeTheHome)<\/span><\/p>\n<p>\ud83d\udda5\ufe0f These racks are organized in groups of eight, allowing for 512 <a href=\"https:\/\/mastertrend.info\/en\/asus-comments-on-q-release-slim\/\" target=\"_blank\" rel=\"noopener\" data-wpil-monitor-id=\"3097\">GPU<\/a> per array. Each server is equipped with four <a href=\"https:\/\/mastertrend.info\/en\/armando-mi-pc\/\" target=\"_blank\" rel=\"noopener\" data-wpil-monitor-id=\"2767\">power supplies<\/a> redundant. At the back of the racks <a href=\"https:\/\/mastertrend.info\/en\/laptop-rtx-50\/\" target=\"_blank\" rel=\"noopener\" data-wpil-monitor-id=\"3420\">GPU<\/a>, there are three-phase power supplies, Ethernet switches, and a rack-sized manifold that provides all the liquid cooling. \ud83d\udca7<\/p>\n<p>There are more than 1,500 racks in the Colossus cluster. <a href=\"https:\/\/mastertrend.info\/en\/neural-texture-compression\/\" target=\"_blank\" rel=\"noopener\" data-wpil-monitor-id=\"3833\">GPU<\/a>, distribuidos en cerca de 200 conjuntos de bastidores. Seg\u00fan Jensen Huang, director ejecutivo de Nvidia, las GPU de estas 200 matrices se instalaron completamente en solo tres semanas. \ud83d\ude80<\/p>\n<p>Since an AI supercluster constantly training models requires huge bandwidth, xAI went further in its interconnectivity of <a href=\"https:\/\/mastertrend.info\/en\/birthday-notifications-on-facebook\/\" target=\"_blank\" rel=\"noopener\" data-wpil-monitor-id=\"3164\">red<\/a>Each graphics card has a dedicated 400GbE NIC (network interface controller), with an additional 400Gb NIC per server. \ud83d\udd17 This means each HGX H100 server has 3.6 terabits per second of Ethernet. Impressive, isn&#039;t it? And yes, the entire cluster runs on Ethernet, rather than InfiniBand or other exotic connections that are standard in the supercomputing world. \ud83c\udf10<\/p>\n<picture data-hydrate=\"true\"><source class=\"image-wrapped__image image__image\" type=\"image\/webp\" data-normal=\"https:\/\/vanilla.futurecdn.net\/cyclingnews\/media\/img\/missing-image.svg\" data-original-mos=\"https:\/\/mastertrend.info\/wp-content\/uploads\/2024\/10\/1730144713_456_Primera-mirada-en-profundidad-al-cluster-de-IA-de-100000.jpg\" data-pin-media=\"https:\/\/mastertrend.info\/wp-content\/uploads\/2024\/10\/1730144713_456_Primera-mirada-en-profundidad-al-cluster-de-IA-de-100000.jpg\" data-pin-nopin=\"true\" data-slice-image=\"true\" \/><source class=\"image-wrapped__image image__image\" type=\"image\/jpeg\" data-normal=\"https:\/\/vanilla.futurecdn.net\/cyclingnews\/media\/img\/missing-image.svg\" data-original-mos=\"https:\/\/mastertrend.info\/wp-content\/uploads\/2024\/10\/1730144713_456_Primera-mirada-en-profundidad-al-cluster-de-IA-de-100000.jpg\" data-pin-media=\"https:\/\/mastertrend.info\/wp-content\/uploads\/2024\/10\/1730144713_456_Primera-mirada-en-profundidad-al-cluster-de-IA-de-100000.jpg\" data-pin-nopin=\"true\" data-slice-image=\"true\" \/><img decoding=\"async\" class=\"image-wrapped__image image__image\" src=\"https:\/\/mastertrend.info\/wp-content\/uploads\/2024\/10\/1730144713_456_Primera-mirada-en-profundidad-al-cluster-de-IA-de-100000.jpg\" alt=\"A shot looking down at the waves and waves of yellow Ethernet cables connecting the xAI Colossus cluster to itself. Several layers of overly wide cables are embedded in the ceiling.\" data-normal=\"https:\/\/vanilla.futurecdn.net\/cyclingnews\/media\/img\/missing-image.svg\" data-pin-media=\"https:\/\/mastertrend.info\/wp-content\/uploads\/2024\/10\/1730144713_456_Primera-mirada-en-profundidad-al-cluster-de-IA-de-100000.jpg\" data-pin-nopin=\"true\" data-slice-image=\"true\" title=\"\"><\/picture><span class=\"caption-credit__credit\">(Image credit: ServeTheHome)<\/span><picture data-hydrate=\"true\"><source class=\"image-wrapped__image image__image\" type=\"image\/webp\" data-normal=\"https:\/\/vanilla.futurecdn.net\/cyclingnews\/media\/img\/missing-image.svg\" data-original-mos=\"https:\/\/mastertrend.info\/wp-content\/uploads\/2024\/10\/1730144713_114_Primera-mirada-en-profundidad-al-cluster-de-IA-de-100000.jpg\" data-pin-media=\"https:\/\/mastertrend.info\/wp-content\/uploads\/2024\/10\/1730144713_114_Primera-mirada-en-profundidad-al-cluster-de-IA-de-100000.jpg\" data-pin-nopin=\"true\" data-slice-image=\"true\" \/><source class=\"image-wrapped__image image__image\" type=\"image\/jpeg\" data-normal=\"https:\/\/vanilla.futurecdn.net\/cyclingnews\/media\/img\/missing-image.svg\" data-original-mos=\"https:\/\/mastertrend.info\/wp-content\/uploads\/2024\/10\/1730144713_114_Primera-mirada-en-profundidad-al-cluster-de-IA-de-100000.jpg\" data-pin-media=\"https:\/\/mastertrend.info\/wp-content\/uploads\/2024\/10\/1730144713_114_Primera-mirada-en-profundidad-al-cluster-de-IA-de-100000.jpg\" data-pin-nopin=\"true\" data-slice-image=\"true\" \/><img decoding=\"async\" class=\"image-wrapped__image image__image\" src=\"https:\/\/mastertrend.info\/wp-content\/uploads\/2024\/10\/1730144713_114_Primera-mirada-en-profundidad-al-cluster-de-IA-de-100000.jpg\" alt=\"xAI&#039;s Colossus CPU computing servers, which look exactly like Supermicro&#039;s storage servers, are also used extensively on site.\" data-normal=\"https:\/\/vanilla.futurecdn.net\/cyclingnews\/media\/img\/missing-image.svg\" data-pin-media=\"https:\/\/mastertrend.info\/wp-content\/uploads\/2024\/10\/1730144713_114_Primera-mirada-en-profundidad-al-cluster-de-IA-de-100000.jpg\" data-pin-nopin=\"true\" data-slice-image=\"true\" title=\"\"><\/picture><span class=\"caption-credit__credit\">(Image credit: ServeTheHome)<\/span><\/p>\n<p>Of course, a supercomputer like the Grok 3 chatbot, which trains AI models, needs more than just <a title=\"\u26a0\ufe0f A GPU compatible with D3D11 is required: Quick Fix!\" href=\"https:\/\/mastertrend.info\/en\/a-d3d11-compatible-gpu-is-required\/\" target=\"_blank\" rel=\"noopener\" data-wpil-monitor-id=\"4197\">GPU<\/a> to run at its best. \ud83d\udd25 While details on storage and CPU servers in Colossus are somewhat limited, thanks to Patrick&#039;s video and the <a href=\"https:\/\/www.servethehome.com\/inside-100000-nvidia-gpu-xai-colossus-cluster-supermicro-helped-build-for-elon-musk\/3\/\" target=\"_blank\" rel=\"noopener\" data-schema-attribute=\"mentions\">blog post<\/a>, we know that these servers are usually in Supermicro chassis. \ud83d\ude80<\/p>\n<p>1U NVMe-forward servers with x86 platform CPUs inside are used, providing both storage and computing power, and are equipped with liquid cooling at the rear. \ud83d\udca7 Additionally, very compact Tesla Megapack battery banks can be seen outside. \u26a1\ufe0f<\/p>\n<p>The start-stop feature of the array, with its millisecond latency between banks, was too much for the conventional power grid or Musk&#039;s diesel generators. So several Tesla Megapacks (each with a capacity of 3.9 MWh) are used as an intermediate power source between the <a href=\"https:\/\/mastertrend.info\/en\/category\/networks\/\" target=\"_blank\" rel=\"noopener\" data-wpil-monitor-id=\"1922\">red<\/a> electrical and the supercomputer. \ud83d\udda5\ufe0f\ud83d\udd0b This ensures optimal and efficient operation, avoiding interruptions. \ud83d\udea6\u2728<\/p>\n<h2>\ud83c\udf1f Using Colossus and Musk&#039;s stable supercomputer \ud83c\udf1f<\/h2>\n<p>The xAI supercomputer Colossus is currently, according to Nvidia, the largest AI supercomputer in the world. \ud83e\udd2f While many of the world&#039;s leading supercomputers are used in research by contractors or academics to study weather patterns, diseases, or other complex tasks, Colossus has sole responsibility for training X&#039;s (formerly Twitter) various AI models. Most notably, Grok 3, Elon&#039;s &quot;anti-woke&quot; chatbot that&#039;s available only to X Premium subscribers. \ud83e\udd16<\/p>\n<p>Additionally, ServeTheHome was informed that Colossus is training <a href=\"https:\/\/mastertrend.info\/en\/microsoft-will-release-deepseek-r1-for-npu\/\" target=\"_blank\" rel=\"noopener\" data-wpil-monitor-id=\"3843\">AI models<\/a> \u00abdel futuro\u00bb; modelos cuyos usos y capacidades supuestamente est\u00e1n m\u00e1s all\u00e1 de las capacidades actuales de la IA. \ud83d\ude80 La primera fase de construcci\u00f3n de Colossus est\u00e1 completa y el cl\u00faster est\u00e1 completamente operativo, pero a\u00fan no est\u00e1 todo terminado. La supercomputadora de Memphis pronto se actualizar\u00e1 para duplicar su capacidad de GPU, con 50.000 GPU H100 adicionales y 50.000 GPU H200 de pr\u00f3xima generaci\u00f3n. \ud83d\udd25<\/p>\n<p>This <a href=\"https:\/\/mastertrend.info\/en\/make-reddit-faster\/\" target=\"_blank\" rel=\"noopener\" data-wpil-monitor-id=\"1488\">update<\/a> It will also more than double its energy consumption, which is already too much for the 14 diesel generators Musk added to the site in July to handle. \u26a1 While it is short of Musk&#039;s promise of 300,000 H200s inside Colossus, that could be part of Phase 3 of <a href=\"https:\/\/mastertrend.info\/en\/update-windows-drivers\/\" target=\"_blank\" rel=\"noopener\" data-wpil-monitor-id=\"1546\">updates<\/a>. \ud83d\udd0b<\/p>\n<p>On the other hand, the 50,000 GPU Cortex supercomputer at Tesla&#039;s &quot;Giga Texas&quot; plant also belongs to a Musk company. Cortex is dedicated to training Tesla&#039;s autonomous AI technology through camera streaming and image detection, as well as Tesla&#039;s autonomous robots and other AI projects. \ud83e\udd16\ud83d\ude97<\/p>\n<p>Additionally, Tesla will soon see the construction of the Dojo supercomputer in Buffalo, New York, a $500 million project coming soon. \ud83d\udcb8 Meanwhile, industry speculators like Baidu CEO Robin Li predict that 99% of AI companies could collapse when the bubble bursts. Whether Musk&#039;s record spending on AI will backfire or pay off remains to be seen. \u23f3<\/p>\n<\/div>\n<\/div>\n<\/div>","protected":false},"excerpt":{"rendered":"<p>\ud83c\udf1f The Secrets of xAI Colossus: Discover Elon Musk&#039;s 100,000 GPU AI Cluster \ud83d\ude80<\/p>","protected":false},"author":1,"featured_media":83215,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"ai_generated_summary":"","iawp_total_views":23,"jnews-multi-image_gallery":[],"jnews_single_post":{"format":"standard","override":[{"template":"1","parallax":"1","fullscreen":"1","layout":"right-sidebar","sidebar":"default-sidebar","second_sidebar":"default-sidebar","sticky_sidebar":"1","share_position":"top","share_float_style":"share-monocrhome","show_share_counter":"1","show_view_counter":"1","show_featured":"1","show_post_meta":"1","show_post_author":"1","show_post_author_image":"1","show_post_date":"1","post_date_format":"default","post_date_format_custom":"Y\/m\/d","show_post_category":"1","show_post_reading_time":"1","post_reading_time_wpm":"300","post_calculate_word_method":"str_word_count","show_zoom_button":"1","zoom_button_out_step":"2","zoom_button_in_step":"3","show_post_tag":"1","show_prev_next_post":"1","show_popup_post":"1","show_comment_section":"1","number_popup_post":"1","show_author_box":"1","show_post_related":"0","show_inline_post_related":"0"}],"image_override":[{"single_post_thumbnail_size":"crop-500","single_post_gallery_size":"crop-500"}],"trending_post_position":"meta","trending_post_label":"Trending","sponsored_post_label":"Sponsored by","disable_ad":"0","subtitle":""},"jnews_primary_category":[],"jnews_social_meta":[],"jnews_review":[],"enable_review":"","type":"percentage","name":"","summary":"","brand":"","sku":"","good":[],"bad":[],"score_override":"","override_value":"","rating":[],"price":[],"jnews_override_counter":{"view_counter_number":"0","share_counter_number":"0","like_counter_number":"0","dislike_counter_number":"0"},"footnotes":""},"categories":[880],"tags":[1445,1709,1622],"class_list":["post-24310","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-ia","tag-evergreencontent","tag-gpu","tag-innovacion"],"_links":{"self":[{"href":"https:\/\/mastertrend.info\/en\/wp-json\/wp\/v2\/posts\/24310","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/mastertrend.info\/en\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/mastertrend.info\/en\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/mastertrend.info\/en\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/mastertrend.info\/en\/wp-json\/wp\/v2\/comments?post=24310"}],"version-history":[{"count":2,"href":"https:\/\/mastertrend.info\/en\/wp-json\/wp\/v2\/posts\/24310\/revisions"}],"predecessor-version":[{"id":103867,"href":"https:\/\/mastertrend.info\/en\/wp-json\/wp\/v2\/posts\/24310\/revisions\/103867"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/mastertrend.info\/en\/wp-json\/wp\/v2\/media\/83215"}],"wp:attachment":[{"href":"https:\/\/mastertrend.info\/en\/wp-json\/wp\/v2\/media?parent=24310"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/mastertrend.info\/en\/wp-json\/wp\/v2\/categories?post=24310"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/mastertrend.info\/en\/wp-json\/wp\/v2\/tags?post=24310"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}