Nvidia AI霸主地位受挑戰?推論運算新戰場開打!

Ai




Nvidia在AI訓練賽中獲勝,但推理仍然是未知數

除了像Google的TPU或Amazon的Trainium這類定制雲矽,當前大多數AI訓練集群都是由Nvidia的GPU驅動。然而,儘管Nvidia在AI訓練方面已經佔據了優勢,推理的競爭卻遠未結束。

到目前為止,焦點主要集中在構建更好、更強大和更可靠的模型上。大多數推理工作負載則以概念驗證和AI聊天機器人、圖像生成器等簡單應用為主。因此,大多數AI計算都是針對訓練而優化,而非推理。

隨著這些模型的進步、應用變得更加複雜,以及AI逐漸深入我們的日常生活,這一比例在未來幾年內有望發生劇變。面對這一變化,許多錯過AI訓練機會的芯片公司正摩拳擦掌,準備挑戰Nvidia的市場主導地位。

尋找利基市場

與幾乎普遍需要大量計算資源的訓練相比,推理是一種更為多樣化的工作負載。

在推理方面,性能主要由三個核心因素決定:

1. 內存容量決定了能運行哪些模型。
2. 內存帶寬影響響應生成的速度。
3. 計算能力影響模型響應的時間和同時處理的請求數量。

但你重視哪一項,則取決於模型的架構、參數數量、托管位置和目標受眾。

例如,一個小型的低延遲模型可能更適合使用低功耗的NPU或甚至CPU,而一個多萬億參數的LLM則需要數據中心級別的硬件,擁有TB級的超快內存。

後者正是AMD的MI300系列GPU所針對的對象,該系列擁有192 GB至256 GB的高速HBM內存。更大的內存意味著AMD能在單個服務器中容納比Nvidia更多的前沿模型,這也解釋了為何Meta和Microsoft等公司如此渴望採用這些產品。

在另一端,像Cerebras、SambaNova和Groq等公司則優先考慮速度,利用其以SRAM為主的芯片架構和推測解碼等技術,使模型的運行速度比最佳的基於GPU的推理服務供應商快五倍、十倍甚至二十倍。

隨著推理鏈式思維模型的興起,這些模型可能需要生成數千個單詞——或更具體地說,數千個標記來回答問題,閃電般的推理從一個有趣的花招變成了真正有用的工具。

因此,像d-Matrix這樣的初創公司也在尋求進入“快速推理”市場。該公司預計其Corsair加速器將在第二季度上市,能以每個標記低至2毫秒的延遲運行類似Llama 70B的模型,按我們的估算,這相當於每秒500個標記。該公司對下一代Raptor系列芯片的期望更高,將使用垂直堆疊的DRAM以提升內存容量和帶寬。

在低端市場,我們看到越來越多的供應商如Hailo AI、EnCharge和Axelera正在開發面向邊緣和PC市場的低功耗高性能芯片。

提到PC市場,AMD、Intel、Qualcomm和Apple等較為成熟的芯片製造商正在競相將更強大的NPU集成到其SoC中,以支持AI增強的工作流程。

最後,我們不能忽視雲和超級計算提供商,許多公司將繼續購買Nvidia硬件,同時在內部硅上進行風險對沖。

不要低估Nvidia

儘管Nvidia面臨的競爭比以往更加激烈,但它仍然是AI基礎設施中最大的名字。憑藉其最新一代GPU,Nvidia顯然正在為大規模推理部署的轉變做好準備。

Nvidia於去年推出的GB200 NVL72擴展了其NVLink計算域至72個GPU,總計超過1.4 exaFLOPS和13.5 TB的內存。

在此之前,Nvidia最強大的系統每個節點最多只能支持八個GPU,並且擁有640 GB至1.1 TB的vRAM。這意味著像GPT-4這樣的大型前沿模型必須分佈在多個系統中,不僅是為了將所有參數放入內存,還是為了實現合理的吞吐量。

如果相信Nvidia的預測,NVL72的高速互連結構將能為1.8萬億參數級別的專家模型(如GPT-4)提供30倍的吞吐量提升,這與一個八節點、64 GPU的H100集群相比。

更重要的是,這些是通用GPU,這意味著它們不僅限於訓練或推理。它們可以用來訓練新模型,然後再重新調整來運行它們——這並非所有競爭者的硅都能做到的。

隨著GTC會議即將開始,Nvidia預計將詳細介紹其下一代Blackwell-Ultra平台。如果這與其H200系列GPU類似,那麼應該會專門針對推理進行調整。

考慮到Nvidia早前推出的基於Blackwell的RTX卡,我們也不會驚訝看到L40的繼任者或一些更新的工作站級產品。

最終,推理是一場每美元獲得標記的遊戲

無論AI服務提供商最終在其數據中心中配備什麼硬件,推理的經濟學最終都歸結為每美元獲得的標記數。

我們並不是說開發者不會願意為訪問最新模型或更高吞吐量支付額外費用,尤其是當這能幫助他們的應用或服務脫穎而出時。

但從開發者的角度來看,這些服務不過是連接到他們應用的API接口,讓標記按需流動。

他們使用Nvidia的Blackwell部件或某些你未曾聽說過的定制加速器的事實,完全被抽象化,通常最終以OpenAI兼容的API端點呈現。

這一切顯示出AI推理市場的複雜性和多樣性,未來的競爭將不僅是技術上的較量,更是商業模式的博弈。隨著各大企業持續投入資源和創新,AI推理的潛力無疑將成為未來數位經濟的重要推動力。

以上文章由特價GPT API KEY所翻譯及撰寫。而圖片則由FLUX根據內容自動生成。

🎨 Nano Banana Pro 圖像生成器|打幾句說話就出圖

想畫人像、產品圖、插畫?SSFuture 圖像生成器支援 Flux Gemini Nano Banana Pro 改圖 / 合成, 打廣東話都得,仲可以沿用上一張圖繼續微調。

🆓 Flux 模型即玩,不用登入
🤖 登入後解鎖 Gemini 改圖
📷 支援上載參考圖再生成
⚡ 每天免費額度任你玩
✨ 即刻玩 AI 畫圖
人物:人物姿態表情動作衣著都不變,色調:富士底片日系風格,暖色調,日系輕透感,光影:維持照片中的光影邏輯,輕灑在少女身上,像是日系風格清爽,必須符合原本照片中的光影邏輯,背景:一片海洋與藍天,天空天氣非常晴朗,海與藍天在畫面中維持一半的比例,海的顏色是鮮紅色的海,海非常鮮紅、一片平靜的死海,海上有陽光帶來的一點光班,場景:少女坐在海堤防邊,面對著畫面,而少女的後方是一片遙遙無極的海與藍天,完美的呈現一半的比例,在一個清晨的光線中,陽光並沒有太刺眼,陽光像是輕灑在少女身上還有紅色的海面上,透視關係:構圖不改變,維持原本照片的構圖,平面構圖 {
"intro": "Create an ultra realistic 8K UHD DSLR photo based on the attached image as a reference of facial features, maintaining 100% likeness.",

"subject": {
"identity": "A stylish beautiful woman portrayed as Cleopatra, the eternal Queen of Egypt, exuding power, seduction, and divine authority.",
"angle": "Full-body editorial portrait captured at a refined 3/4 angle, with both the subject and her throne positioned diagonally, rendered in ultra-crisp clarity with no blur.",
"pose": {
"body_position": "She is seated regally on a luxurious Egyptian throne angled slightly to the side, her torso and legs elegantly turned to match the diagonal composition, enhancing her curves and royal poise.",
"hands": "One arm rests gracefully along the angled armrest of the throne, while the other cradles a magnificent royal cat against her body.",
"expression": "She looks directly into the camera with a composed, intelligent, and seductive gaze—calm authority mixed with magnetic allure."
}
},

"appearance": {
"outfit": "An exceptionally bongga, sexy, and ultra-colorful Cleopatra couture gown designed as a high-fashion masterpiece. The gown features a sculpted corset bodice encrusted with multicolored gemstones—turquoise, lapis blue, emerald green, ruby red, amethyst violet, and molten gold—arranged in intricate Egyptian patterns. The fabric transitions into layered sheer silks in jewel tones that cascade dramatically, creating movement and depth. A daring thigh-high slit reveals her leg, while illusion panels and crystal embroidery contour her waist and hips. The gown shimmers with every hue, bold yet luxurious, sensual yet undeniably royal.",
"accessories": "A dramatic Egyptian crown with raised cobra centerpiece and iridescent gemstone inlays, oversized multi-layered gold collar necklace, engraved arm cuffs, crystal-encrusted finger rings, an ornate gold waist belt, anklets with delicate charms, and elegant flat Egyptian sandals.",
"hair": "Her hair is shoulder-length, sleek, and glossy with soft movement, modernized yet inspired by ancient Egyptian elegance, no bangs.",
"makeup": "High-impact, colorful Egyptian glam makeup—intensely elongated kohl eyeliner, bold eyeshadow blended in gold, turquoise, teal, emerald, and hints of violet, sculpted cheekbones with luminous gold highlight, flawless bronzed skin, defined brows, and rich nude-to-berry satin lips with a sensual glow."
},

"props": {
"animal": "A stunning, regal Egyptian cat of exceptional beauty, with sleek, glossy fur patterned in warm sand, charcoal, and soft gold tones. The cat has large almond-shaped eyes that glow amber-gold, finely sculpted features, and an elegant posture. It wears a delicate gold collar adorned with tiny gemstones and a miniature Bastet charm, symbolizing protection, divinity, and royal favor."
},

"background": {
"macro_environment": "A grand royal palace courtyard in ancient Egypt at golden hour, composed diagonally to echo the angled throne, with towering sandstone columns, carved relief walls, and distant pyramids beneath a richly colored desert sky.",
"midground_details": "Palm trees gently swaying, monumental statues of Bastet and other Egyptian deities, ceremonial fire torches, flowing silk banners in jewel tones, and distant palace attendants positioned subtly for scale.",
"micro_elements": "Ultra-sharp hieroglyph carvings, visible stone grain and chisel marks, fine desert sand particles, radiant gemstone reflections, metallic gold highlights, intricate embroidery threads, and realistic sun-cast shadows—every element sharply defined with zero blur."
},

"lighting": {
"type": "Cinematic natural golden-hour lighting enhanced with soft reflective fill light.",
"effect": "Warm sunlight amplifies the vivid colors of the gown and gemstones while sculpted shadows define her face, body, throne, and the cat, creating a dramatic yet luxurious editorial mood."
},

"camera": {
"camera_type": "DSLR",
"resolution": "8K UHD",
"lens": "50mm prime lens",
"aperture": "f/8 for maximum sharpness across subject and background",
"iso": 100,
"shutter_speed": "1/200s",
"focus": "Extreme sharp focus from foreground to background, no bokeh, no blur"
},

"style": "High-fashion editorial, cinematic realism, ultra-luxury Egyptian couture, vibrant jewel-toned palette, historical grandeur fused with modern sensuality, extremely detailed, sharp, powerful, and seductive"
} Prompt:
Use my image in Ultra-realistic, hyper-detailed, 8K cinematic portrait of a young stylish man, using the uploaded image for exact face and hairstyle.
Outfit: An oversized red knit sweater with white hearts, exactly as described in the prompt.
Pose: A hyper-realistic close-up portrait with a messy, cropped framing showing only the boy holding the book. His left hand rests on the wooden table and covers part of his cheek, with a subtle smile on his lips. His other hand holds the book titled "Something I Never Told You" with the word "YOU" written in pink, exactly as
described in the prompt. Background: Not specified.