阿里Qwen3深度拆解:超大語料+混合推理訓練揭秘




解構阿里巴巴Qwen3訓練之路:大模型新時代的養成秘訣

Qwen3的誕生:訓練數據從何而來?

Qwen3,阿里巴巴最新的混合推理大型語言模型(LLM),再次改寫人工智能研究與應用的格局。其強大能力背後,是一個極為嚴謹的訓練流程——由龐大的預訓練數據、創新架構設計,以及多階段後訓練組成。本文將深入拆解Qwen3的訓練過程,逐步揭示從原始數據到精細調校、推理部署的每一環節。

數據規模再升級:從兆級到數十兆級的語料

Qwen3的基礎來自一個史無前例的龐大語料庫——超過36兆(trillion)個詞元(tokens),涵蓋119種語言及方言。這個規模接近Qwen2.5的兩倍(18兆tokens),讓Qwen3能夠吸收更豐富的語言模式、世界知識及各專業領域內容。

多元數據來源:網絡、PDF與合成內容

為了建立如此龐大的數據集,阿里巴巴結合網頁爬蟲及Qwen2.5-VL處理的PDF文檔,確保能高質量提取技術文本及學術資料。此外,還利用Qwen2.5-Math及Qwen2.5-Coder生成目標明確的合成數據,為語料庫加入數以百萬計的數學題解與程式碼片段,進一步強化STEM及編程能力。

Qwen3的預訓練流程如何設計?

階段一:打好基礎知識

在第一階段(S1),Qwen3以標準4K上下文長度的Transformer骨幹,訓練超過30兆tokens,建立基本語言理解及通用知識,相當於人類「學字母」的階段。

階段二:知識密集型能力強化

進入第二階段(S2),數據集重新平衡,著重於知識密集型內容——STEM文本、編程挑戰及推理任務。再額外加入5兆tokens,令模型能應對更複雜的學術及技術問題。

階段三:延展上下文長度

最後,長上下文預訓練階段利用高質量文檔,將Qwen3的原生上下文窗口延長至32K tokens,使其能處理如學術論文、長步驟指令等長篇輸入。

Qwen3性能背後的架構創新

Dense與專家混合(MoE)模型

Qwen3提供Dense(密集)及Mixture-of-Experts(MoE,專家混合)兩種架構。Dense模型參數由0.6B至32B,而MoE模型則只激活小部分專家(如128個專家中只用8個),大幅節省計算資源(最多可減少90%),但效能無損。

注意力與正規化升級

如每個head的QK正規化、重新設計的注意力偏置等創新,令模型在大規模下更穩定。這些改良讓Qwen3(如235B-A22B型號的94層深度)能高效收斂,隨著參數增加,效能穩步提升。

Qwen3的混合推理實現方式

思考模式 vs 非思考模式

Qwen3一大特色是「混合推理」:

– 「思考模式」:啟動chain-of-thought(CoT)推理,將問題分解為多步驟,逐步推理後才給答案。
– 「非思考模式」:不經明確中間推理,直接快速回應。用戶可透過`enable_thinking`參數或內嵌標籤(如/think、/no_think)自由切換,根據任務複雜度調整推理深度。

推理資源分配

Qwen3能針對推理步驟分配「計算預算」,難題可分配更多資源進行深度推理,簡單查詢則保持高效,實現推理質量與成本的精細調控。

Qwen3的後訓練流程包括什麼?

CoT冷啟動微調

第一個後訓練階段,Qwen3會在多元長CoT數據(如數學、邏輯、編程)上微調,為後續強化學習打好推理基礎。

推理強化學習(RL)

第二階段通過規則驅動的強化學習(RL),利用手工設計的獎勵函數,引導模型探索推理路徑,提升產生連貫中間步驟的能力。

思考模式融合與通用RL

第三階段將推理與指令微調數據融合(思考模式融合),結合深層推理與一般指令執行能力。最後,第四階段針對20多個通用任務(如格式遵從、代理功能)進行RL訓練,矯正不良行為並提升流暢度。

Qwen3與Qwen2.5有何不同?

Qwen3在多方面大幅升級:

| 功能 | Qwen2.5 | Qwen3 |
| — | — | — |
| 參數規模 | 最多72B(Dense) | 最多235B(MoE)+Dense選項 |
| 上下文窗口 | 16K tokens | 128K tokens(大部分型號) |
| 語言覆蓋 | 29種語言 | 119種語言及方言 |
| 推理整合 | 分離推理模型 | 思考/非思考模式統一 |
| 開源權重 | 是(Apache 2.0) | 是(Apache 2.0) |

這些升級令Qwen3更靈活、精確,並具備全球化應用潛力。

Qwen3如何實現實時部署?

Qwen3不只著重訓練,更強調低延遲推理及可擴展部署,支持生產級智能代理及Copilot。

硬件加速:Cerebras協助

Cerebras展示了Qwen3-32B在其晶圓級引擎上的實時推理能力,1.2秒內完成回應,比同類推理模型快60倍,得益於專為Qwen3優化的推理核心。

雲端部署與API就緒

阿里雲通過API套件提供Qwen3,配備自動擴容GPU集群及推理優化CPU節點。開發者可利用內建LoRA支持,針對需求微調並部署Qwen3,令大規模AI服務更經濟易用。

開發者如何善用Qwen3?

Qwen3已以Apache 2.0授權開源,歡迎全球科研及企業開發者採用、改造及擴展模型家族,應用於各種專業場景。

可選型號

– Dense模型(0.6B、3B、22B、32B):適合本地部署及邊緣場景,易於集成。
– MoE模型(總參數235B,活躍22B):針對高吞吐雲端服務,推理深度及多語言能力最強,資源利用最優。

API與本地部署的分別

– 阿里雲API:託管端點,自動擴容,適合快速原型及全球分發。
– 自行部署:提供Docker及Kubernetes方案,符合資料駐留及安全需求。
– CometAPI:統一REST介面,聚合多款AI模型,方便管理及調用。

社群與生態支持

– 開源庫:Qwen GitHub提供模型權重、訓練腳本及微調工具,鼓勵社群創新。
– 預製集成:支援主流機器學習框架及第三方平台,加快落地。
– 研究協作:阿里巴巴已將完整技術報告發佈於arXiv,確保架構及訓練方法透明。

評論與啟發:Qwen3的突破意味著什麼?

Qwen3的訓練與設計,體現了中國AI企業在全球大模型賽道上的野心與實力。它不僅追趕甚至挑戰OpenAI、Google等國際巨頭,還以開源姿態推動全球社群共同進步。值得注意的是,Qwen3不單是「大」——它在推理模式(思考/非思考)、多語言支持、低延遲部署等細節上都有突破,這些技術細節才是真正拉開差距的關鍵。

對香港及華語圈開發者而言,Qwen3的多語言及方言支持,將有力推動本地化AI應用的創新。而其開源策略,令中小企業、創業團隊也能平等享有頂級AI技術,降低數字鴻溝。

當然,Qwen3的成功也反映了大數據、大算力時代的「資本密集型」特徵。未來AI模型的競爭,除了算法創新,更考驗數據治理、能效優化及生態協同。Qwen3的混合推理、MoE結構、硬件協同部署等新嘗試,值得本地AI業界深思——我們如何在有限資源下,做出有特色、可落地的AI產品?這或許是Qwen3帶給香港科技圈最重要的啟示。

Qwen3示意圖

入門指引

CometAPI提供統一REST介面,聚合數百款AI模型,方便開發者在同一端點管理API金鑰、用量配額及帳單。開發者可於CometAPI Playground探索Qwen3能力,並參考API指南進行接入。使用前請先註冊並取得API金鑰。

總結

Qwen3以多階段大規模預訓練、架構創新及精細後訓練流程,創下混合推理新標桿。其靈活推理模式、高效MoE變體及豐富部署生態,令其成為開源AI前沿力量。對香港及全球開發者而言,這不僅是技術紅利,更是參與未來AI生態的入場券——我們準備好迎接這場智能革命了嗎?

🎨 Nano Banana Pro 圖像生成器|打幾句說話就出圖

想畫人像、產品圖、插畫?SSFuture 圖像生成器支援 Flux Gemini Nano Banana Pro 改圖 / 合成, 打廣東話都得,仲可以沿用上一張圖繼續微調。

🆓 Flux 模型即玩,不用登入
🤖 登入後解鎖 Gemini 改圖
📷 支援上載參考圖再生成
⚡ 每天免費額度任你玩
✨ 即刻玩 AI 畫圖
一隻在香港茶餐廳喝奶茶的貓 Create a photorealistic and highly detailed image featuring the attached image walking confidently down a modern city street, accompanied by Jason Statham, Dwayne “The Rock” Johnson, and Jason Momoa acting as bodyguards.
John Wick (Keanu Reeves) is walking just beside or slightly behind the subject, holding an umbrella over him to shield from light rain.
The subject should be the central figure, wearing stylish casual clothing — like a fitted jacket, dark jeans, and sunglasses — exuding calm authority and cool charisma.
Statham, The Rock, and Momoa are dressed in black tactical-style suits, maintaining alert, protective stances, scanning the surroundings like professional bodyguards. John Wick wears his signature black suit and tie, looking composed as he holds the umbrella.
The setting is a downtown urban street with wet pavement reflecting city lights, parked luxury cars, and paparazzi in the background snapping photos.
The photo should look like a real paparazzi shot — slightly off-angle, mid-step motion blur, with realistic lighting and reflections.
Lighting: natural daylight with overcast skies, reflections from wet concrete, realistic shadows, subtle raindrops on the umbrella and clothing.
Camera realism: crisp detail on facial features and clothing textures, shallow depth of field emphasizing the group, with lens flare or light bloom for authenticity.
Mood & tone: grounded, cinematic, and stylish — feels like a moment from a celebrity entourage photo or action-movie press capture, taken with an iPhone by paparazzi.
Style: ultra-realistic, documentary-style street photography with modern cinematic sharpness. {
"intro": "Create an ultra realistic 8K UHD DSLR photo based on the attached image as a reference of facial features, maintaining 100% likeness.",

"subject": {
"identity": "A stylish beautiful woman portrayed as Cleopatra, the eternal Queen of Egypt, radiating supreme authority, elegance, and divine power.",
"angle": "Full-body editorial portrait captured at a cinematic 3/4 angle, both Cleopatra and the horse positioned diagonally, rendered in ultra-crisp clarity with no blur.",
"pose": {
"body_position": "She is riding a majestic white horse, seated confidently with impeccable royal posture, her torso slightly turned to a 3/4 angle to emphasize grace and dominance.",
"hands": "One hand gently holds the gold-accented reins, while the other rests elegantly near her waist, displaying ornate jewelry.",
"expression": "She looks directly at the camera with a calm, commanding, and seductive gaze—the unmistakable presence of a queen born to rule."
}
},

"appearance": {
"outfit": "An opulent, ultra-bongga Cleopatra couture gown in pristine white and radiant gold. The gown features a sculpted corset bodice richly embroidered with gold hieroglyphic motifs, sun-disk patterns, and crystal beadwork. Flowing white silk, chiffon, and sheer organza panels cascade dramatically from the waist and shoulders, creating powerful movement as she rides. A daring thigh-high slit reveals her leg, balancing sensuality with imperial elegance. Every seam is traced with gold-thread embroidery for a luminous, goddess-like silhouette.",
"accessories": "She wears the exact Cleopatra headpiece from the attached reference image: a regal black-and-gold striped nemes-style headdress with a polished gold cobra (uraeus) centerpiece at the forehead, structured side panels, and intricate gold detailing. Paired with a wide Egyptian collar necklace with turquoise accents, engraved gold arm cuffs, crystal finger rings, an ornate gold waist belt, delicate anklets, and elegant flat Egyptian sandals.",
"hair": "Her hair is fully concealed beneath the nemes headpiece as shown in the reference image, ensuring perfect historical accuracy and symmetry.",
"makeup": "Ultra-bold, highly colorful Egyptian eye makeup with ceremonial ink artistry. Her eyes feature sharp elongated black kohl eyeliner extended dramatically past the outer corners, layered with vivid turquoise, teal, emerald green, sapphire blue, violet, and metallic gold pigments blended in high-fashion gradients. Beneath each eye, intricate hand-drawn ink designs inspired by ancient Egyptian symbolism are visible—fine black and gold lines, dots, and sacred motifs echoing hieroglyphs and protective markings, following the natural curve of the lower eye and cheekbone. Her complexion is flawless and softly bronzed with luminous highlights, brows are sculpted and powerful, cheeks carry a subtle coral-rose flush, and lips are finished in a refined nude-rose satin tone to balance the intense, artistic eye look."
},

"props": {
"animal": "A powerful, majestic white horse with a flowing ivory mane, sculpted muscles, and intelligent dark eyes. The horse wears elegant white-and-gold tack engraved with Egyptian motifs, symbolizing royal conquest, divine favor, and sovereignty."
},

"background": {
"macro_environment": "A vast open desert landscape near the Egyptian palace and pyramids at golden hour, stretching endlessly beneath a dramatic sky glowing with gold, amber, and soft ivory tones.",
"midground_details": "Distant pyramids rising from the sand, monumental stone statues, ceremonial banners moving in the wind, and faint silhouettes of royal guards and attendants placed far behind for scale.",
"micro_elements": "Fine desert sand lifted by the horse’s movement, sharply defined gold engravings on the reins, visible embroidery threads on the gown, subtle translucency of sheer fabrics, radiant light reflections on gold surfaces, and crisp, realistic shadows—everything rendered with extreme clarity and zero blur."
},

"lighting": {
"type": "Cinematic natural golden-hour lighting with soft reflective highlights.",
"effect": "Warm sunlight intensifies the white-and-gold palette, ignites the vibrant eye makeup colors and ink details, and illuminates the horse’s coat, creating a radiant, divine, editorial glow."
},

"camera": {
"camera_type": "DSLR",
"resolution": "8K UHD",
"lens": "50mm prime lens",
"aperture": "f/8 for maximum sharpness across subject and background",
"iso": 100,
"shutter_speed": "1/250s to freeze motion while maintaining realism",
"focus": "Extreme sharp focus from foreground to background, no bokeh, no blur"
},

"style": "High-fashion editorial, cinematic realism, divine Egyptian royalty, white-and-gold couture contrasted with vibrant ceremonial eye art, powerful, sensual, ultra-detailed, sharp, majestic"
}