DeepSeek AI 新模型:專家知識勝算,唔靠算力都得?

Ai




DeepSeek的新模型顯示AI專業知識可能比計算能力更重要

AI社群對DeepSeek R1的新模型感到興奮,並急於消化其意義。這款由中國AI初創公司DeepSeek開發的旗艦模型在關鍵推理基準上表現與OpenAI的o1系列相當,而他們的7B模型則在某些方面超越了較大的開源模型。

然而,除了對民主化和性能的即時興奮外,DeepSeek更暗示了一條深刻的新路徑:讓領域專家能夠利用有限資源創建強大的專業模型。這一突破對我們的行業有三個主要影響。首先,應用開發者獲得了強大的新開源模型來構建應用;其次,主要實驗室可能會利用這些效率創新進一步推動更大模型的發展。

但最令人感興趣的是,DeepSeek的做法表明,深厚的領域專業知識在構建下一代AI模型和智能應用時,或許比原始計算能力更為重要。

超越原始計算:智能訓練的崛起

DeepSeek R1之所以特別有趣,是因為它如何實現強大的推理能力。該團隊並未依賴昂貴的人類標註數據集或龐大的計算能力,而是專注於兩項關鍵創新:

首先,他們生成了可以自動驗證的訓練數據,專注於數學等正確性明確的領域。其次,他們開發了高效的獎勵函數,可以識別哪些新的訓練示例實際上會改善模型,避免在冗餘數據上浪費計算資源。

結果顯而易見:在AIME 2024數學基準中,DeepSeek R1-Zero達到71.0%的準確率,與o1-0912的74.4%相較相差不遠。他們的7B精簡模型甚至達到55.5%的準確率,超過了QwQ-32B-Preview的50.0%。即使是他們的1.5B參數模型,在AIME上也達到28.9%的驚人成績,在MATH上達到83.9%,顯示出專注訓練如何在特定領域以有限的計算資源取得強大成果。

對應用開發者的贈禮

DeepSeek工作的直接影響是明顯的:他們開源發布的六個較小模型(參數範圍從1.5B到70B)為應用開發者提供了強大的新選擇,讓他們能在強大的推理模型上構建應用。特別是他們的14B精簡模型,在關鍵基準上超越了較大的開源替代品,為專注於應用開發的開發者提供了吸引人的基礎。

加速領導者的發展

對於主要的AI實驗室來說,DeepSeek在訓練效率上的創新不會減緩對更大模型的競爭,反而會加速這一過程。這些技術將可能與龐大的計算資源相結合,進一步推動通用模型的邊界。頂端的計算競賽將繼續,只不過是以更好的燃料為基礎。

領域專家的新路徑

但最有趣的影響可能是對於擁有深厚領域專業知識的團隊。行業敘事大多建議初創企業應專注於在現有模型上構建應用,而不是創建自己的模型。DeepSeek則展示了另一種方式:應用深厚的領域專業知識,以較低的成本創建高度優化的專業模型。

值得注意的是,DeepSeek出自High-Flyer這家對沖基金,這裡的獎勵函數十分明確——即財務回報。可以合理地想像,他們已經在金融建模中應用這些技術,通過對市場數據的預測自動驗證來推動高效訓練。

這一模式可以擴展到任何有明確成功指標的領域。考慮到擁有深厚專業知識的團隊,例如:

– 代碼生成,利用應用性能、提交歷史和驗證/測試作為反饋
– 金融建模,使用市場數據進行驗證
– 醫療診斷,將臨床結果作為真實數據
– 法律分析,利用案件結果進行驗證
– 工業運營,利用實際性能數據創建反饋循環

借助DeepSeek的技術,這些團隊可以:

– 生成可以根據領域規則自動驗證的合成訓練數據
– 創建獎勵函數,有效識別高價值的訓練示例
– 將計算資源集中於對其領域最重要的特定能力
– 將專業模型與領域特定的應用垂直整合

這種方法的威力在於DeepSeek的精簡結果。他們的32B參數模型在AIME 2024上達到72.6%的準確率,在MATH-500上達到94.3%,顯著超越了之前的開源模型。這表明,專注訓練可以克服原始參數數量的限制。

模型開發的未來

展望未來,我們可能會看到模型開發分化為三個方向:

1. 應用開發者在越來越強大的開源基礎上構建
2. 主要實驗室利用效率技術推進通用模型
3. 領域專家用有限的計算預算創建高度優化的專業模型

這第三個方向——領域專家自己構建模型——是最引人注目的。這暗示著未來最有趣的AI發展可能不再取決於誰擁有最多的計算資源,而是誰能最有效地將領域專業知識與巧妙的訓練技術結合起來。

我們正進入一個智能訓練可能比原始計算更為重要的時代——至少對於那些明智地專注於正確問題的人而言。DeepSeek已經展示了一條前進的道路,其他人將會跟隨,但會在這些基本創新上加上自己的領域特定的變化。

這一切都表明,未來AI的發展將不僅僅依賴於技術的進步,還需要結合深厚的領域知識,這為許多初創公司和專業團隊提供了無限的可能性和挑戰。

以上文章由特價GPT API KEY所翻譯及撰寫。而圖片則由FLUX根據內容自動生成。

🎨 Nano Banana Pro 圖像生成器|打幾句說話就出圖

想畫人像、產品圖、插畫?SSFuture 圖像生成器支援 Flux Gemini Nano Banana Pro 改圖 / 合成, 打廣東話都得,仲可以沿用上一張圖繼續微調。

🆓 Flux 模型即玩,不用登入
🤖 登入後解鎖 Gemini 改圖
📷 支援上載參考圖再生成
⚡ 每天免費額度任你玩
✨ 即刻玩 AI 畫圖
一隻在香港茶餐廳喝奶茶的貓 Generate an ultra-realistic, highly ultra-detailed, 8k resolution with 1080x1080 pixel portrait of me using the uploaded image for reference (preserved the likeness and the original face for reference) of a cinematic studio portrait of a woman seated on a simple wooden chair with a minimalist design, positioned slightly to the left of the frame. She is captured in a contemplative pose, with her body turned to the left, her left arm resting gracefully on the back of the chair, and her right hand gently touching her face near her lips, conveying a sense of introspection and elegance. Her long, wavy hair cascades naturally over her shoulders, framing her face and adding softness to the composition. She wears an oversized, textured knit sweater that slips off her shoulders, exposing her collarbones and upper chest, emphasizing a relaxed and intimate mood. Her legs are bare, with her right foot flat on the ground and her left knee slightly raised, creating a dynamic line that guides the viewer’s eye through the composition. *** The background is a seamless, deep charcoal or dark brown studio backdrop, providing a rich, neutral setting that enhances the dramatic lighting. The lighting setup features a single, soft yet directional light source positioned to the left of the subject, casting gentle, sculptural shadows that highlight the contours of her face, shoulders, and arms, while creating a subtle gradient across her form. The light accentuates the texture of her sweater and the natural shine of her hair, adding depth and dimension to the image. The color palette is monochromatic with warm, muted tones—shades of gray, brown, and beige—contributing to a timeless, artistic aesthetic. The image is shot with a professional full-frame camera using an 85mm or 50mm lens at a wide aperture (f/1.8 to f/2.😎 to achieve a shallow depth of field, ensuring the subject is in sharp focus while the background remains softly blurred. The resolution is ultra-high, capturing every detail from the fine texture of her sweater to the subtle expression of her pose. The overall style is elegant, contemplative, and refined, emphasizing mood and atmosphere over overt glamour. Post-processing is minimal, maintaining natural skin tones, enhancing contrast and clarity, and preserving the authenticity of the scene. This portrait embodies a delicate balance between simplicity and emotional depth, making it suitable for fine art, editorial, or fashion photography. {
  "image_generation_request": {
    "prompt": "Ultra-realistic portrait of a man walking toward the camera on an airport runway at night He wears a white long-sleeve shirt with sleeves rolled up and dress pants, shoes. The camera is very close, capturing his face sharply - textures of skin, smoke from his lips, and subtle reflections of firelight in his eyes. Behind him, slightly out of focus, a commercial airplane is burning intensely, with huge flames, roaring firestorms, and thick black smoke rising high. The fiery glow casts dramatic orange highlights on his shirt and face, creating deep shadows and a gritty, cinematic mood. Wet runway reflects the blaze, enhancing the dramatic atmosphere.",
    "dimensions": {
      "width": 1200,
      "height": 1200
    },
    "style_descriptors": [
      "Cinematic",
      "Photorealistic",
      "Gritty",
      "Dramatic Lighting",
      "Macro Photography",
      "8k resolution"
    ],
    "subject_details": {
      "action": "Walking toward camera, smoking",
      "clothing": "White long-sleeve shirt (rolled sleeves), dress pants, shoes",
      "facial_features": "Sharp focus, skin texture, firelight reflection in eyes"
    },
    "environment_details": {
      "location": "Airport runway at night",
      "background": "Commercial airplane burning, intense fire, thick black smoke, out of focus",
      "ground": "Wet runway, reflecting fire"
    }
  }
}