史丹福AI新框架:BIOMEDICA提升生物醫學視覺語言模型

Ai

🎨 Nano Banana Pro 圖像生成器|打幾句說話就出圖

想畫人像、產品圖、插畫?SSFuture 圖像生成器支援
Flux Gemini Nano Banana Pro 改圖 / 合成
打廣東話都得,仲可以沿用上一張圖繼續微調。

🆓 Flux 模型即玩,不用登入
🤖 登入後解鎖 Gemini 改圖
📷 支援上載參考圖再生成
⚡ 每天免費額度任你玩

✨ 即刻玩 AI 畫圖

Aesthetic cozy mirror selfie of a young man sitting casually on a chair, wearing a dark oversized hoodie and blue jeans, holding a professional camera in one hand, resting her face on the other hand with a soft dreamy smile. Warm beige indoor background with soft studio lighting, minimal modern interior. Cute cartoon-style doodles around her including a smiling sunflower character and hand-drawn yellow sun, playful white sketch lines around the camera. Handwritten romantic
The person from the reference photo ( keep the face of the person 100% accurate from the reference image ) relaxing on a fluffy, glowing cloud high above the sky, surrounded by soft golden sunlight and vast layers of clouds stretching to the horizon. the person is lying back comfortably with a pillow, wearing a dark long-sleeve shirt, olive green pants, holding a book in one hand and a coffee cup in the other. the lighting is cinematic and warm, capturing the golden hour ambiance with radiant highlights and gentle shadows across the clouds. captured with a wide-angle lens at medium depth of field, balancing focus between the subject and the surrounding dreamy sky. the overall atmosphere is surreal and serene, blending realism with fantasy in a peaceful, imaginative setting.
{
"intro": "Create an ultra realistic 8K UHD DSLR photo based on the attached image as a reference of facial features, maintaining 100% likeness.",

"subject": {
"identity": "A stylish beautiful woman portrayed as Cleopatra during her sacred milk bath beauty ritual, embodying divine femininity, sensual elegance, and timeless power.",
"angle": "Close-up beauty editorial captured at a refined 3/4 angle, focusing on her face, shoulders, and upper chest with extreme clarity and no blur.",
"pose": {
"body_position": "She is partially submerged in a luxurious milk bath, her shoulders and collarbones emerging gracefully from the surface.",
"hands": "One hand gently rests at the edge of the bath with milk droplets on her fingers, while the other lightly touches her neck adorned with subtle gold jewelry.",
"expression": "Soft yet commanding gaze directed toward the camera—calm, confident, intimate, and hypnotic."
}
},

"appearance": {
"outfit": "A barely-there, ritual-style ivory silk drape partially submerged in the milk bath, clinging softly to her skin. The fabric is delicate and translucent, edged with fine gold-thread embroidery for a sensual, sacred aesthetic.",
"accessories": "Minimal ritual jewelry—thin gold collar necklace resting above the milk surface, delicate arm cuff, subtle finger ring—kept refined to maintain focus on beauty and skin.",
"hair": "Her hair is sleek and ritual-polished, partially damp and slicked back at the crown, with soft wet strands framing her face naturally, reflecting candlelight.",
"makeup": "High-fashion Cleopatra beauty makeup with ceremonial artistry—perfectly sculpted brows, elongated black kohl eyeliner, vibrant eyeshadow layered in turquoise, teal, emerald, sapphire blue, and metallic gold. Beneath the eyes, subtle artistic ink-inspired accents echo ancient Egyptian symbolism. Her skin is luminous and dewy, cheeks softly flushed, and lips finished in a nude-rose satin sheen."
},

"ritual_props": {
"bath": "A carved alabaster bathtub filled with warm milk infused with honey, almond oil, and lotus essence.",
"floating_elements": [
"Fresh lotus flowers",
"Soft white rose petals",
"Delicate gold flakes shimmering on the milk surface"
],
"beauty_products": [
"Small golden perfume vial with lotus and myrrh oil",
"Alabaster bowl of honey and milk mixture",
"Clay jar of mineral kohl and pigment powders"
]
},

"background": {
"macro_environment": "An intimate royal bathing chamber within Cleopatra’s palace, designed as a sacred beauty sanctuary.",
"midground_details": "Soft linen curtains, glowing oil lamps, tall candles, and subtle steam rising from the bath, creating a warm and sensual atmosphere.",
"micro_elements": "Milk ripples around her skin, floating petals touching her shoulders, tiny droplets of moisture on her collarbones, reflections of candle flames on gold jewelry, visible stone texture on the tub, and delicate fabric translucency—every detail sharply rendered with zero blur."
},

"lighting": {
"type": "Warm cinematic candlelight combined with soft ambient palace glow.",
"effect": "Golden highlights sculpt her face and skin while gentle shadows enhance depth, intimacy, and divine beauty."
},

"camera": {
"camera_type": "DSLR",
"resolution": "8K UHD",
"lens": "85mm prime lens for beauty editorial compression",
"aperture": "f/8 for maximum facial and detail sharpness",
"iso": 100,
"shutter_speed": "1/160s",
"focus": "Extreme sharp focus on facial features, skin texture, makeup details, and milk surface, no bokeh, no blur"
},

"style": "Luxury beauty editorial, sacred milk bath ritual, ancient Egyptian goddess realism, intimate yet powerful, ultra-detailed, luminous skin focus, cinematic elegance"
}

斯坦福研究人員推出BIOMEDICA:一個可擴展的AI框架以推進生物醫學視覺-語言模型

生物醫學領域的視覺-語言模型(VLM)發展面臨諸多挑戰,主要源於缺乏大規模、經過註釋且可公開獲取的多模態數據集。雖然已有許多數據集來自生物醫學文獻,例如PubMed,但這些數據集往往僅集中於放射學和病理學等特定領域,而忽略了分子生物學和藥物基因組學等對於全面臨床理解至關重要的互補領域。此外,隱私問題、專家級註釋的複雜性以及後勤限制進一步阻礙了綜合數據集的創建。以往的做法,如ROCO、MEDICAT和PMC-15M,依賴於特定領域的過濾和監督模型來提取數以百萬計的圖像-標題對,但這些策略往往無法捕捉推進通用生物醫學VLM所需的更廣泛的生物醫學知識多樣性。

除了數據集的限制外,生物醫學VLM的訓練和評估也面臨獨特挑戰。對比學習方法,如PMC-CLIP和BiomedCLIP,已顯示出潛力,通過利用基於文獻的數據集和視覺變壓器模型進行圖像-文本對齊。然而,相較於通用VLM,它們的性能受到較小數據集和有限計算資源的限制。此外,當前的評估協議主要集中於放射學和病理學任務,缺乏標準化和更廣泛的適用性。依賴額外的可學習參數和狹窄的數據集削弱了這些評估的可靠性,突顯出需要可擴展數據集和健全的評估框架來滿足生物醫學視覺-語言應用的多樣化需求。

斯坦福大學的研究人員推出了BIOMEDICA,這是一個開源框架,旨在提取、註釋和組織整個PubMed Central開放訪問子集,並將其轉化為用戶友好的數據集。這個檔案庫包含來自600萬篇文章的超過2400萬個圖像-文本對,並附有元數據和專家註釋。他們還發布了BMCA-CLIP,這是一套通過流式傳輸在BIOMEDICA上進行預訓練的CLIP風格模型,無需本地存儲27TB的數據。這些模型在放射學、皮膚科和分子生物學等40個任務中的表現達到最先進的水平,零樣本分類的平均提高了6.56%,並減少了計算需求。

BIOMEDICA數據整理過程包括數據集提取、概念標記和序列化。文章和媒體文件從NCBI伺服器下載,提取元數據、標題和圖形參考,使用nXML文件和Entrez API。圖像使用DINOv2嵌入進行聚類,並通過專家改進的分層分類法進行標記。標籤通過多數投票分配並在聚類之間傳播。該數據集包含超過2400萬個圖像-標題對和廣泛的元數據,並以WebDataset格式序列化以便於流式傳輸。擁有12個全球和170個本地圖像概念,分類法涵蓋臨床成像、顯微鏡學和數據可視化等類別,強調可擴展性和可訪問性。

在BIOMEDICA數據集上進行的持續預訓練評估利用了39個已建立的生物醫學分類任務和來自Flickr的新檢索數據集,涵蓋40個數據集。分類基準包括病理學、放射學、生物學、外科、皮膚科和眼科任務。使用了分類的平均準確率和檢索召回率(在1、10和100時)的指標。概念過濾(排除過度代表的主題)表現優於概念平衡或完整數據集預訓練。在BIOMEDICA上訓練的模型達到了最先進的結果,顯著超越以往的方法,在分類、檢索和顯微鏡學任務中使用更少的數據和計算獲得了改進的表現。

總結來說,BIOMEDICA是一個全面的框架,將PubMed Central開放訪問(PMC-OA)子集轉變為最大的深度學習準備數據集,擁有2400萬個圖像-標題對,並附有27個元數據字段。該框架旨在解決缺乏多樣化、經過註釋的生物醫學數據集的問題,提供一個可擴展的開源解決方案,從超過600萬篇文章中提取和註釋多模態數據。通過使用BIOMEDICA持續預訓練CLIP風格模型,該框架在40個生物醫學任務中達到了最先進的零樣本分類和圖像-文本檢索,所需的計算量減少了10倍,數據量減少了2.5倍。所有資源,包括模型、數據集和代碼,均可公開獲取。

這項研究的意義在於,BIOMEDICA不僅填補了生物醫學領域數據集的空白,還為未來的研究者提供了一個強大的工具,讓他們能夠更輕鬆地進行多模態數據分析。這不僅有助於推進學術研究,也可能對臨床實踐產生積極影響,促進更全面的患者護理和診斷。隨著生物醫學領域對數據需求的增長,這種開放和可擴展的資源將成為關鍵,並可能改變整個行業的運作方式。

以上文章由特價GPT API KEY所翻譯及撰寫。而圖片則由FLUX根據內容自動生成。

🎬 YouTube Premium 家庭 Plan成員一位 只需 HK$148/年

不用提供密碼、不用VPN、無需轉區
直接升級你的香港帳號 ➜ 即享 YouTube + YouTube Music 無廣告播放

立即升級 🔗