Home / Technology /

What are the highlights of Meta's open-source Llama 4 release?

CCN Chronicle Daily

Technology • Updated Friday May 16 11:08:12 CST 2025

Llama 4 The highlights of Meta's open - source large language model Llama 4 are as follows:

Innovative Architecture
- It introduces the Mixture - of - Experts (MoE) architecture for the first time. In the MoE model, only a part of the total parameters is activated for a single token. Meta uses alternating dense layers and MoE layers. In the MoE layer, 128 routing experts and a shared expert are used, and each token is sent to the shared expert and one of the 128 routing experts. This architecture improves computational efficiency during training and inference, and provides higher quality than dense models under a fixed training FLOPs budget. It also improves inference efficiency by reducing model service costs and latency.
Powerful Multimodality
- It is a native multimodal model that uses the early - fusion technique to seamlessly integrate text and visual tokens into a unified model backbone. This allows the model to be jointly pre - trained with a large amount of unlabeled text, image, and video data. It can process and integrate various types of data such as text, video, images, and audio, and convert content between these formats. The vision encoder uses an upgraded version of MetaCLIP, which is trained simultaneously with a frozen - parameter Llama model to better adapt to the LLM.
Diverse Versions with Distinct Advantages
- Llama 4 Scout: It is known as the "best - in - class global multimodal model". It has 16 experts, 1.7 billion active parameters, and 109 billion total parameters, and can run on a single NVIDIA H100 GPU (using Int4 quantization). It supports a context window of up to 10 million tokens and can handle up to 5 million words of text. In many benchmark tests, its performance exceeds that of well - known models such as Gemma 3, Gemini 2.0 Flash - Lite, and Mistral 3.1, and it performs particularly well in the field of image localization. It is suitable for scenarios such as document summarization and large - code - library reasoning.
- Llama 4 Maverick: It has 128 experts, 1.7 billion active parameters, and 400 billion total parameters, and is adapted to a single H100 host. It defeats GPT - 4o and Gemini 2.0 Flash in various benchmark tests, and achieves results equivalent to the new DeepSeek - v3 in terms of reasoning and programming, with only half the active parameters of DeepSeek - v3. Its experimental chat version has an ELO score of 1417 on LMArena, and the inference cost per 1M tokens of input and output is in the range of $0.19 - 0.49, which is close to or even lower than that of DeepSeek v3.1 ($0.48). It is very suitable for general - purpose assistant and chat - based application scenarios.
- Llama 4 Behemoth: It has 16 experts, 288 billion active parameters, and nearly 2 trillion total parameters. Its performance in multiple STEM benchmark tests is better than that of GPT - 4.5, Claude Sonnet 3.7, and Gemini 2.0 Pro. It will serve as a teacher model for the collaborative distillation of models such as Maverick. It is pre - trained with 30T multimodal tokens on 32000 GPUs using FP8 precision. After its official release in the future, it will provide strong support for complex task processing and cutting - edge research.
High Cost - Effectiveness
- The efficient characteristics of Llama 4 significantly reduce the inference cost. For example, the experimental chat version of Llama 4 Maverick has an inference cost per 1M tokens of input and output in the range of $0.19 - 0.49, which is close to or even lower than that of DeepSeek v3.1 ($0.48). Llama 4 Scout and Maverick can process each million input tokens at a cost of 15 cents and 24 cents respectively, which is significantly lower than the cost of models such as GPT - 4 and Claude 3.7 Sonnet.
High Ecological Integration
- Llama 4 can be downloaded from llama.com and Hugging Face, and will soon be available on mainstream cloud and data platforms, edge chips, and global service integrators. In addition, it has been integrated into applications such as WhatsApp, Messenger, Instagram Direct, and the Meta.AI website. Users can try out Meta AI based on Llama 4 on these platforms, which provides convenience for developers and researchers to experiment and integrate, and strongly promotes the innovative application of AI technology in various fields.

Must Read

China's satellite navigation industry output value hits $79 billion as Beidou expands globally

Technology • Updated Sunday May 18 15:19:38 CST 2025

Xiaomi teases self-developed Xring O1 chip for broader device use

Technology • Updated Sunday May 18 15:12:20 CST 2025

Yemen's Huthis claim missile fire at Israel airport

Breaking News • Updated Sunday May 18 15:07:19 CST 2025

Massive fire broke out near Charminar in Hyderabad; Officials confirm at least 17 dead

Breaking News • Updated Sunday May 18 14:52:34 CST 2025

Mexican president 'deeply saddened' by deaths in NYC ship crash

Breaking News • Updated Sunday May 18 14:49:07 CST 2025

Polls open for first round of Poland's presidential election

Breaking News • Updated Sunday May 18 14:45:27 CST 2025

Coconut Chicken: A Tropical Culinary Marvel from Hainan

Culture • Updated Sunday May 18 10:55:40 CST 2025

China's Visa-Free Policy Upgrade: Unlock Global Travel and Explore China's Charms

Culture • Updated Sunday May 18 10:35:07 CST 2025

Shaolin monks amaze with silver spear throat performance

Culture • Updated Sunday May 18 10:21:08 CST 2025

Denmark to Provide Ukraine with New Military Aid Package

Breaking News • Updated Sunday May 18 10:13:31 CST 2025

What are the highlights of Meta's open-source Llama 4 release?

Related News

AMD Radeon RX 9070 XT: The Next Generation of Graphics Powerhouses

Shenzhou-19 Crew Conducts Scientific Experiments, Training in Space

2026 Honda Accord Facelift Unveiled: A Blend of Style, Tech, and Performance

Xiaomi 15 Ultra and Xiaomi SU7 Ultra Set to Be Launched on February 27th

2025 BYD Atto 3 Facelift Unveiled: Stylish Upgrades and Tech Enhancements

China's New CR450 High - Speed Train Prototypes Undergo Testing: A Leap in Railway Technology

China's Successful Launch of Zhongxing - 10R Satellite: A Leap Forward in Space Technology and Global Connectivity

China Breaks Asian Record with Deepest Vertical Well, Unveiling Energy and Scientific Ambitions

Chinese Scientists Uncover Breakthrough Evidence for Elusive Intermediate-Mass Black Holes

DeepSeek provides a guide for making the most of your weekends, so you can spend them more meaningfully!

In 5 years, can you buy a humanoid robot for housework for 50,000 yuan?

NVIDIA GeForce RTX 5070 Ti Launches: 40% Faster Ray Tracing, AI-Optimized Gaming, and Industry-Disrupting Specs Redefine GPU Standards

DeepSeek App Surpasses 100M Downloads: How This AI Powerhouse Became China’s Must-Have Productivity Tool

DeepSeek Ranks China’s Most Livable Cities 2025: Coastal Gem Zhuhai Tops List Amid AI-Driven Urban Analysis

Oppo Find N5 Redefines Portability: World’s Thinnest Foldable Smartphone Unveiled with Cutting-Edge Tech

Microsoft Quantum Computers: Scaling Topological Qubits and the Race for Fault-Tolerant Quantum Advantage

iPhone 16e Price Revealed: Apple’s $599 Budget Model Targets Mid-Range Market with A18 Chip & AI Features

iPhone 16e vs. iPhone 16: Decoding Apple’s $200 Gap for the Perfect Choice

PlayStation 6: Sony’s Next-Gen Console Set for 2027 with Zen 5, 8K Gaming, and AI-Driven Innovation

NASA Tracks Asteroid 2024 YR4: 2% Chance of Earth Impact in 2032 Sparks Global Vigilance

Apple Launches iPhone 16e: The Most Affordable AI-Capable iPhone with Groundbreaking Features Starting at 3,999 CNY

Mira Murati Launches Thinking Machines Lab: Ex-OpenAI CTO’s $120M Venture Targets AI Safety and Transparency with All-Star Team

Xiaomi teases self-developed Xring O1 chip for broader device use

China's satellite navigation industry output value hits $79 billion as Beidou expands globally

Must Read

China's satellite navigation industry output value hits $79 billion as Beidou expands globally

Xiaomi teases self-developed Xring O1 chip for broader device use

Yemen's Huthis claim missile fire at Israel airport

Massive fire broke out near Charminar in Hyderabad; Officials confirm at least 17 dead

Mexican president 'deeply saddened' by deaths in NYC ship crash

Polls open for first round of Poland's presidential election

Coconut Chicken: A Tropical Culinary Marvel from Hainan

China's Visa-Free Policy Upgrade: Unlock Global Travel and Explore China's Charms

Shaolin monks amaze with silver spear throat performance

Denmark to Provide Ukraine with New Military Aid Package

Cookie Preferences