谷歌发布全新视频与图像生成技术更新:Veo 2、Imagen 3 和 Whisk

谷歌推出Veo 2、Imagen 3与Whisk:Veo 2支持4K电影级视频生成,可精准控制镜头参数与胶片质感;Imagen 3显著提升图像细节与真实感;Whisk则为创意工作者提供新工具。三者协同强化AI在专业视觉创作中的可控性与表现力。

发布于2024年12月17日 04:02
编辑零重力瓦力
评论0
阅读31

谷歌发布全新视频与图像生成技术更新:Veo 2、Imagen 3 和 Whisk

Google 宣布了一系列重磅更新,包括最新的视频生成模型 Veo 2、图像生成模型 Imagen 3,以及一款全新的创意工具 Whisk。这些技术为视频创作、图片生成和创意表达带来了全新的可能性。

Veo 2:让视频生成更具电影感

Veo 2 是谷歌新一代视频生成模型,能够制作出高质量的视频,覆盖多种题材和风格。它不仅提升了对现实物理和人物动作的理解,还能捕捉电影级的细节和质感。例如,你可以要求它生成:低角度的动态镜头,穿梭于场景中;科学家在实验室中显微镜下专注的特写;35mm镜头拍摄的乡村蜂窝场景,阳光洒在蜂农和蜂蜜罐上等各种风格的视频。这些画面不仅清晰度高达 4K,还充满艺术性和电影感,甚至可以在几分钟时长的视频中保持一致的风格。Veo 2 还支持用户通过提示词选择各种镜头参数(如“18mm镜头”)或视觉效果(如“浅景深”),生成专业级画面。

在测试中,Veo 2 的生成质量在与其他领先的 AI 视频模型性比,它还减少了常见的 AI 生成问题,比如多余的物体或不合理的细节,让生成结果更加真实自然。

Cinematic shot of a female doctor in a dark yellow hazmat suit, illuminated by the harsh fluorescent light of a laboratory. The camera slowly zooms in on her face, panning gently to emphasize the worry and anxiety etched across her brow. She is hunched over a lab table, peering intently into a microscope, her gloved hands carefully adjusting the focus. The muted color palette of the scene, dominated by the sickly yellow of the suit and the sterile steel of the lab, underscores the gravity of the situation and the weight of the unknown she is facing. The shallow depth of field focuses on the fear in her eyes, reflecting the immense pressure and responsibility she bears.
Veo 2 prompt: This medium shot, with a shallow depth of field, portrays an adorable cartoon girl with wavy brown hair and lots of character, sitting upright in a 1980s kitchen. Her hair is medium length and wavy. She has a small, slightly upturned nose, and small, rounded ears. She is very animated and excited as she talks to the camera and lighting and giggling with a huge grin.
Veo 2 prompt: The camera floats gently through rows of pastel-painted wooden beehives, buzzing honeybees gliding in and out of frame. The motion settles on the refined farmer standing at the center, his pristine white beekeeping suit gleaming in the golden afternoon light. He lifts a jar of honey, tilting it slightly to catch the light. Behind him, tall sunflowers sway rhythmically in the breeze, their petals glowing in the warm sunlight. The camera tilts upward to reveal a retro farmhouse with mint-green shutters, its walls dappled with shadows from swaying trees. Shot with a 35mm lens on Kodak Portra 400 film, the golden light creates rich textures on the farmer’s gloves, marmalade jar, and weathered wood of the beehives.
Veo 2 prompt: A low-angle shot captures a flock of pink flamingos gracefully wading in a lush, tranquil lagoon. The vibrant pink of their plumage contrasts beautifully with the verdant green of the surrounding vegetation and the crystal-clear turquoise water. Sunlight glints off the water's surface, creating shimmering reflections that dance on the flamingos' feathers. The birds' elegant, curved necks are submerged as they walk through the shallow water, their movements creating gentle ripples that spread across the lagoon. The composition emphasizes the serenity and natural beauty of the scene, highlighting the delicate balance of the ecosystem and the inherent grace of these magnificent birds. The soft, diffused light of early morning bathes the entire scene in a warm, ethereal glow.
Veo 2 prompt: A perfect cube rotates in the center of a soft, foggy void. The surface shifts between different hyper-real textures—smooth marble, velvety suede, hammered brass, and raw concrete. Each material reveals subtle details: marble veins slowly spreading, suede fibers brushing with wind, brass tarnishing in slow motion, and concrete crumbling to reveal polished stone inside. Ends with a soft glow surrounding the cube as it transitions to a smooth mirrored surface, reflecting infinity.
Veo 2 prompt: A cinematic shot captures a fluffy Cockapoo, perched atop a vibrant pink flamingo float, in a sun-drenched Los Angeles swimming pool. The crystal-clear water sparkles under the bright California sun, reflecting the playful scene. The Cockapoo's fur, a soft blend of white and apricot, is highlighted by the golden sunlight, its floppy ears gently swaying in the breeze. Its happy expression and wagging tail convey pure joy and summer bliss. The vibrant pink flamingo adds a whimsical touch, creating a picture-perfect image of carefree fun in the LA sunshine.
Veo 2 prompt: The sun rises slowly behind a perfectly plated breakfast scene. Thick, golden maple syrup pours in slow motion over a stack of fluffy pancakes, each one releasing a soft, warm steam cloud. A close-up of crispy bacon sizzles, sending tiny embers of golden grease into the air. Coffee pours in smooth, swirling motion into a crystal-clear cup, filling it with deep brown layers of crema. Scene ends with a camera swoop into a fresh-cut orange, revealing its bright, juicy segments in stunning macro detail.

Veo 2 亮点

  • 高质量与高分辨率,可生成4K视频,时长扩展至数分钟。
  • 电影级的提示词理解能力,支持镜头语言,如焦距、景深和拍摄角度。
  • 安全性与透明度高,所有生成的视频均嵌入 SynthID 水印,标明其为AI生成,避免误用或误导。

目前,Veo 2 已在谷歌实验室的 VideoFX 工具中上线,部分用户可以抢先体验。未来,它还将被扩展到 YouTube Shorts 等平台。

Imagen 3:图像生成更精美多样

相比上一代,Imagen 3 的图像生成能力有了显著提升。无论是逼真的摄影风格,还是抽象的艺术风格,它都能以更高的细节和更丰富的质感还原用户的创意。

例如,你可以通过提示词生成:瓦片质感的陶艺场景,手工艺人用金色能量塑造陶器;雪林中一只红松鼠,毛发细节清晰可见;充满蒸汽与离别情感的1940年代欧洲火车站。Imagen 3 对用户提示的响应更准确,并在多种艺术风格之间切换自如。从印象派到动漫,从抽象艺术到写实摄影,它都能满足用户需求。

谷歌发布全新视频与图像生成技术更新:Imagen 3

Imagen 3 prompt: An extreme close-up of a craftsperson's hands shaping a glowing piece of pottery on a wheel. Threads of golden, luminous energy connect the potter’s hands to the clay, swirling dynamically with their movements. The workspace is filled with rich textures—dusty shelves lined with tools, scattered clay fragments, and beams of natural light piercing through wooden shutters. The interplay of light and energy creates an ethereal, almost magical atmosphere

谷歌发布全新视频与图像生成技术更新:Imagen 3

Imagen 3 prompt: A close-up shot captures a winter wonderland scene – soft snowflakes fall on a snow-covered forest floor. Behind a frosted pine branch, a red squirrel sits, its bright orange fur a splash of color against the white. It holds a small hazelnut. As it enjoys its meal, it seems oblivious to the falling snow.

谷歌发布全新视频与图像生成技术更新:Imagen 3

Imagen 3 prompt: A foggy 1940s European train station at dawn, framed by intricate wrought-iron arches and misted glass windows. Steam rises from the tracks, blending with dense fog. Two lovers stand in an emotional embrace near the train, backlit by the warm, amber glow of dim lanterns. The departing train is partially visible, its red tail lights fading into the mist. The woman wears a faded red coat and clutches a small leather diary, while the man is dressed in a weathered soldier’s uniform. Dust motes float in the air, illuminated by the soft golden backlight. The atmosphere is melancholic and timeless, evoking the bittersweet farewell of wartime cinema.

谷歌发布全新视频与图像生成技术更新:Imagen 3

Imagen 3 prompt: A portrait of an Asian woman with neon green lights in the background, shallow depth of field.

谷歌发布全新视频与图像生成技术更新:Imagen 3

Imagen 3 prompt: A close-up, macro photography stock photo of a strawberry intricately sculpted into the shape of a hummingbird in mid-flight, its wings a blur as it sips nectar from a vibrant, tubular flower. The backdrop features a lush, colorful garden with a soft, bokeh effect, creating a dreamlike atmosphere. The image is exceptionally detailed and captured with a shallow depth of field, ensuring a razor-sharp focus on the strawberry-hummingbird and gentle fading of the background. The high resolution, professional photographers style, and soft lighting illuminate the scene in a very detailed manner, professional color grading amplifies the vibrant colors and creates an image with exceptional clarity. The depth of field makes the hummingbird and flower stand out starkly against the bokeh background.

Imagen 3 亮点

  • 更亮眼的图像效果,色彩更鲜艳,构图更优秀,细节更丰富。
  • 支持多种艺术风格,从写实到幻想,转换流畅。

Whisk:用图片而非文字激发创意

谷歌还推出了一款全新实验工具 Whisk,它让用户可以通过图片来生成创意图像,而不是传统的文字提示。这款工具主要适合快速视觉化创作,让用户可以将主题、场景和风格的图片拖拽到工具中,由 AI 将它们重新混合成独一无二的新作品。

Whisk 背后的技术结合了 Imagen 3 和 Gemini 模型。Gemini 会为输入图片生成详细描述,再通过 Imagen 3 转化为最终的创意图像。无论是数字玩偶、搪瓷徽章还是贴纸设计,Whisk 都能让用户的创意瞬间成真。

Whisk 详细介绍:谷歌推出 Whisk:用图片和 AI 重新定义创意表达

亮点

  • 通过拖拽图片生成创意。
  • 简单快捷,无需复杂提示词,适合快速试验创意。

AI 创意技术的新纪元

谷歌实验室的 Veo 2、Imagen 3 和 Whisk 展示了生成式 AI 在视频和图像创作领域的巨大潜力。从电影级视频制作到个性化艺术创作,这些工具不仅为专业创作者提供了便捷的创意解决方案,也让普通用户可以轻松尝试。未来,谷歌计划将这些技术扩展到更多产品和平台,让 AI 赋能每个人的创意表达。

参考资料
State-of-the-art video and image generation with Veo 2 and Imagen 3

相关文章

微软 Mirage:让世界模型学会“过目不忘”,速度快 10 倍、显存省 55 倍
AI 产品工具
2026年6月21日
0 条评论
零重力瓦力

微软 Mirage:让世界模型学会“过目不忘”,速度快 10 倍、显存省 55 倍

微软研究院联合多所高校发布 Mirage 模型,通过在扩散模型隐空间直接存储三维记忆,解决了 AI 视频生成中场景一致性差及计算昂贵的问题。该方案摒弃传统 RGB 点云渲染流程,使生成速度提升最高 10.57 倍,显存占用降低 55 倍,且长视频边际成本几乎不增。测试显示其三维与光度一致性优于现有方案,虽暂不支持动态物体记忆,但已开源并适用于机器人仿真等静态场景任务。

#世界模型
阅读全文
Google 搜索变身全天候智能体:Information Agents 上线,你的数据终于开始替你干活了
AI 产品工具
2026年6月15日
0 条评论
零重力瓦力

Google 搜索变身全天候智能体:Information Agents 上线,你的数据终于开始替你干活了

Google 推出 Information Agents 功能,面向 AI Ultra 订阅用户开放。该功能将搜索从被动查询转变为主动监测,智能体可 7×24 小时追踪用户需求并推送变化信息。其底层依托 Personal Intelligence 战略,通过整合 Gmail、Photos 等跨应用数据实现个性化推理。尽管存在隐私与准确性挑战,但凭借二十年数据积累,Google 正推动 AI 助手从对话工具向自主代理进化,重塑“信息找人”的交互范式。

#Google#智能体
阅读全文
Kimi Work 上线:300 个子智能体在你的电脑上同时干活,个人 Agent 之战正式开打
AI 产品工具
2026年6月14日
0 条评论
零重力瓦力

Kimi Work 上线:300 个子智能体在你的电脑上同时干活,个人 Agent 之战正式开打

6 月首周,月之暗面、微软、Google 及 Databricks 密集发布智能体产品,标志着 AI 正从对话助手转向持续行动系统。其中 Kimi Work 主打本地桌面运行,支持多智能体并行与浏览器接管;Microsoft Scout 定位永远在线的个人助理;Google 推出 24 小时信息追踪智能体;Databricks 开源 Omnigent 实现跨智能体互操作。行业共识逐渐形成,智能体将具备自主调度、任务拆解及持续运行能力。

#智能体框架#智能体
阅读全文
互动讨论

评论区

围绕《谷歌发布全新视频与图像生成技术更新:Veo 2、Imagen 3 和 Whisk》展开交流,未登录用户可浏览评论,登录后可参与讨论。

评论数
0
登录后参与评论
支持发表观点与回复一级评论,互动后将同步到消息中心。
登录后评论
暂无评论,欢迎成为第一个参与讨论的人。