Hunyuan · @TencentHunyuan

Today we're announcing the open-source release of HunyuanVideo-Foley, our new end-to-end Text-Video-to-Audio...

查看 @TencentHunyuan 在 2025年8月28日 04:20 发布的这条 X/Twitter 推文。 这条内容包含 1 个视频。

发布时间
2025年8月28日 04:20
线程条目数
1
媒体数量
1

推文概览

查看 @TencentHunyuan 在 2025年8月28日 04:20 发布的这条 X/Twitter 推文。 这条内容包含 1 个视频。

Today we're announcing the open-source release of HunyuanVideo-Foley, our new end-to-end Text-Video-to-Audio (TV2A) framework for generating high-fidelity audio.🚀

This tool empowers creators in video production, filmmaking, and game development to generate professional-grade audio that precisely aligns with visual dynamics and semantic context, addressing key challenges in V2A generation.🔊

Key Innovations:

🔹Exceptional Generalization: Trained on a massive 100k-hour multimodal dataset, the model generates contextually-aware soundscapes for a wide range of scenes, from natural landscapes to animated shorts.

🔹Balanced Multimodal Response: Our innovative multimodal diffusion transformer (MMDiT) architecture ensures the model balances video and text cues, generating rich, layered sound effects that capture every detail—from the main subject to subtle background elements.

🔹High-Fidelity Audio: Using a Representation Alignment (REPA) loss function and a powerful Audio VAE, we've improved generation stability and producing professional-grade audio, free of noise and inconsistencies.

HunyuanVideo-Foley achieves SOTA on multiple benchmarks, surpassing all open-source models in audio quality, visual-semantic alignment, and temporal alignment.

👉Try it now: 
🌐Project Page: 
🔗Code: 
📄Technical Report: 
🤗Hugging Face: 

来自 @TencentHunyuan 的更多内容

来自 Hunyuan 的收录推文

查看全部

相关创作者

TwitFast

v1.4.88

Free Twitter video downloader. Top Twitter trends and hashtags list, Monitor, track hottest trending topics, hashtags.

© 2024 TwitFast 保留所有权利。