We're helping AI to see the 3D world in motion as humans do. 🌐

Enter D4RT: a unified model that turns video into 4D representations faster than previous methods - enabling it to understand space and time. This is how it works 🧵
To perceive a 2D scene captured on video, an AI must track every pixel of every object as it moves. 🔍️️ 

Capturing this level of geometry and motion requires computationally intensive processes leading to slow and fragmented reconstructions. But D4RT takes a different approach.
D4RT encodes input videos into a compressed version, then processes and queries the data using a lightweight decoder in parallel.

This makes it extremely fast and scalable - whether to track just a few points, or to reconstruct an entire scene.  🖼️
Many 4D tasks can now be solved with one model, enabling us to:
👉 Predict a pixel’s 3D trajectory by looking for its location across different time steps.
⏱️ Freeze time and the camera viewpoint to generate a scene's complete 3D structure. 
D4RT can even create and align 3D snapshots of a single moment from different viewpoints - easily recovering the camera's trajectory. 🎥 
4D reconstruction often fails on dynamic objects, resulting in ghosting artifacts or processing lag. ⏳

D4RT can continuously understand what's moving while running 18x–300x faster than previous methods - processing a 1-minute video in roughly 5 seconds on a single TPU chip.
We believe this research could have unlimited applications in the real-world.

From providing spatial awareness in robotics 🤖, leveling up efficiency in AR 🕹️, and expanding the capabilities in world models 🌐 D4RT’s potential is a necessary step on the path towards AGI.

Find out more → 
chia sẻ
khám phá

TwitFast

v1.4.81

Free Twitter video downloader. Top Twitter trends and hashtags list, Monitor, track hottest trending topics, hashtags.

© 2024 TwitFast Mọi quyền được bảo lưu.