Google DeepMind ยท @GoogleDeepMind

We're helping AI to see the 3D world in motion as humans do. ๐ŸŒ Enter D4RT: a unified model that turns video...

View this X/Twitter post from @GoogleDeepMind published on 2026๋…„ 1์›” 22์ผ ์˜คํ›„ 03:01. This post contains 1 videos and 1 images.

Published
2026๋…„ 1์›” 22์ผ ์˜คํ›„ 03:01
Thread Items
7
Media Items
2
Google DeepMind avatar
Google DeepMind
@GoogleDeepMind
2026๋…„ 1์›” 22์ผ ์˜คํ›„ 03:01

Tweet Overview

View this X/Twitter post from @GoogleDeepMind published on 2026๋…„ 1์›” 22์ผ ์˜คํ›„ 03:01. This post contains 1 videos and 1 images.

We're helping AI to see the 3D world in motion as humans do. ๐ŸŒ

Enter D4RT: a unified model that turns video into 4D representations faster than previous methods - enabling it to understand space and time. This is how it works ๐Ÿงต
To perceive a 2D scene captured on video, an AI must track every pixel of every object as it moves. ๐Ÿ”๏ธ๏ธ 

Capturing this level of geometry and motion requires computationally intensive processes leading to slow and fragmented reconstructions. But D4RT takes a different approach.
D4RT encodes input videos into a compressed version, then processes and queries the data using a lightweight decoder in parallel.

This makes it extremely fast and scalable - whether to track just a few points, or to reconstruct an entire scene.  ๐Ÿ–ผ๏ธ
Many 4D tasks can now be solved with one model, enabling us to:
๐Ÿ‘‰ Predict a pixelโ€™s 3D trajectory by looking for its location across different time steps.
โฑ๏ธ Freeze time and the camera viewpoint to generate a scene's complete 3D structure. 
D4RT can even create and align 3D snapshots of a single moment from different viewpoints - easily recovering the camera's trajectory. ๐ŸŽฅ 
4D reconstruction often fails on dynamic objects, resulting in ghosting artifacts or processing lag. โณ

D4RT can continuously understand what's moving while running 18xโ€“300x faster than previous methods - processing a 1-minute video in roughly 5 seconds on a single TPU chip.
We believe this research could have unlimited applications in the real-world.

From providing spatial awareness in robotics ๐Ÿค–, leveling up efficiency in AR ๐Ÿ•น๏ธ, and expanding the capabilities in world models ๐ŸŒ D4RTโ€™s potential is a necessary step on the path towards AGI.

Find out more โ†’ 
Google DeepMind media

More from @GoogleDeepMind

Archived posts from Google DeepMind

๋ชจ๋‘ ๋ณด๊ธฐ

Related Creators

TwitFast

v1.4.88

Free Twitter video downloader. Top Twitter trends and hashtags list, Monitor, track hottest trending topics, hashtags.

ยฉ 2024 TwitFast ๋ชจ๋“  ๊ถŒ๋ฆฌ ๋ณด์œ .