Molmo 2 is an 8B-parameter model that surpasses the 72B-parameter Molmo in accuracy, temporal understanding, and pixel-level ...
Imagine a coffee cup sitting on a table. Now, imagine a book partially obscuring the cup. As humans, we still know what the coffee cup is even though we can't see all of it. But a robot might be ...
Humanoid robot David is perceiving a cup using his RGB-D camera; in the following, he will grasp it and fill the dishwasher. Tracking objects and kinematic structures in 3D space and determining their ...
Can a robot figure out how heavy or soft an object is without using a single camera or force sensor? According to a recent arXiv paper, the answer is yes—and the solution lies entirely in how the ...