Constructing robots to accomplish long-horizon tasks is a long-standing challenge within artificial intelligence. Approaches using generative methods, particularly Diffusion Models, have gained attention due to their ability to model continuous robotic trajectories for planning and control. However, we show that these models struggle with long-horizon tasks that involve complex decision-making and, in general, are prone to confusing different modes of behavior, leading to failure.
To remedy this, we propose to augment continuous trajectory generation by simultaneously generating a high-level symbolic plan. We show that this requires a novel mix of discrete variable diffusion and continuous diffusion, which dramatically outperforms the baselines. In addition, we illustrate how this hybrid diffusion process enables flexible trajectory synthesis, allowing us to condition synthesized actions on partial and complete discrete conditions.
Struggles with long-horizon decision making tasks, mixing behaviors in the dataset.
Shows remarkable ability to solve complex tasks by combining symbolic planning.
Method | X-Arm Sorting | Arrange Blocks | Hook Task |
---|---|---|---|
Diffuser | 46% | 67% | 38% |
Joint Diffuser | 41% | 61% | 48% |
Separate Diffuser | 38% | 62% | 43% |
Hybrid (Ours) | 83% | 74% | 60% |
In addition to experimental benchmarks, we measure the robustness of all methods as task complexity increases by varying the number of blocks to sort. We find that Hybrid Diffusion Planning is significantly stronger than the baselines.
Method | Sorting Task | Hook Task |
---|---|---|
Diffuser | 20% | 6.7% |
Joint Diffuser | 10% | 10% |
Separate Diffuser | 0% | 6.7% |
Hybrid (Ours) | 70% | 60% |