Movement fashions parameterized as time-dependent velocity fields can generate information from noise by integrating an ODE. These fashions are sometimes skilled utilizing circulation matching, i.e. by sampling random pairs of noise and goal factors and guaranteeing that the speed discipline is aligned, on common, with when evaluated alongside a phase linking to . Whereas these pairs are sampled independently by default, they may also be chosen extra fastidiously by matching batches of noise to goal factors utilizing an optimum transport (OT) solver. Though promising in concept, the OT circulation matching (OT-FM) strategy shouldn’t be extensively utilized in apply. Zhang et al. (2025) identified lately that OT-FM really begins paying off when the batch measurement grows considerably, which solely a multi-GPU implementation of the Sinkhorn algorithm can deal with. Sadly, the prices of working Sinkhorn can shortly balloon, requiring operations for each pairs used to suit the speed discipline, the place is a regularization parameter that ought to be usually small to yield higher outcomes. To meet the theoretical guarantees of OT-FM, we suggest to maneuver away from batch-OT and rely as a substitute on a semidiscrete formulation that leverages the truth that the goal dataset distribution is often of finite measurement . The SD-OT downside is solved by estimating a twin potential vector utilizing SGD; utilizing that vector, freshly sampled noise vectors at practice time can then be matched with information factors at the price of a most inside product search (MIPS). Semidiscrete FM (SD-FM) removes the quadratic dependency on that bottlenecks OT-FM. SD-FM beats each FM and OT-FM on all coaching metrics and inference price range constraints, throughout a number of datasets, on unconditional/conditional technology, or when utilizing mean-flow fashions.
- ** Work finished whereas at Apple
