Embedding projections are well-liked for visualizing giant datasets and fashions. Nevertheless, individuals usually encounter “friction” when utilizing embedding visualization instruments: (1) limitations to adoption, e.g., tedious knowledge wrangling and loading, scalability limits, no integration of outcomes into current workflows, and (2) limitations in attainable analyses, with out integration with exterior instruments to moreover present coordinated views of metadata. On this paper, we current Embedding Atlas, a scalable, interactive visualization software designed to make interacting with giant embeddings as simple as attainable. Embedding Atlas makes use of trendy net applied sciences and superior algorithms — together with density-based clustering, and automatic labeling — to supply a quick and wealthy knowledge evaluation expertise at scale. We consider Embedding Atlas with a aggressive evaluation towards different well-liked embedding instruments, exhibiting that Embedding Atlas’s function set particularly helps scale back friction, and report a benchmark on its real-time rendering efficiency with thousands and thousands of factors. Embedding Atlas is out there as open supply to assist future work in embedding-based evaluation.
