Document Similarity

Overall

With embeddings, user can easily query and explore the unstructured data, embeddings are useful for developing a more holistic understanding of your training data. For more details of embeddings, please refer to the following link

Embedding projector

An embedding projector is a tool for uncovering patterns in unstructured data that can be used to diagnose systemic model and labeling errors. User can use the projector tool to employ dimensionality reduction algorithms to explore embeddings in 2D interactively.

To use the embedding projector, please click on the projector icon in the top right corner of LABEL page.

1627

We use standard TensorFlow projector with algorithms: T-SNE, UMAP & PCA supported and for details, please refer to the following link

In the demo below, we looked into different bank authority documents and you will see with umap algorithm, the similar bank authorities are clustered together with detailed the nearest points in the original space listed at the right side of page.

1642

For details of different algorithms and dimensionality reduction, please refer to the link below.