Document Similarity
Overall
With embeddings, user can easily query and explore the unstructured data, embeddings are useful for developing a more holistic understanding of your training data. For more details of embeddings, please refer to the following link
Embedding projector
An embedding projector is a tool for uncovering patterns in unstructured data that can be used to diagnose systemic model and labeling errors. User can use the projector tool to employ dimensionality reduction algorithms to explore embeddings in 2D interactively.
To use the embedding projector, please click on the projector icon in the top right corner of LABEL page.
![Screen Shot 2022-03-31 at 11.56.45 am.png 1627](https://files.readme.io/b1dc49b-Screen_Shot_2022-03-31_at_11.56.45_am.png)
We use standard TensorFlow projector with algorithms: T-SNE, UMAP & PCA supported and for details, please refer to the following link
In the demo below, we looked into different bank authority documents and you will see with umap algorithm, the similar bank authorities are clustered together with detailed the nearest points in the original space listed at the right side of page.
![embedding projector page.png 1642](https://files.readme.io/cb3a88d-embedding_projector_page.png)
For details of different algorithms and dimensionality reduction, please refer to the link below.
Updated 7 months ago