Han Xiaoโ€™s Post

View profile for Han Xiao, graphic

CEO@Jina AI (e/acc)

One cool thing about ColBERT-based search vs. the cosine-based vector retrieval is that you get ๐ข๐ง๐ญ๐ž๐ซ๐ฉ๐ซ๐ž๐ญ๐š๐›๐ข๐ฅ๐ข๐ญ๐ฒ for free as a byproduct of the MaxSim computation. It's kind of like the Lucene highlighter, letting you grab the most relevant snippets from a long document to show users where their query matches. With ๐‰๐ข๐ง๐š-๐‚๐จ๐ฅ๐๐„๐‘๐“-๐ฏ๐Ÿ, which supports up to 8K token length, released from Jina AI earlier this Feb., the visualization of the late interaction between a query and a document is almost... artistic. The video shows the late interaction between the query "Elephants eat 150 kg of food per day." and the Wikipedia article about "Indian Elephant". Darker colors indicate stronger interactions. The darkest area corresponds to "The species is classified as a megaherbivore and consume up to 150 kg (330 lb) of plant matter per day." from the original article. You can use Jina-ColBERT-v1 via our Embedding API https://jina.ai/embeddings Make sure to select this model in the model dropdown. Don't forget to check out this article where we explain how those graphs were made. https://lnkd.in/eQCHbTsN

Andre Zayarni

Co-founder at Qdrant, Vector Database.

1w

Finally, Jina ColBERT can be used in Qdrant https://qdrant.tech/blog/qdrant-1.10.x/

Daniel Svonava

Vector Compute @ Superlinked | xYouTube

6d

How does the computational cost of ColBERT-based search compare to traditional cosine-based vector retrieval, especially for large-scale applications?

Jonathan Sims

ML | Analytics | Data Management

5d
See more comments

To view or add a comment, sign in

Explore topics