Cohere's embed-v4.0 is my daily driver as far as a high performance model is concerned. I do a lot of cluster analysis and data visualization and I like that there's an `input_type="clustering"` mode in addition to the standard `input_type="search"` mode.
I've liked qwen and embeddinggemma for local search. Qwen because 32K is enough to basically fit a whole page into the context window and embeddiggemma because it's crazy efficient.
I’ve been using MixedBread, which is a pretty old model at this point. Recently, I tried comparing it to some newer models and was disappointed that the results weren’t dramatically and uniformly better.
You probably can’t go wrong if you pick a recent one that scores decently well on benchmarks and is at the right price point (or memory requirement) for whatever you’re trying to do.
Benchmarks only paint part of the picture, but it's still a decent place to start looking into recent models:
https://huggingface.co/spaces/mteb/leaderboard
Cohere's embed-v4.0 is my daily driver as far as a high performance model is concerned. I do a lot of cluster analysis and data visualization and I like that there's an `input_type="clustering"` mode in addition to the standard `input_type="search"` mode.
For a fast, open, and local model, I've found it hard to beat https://huggingface.co/sentence-transformers/all-MiniLM-L6-v...
I've liked qwen and embeddinggemma for local search. Qwen because 32K is enough to basically fit a whole page into the context window and embeddiggemma because it's crazy efficient.
I am using openai small embedding model with custom compression. It is super cheap. You can read more at https://corvi.careers/blog/vector-search-embedding-compressi...
Just fyi, for RAG/similarity search, adding a reranker was much bigger pay off than switching embedding models.
What top K do you use for vector search before passing into the reranker?
I’ve been using MixedBread, which is a pretty old model at this point. Recently, I tried comparing it to some newer models and was disappointed that the results weren’t dramatically and uniformly better.
You probably can’t go wrong if you pick a recent one that scores decently well on benchmarks and is at the right price point (or memory requirement) for whatever you’re trying to do.
Feels like embeddings are underrated compared to LLM's hype, but they doing great.
Why do you feel like embeddings are underrated? What is it with embeddings that deserves more attention?
Meta's Perception Encoder Audio-Visual, its CLIP like but has three modality: Audio, Video and Text
I’m partial to jina.ai — they have open models for code and prose, all easily runnable locally.
embeddings are easy to fine tune. Try modern bert.
E5 (Microsoft)
gemma4
who knows a tool for rug check in crypto