Language support in Meilisearch

February 1, 2026
1:15 PM – 1:45 PM

UB4.136

Live Stream

many

<p>Meilisearch (https://www.meilisearch.com/) is a popular Open-Source search engine written in Rust that boasts more than 50k stars on GitHub, focusing on performance and ease-of-use. Meilisearch is designed to be available worldwide, which requires supporting multiple languages through word tokenization. But, how difficult is it to segment and normalize words? And, how different this process can be depending on the Language?</p> <p>Meilisearch core maintainers share how they handled language support, the difficulties they faced, and the solution they found.</p>

Additional information

Live Stream	https://live.fosdem.org/watch/ub4136
Type	devroom
Language	English

More sessions

2/1/26	Implementing Block-Max Pruning in Rust: Faster Learned Sparse Retrieval for Modern Search Search UB4.136 <p>Learned sparse retrieval models such as SPLADE, uniCOIL, and other transformer-based sparse encoders have become popular for delivering neural-level relevance while preserving the efficiency of inverted indexes. But these models also produce indexes with statistical properties radically different from classic BM25: longer queries, compressed vocabularies, and posting lists with unusual score distributions. As a result, traditional dynamic pruning algorithms like WAND and Block-Max WAND often ...
2/1/26	Deriving Maximum Insight: Open-Source Graph-Enhanced RAG for Complex Question Answering Search Mykyta Kemarskyi UB4.136 <p>Traditional QA pipelines—even those using baseline RAG—struggle with complex reasoning tasks such as multi-hop inference, contradiction detection, entity linking, temporal consistency, and large-scale cross-document understanding. These limitations become critical in domains like investigative journalism, scientific research, and legal analysis, where answers depend on relationships spread across many documents rather than isolated text chunks.</p> <p>This talk will demonstrate how ...
2/1/26	OpenSearch v3: A New Era of Search Innovation - From Neural Sparse ANN to Agentic Workflows and everything in-between Search UB4.136 <p>OpenSearch v3 major release that was introduced in the past year represents a significant leap forward in open source search technology, delivering breakthrough innovations across neural search, AI-driven search experiences and performance optimization. This talk explores the major features that define the 3.x releases and their impact on modern search applications.</p> <p>We'll dive into differentiating capabilities like scalable Neural Sparse ANN Search using the SEISMIC algorithm, and the ...
2/1/26	Multi-Vector embeddings revolution? or evolution? Search UB4.136 <p>What are multi-vector embeddings? How do they differ from regular embeddings? And how can you build an AI-powered OCR system in under 5 minutes without paying a fortune for infrastructure? If you're curious for answers, join me! I'll break down ColBERT embeddings, explore how MUVERA compression is revolutionizing the way we work with multi-vectors, and show you how to leverage it all to build an AI-powered OCR system on resource constrained devices such as Raspberry Pi.</p> <p>Weaviate DB: ...
2/1/26	Multi-Stage Retrieval in Elasticsearch - Present and Future Search Carlos Delgado UB4.136 <p>Search in Elasticsearch keeps evolving, from traditional BM25 keyword retrieval to multi-stage search that combine lexical, vector, and language-model-driven intelligence. In this talk, we’ll explore how Elasticsearch APIs enable developers to build hybrid search systems that mix classical scoring, dense vector search and semantic reranking in a single coherent workflow.</p> <p>We’ll use ES\|QL, Elasticsearch’s new query language, and show how constructs like FORK, FUSE, RERANK, ...

FOSDEM 2026

1/31/26 – 2/1/26

Event

Hackerkonferenzen

Created by @CCC 58 Follower

Event Calendar