Posts
January 07, 2026
From RAG to AI Agent
A step-by-step guide to transform your RAG pipelines into effective AI agents.
December 11, 2025
What are the "experts" in Mixture-of-Experts LLMs?
And how can 8 or 16 of them cover all possible domain of expertise?
November 04, 2025
What's hybrid retrieval good for?
We've been told embedding search strictly superior to BM25 and all other keyword-search algorithms. But they still have a role in modern search pipelines.
October 29, 2025
Making sense of KV Cache optimizations, Ep. 4: System-level
Let's make sense of the zoo of system-level techniques that exist out there.
October 28, 2025
Making sense of KV Cache optimizations, Ep. 3: Model-level
Let's make sense of the zoo of model-level techniques that exist out there.
October 27, 2025
Making sense of KV Cache optimizations, Ep. 2: Token-level
Let's make sense of the zoo of token-level techniques that exist out there.
October 26, 2025
Making sense of KV Cache optimizations, Ep. 1: An overview
Let's make sense of the zoo of techniques that exist out there.
October 23, 2025
How does prompt caching work?
Nearly all inference libraries can do it for you. But what's really going on under the hood?
October 17, 2025
What is prompt caching?
Caching prompts can have an outsized impact on the cost and latency of your AI apps. But what exactly to cache and how?
October 09, 2025
Why using a reranker?
And is the added latency worth it? Let's understand what they do and how can they improve the quality of your RAG pipelines so drastically.
September 15, 2025
Trying to play "Guess Who" with an LLM
I expected a different kind of fun.
June 02, 2025
Can you really interrupt an LLM?
You never see that in the demos... why?
May 21, 2025
A simple vibecoding exercise
Can GenAI help you finish your side-projects?
May 16, 2025
Using Llama Models in the EU
The ban's terms are surprisingly not well known among users of these popular "open-source" LLMs.
May 12, 2025
Beyond the hype of reasoning models: debunking three common misunderstandings
This is a teaser for my upcoming talk at ODSC East 2025, "LLMs that Think: Demystifying Reasoning Models". If you want to learn more, join the webinar!
October 30, 2024
Building Reliable Voice Bots with Open Source Tools - Part 2
A practical guide on the best techniques to build performant and cost effective voice bots.
September 20, 2024
Building Reliable Voice Bots with Open Source Tools - Part 1
A deep look at the main challenges of building performant and cost effective voice bots.
June 10, 2024
The Agent Compass
Agent means everything and nothing in today's GenAI landscape. Let's shed some light on this topic.
May 06, 2024
Generating creatures with Teranoptia
Having fun with fonts doesn’t always mean obsessing over kerning and ligatures. Sometimes, writing text is not even the point!
April 29, 2024
RAG, the bad parts (and the good!)
A summary of my recent talk at ODSC East about RAG, just in case you haven't heard enough of it already.
April 14, 2024
Explain me LLMs like I'm five: build a story to help anyone get the idea
Let's explore a high-level way to tell clearly what LLMs are good for to the average pedestrian and help them reason about it.
February 28, 2024
ClozeGPT: Write Anki cloze cards with a custom GPT
Writing good Anki cards is a chore. Let's bring LLMs to the rescue.
February 21, 2024
Is RAG all you need? A look at the limits of retrieval augmentation
This blogpost is a teaser for my upcoming talk at ODSC East 2024 in Boston, April 23-25.
January 06, 2024
Headless WiFi setup on Raspberry Pi OS "Bookworm" without the Raspberry Pi Imager
Setting up a headless Pi used to be simpler. Is it still possible to do it without the RPi Imager?
November 09, 2023
The World of Web RAG
What if our RAG application could fetch data directly from the web, live? Let's build this pipeline with Haystack 2.0.
November 05, 2023
Indexing data for RAG applications
RAG apps need data to work. Let's see how to pre-process our data to make our Haystack 2.0 RAG pipeline perform even better.
October 27, 2023
RAG Pipelines from scratch
Let's build a simple RAG Pipeline with Haystack 2.0 by just connecting three components: a Retriever, a PromptBuilder and a Generator.
October 26, 2023
A New Approach to Haystack Pipelines
Haystack 2.0 comes with a brand new pipeline concept. Let's discover it!
October 15, 2023
Haystack's Pipeline - A Deep Dive
What are Haystack's pipelines and how do they work?
October 11, 2023
Why rewriting Haystack?!
Before even diving into what Haystack 2.0 is, how it was built, and how it works, let’s spend a few words about the whats and the whys.
October 10, 2023
Haystack 2.0: What is it?
December is finally approaching, and with it the release of a Haystack 2.0.
September 10, 2023
An (unofficial) Python SDK for Verbix
If you need a Python SDK for a verb conjugator, try this one while it's still alive.
December 11, 2021
My Dotfiles
What Linux developer would I be if I didn't also have my very own dotfiles repo?