Youtube link, slides, Colab (live coding). All the material can also be found here.
A Practical Introduction to Image Retrieval Link to heading
by Sara Zanzottera from deepset
Search should not be limited to text only. Recently, Transformers-based NLP models started crossing the boundaries of text data and exploring the possibilities of other modalities, like tabular data, images, audio files, and more. Text-to-text generation models like GPT now have their counterparts in text-to-image models, like Stable Diffusion. But what about search? In this talk we’re going to experiment with CLIP, a text-to-image search model, to look for animals matching specific characteristics in a dataset of pictures. Does CLIP know which one is “The fastest animal in the world”?
For the 7th OpenNLP meetup I presented the topic of Image Retrieval, a feature that I’ve recently added to Haystack in the form of a MultiModal Retriever (see the Tutorial).
The talk consists of 5 parts:
- An introduction of the topic of Image Retrieval
- A mention of the current SOTA model (CLIP)
- An overview of Haystack
- A step-by-step description of how image retrieval applications can be implemented with Haystack
- A live coding session where I start from a blank Colab notebook and build a fully working image retrieval system from the ground up, to the point where I can run queries live.
Towards the end I mention briefly an even more advanced version of this image retrieval system, which I had no time to implement live. However, I later built a notebook implementing such system and you can find it here: Cheetah.ipynb
The slides were generated from the linked Jupyter notebook with jupyter nbconvert Dec_1st_OpenNLP_Meetup.ipynb --to slides --post serve
.