ILIAS:
Instance-Level Image retrieval At Scale

Giorgos Kordopatis-Zilos Vladan Stojnić Anna Manko Pavel Šuma Nikos Efthymiadis Nikolaos-Antonios Ypsilantis Zakaria Laskar Jiří Matas Ondřej Chum Giorgos Tolias

and data contributors

Visual Recognition Group, Faculty of Electrical Engineering, Czech Technical Univesity in Prague

Explore

A Flourish chart

Dataset intro

A Flourish hierarchy chart

This work introduces ILIAS , a new test dataset for Instance-Level Image retrieval At Scale, designed to support future research in image-to-image and text-to-image retrieval for particular objects, and additionally serves as a large-scale benchmark for evaluating representations of foundation vision and language models (VLM).

ILIAS includes queries and positive images for 1,000 object instances, covering diverse conditions and domains. Retrieval is done against 100M distractor images from YFCC100M. To avoid FNs, only query objects emerging after 2014, the YFCC100M compilation date, are included.

Key insights from extensive benchmarking:

models fine-tuned on specific domains, such as landmarks or products, excel in that domain but fail on ILIAS
learning a linear adaptation layer using multi-domain class supervision results in performance improvements, especially for vision-and-language models
local descriptors in retrieval re-ranking are still a key ingredient, especially in the presence of severe background clutter
the text-to-image performance of the vision-language foundation models is surprisingly close to the corresponding image-to-image case.

Instances were manually collected to capture challenging conditions and diverse domains

Large-scale retrieval is conducted against 1M Distractors images from YFCC100M.

To avoid FPs all the objects known to be designed after 2014, the YFCC100M compilation date

SOTA models were evaluated during the dataset collection

Benchmark

Global representation models for Image-to-Image

Show entries

Search:

checkpoint	year	repo	arch	dims	Dataset	Data Size	Train Res	Test Res	5M	100M^†	100M
alexnet.tv_in1k	2012	torchvision	CNN	sup	in1k	1M	224	384	1.9	1.3	1.5
convnext_base.clip_laion2b_augreg	2022	timm	CN-B	vla	laion2b	2B	256	384	18.1	14.0	7.9
convnext_base.fb_in1k	2022	timm	CN-B	sup	in1k	1M	288	384	3.9	2.7	2.0
convnext_base.fb_in22k	2022	timm	CN-B	sup	in22k	14M	224	384	9.9	7.6	6.4
convnext_large.fb_in1k	2022	timm	CN-L	sup	in1k	1M	288	384	4.2	2.9	2.2
convnext_large.fb_in22k	2022	timm	CN-L	sup	in22k	14M	288	384	9.1	6.9	6.6
convnext_large_mlp.clip_laion2b_ft_soup_320	2022	timm	CN-L	vla	laion2b	2B	320	512	22.9	18.3	9.6
cvnet_resnet101	2022	github	R101	sup	gldv2	1M	512	724	4.2	3.1	3.0
cvnet_resnet50	2022	github	R50	sup	gldv2	1M	512	724	3.5	2.6	2.9
deit3_base_patch16_224.fb_in1k	2021	timm	ViT-B	sup+dist	in1k	1M	224	384	2.7	1.8	1.2

Showing 1 to 10 of 64 entries

Previous1 2 3 4 5 6 7Next

Global representation models for Text-to-Image

Show entries

Search:

checkpoint	year	repo	arch	dims	Dataset	Data Size	Train Res	Test Res	5M	100M
convnext_base.clip_laion2b_augreg	2022	convnext	timm+oc	CN-B	640	laion2b	2B	256	384	6.6
convnext_large_mlp.clip_laion2b_ft_soup_320	2022	convnext	timm+oc	CN-L	768	laion2b	2B	320	512	10.9
eva02_base_patch16_clip_224.merged2b	2023	evaclip	timm+oc	ViT-B	512	merged2b	2B	224	384	4.4
eva02_large_patch14_clip_336.merged2b	2023	evaclip	timm+oc	ViT-L	768	merged2b	2B	336	512	10.0
RN50.openai	2021	oc	R50	1024	opanai	400M	224	384	2.2	1.4
vit_base_patch16_clip_224.metaclip_2pt5b	2024	timm+oc	ViT-B	768	2pt5b	2.5B	224	384	7.0	4.5
vit_base_patch16_clip_224.openai	2021	timm+oc	ViT-B	512	opanai	400M	224	384	2.8	1.6
vit_base_patch16_siglip_224.webli	2023	timm+hf	ViT-B	768	webli	10B	224	384	9.5	6.7
vit_base_patch16_siglip_256.webli	2023	timm+hf	ViT-B	768	webli	10B	224	384	9.7	7.0
vit_base_patch16_siglip_384.webli	2023	timm+hf	ViT-B	768	webli	10B	384	512	13.6	10.4

Showing 1 to 10 of 17 entries

Previous1 2Next

Local representation models for Image-to-Image with Reranking

Show entries

Search:

checkpoint	year	repo	arch	dims	Dataset	Data Size	Train Res	Test Res	5M	100M^†	100M
No data available in table

Showing 0 to 0 of 0 entries

PreviousNext

Explore the collected data for your instance-level research!

Discover ILIAS

Get in touch

Citation

If you find our project useful, please consider citing us:

@article{
#coming-soon,
title={ILIAS: Instance-Level Image retrieval At Scale},
author={},
journal={#coming-soon},
year={#coming-soon},
}

Results

Sumbit your results here:

Open submission form

If you have any further questions, please don't hesitate to reach out to georgekordopatis.gmail.com

Acknowledgment

We are grateful to everyone who contributed to ILIAS. A special thank you to Larysa Ivashechkina for her invaluable work in data annotation. We also appreciate the efforts and participation of all the external data collectors — Yankun Wu, Noa Garcia, Yannis Kalantidis, Dmytro Mishkin, Tomáš Jelínek, Dimitris Karageorgiou, Markos Zampoglou, Celeste Abreu, Aggeliki Tserota, Christina Tserota, Eleni Karantali, Eva Tsiliakou, Kelly Kordopati, Panagiotis Tassis, Ruslan Rozumnyi.