A fast and low-cost method to detect nearduplicate Images in large dataset based on fingerprint extraction and Deep Learning

Nasri Shandiz, Fatemeh

Por favor, use este identificador para citar o enlazar este ítem: https://hdl.handle.net/20.500.12104/92292

Título:	A fast and low-cost method to detect nearduplicate Images in large dataset based on fingerprint extraction and Deep Learning
Otros títulos:	A Thesis in the Field of Artificial intelligence and computer vision For the Doctorate Degree in Information Technology
Autor:	Nasri Shandiz, Fatemeh
Director:	Dr. Orizaga Trejo, José Antonio
Asesor:	Dra. Maciel Arellano, Ma. Del Rocío Dra. Gaytán Lugo, Laura Sanely Dr. Beltrán Ramírez, Jesús Raúl
Palabras clave:	Deep Learning;low-Cost Method;images In Large
Fecha de titulación:	16-mar-2023
Editorial:	Biblioteca Digital wdg.biblio Universidad de Guadalajara
Resumen:	Recognizing near-duplicate images from large datasets is a crucial task in image retrieval and content identification. Finding similar images in order to reduce redundancy is timeconsuming in large datasets. Most of image representation targeting methods at conventional image retrieval issues for detecting duplicate are either computationally expensive to extract and match or have robustness limitations. In this work, we propose a fast method to detect near-duplicate images in a large dataset, which is computationally low cost and effective by using image fingerprints to determine similarity between a query image and near-duplicated images in a large dataset. We extract a series of fingerprints combining global and local features also using a deep learning model as a fingerprint for each image in the dataset and store them in a separate database. Then we apply successive filters to the query image, discarding non-similar images in the process until reaching a final set of near-duplicate images. we achieved to discarding most of the non-similar images in the early stages of the process and focuses on robustness in the latter stages, where the set of near-duplicate candidate images is significantly smaller. This allows to perform the query process on the fly. The proposed method and experimental results provide a right compromise between accuracy and speed in detecting near-duplicate images from a large dataset even via a low performance potential computer such has home use laptop or a workstation computer.
URI:	https://wdg.biblio.udg.mx https://hdl.handle.net/20.500.12104/92292
Programa educativo:	DOCTORADO EN TECNOLOGIAS DE INFORMACION
Aparece en las colecciones:	CUCEA

Ficheros en este ítem:

Fichero	Tamaño	Formato
DCUCEA10120FT.pdf	5.86 MB	Adobe PDF	Visualizar/Abrir

Mostrar el registro Dublin Core completo del ítem