Automatic Identification and Description of Jewelry Through Computer Vision and Neural Networks for Translators and Interpreters

View/ Open
Author
Alcalde-Llergo, José M.
Ruiz Mezcua, Aurora
Ávila Ramírez, Rocío
Zingoni, Andrea
Taborri, Juri
Yeguas-Bolívar, Enrique
Publisher
MDPIDate
2025Subject
Image captioningAccessory classification
Jewelry recognition
Deep learning
Computer vision
Natural language descriptions
METS:
Mostrar el registro METSPREMIS:
Mostrar el registro PREMISMetadata
Show full item recordAbstract
Identifying jewelry pieces presents a significant challenge due to the wide range of styles and designs. Currently, precise descriptions are typically limited to industry experts. However, translators and interpreters often require a comprehensive understanding of these items. In this study, we introduce an innovative approach to automatically identify and describe jewelry using neural networks. This method enables translators and interpreters to quickly access accurate information, aiding in resolving queries and gaining essential knowledge about jewelry. Our model operates at three distinct levels of description, employing computer vision techniques and image captioning to emulate expert analysis of accessories. The key innovation involves generating natural language descriptions of jewelry across three hierarchical levels, capturing nuanced details of each piece. Different image captioning architectures are utilized to detect jewels in images and generate descriptions with varying levels of detail. To demonstrate the effectiveness of our approach in recognizing diverse types of jewelry, we assembled a comprehensive database of accessory images. The evaluation process involved comparing various image captioning architectures, focusing particularly on the encoder–decoder model, crucial for generating descriptive captions. After thorough evaluation, our final model achieved a captioning accuracy exceeding 90%.
