ARCHITECTING RETAIL-SCALE VECTOR STORE SYSTEMS FOR AGENTIC GENERATIVE AI
DOI:
https://doi.org/10.34218/IJCET_17_01_001Keywords:
Retail AI Systems, Vector Stores, Agentic Generative AI, Retrieval-Augmented Generation, Big Data Pipelines, Approximate Nearest Neighbor SearchAbstract
Modern retail-facing generative AI systems increasingly rely on retrieval-augmented generation (RAG) and agentic execution models to deliver accurate, context-aware responses across product catalogs, technical documentation, and operational metadata. At retail scale, these systems must support semantic retrieval over heterogeneous and continuously evolving knowledge while maintaining predictable latency and operational reliability. This paper presents a big data driven architecture for vector store systems designed to support retail-scale agentic generative AI. The architecture integrates batch-oriented ingestion pipelines with incremental updates, document-aware chunking, multimodal embedding generation, and scalable approximate nearest-neighbor indexing. Vector stores are treated as core, governed infrastructure components and are accessed by agents through explicit retrieval tools rather than implicit prompt augmentation. Using representative, synthetic benchmarks derived from retail workloads, we analyze ingestion throughput, retrieval latency, and agent-level efficiency across large embedding corpora constructed from product attributes and visually rich documents. The results show that chunk structure, metadata filtering, and retrieval precision have a greater impact on end-to-end agent behavior than marginal index-level latency improvements in most retail scenarios.
References
J. Johnson, M. Douze, and H. J´egou, “Billion-scale similarity search with GPUs,” IEEE Transactions on Pattern Analysis and Machine Intelligence, 2020.
Y. Khandelwal et al., “Generalization through memorization: Nearest neighbor language models,” International Conference on Learning Representations (ICLR), 2020.
S. Khattab and M. Zaharia, “ColBERT: Efficient and effective passage search via contextualized late interaction,” ACM SIGIR Conference on Research and Development in Information Retrieval, 2020.
A. Izacard et al., “Unsupervised dense information retrieval with contrastive learning,” International Conference on Learning Representations (ICLR), 2021.
P. Lewis et al., “Retrieval-augmented generation for knowledge-intensive NLP tasks,” Advances in Neural Information Processing Systems (NeurIPS), 2020.
O. Khattab, R. Santhanam, H. Zamani, and M. Zaharia, “Demonstrate-search-predict: Composing retrieval and language models for knowledge-intensive tasks,” arXiv preprint arXiv:2212.14024, 2022.
A. Mialon et al., “Augmented language models: A survey,” arXiv preprint arXiv:2302.07842, 2023.
W. Xiong et al., “Approximate nearest neighbor negative contrastive learning for dense text retrieval,” International Conference on Learning Representations (ICLR), 2021.
Downloads
Published
Issue
Section
License
Copyright (c) 2026 Karthik Perikala (Author)

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.