ARCHITECTING RETAIL-SCALE VECTOR STORE SYSTEMS FOR AGENTIC GENERATIVE AI

Karthik Perikala

doi:10.34218/IJCET_17_01_001

Authors

Karthik Perikala Senior Principal Software Engineer, The Home Depot, United States Author

DOI:

https://doi.org/10.34218/IJCET_17_01_001

Keywords:

Retail AI Systems, Vector Stores, Agentic Generative AI, Retrieval-Augmented Generation, Big Data Pipelines, Approximate Nearest Neighbor Search

Abstract

Modern retail-facing generative AI systems increasingly rely on retrieval-augmented generation (RAG) and agentic execution models to deliver accurate, context-aware responses across product catalogs, technical documentation, and operational metadata. At retail scale, these systems must support semantic retrieval over heterogeneous and continuously evolving knowledge while maintaining predictable latency and operational reliability. This paper presents a big data driven architecture for vector store systems designed to support retail-scale agentic generative AI. The architecture integrates batch-oriented ingestion pipelines with incremental updates, document-aware chunking, multimodal embedding generation, and scalable approximate nearest-neighbor indexing. Vector stores are treated as core, governed infrastructure components and are accessed by agents through explicit retrieval tools rather than implicit prompt augmentation. Using representative, synthetic benchmarks derived from retail workloads, we analyze ingestion throughput, retrieval latency, and agent-level efficiency across large embedding corpora constructed from product attributes and visually rich documents. The results show that chunk structure, metadata filtering, and retrieval precision have a greater impact on end-to-end agent behavior than marginal index-level latency improvements in most retail scenarios.

References

J. Johnson, M. Douze, and H. J´egou, “Billion-scale similarity search with GPUs,” IEEE Transactions on Pattern Analysis and Machine Intelligence, 2020.

Y. Khandelwal et al., “Generalization through memorization: Nearest neighbor language models,” International Conference on Learning Representations (ICLR), 2020.

S. Khattab and M. Zaharia, “ColBERT: Efficient and effective passage search via contextualized late interaction,” ACM SIGIR Conference on Research and Development in Information Retrieval, 2020.

A. Izacard et al., “Unsupervised dense information retrieval with contrastive learning,” International Conference on Learning Representations (ICLR), 2021.

P. Lewis et al., “Retrieval-augmented generation for knowledge-intensive NLP tasks,” Advances in Neural Information Processing Systems (NeurIPS), 2020.

O. Khattab, R. Santhanam, H. Zamani, and M. Zaharia, “Demonstrate-search-predict: Composing retrieval and language models for knowledge-intensive tasks,” arXiv preprint arXiv:2212.14024, 2022.

A. Mialon et al., “Augmented language models: A survey,” arXiv preprint arXiv:2302.07842, 2023.

W. Xiong et al., “Approximate nearest neighbor negative contrastive learning for dense text retrieval,” International Conference on Learning Representations (ICLR), 2021.

ARCHITECTING RETAIL-SCALE VECTOR STORE SYSTEMS FOR AGENTIC GENERATIVE AI

Authors

DOI:

Keywords:

Abstract

References

Downloads

Published

Issue

Section

License

How to Cite