Skip to main content
Cornell University
We gratefully acknowledge support from the Simons Foundation, member institutions, and all contributors. Donate
arxiv logo > cs.DB

Help | Advanced Search

arXiv logo
Cornell University Logo

quick links

  • Login
  • Help Pages
  • About

Databases

Authors and titles for recent submissions

  • Tue, 10 Feb 2026
  • Mon, 9 Feb 2026
  • Fri, 6 Feb 2026
  • Thu, 5 Feb 2026
  • Wed, 4 Feb 2026

See today's new changes

Total of 43 entries
Showing up to 50 entries per page: fewer | more | all

Tue, 10 Feb 2026 (showing 16 of 16 entries )

[1] arXiv:2602.08588 [pdf, html, other]
Title: MMTS-BENCH: A Comprehensive Benchmark for Time Series Understanding and Reasoning
Yao Yin, Zhenyu Xiao, Musheng Li, Yiwen Liu, Sutong Nan, Yiting He, Ruiqi Wang, Zhenwei Zhang, Qingmin Liao, Yuantao Gu
Subjects: Databases (cs.DB)
[2] arXiv:2602.08546 [pdf, html, other]
Title: Semantics and Multi-Query Optimization Algorithms for the Analyze Operator
Marios Iakovidis, Panos Vassiliadis
Comments: A short version of this paper has been accepted at DOLAP 2026: M. Iakovidis, P. Vassiliadis. Multi-Query Optimization for the novel Analyze Operator. Accepted at 28th International Workshop on Design, Optimization, Languages and Analytical Processing of Big Data (DOLAP) co-located with EDBT/ICDT 2026, Tampere, Finland - March 24, 2026
Subjects: Databases (cs.DB)
[3] arXiv:2602.08482 [pdf, html, other]
Title: CLEAR: A Knowledge-Centric Vessel Trajectory Analysis Platform
Hengyu Liu, Tianyi Li, Haoyu Wang, Kristian Torp, Yushuai Li, Tiancheng Zhang, Torben Bach Pedersen, Christian S. Jensen
Comments: 4 pages, and 5 Figures
Subjects: Databases (cs.DB); Artificial Intelligence (cs.AI)
[4] arXiv:2602.08320 [pdf, html, other]
Title: Making Databases Searchable with Deep Context
Alekh Jindal, Shi Qiao, Shivani Tripathi, Niloy Debnath, Kunal Singh, Pushpanjali Nema, Sharath Prakash, Aditya Halder, Ronith PR, Sadiq Mohammed, Abdul Hameed, Karan Hanswadkar, Ayush Kshitij, Sarthak Bhatt, Rony Chatterjee, Jyoti Pandey, Christina Pavlopoulou, Ravi Shetye
Subjects: Databases (cs.DB)
[5] arXiv:2602.08226 [pdf, html, other]
Title: ByteHouse: A Cloud-Native OLAP Engine with Incremental Computation and Multi-Modal Retrieval
Yuxing Han, Yu Lin, Yifeng Dong, Xuanhe Zhou, Xindong Peng, Xinhui Tian, Zhiyuan You, Yingzhong Guo, Xi Chen, Weiping Qu, Tao Meng, Dayue Gao, Haoyu Wang, Liuxi Wei, Huanchen Zhang, Fan Wu
Subjects: Databases (cs.DB)
[6] arXiv:2602.08190 [pdf, html, other]
Title: ZipFlow: a Compiler-based Framework to Unleash Compressed Data Movement for Modern GPUs
Gwangoo Yeo, Zhiyang Shen, Wei Cui, Matteo Interlandi, Rathijit Sen, Bailu Ding, Qi Chen, Minsoo Rhu
Subjects: Databases (cs.DB); Hardware Architecture (cs.AR); Distributed, Parallel, and Cluster Computing (cs.DC)
[7] arXiv:2602.08186 [pdf, html, other]
Title: Nexus: Inferring Join Graphs from Metadata Alone via Iterative Low-Rank Matrix Completion
Tianji Cong, Yuanyuan Tian, Andreas Mueller, Rathijit Sen, Yeye He, Fotis Psallidas, Shaleen Deep, H. V. Jagadish
Subjects: Databases (cs.DB); Artificial Intelligence (cs.AI)
[8] arXiv:2602.07612 [pdf, html, other]
Title: How to evaluate NoSQL Database Paradigms for Knowledge Graph Processing
Rosario Napoli, Antonio Celesti, Massimo Villari, Maria Fazio
Comments: Accepted at the IEEE/ACM 12th International Conference on Big Data Computing, Applications and Technologies (BDCAT 2025)
Journal-ref: Proceedings of the IEEE/ACM 12th International Conference on Big Data Computing, Applications and Technologies (BDCAT 2025), Article No. 6, pp 1, 10
Subjects: Databases (cs.DB)
[9] arXiv:2602.07584 [pdf, html, other]
Title: Building an OceanBase-based Distributed Nearly Real-time Analytical Processing Database System
Quanqing Xu, Chuanhui Yang, Ruijie Li, Dongdong Xie, Hui Cao, Yi Xiao, Junquan Chen, Yanzuo Wang, Saitong Zhao, Fusheng Han, Bin Liu, Guoping Wang, Yuzhong Zhao, Mingqiang Zhuang
Subjects: Databases (cs.DB)
[10] arXiv:2602.07371 [pdf, html, other]
Title: DeepPrep: An LLM-Powered Agentic System for Autonomous Data Preparation
Meihao Fan, Ju Fan, Yuxin Zhang, Shaolei Zhang, Xiaoyong Du, Jie Song, Peng Li, Fuxin Jiang, Tieying Zhang, Jianjun Chen
Subjects: Databases (cs.DB)
[11] arXiv:2602.07336 [pdf, html, other]
Title: Learned Query Optimizer in Alibaba MaxCompute: Challenges, Analysis, and Solutions
Lianggui Weng, Dandan Liu, Wenzhuang Zhu, Rong Zhu, Junzheng Zheng, Bolin Ding, Zhiguo Zhang, Jingren Zhou
Comments: 17 pages, 16 figures
Subjects: Databases (cs.DB)
[12] arXiv:2602.07303 [pdf, html, other]
Title: KRONE: Hierarchical and Modular Log Anomaly Detection
Lei Ma, Jinyang Liu, Tieying Zhang, Peter M. VanNostrand, Dennis M. Hofmann, Lei Cao, Elke A. Rundensteiner, Jianjun Chen
Subjects: Databases (cs.DB); Artificial Intelligence (cs.AI); Software Engineering (cs.SE)
[13] arXiv:2602.08793 (cross-list from cs.CL) [pdf, html, other]
Title: LakeHopper: Cross Data Lakes Column Type Annotation through Model Adaptation
Yushi Sun, Xujia Li, Nan Tang, Quanqing Xu, Chuanhui Yang, Lei Chen
Subjects: Computation and Language (cs.CL); Databases (cs.DB)
[14] arXiv:2602.08590 (cross-list from cs.LG) [pdf, html, other]
Title: SDFed: Bridging Local Global Discrepancy via Subspace Refinement and Divergence Control in Federated Prompt Learning
Yicheng Di, Wei Yuan, Tieke He, Zhanjie Zhang, Ao Ma, Yuan Liu, Hongzhi Yin
Comments: 13 pages, 6 figures
Subjects: Machine Learning (cs.LG); Databases (cs.DB)
[15] arXiv:2602.07721 (cross-list from cs.LG) [pdf, html, other]
Title: ParisKV: Fast and Drift-Robust KV-Cache Retrieval for Long-Context LLMs
Yanlin Qi, Xinhang Chen, Huiqiang Jiang, Qitong Wang, Botao Peng, Themis Palpanas
Comments: 25 pages, 16 figures. Under review
Subjects: Machine Learning (cs.LG); Computation and Language (cs.CL); Databases (cs.DB)
[16] arXiv:2602.07517 (cross-list from cs.CR) [pdf, other]
Title: MemPot: Defending Against Memory Extraction Attack with Optimized Honeypots
Yuhao Wang, Shengfang Zhai, Guanghao Jin, Yinpeng Dong, Linyi Yang, Jiaheng Zhang
Subjects: Cryptography and Security (cs.CR); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Databases (cs.DB)

Mon, 9 Feb 2026 (showing 2 of 2 entries )

[17] arXiv:2602.06721 [pdf, other]
Title: Filtered Approximate Nearest Neighbor Search Cost Estimation
Wenxuan Xia, Mingyu Yang, Wentao Li, Wei Wang
Comments: 12 pages
Subjects: Databases (cs.DB)
[18] arXiv:2602.06594 [pdf, html, other]
Title: Machine Learning Practitioners' Views on Data Quality in Light of EU Regulatory Requirements: A European Online Survey
Yichun Wang, Kristina Irion, Paul Groth, Hazar Harmouch
Subjects: Databases (cs.DB)

Fri, 6 Feb 2026 (showing 12 of 12 entries )

[19] arXiv:2602.05944 [pdf, html, other]
Title: "Detective Work We Shouldn't Have to Do": Practitioner Challenges in Regulatory-Aligned Data Quality in Machine Learning Systems
Yichun Wang, Kristina Irion, Paul Groth, Hazar Harmouch
Subjects: Databases (cs.DB)
[20] arXiv:2602.05928 [pdf, html, other]
Title: Even Faster Geosocial Reachability Queries
Rick van der Heijden, Nikolay Yakovets, Thekla Hamm
Subjects: Databases (cs.DB)
[21] arXiv:2602.05708 [pdf, html, other]
Title: Cost-Efficient RAG for Entity Matching with LLMs: A Blocking-based Exploration
Chuangtao Ma, Zeyu Zhang, Arijit Khan, Sebastian Schelter, Paul Groth
Subjects: Databases (cs.DB); Computation and Language (cs.CL)
[22] arXiv:2602.05674 [pdf, html, other]
Title: Fast Private Adaptive Query Answering for Large Data Domains
Miguel Fuentes, Brett Mullins, Yingtai Xiao, Daniel Kifer, Cameron Musco, Daniel Sheldon
Subjects: Databases (cs.DB); Cryptography and Security (cs.CR)
[23] arXiv:2602.05651 [pdf, html, other]
Title: One Size Does NOT Fit All: On the Importance of Physical Representations for Datalog Evaluation
Nick Rassau, Felix Schuhknecht
Subjects: Databases (cs.DB)
[24] arXiv:2602.05540 [pdf, html, other]
Title: Taking the Leap: Efficient and Reliable Fine-Grained NUMA Migration in User-space
Felix Schuhknecht, Nick Rassau
Subjects: Databases (cs.DB); Operating Systems (cs.OS)
[25] arXiv:2602.05503 [pdf, html, other]
Title: Repairing Property Graphs under PG-Constraints
Christopher Spinrath, Angela Bonifati, Rachid Echahed
Comments: This paper, without the appendix, has been accepted for publication in Volume 19 of PVLDB
Subjects: Databases (cs.DB)
[26] arXiv:2602.05452 [pdf, html, other]
Title: DistillER: Knowledge Distillation in Entity Resolution with Large Language Models
Alexandros Zeakis, George Papadakis, Dimitrios Skoutas, Manolis Koubarakis
Subjects: Databases (cs.DB)
[27] arXiv:2602.04926 [pdf, html, other]
Title: Pruning Minimal Reasoning Graphs for Efficient Retrieval-Augmented Generation
Ning Wang, Kuanyan Zhu, Daniel Yuehwoon Yee, Yitang Gao, Shiying Huang, Zirun Xu, Sainyam Galhotra
Subjects: Databases (cs.DB); Computation and Language (cs.CL); Machine Learning (cs.LG)
[28] arXiv:2602.05818 (cross-list from cs.AI) [pdf, html, other]
Title: TKG-Thinker: Towards Dynamic Reasoning over Temporal Knowledge Graphs via Agentic Reinforcement Learning
Zihao Jiang, Miao Peng, Zhenyan Shan, Wenjie Xu, Ben Liu, Gong Chen, Ziqi Gao, Min Peng
Subjects: Artificial Intelligence (cs.AI); Databases (cs.DB)
[29] arXiv:2602.05792 (cross-list from cs.NI) [pdf, html, other]
Title: Data analysis of cloud virtualization experiments
Pedro R. X. do Carmo, Eduardo Freitas, Assis T. de Oliveira Filho, Judith Kelner, Djamel Sadok
Subjects: Networking and Internet Architecture (cs.NI); Databases (cs.DB)
[30] arXiv:2602.05134 (cross-list from cs.LG) [pdf, other]
Title: SemPipes -- Optimizable Semantic Data Operators for Tabular Machine Learning Pipelines
Olga Ovcharenko, Matthias Boehm, Sebastian Schelter
Subjects: Machine Learning (cs.LG); Databases (cs.DB)

Thu, 5 Feb 2026 (showing 8 of 8 entries )

[31] arXiv:2602.04430 [pdf, html, other]
Title: The Stretto Execution Engine for LLM-Augmented Data Systems
Gabriele Sanmartino, Matthias Urban, Paolo Papotti, Carsten Binnig
Subjects: Databases (cs.DB)
[32] arXiv:2602.04314 [pdf, other]
Title: Identifying knowledge gaps in biodiversity data and their determinants at the regional level
Didier Alard (BioGeCo), Anaïs Guéry
Subjects: Databases (cs.DB)
[33] arXiv:2602.04261 [pdf, html, other]
Title: Data Agents: Levels, State of the Art, and Open Problems
Yuyu Luo, Guoliang Li, Ju Fan, Nan Tang
Journal-ref: SIGMOD 2026 Tutorial
Subjects: Databases (cs.DB)
[34] arXiv:2602.04190 [pdf, html, other]
Title: LatentTune: Efficient Tuning of High Dimensional Database Parameters via Latent Representation Learning
Sein Kwon, Youngwan Jo, Seungyeon Choi, Jieun Lee, Huijun Jin, Sanghyun Park
Comments: 11 pages
Subjects: Databases (cs.DB)
[35] arXiv:2602.04181 [pdf, html, other]
Title: Piece of CAKE: Adaptive Execution Engines via Microsecond-Scale Learning
Zijie Zhao, Ryan Marcus
Subjects: Databases (cs.DB); Machine Learning (cs.LG)
[36] arXiv:2602.04029 [pdf, html, other]
Title: PluRel: Synthetic Data unlocks Scaling Laws for Relational Foundation Models
Vignesh Kothapalli, Rishabh Ranjan, Valter Hudovernik, Vijay Prakash Dwivedi, Johannes Hoffart, Carlos Guestrin, Jure Leskovec
Comments: Code: this https URL
Subjects: Databases (cs.DB); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[37] arXiv:2602.04004 [pdf, html, other]
Title: StraTyper: Automated Semantic Type Discovery and Multi-Type Annotation for Dataset Collections
Christos Koutras, Juliana Freire
Subjects: Databases (cs.DB)
[38] arXiv:2602.04068 (cross-list from cs.LG) [pdf, html, other]
Title: An Empirical Survey and Benchmark of Learned Distance Indexes for Road Networks
Gautam Choudhary, Libin Zhou, Yeasir Rayhan, Walid G. Aref
Comments: Preprint (Under Review). 14 pages, 2 figures
Subjects: Machine Learning (cs.LG); Databases (cs.DB)

Wed, 4 Feb 2026 (showing 5 of 5 entries )

[39] arXiv:2602.03278 [pdf, other]
Title: A Pipeline for ADNI Resting-State Functional MRI Processing and Quality Control
Saige Rutherford, Zeshawn Zahid, Robert C. Welsh, Andrea Avena-Koenigsberger, Vincent Koppelmans, Amanda F. Mejia
Subjects: Databases (cs.DB)
[40] arXiv:2602.03189 [pdf, html, other]
Title: StreamShield: A Production-Proven Resiliency Solution for Apache Flink at ByteDance
Yong Fang, Yuxing Han, Meng Wang, Yifan Zhang, Yue Ma, Chi Zhang
Subjects: Databases (cs.DB); Distributed, Parallel, and Cluster Computing (cs.DC)
[41] arXiv:2602.03069 [pdf, html, other]
Title: Skill-Based Autonomous Agents for Material Creep Database Construction
Yue Wu, Tianhao Su, Shunbo Hu, Deng Pan
Subjects: Databases (cs.DB)
[42] arXiv:2602.02999 [pdf, html, other]
Title: ResQ: Realistic Performance-Aware Query Generation
Zhengle Wang, Yanfei Zhang, Chunwei Liu
Comments: 13 pages, 4 figures
Subjects: Databases (cs.DB)
[43] arXiv:2602.03633 (cross-list from cs.CL) [pdf, other]
Title: BIRDTurk: Adaptation of the BIRD Text-to-SQL Dataset to Turkish
Burak Aktaş, Mehmet Can Baytekin, Süha Kağan Köse, Ömer İlbilgi, Elif Özge Yılmaz, Çağrı Toraman, Bilge Kaan Görür
Comments: Accepted by EACL 2026 SIGTURK
Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Databases (cs.DB)
Total of 43 entries
Showing up to 50 entries per page: fewer | more | all
  • About
  • Help
  • contact arXivClick here to contact arXiv Contact
  • subscribe to arXiv mailingsClick here to subscribe Subscribe
  • Copyright
  • Privacy Policy
  • Web Accessibility Assistance
  • arXiv Operational Status