An Efficient Primary Indexing Method with Sibling Pointers for Large-Scale Database Systems

Mohammad Al Khaldy; Ameen Shaheen; Wael Alzyadat; Aysh Alhroob

doi:10.48084/etasr.11575

Authors

Mohammad Al Khaldy Department of Business Intelligence and Data Analytics, University of Petra, Amman, Jordan https://orcid.org/0009-0009-7502-4668
Ameen Shaheen Department of Software Engineering, Al-Zaytoonah University, Amman, Jordan
Wael Alzyadat Department of Software Engineering, Al-Zaytoonah University, Amman, Jordan https://orcid.org/0000-0002-0068-1526
Aysh Alhroob Department of Software Engineering, Al-Zaytoonah University, Amman, Jordan https://orcid.org/0000-0002-4653-7386

Volume: 15 | Issue: 4 | Pages: 24691-24697 | August 2025 | https://doi.org/10.48084/etasr.11575

Received: 18 April 2025 | Revised: 21 May 2025 | Accepted: 25 May 2025 | Online: 2 August 2025

Corresponding author: Mohammad Al Khaldy

Abstract

Efficient data retrieval is critical in modern database systems where data volumes are increasing. Although traditional indexing with B+ tree and bitmap indexes optimizes query performance, it introduces storage overhead, has inefficient mechanisms for handling duplicate keys, and proves challenging to scale. This study presents a new primary index approach that minimizes query execution time and eliminates extraneous lookups by adding sibling pointers, enabling efficient management and retrieval of duplicate keys. Based on a thorough experimental study utilizing a MySQL database of up to 10 million records, the proposed approach achieved significantly faster query execution times (up to 33.5%) and reduced storage overhead (up to 25%) compared to classical techniques. Τhe proposed method provides stable effectiveness regardless of query type and improves scalability over large databases, advancing the field of indexing techniques by providing a low-cost, scalable, and storage-friendly solution applicable to high-traffic workloads and very large datasets. Future directions include its use in distributed and cloud-based environments, with opportunities for improvement through adaptive and AI-driven indexing approaches.

Keywords:

database optimization, B tree, storage efficiency, large-scale data systems

Downloads

Download data is not yet available.

References

A. Raman, K. Karatsenidis, S. Xie, M. Olma, S. Sarkar, and M. Athanassoulis, "QuIT your B+-tree for the Quick Insertion Tree." OpenProceedings.org, 2025.

S. R. Jeong, Y. Kim, I. Ghani, and J. H. Kim, "A New Database Archiving Approach for Effective Storage and Data Management: A Case Study of Data Warehouse Project in a Korean Bank," International Journal of Advance Soft Computing Application, vol. 6, no. 3, pp. 31–46, 2014.

S. Emanuilov and A. Dimov, "Billion-Scale Similarity Search Using a Hybrid Indexing Approach with Advanced Filtering," Cybernetics and Information Technologies, vol. 24, no. 4, pp. 45–58, Dec. 2024. DOI: https://doi.org/10.2478/cait-2024-0035

J. S. Hwang, S. Lee, Y. Lee, and S. Park, "A Selection Method of Database System in Bigdata Environment: A Case Study From Smart Education Service in Korea," International Journal of Advance Soft Computing Application, vol. 7, no. 1, pp. 9–21, 2015.

S. H. Adil, M. Ebrahim, S. S. A. Ali, and K. Raza, "Performance Analysis of Duplicate Record Detection Techniques," Engineering, Technology & Applied Science Research, vol. 9, no. 5, pp. 4755–4758, Oct. 2019. DOI: https://doi.org/10.48084/etasr.3036

M. Al-Ani, Q. Al-Shayea, A. R. Alshehadeh, B. Annisa, and H. A. Al-khawaja, "Creating Visual Knowledge Representation Based on Data Mining in Educational Jordanian Databases," International Journal of Advance Soft Computing Application, vol. 16, no. 1, pp. 155–168, 2024.

J. Wang and M. Athanassoulis, "CUBIT: Concurrent Updatable Bitmap Indexing," Proceedings of the VLDB Endowment, vol. 18, no. 2, pp. 399–412, Oct. 2024. DOI: https://doi.org/10.14778/3705829.3705854

T. Taipalus, "The effects of database complexity on SQL query formulation," Journal of Systems and Software, vol. 165, Jul. 2020, Art. no. 110576. DOI: https://doi.org/10.1016/j.jss.2020.110576

S. AlZu’bi et al., "Diabetes Monitoring System in Smart Health Cities Based on Big Data Intelligence," Future Internet, vol. 15, no. 2, Feb. 2023, Art. no. 85. DOI: https://doi.org/10.3390/fi15020085

M. Aumüller, E. Bernhardsson, and A. Faithfull, "ANN-Benchmarks: A Benchmarking Tool for Approximate Nearest Neighbor Algorithms," in Similarity Search and Applications, vol. 10609, C. Beecks, F. Borutta, P. Kröger, and T. Seidl, Eds. Springer International Publishing, 2017, pp. 34–49. DOI: https://doi.org/10.1007/978-3-319-68474-1_3

J. C. Corbett et al., "Spanner: Google’s Globally Distributed Database," ACM Transactions on Computer Systems, vol. 31, no. 3, pp. 1–22, Aug. 2013. DOI: https://doi.org/10.1145/2491245

G. DeCandia et al., "Dynamo: amazon’s highly available key-value store," in Proceedings of twenty-first ACM SIGOPS symposium on Operating systems principles, Oct. 2007, pp. 205–220. DOI: https://doi.org/10.1145/1294261.1294281

C. McMillen, Advancements in Database Management Systems: A Comprehensive Review and Future Directions. 2024.

H. V. Jagadish, B. C. Ooi, K. L. Tan, C. Yu, and R. Zhang, "iDistance: An adaptive B+ -tree based indexing method for nearest neighbor search," ACM Transactions on Database Systems, vol. 30, no. 2, pp. 364–397, Jun. 2005. DOI: https://doi.org/10.1145/1071610.1071612

D. Deutch, N. Frost, A. Gilad, and T. Haimovich, "Explaining Missing Query Results in Natural Language." OpenProceedings.org, 2020.

C. Zhong et al., "IndeXY: A Framework for Constructing Indexes Larger than Memory," in 2024 IEEE 40th International Conference on Data Engineering (ICDE), Utrecht, Netherlands, May 2024, pp. 516–529. DOI: https://doi.org/10.1109/ICDE60146.2024.00046

P. O’Neil and D. Quass, "Improved query performance with variant indexes," in Proceedings of the 1997 ACM SIGMOD international conference on Management of data, Mar. 1997, pp. 38–49. DOI: https://doi.org/10.1145/253260.253268

K. Wu, E. J. Otoo, and A. Shoshani, "Optimizing bitmap indices with efficient compression," ACM Transactions on Database Systems, vol. 31, no. 1, pp. 1–38, Mar. 2006. DOI: https://doi.org/10.1145/1132863.1132864

J. Rao and K. A. Ross, "Making B+- trees cache conscious in main memory," in Proceedings of the 2000 ACM SIGMOD international conference on Management of data, Dallas, ΤΧ, USA, May 2000, pp. 475–486. DOI: https://doi.org/10.1145/342009.335449

J. Ma and J. V. Nickerson, "Hands-on, simulated, and remote laboratories: A comparative literature review," ACM Computing Surveys, vol. 38, no. 3, Sep. 2006, Art. no. 7. DOI: https://doi.org/10.1145/1132960.1132961

S. Chen, P. B. Gibbons, and S. Nath, "Rethinking Database Algorithms for Phase Change Memory," presented at the 5th Biennial Conference on Innovative Data Systems Research (CIDR ’11), Jan. 2011.

J. Dittrich, J. Nix, and C. Schön, "The next 50 years in database indexing or: the case for automatically generated index structures," Proceedings of the VLDB Endowment, vol. 15, no. 3, pp. 527–540, Nov. 2021. DOI: https://doi.org/10.14778/3494124.3494136

S. Nakazono, Y. Bessho, H. Kawashima, and T. Nakamori, "Griffin: Fast Transactional Database Index with Hash and B+-Tree." arXiv, Jul. 18, 2024. DOI: https://doi.org/10.1109/e-Science62913.2024.10678674

T. Kraska, A. Beutel, E. H. Chi, J. Dean, and N. Polyzotis, "The Case for Learned Index Structures," in Proceedings of the 2018 International Conference on Management of Data, Feb. 2018, pp. 489–504. DOI: https://doi.org/10.1145/3183713.3196909

D. Gui and G. He, "Distributed Multi-Dimensional Data Index Strategy in Cloud Computing Environment," in 2021 5th International Conference on Electronics, Communication and Aerospace Technology (ICECA), Coimbatore, India, Dec. 2021. DOI: https://doi.org/10.1109/ICECA52323.2021.9675845

An Efficient Primary Indexing Method with Sibling Pointers for Large-Scale Database Systems

Authors

Abstract

Keywords:

Downloads

References

Downloads

How to Cite

Metrics

License