An Efficient Primary Indexing Method with Sibling Pointers for Large-Scale Database Systems
Received: 18 April 2025 | Revised: 21 May 2025 | Accepted: 25 May 2025 | Online: 2 August 2025
Corresponding author: Mohammad Al Khaldy
Abstract
Efficient data retrieval is critical in modern database systems where data volumes are increasing. Although traditional indexing with B+ tree and bitmap indexes optimizes query performance, it introduces storage overhead, has inefficient mechanisms for handling duplicate keys, and proves challenging to scale. This study presents a new primary index approach that minimizes query execution time and eliminates extraneous lookups by adding sibling pointers, enabling efficient management and retrieval of duplicate keys. Based on a thorough experimental study utilizing a MySQL database of up to 10 million records, the proposed approach achieved significantly faster query execution times (up to 33.5%) and reduced storage overhead (up to 25%) compared to classical techniques. Τhe proposed method provides stable effectiveness regardless of query type and improves scalability over large databases, advancing the field of indexing techniques by providing a low-cost, scalable, and storage-friendly solution applicable to high-traffic workloads and very large datasets. Future directions include its use in distributed and cloud-based environments, with opportunities for improvement through adaptive and AI-driven indexing approaches.
Keywords:
database optimization, B tree, storage efficiency, large-scale data systemsDownloads
References
A. Raman, K. Karatsenidis, S. Xie, M. Olma, S. Sarkar, and M. Athanassoulis, "QuIT your B+-tree for the Quick Insertion Tree." OpenProceedings.org, 2025.
S. R. Jeong, Y. Kim, I. Ghani, and J. H. Kim, "A New Database Archiving Approach for Effective Storage and Data Management: A Case Study of Data Warehouse Project in a Korean Bank," International Journal of Advance Soft Computing Application, vol. 6, no. 3, pp. 31–46, 2014.
S. Emanuilov and A. Dimov, "Billion-Scale Similarity Search Using a Hybrid Indexing Approach with Advanced Filtering," Cybernetics and Information Technologies, vol. 24, no. 4, pp. 45–58, Dec. 2024. DOI: https://doi.org/10.2478/cait-2024-0035
J. S. Hwang, S. Lee, Y. Lee, and S. Park, "A Selection Method of Database System in Bigdata Environment: A Case Study From Smart Education Service in Korea," International Journal of Advance Soft Computing Application, vol. 7, no. 1, pp. 9–21, 2015.
S. H. Adil, M. Ebrahim, S. S. A. Ali, and K. Raza, "Performance Analysis of Duplicate Record Detection Techniques," Engineering, Technology & Applied Science Research, vol. 9, no. 5, pp. 4755–4758, Oct. 2019. DOI: https://doi.org/10.48084/etasr.3036
M. Al-Ani, Q. Al-Shayea, A. R. Alshehadeh, B. Annisa, and H. A. Al-khawaja, "Creating Visual Knowledge Representation Based on Data Mining in Educational Jordanian Databases," International Journal of Advance Soft Computing Application, vol. 16, no. 1, pp. 155–168, 2024.
J. Wang and M. Athanassoulis, "CUBIT: Concurrent Updatable Bitmap Indexing," Proceedings of the VLDB Endowment, vol. 18, no. 2, pp. 399–412, Oct. 2024. DOI: https://doi.org/10.14778/3705829.3705854
T. Taipalus, "The effects of database complexity on SQL query formulation," Journal of Systems and Software, vol. 165, Jul. 2020, Art. no. 110576. DOI: https://doi.org/10.1016/j.jss.2020.110576
S. AlZu’bi et al., "Diabetes Monitoring System in Smart Health Cities Based on Big Data Intelligence," Future Internet, vol. 15, no. 2, Feb. 2023, Art. no. 85. DOI: https://doi.org/10.3390/fi15020085
M. Aumüller, E. Bernhardsson, and A. Faithfull, "ANN-Benchmarks: A Benchmarking Tool for Approximate Nearest Neighbor Algorithms," in Similarity Search and Applications, vol. 10609, C. Beecks, F. Borutta, P. Kröger, and T. Seidl, Eds. Springer International Publishing, 2017, pp. 34–49. DOI: https://doi.org/10.1007/978-3-319-68474-1_3
J. C. Corbett et al., "Spanner: Google’s Globally Distributed Database," ACM Transactions on Computer Systems, vol. 31, no. 3, pp. 1–22, Aug. 2013. DOI: https://doi.org/10.1145/2491245
G. DeCandia et al., "Dynamo: amazon’s highly available key-value store," in Proceedings of twenty-first ACM SIGOPS symposium on Operating systems principles, Oct. 2007, pp. 205–220. DOI: https://doi.org/10.1145/1294261.1294281
C. McMillen, Advancements in Database Management Systems: A Comprehensive Review and Future Directions. 2024.
H. V. Jagadish, B. C. Ooi, K. L. Tan, C. Yu, and R. Zhang, "iDistance: An adaptive B+ -tree based indexing method for nearest neighbor search," ACM Transactions on Database Systems, vol. 30, no. 2, pp. 364–397, Jun. 2005. DOI: https://doi.org/10.1145/1071610.1071612
D. Deutch, N. Frost, A. Gilad, and T. Haimovich, "Explaining Missing Query Results in Natural Language." OpenProceedings.org, 2020.
C. Zhong et al., "IndeXY: A Framework for Constructing Indexes Larger than Memory," in 2024 IEEE 40th International Conference on Data Engineering (ICDE), Utrecht, Netherlands, May 2024, pp. 516–529. DOI: https://doi.org/10.1109/ICDE60146.2024.00046
P. O’Neil and D. Quass, "Improved query performance with variant indexes," in Proceedings of the 1997 ACM SIGMOD international conference on Management of data, Mar. 1997, pp. 38–49. DOI: https://doi.org/10.1145/253260.253268
K. Wu, E. J. Otoo, and A. Shoshani, "Optimizing bitmap indices with efficient compression," ACM Transactions on Database Systems, vol. 31, no. 1, pp. 1–38, Mar. 2006. DOI: https://doi.org/10.1145/1132863.1132864
J. Rao and K. A. Ross, "Making B+- trees cache conscious in main memory," in Proceedings of the 2000 ACM SIGMOD international conference on Management of data, Dallas, ΤΧ, USA, May 2000, pp. 475–486. DOI: https://doi.org/10.1145/342009.335449
J. Ma and J. V. Nickerson, "Hands-on, simulated, and remote laboratories: A comparative literature review," ACM Computing Surveys, vol. 38, no. 3, Sep. 2006, Art. no. 7. DOI: https://doi.org/10.1145/1132960.1132961
S. Chen, P. B. Gibbons, and S. Nath, "Rethinking Database Algorithms for Phase Change Memory," presented at the 5th Biennial Conference on Innovative Data Systems Research (CIDR ’11), Jan. 2011.
J. Dittrich, J. Nix, and C. Schön, "The next 50 years in database indexing or: the case for automatically generated index structures," Proceedings of the VLDB Endowment, vol. 15, no. 3, pp. 527–540, Nov. 2021. DOI: https://doi.org/10.14778/3494124.3494136
S. Nakazono, Y. Bessho, H. Kawashima, and T. Nakamori, "Griffin: Fast Transactional Database Index with Hash and B+-Tree." arXiv, Jul. 18, 2024. DOI: https://doi.org/10.1109/e-Science62913.2024.10678674
T. Kraska, A. Beutel, E. H. Chi, J. Dean, and N. Polyzotis, "The Case for Learned Index Structures," in Proceedings of the 2018 International Conference on Management of Data, Feb. 2018, pp. 489–504. DOI: https://doi.org/10.1145/3183713.3196909
D. Gui and G. He, "Distributed Multi-Dimensional Data Index Strategy in Cloud Computing Environment," in 2021 5th International Conference on Electronics, Communication and Aerospace Technology (ICECA), Coimbatore, India, Dec. 2021. DOI: https://doi.org/10.1109/ICECA52323.2021.9675845
Downloads
How to Cite
License
Copyright (c) 2025 Mohammad Al Khaldy, Ameen Shaheen, Wael Alzyadat, Aysh Alhroob

This work is licensed under a Creative Commons Attribution 4.0 International License.
Authors who publish with this journal agree to the following terms:
- Authors retain the copyright and grant the journal the right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) after its publication in ETASR with an acknowledgement of its initial publication in this journal.
