Why do I need to learn about advanced database management systems?
Learning about advanced database management systems is important because modern applications deal with huge, complex, and diverse data. Learning to build a database management system equips you with the skills to design scalable, distributed, and high-performance data systems, making you a better problem solver and innovator in data management.
What can I do after finishing learning about advanced database management systems?
You will understand how data is stored, indexed, and retrieved efficiently, providing deep insight into the foundations of all database systems. You will also learn how queries are processed, transactions are managed, and consistency is maintained, which is essential for reliable, multi-user applications.
With this knowledge, you will be able to design and optimize high-performance databases tailored to specific applications. You can also develop custom or specialized data systems, such as vector, time-series, graph, or embedded databases.
Additionally, you gain the skills to build distributed, scalable, and fault-tolerant systems or improve existing database platforms for reliability and efficiency.
You can also contribute to database research, develop new query languages, design novel storage engines, or create data-intensive products in AI, analytics, or fintech.
That sounds useful! What should I do now?
First, please read this book to learn about database system concepts: Abraham Silberschatz et al. (2019). Database System Concepts. McGraw-Hill Education.
Alternatively, if you want to follow the concepts with interactive explanations, you can audit this course: CMU 15-445 – Introduction to Database Systems (UC Berkeley).
After that, please read the books below to learn how to design distributed databases and understand how they work:
- Martin Kleppmann (2017). Designing Data-Intensive Applications: The Big Ideas Behind Reliable, Scalable, and Maintainable Systems. O’Reilly Media
- Alex Petrov (2019). Database Internals. O’Reilly Media
Terminology Review:
- Database’s Files.
- Storage Manager.
- Database’s Pages vs. Hardware Pages vs. OS Pages.
- Page Storage Architecture: Heap Files, Tree Files, Sequential / Sorted File Organization (ISAM), Hashing Files.
- Page Directory.
- Page Header.
- Slotted Pages.
- Tuple Layout.
- Tuple Header.
- Tuple Data.
- Record Identifiers.
- Log-Structured Storage: MemTable, SSTables, Compaction.
- Index-Organized Storage.
- Tuple Storage.
- Word-Aligned Tuples.
- Data Representation: Variable-Precision Numeric Type, Null Data Type, Large Values, Overflow Pages.
- System Catalogs.
- N-ary Storage Model (NSM).
- Decomposition Storage Model (DSM).
- Partition Attributes Across (PAX) Storage Model.
- Columnar Compression: Run-Length Encoding (RLE), Bit-Packing Encoding, Bitmap Encoding, Delta Encoding, Dictionary Encoding.
- Data-Intensive Applications.
- Graph Databases.
- Distributed Databases.
- Distributed Relational Databases.
After finishing advanced database management systems, please click on Topic 16 – Advanced Software Design to continue.