Today I finished reading a book ‘Designing Data-Intensive Applications’ written by Martin Kleppmann. I would like to share with you my review of this book.
In my view designing and development of any distributed system is a hard work. My opinion is based on over a decade of experience in this subject. The problem is that all modern applications are distributed in some way. If a database is hosted on a different computer than a web server, there is a communication link. You have two services – database service and application service – which share an unreliable network as a communication channel. The browser hosting client code which connects to the web server creates distributed system as well.
The most difficult part for developers and architects is designing a system that is as much bulletproof as possible. To do this you need to know all the risks related to the design you choose. Developer should understand main application concept and create code which does not violate rules on the code level. You cannot handle all the risks (time, resources, budget constraints), but you should know them to design the best possible system. ‘Designing Data-Intensive Applications’ describes these risks and shows how to handle them. It explains how to be prepared for the process of designing an application. It also helps to choose reliable and scalable solutions. The book drives you through implementation of various database types existing on the market. It shows you how they store data and how they guarantee consistency and fault tolerance. The book discusses the pros and cons of using different types of databases.
‘Designing Data-Intensive Applications’ covers various problems which you may encounter when designing a distributed system. For example: how to guarantee consistency of your system and how not to break your system when failure occurs. You can learn that not only network is unreliable, but using global time e.g. UTC to recognize events order is not predictable and safe (even if you think that clocks are well synchronized). I thought I knew a lot of risks related to time, but I found out many new things from this book in this aspect as well.
If you are student of computer science, a young or an experienced developer, software architect with many years of experience, that book is for you. If you want to be a good engineer and take your experience to a higher level it is for you as well. I am going to convince my colleagues from IT sector to read it. ‘Designing Data-Intensive Applications’ collects a lot of knowledge on distributed (and not only distributed) systems in a readable form. By showing you edge cases, it gives you an opportunity to learn about existing (already analyzed) problems before they appear in your application. If you want to drill down any subject described in this book hundreds or maybe thousands of references are included. It means your knowledge can grow in multidimensional space of related papers.
By this review of Martin Kleppmann book I want to thank him for a really, really good work. I believe that when IT people start to read this book our software will become better. If you are a part of IT structure, just read it and become aware of how to make better systems.