When you build a distributed system it is difficult to design good transaction guarantee mechanism. If your communication is asynchronous, you communicate with remote services using some messaging system. I guess you already know that network is not reliable, so your message can be lost. To guarantee message delivery you have to resend it when confirmation is missing. There are solutions to guarantee idempotence for the service receiving data. What I want to share is my idea how to make sending e-mail service as close as possible to the idempotence solution.
Today I finished reading a book ‘Designing Data-Intensive Applications’ written by Martin Kleppmann. I would like to share with you my review of this book.
In my view designing and development of any distributed system is a hard work. My opinion is based on over a decade of experience in this subject. The problem is that all modern applications are distributed in some way. If a database is hosted on a different computer than a web server, there is a communication link. You have two services – database service and application service – which share an unreliable network as a communication channel. The browser hosting client code which connects to the web server creates distributed system as well.
Workflow engines are very advanced tools and I saw a lot of projects where they created great business values. They sometimes simplify development. In this article I do not argue that you should not use them. My goal is to show you that there is an alternative – transformation from workflow into set of services distributed in the future.
The traditional SQL databases are used by developers as a fully safe storage. The ACID properties are intuitive for us and give us a sense of safety during application development. We know that if during a database transaction a network error occurred and we received an exception, the whole transaction would be rolled back as atomic part of process. To avoid network issues we can retry the operation later and it should solve the problem. It is true in most cases, but there is one when it is not so easy. Let me drill down this subject.
Sometimes system flexibility makes the solution complicated 😉 . Imagine a ‘simple’ situation that you have a system in which a client can define new attribute and assign multiple values to it. Later, the client can assign these values to the Customers. If you do not plan to develop new attributes’ set every time a client wants to make a change, you will need to prepare a ‘dynamic’ structure in the database.
Big companies have huge internal structures. The problem they have is that huge structures have to be mapped into a permission model. One company I worked for had over 300k groups in Active Directory. As worldwide organization they have multiple domains in AD forest. Of course, various groups have various memberships so the structure was really complicated.
Connecting front end and back end side is difficult. The main problem is the protocol itself. It is stateless. Connection is being re-created each time. HTTP/2 changes this behavior a little bit, but this is the transport layer only. Multiple requests/responses can be handled in a different order. It means there are many traps waiting for a developer.
SQL query engine is prepared to return correct results as soon as possible. But correctness means being correct from mathematical perspective. Problem that I will describe is obvious, but it is not what you sometimes expect from SQL engine.