Disclaimer

Workflow engines are very advanced tools and I saw a lot of projects where they created great business values. They sometimes simplify development. In this article I do not argue that you should not use them. My goal is to show you that there is an alternative – transformation from workflow into set of services distributed in the future.

Introduction

People like their comfort zone. They feel well when they find a pattern that suits their needs. When they see a diagram representing workflows they think the best choice is workflow engine. The workflow engine solves many problems. It allows you to split a problem into multiple subtasks/steps called activities. It can handle long running process by giving you hydration and dehydration mechanism. It means you do not have to care about server restart. It gives you an opportunity to develop complicated business dataflow in a single place using dedicated tool. Your flow is presented by a nice diagram, which is worth more than a thousand words 😉 .

Sample Workflow

The diagram is compiled (validated) by your development environment. It means it is verified automatically at some abstraction level. For exchanging data in a uniform way between activities a kind of global object called workflow context is introduced. Workflows are great and allow you to solve a lot of problems. Each activity/state of workflow is recorded in dedicated database storage, so it is traceable. Your workflow can be executed on multiple machines – distributed wisely by workflow engine. Everything looks fine until you focus on a few aspect. By using workflows you are creating a monolith work manager at the top level. By creating sub-workflows you are creating the monolith on lower levels as well. A lot of workflow engines I saw are not source-control friendly. What I mean is that the diagram designing part is a problem. The diagrams are stored in binary or complicated XML form. Workflows-dedicated products are not so popular among developers. It means workflow products require more specialized ‘experts’ to develop them. Workflows deliver you many of build-in mechanisms like scalability, state storage, traceability, but they can become a bottleneck in a development process after a few years. Edition of growing diagrams with sub diagrams is complicated especially when source control support is poor. Without source control mechanism you are losing traceability of software changes. Workflows monolith architecture by design also could become a problem when you want to delegate part of work outside. There exist solutions where you can delegate workflow activities to external services. In consequence you split work between developers (workflow/service), but it creates duplicates in logic. Each activity that connects external service becomes a translator between an extracted service and workflow context. It violates Single Responsibility Principle, because each change has to be implemented twice in activity and external service.

Alternative Solution – Service Bus.

Let’s summarize the matter so far. The main workflow components are: activities/states represented by boxes (graph nodes) and transitions represented by arrows (graph edges). Workflow context presents the data exchange object between activities.

Let’s look into the workflow using a different approach. If an activity became a process (service) and transition arrow became a message representing transferred data between activities, your workflow engine would be a distributed system by design. The only thing you have to guarantee is durable asynchronous message delivery (I know frameworks which do that). Each workflow activity represents a service. State transition is represented by a message. All activities can be run in parallel and optimized independently. It means that by design you split your monolith into the multi component architecture. You can say that you lost the whole workflow picture (which is worth more than thousand words). But don’t worry. If you audit all your messages in the system and you introduce correlation ID for the initial message, you can still manage the whole process as a workflow from the business perspective. Diagram is just a set of nodes and edges. You can create a diagram by some tools reading your service interconnection configuration. But you are not forced to do the diagrams yourself. There are many tools, which allow you to visualize your service infrastructure and show you data flow. If you create some correlation ID for your input message (initial state) you can see not only how the flow was originally designed, but also what kind of interactions are done by the system for specific input data. We can say using metaphor that instead of class diagram you can see object diagram in action (see Particular Software > Sequence Diagram). Because your workflow diagram is created from code/configuration, you can see exactly if this code realizes business needs. You have a cross check of what was designed by Business Analyst and what was built by Development Team. By transformation of workflow into independent set of activities, code of each activity can also be independently written. Because new activity transformed to service is an isolated unit of work communicating with external world by messages, you can test it independently.

The distributed service set does not mean that you have to build a cluster of servers. You can start from a single machine. Your services can communicate by using local queues. When you decide that you need to scale out you, will be ready. The key two things you have to take care of are reliable message bus for your system and wise way of keeping nodes and edges consistent. Fortunately, dedicated products are already on the market.