Background
I have a processing chain with 3 steps. I going to design my application to have a very high output.
Getting into details
The system is solving incoming tasks. Each processing chain (A
, B
and C
) has input and output:
A
input is a task to be solved. A
output is a list of sub tasks to be solved. A
produces multiple outputs for a single input (all related to the same task).
B
input is a task to be solved. B
output is a single task targeted to C
.
C
input are list of messages, aggregated by the "parent task". Once all the items for a specific tasks is completely solved, C
mark the task as completed.
One possible architecture, using Google Cloud, is to write a Google Cloud Storage Object into a bucket for every new incoming task. Turn on Google Function notification for each new storage object created. This function will to the work of A
(from the processing chain). The output will be written into diffrent bucket that will fire another Function notification (B
). The output will be written into a 3rd bucket for processing of C
.
Note: When a function process a task, it also delete it on the end.
Let's assume that a specific task was created 10 items to process on Function B
. So, in bucket C
you will find, at the end, 10 different objects. Function C
mission is to detect the exact time when ALL the items (A
output) for a specific task was completely executed. If all the items executed, C
has to mark the task as completed.
The Problem
Sounds like we have to count how many outputs A
had, and compare it to how many inputs C
had.
Is this possible to change the system design to prevent the need of "counting messages"?