27

I have a function with a sensitive operation:

function doSomeThingSensitive() {
    customers = getAllCustomers()
    for each customer in customers
        if customer is from Europe
            Give them a 1000 euro giveaway coupon
}

I want to dry run this function in order to calculate how many sensitive operations (aka the 1000-euro giveaways) will be performed. The dry run function will look something like:

function doSomeThingSensitiveDryRun() {
    counter = 0
    customers = getAllCustomers()
    for each customer in customers
        if customer is from Europe
            counter++
    print 'we will pay ' + counter * 1000
}

The problem with this approach is that the dry run function is not maintainable at all. Any slight change in the initial function will break the logic on the dry run function, which makes its maintenance very difficult. Is there a different way I can approach this?

An idea is to run the initial function as a unit test and stub the sensitive operations. A different idea can be to run the initial function with a Boolean parameter, dryRun, and if it is set to true, the sensitive operation will not happen. Is there a better and maybe simpler way I can tackle this?

13
  • 47
    Welcome to the world of mocking. Commented Jan 25, 2022 at 21:19
  • 1
    At the heart of this problem is a query. That query is being done on objects which would be slower than many databases. However if the data were in an underlying database, you could easily answer questions like this in seconds. The main difference is whether you do a SELECT COUNT(*) or SELECT * and iterate over the records. I know not all databases are SQL, but there are ways to get the same results from all the database types I know of. Commented Jan 26, 2022 at 20:29
  • 6
    I would write two separate functions. Once computes the list of customers that should receive a coupon, the other unconditionally gives a coupon to the list of customers it receives as an argument.
    – chepner
    Commented Jan 26, 2022 at 20:51
  • 12
    So you want a DRY dry run? Commented Jan 27, 2022 at 9:00
  • 2
    @moonman239 Being a separate function means it will be hard to keep it in sync with the original function, and it would be impossible to properly verify that the changes in both functions are actually the same, and there will likely be no automatic way to enforce that a later change to real function will be reflected in the dry run version at all. It's not just difficult to maintain, it is truly impossible to maintain the dry run correctness reliably.
    – Frax
    Commented Jan 28, 2022 at 14:03

5 Answers 5

57

You're doing multiple different things in that function. Start off by splitting it into separate functions. Here's an example using a closure for the "dry run" function.

function doWithCustomers(action)
    customers = getAllCustomers()
    for each customer in customers
        if customer is from Europe
            action(customer)

function sendCoupon(customer)
    give coupon to customer

Then you can provide the appropriate action depending on whether you're testing or running in production:

function main()
    doWithCustomers(sendCoupon)

function dryRun()
    count=0
    doWithCustomers(() -> count++)
    print 'we will pay' + counter * 1000

It should be fairly trivial to convert the closure to an object if your language has the latter but lacks the former.

For more functional-oriented languages, you may want to look up the fold function (also commonly known as reduce) as a generic language built-in replacement for the doWithCustomers function. Filter and map may also help.

7
  • 5
    A further generalization od doWithCustomers(action) to doWithCustomers(condition, action) might be desirable. Though at that point, that function has basically become a combination of filter and map. Commented Jan 26, 2022 at 9:59
  • 3
    Re. functional languages: you could go one step further and write the function in question over a free monad, and have two different interpreters. Commented Jan 26, 2022 at 10:48
  • 1
    If you're working with side effects, foreach might be better than fold. Commented Jan 27, 2022 at 2:09
  • @TamoghnaChowdhury Then the filter is duplicated in each function (mock/real). Do it too much, and you're back to the original question! Commented Jan 27, 2022 at 14:58
  • 1
    I can't tell you how many times I've had to explain to coworkers that making something hard to test, hard to fail, hard to mock etc. also make it hard to maintain in general. The paradigms we use for testing are not accidents - they are there because they're the easiest way to test well-written code. You're not writing it "so it can be tested" - you're writing it "the good way" which testing frameworks have been built around!
    – corsiKa
    Commented Jan 29, 2022 at 1:28
38

There are two solutions that seem reasonable:

In practice, we might do this as follows. First, our function contains the business logic of deciding which customer receives how much. Instead of applying that action directly, we return a list of objects representing the action to be taken:

class SendCouponAction:
  customer: Customer
  value: Money

def do_something_sensitive() -> Iterable[SendCouponAction]:
  for customer in get_all_customers():
    if is_in_europe(customer):
      yield SendCouponAction(customer, Money.euro(1000))

Now normally, we would just perform the actions:

actions = list(do_something_sensitive())

for action in actions:
  coupon = generate_coupon(action.customer, action.value)
  coupon.send()

But before doing that, we can log the actions to be taken. For example, we can log the total value:

logger.info(
  "generated %d coupons worth %d in total",
  len(actions),
  sum(coupon.value for coupon in actions),
)

You could now easily make it conditional to actually carry out the action. You can also easily write unit tests for the logic that decides who receives which coupons (though there's still a dependency on the get_all_customers() function that would make tests difficult – its result could be passed as a parameter instead).

Is this a good design? Sometimes.

  • Having a “functional core” is great when possible, because such functions that do not carry out external actions are really easy to test. You don't even have to mock anything, you just provide inputs and look at the outputs.

  • But this can get problematic if carrying out the action is the more interesting part – you might have a core that is essentially empty. For example, I don't think this would be worth it for a basic CRUD application that doesn't really contain any business logic.

  • Trying to achieve such an architecture also gets really tricky if the business logic requires lots of back and forth communication with a data source. You could extract the purely functional parts, but that would end up with a tortured encoding of a state machine that obfuscates the actual business logic.

8
  • 3
    This is the best answer to the essence of OP's question, which is: "how can I faithfully audit a sensitive function?". The DI answer is "run a different function" -- but if someone added an extra 0 in the live function but not the audit function, you're hosed. With this approach, the number is only entered once, and you audit that number. This "functional core, imperative shell" approach is extremely useful in many contexts. Commented Jan 26, 2022 at 17:00
  • 2
    I'd offer that this is missing the 3rd option: mocking the sensitive function, as suggested in the top comment on the question. Commented Jan 26, 2022 at 18:01
  • 2
    @shadowtalker Mocking is equivalent to dependency injection, which I mention in my first bullet point as an alternative.
    – amon
    Commented Jan 26, 2022 at 18:03
  • 1
    Also: I think the "functional core" approach that @JounceCracklePop suggested is a special case of "algebraic effects". Instead of your program "doing things" imperatively, your program encodes branching tree of "instructions", which are then "interpreted" by effect handlers. I believe there is a lot of CS research happening on this topic, including some Haskell libraries (e.g. freer-effects) and research languages (e.g. Koka). I'd love to see an answer from someone who has experience in that area. Commented Jan 26, 2022 at 18:08
  • 2
    @shadowtalker I think the general idea is to replace control flow with data flow. The “functional core” idea is already appealing on that level. Viewing this example through the lens of effect systems is definitely interesting though, as a kind of language-level support for this pattern.
    – amon
    Commented Jan 26, 2022 at 18:17
8

I would break the operation into two. As long as it's a single operation, regardless of how you program it, there is a chance you run it live instead of dry by mistake.

Conceptually with your coupon example:

  1. produce a list of coupons, save to a database.
  2. send the tokens out to customers

Now I can run and rerun 1 all day and no money is paid out. I can check the result for accuracy, have manual intervention, etc.

When I'm happy, I can perform 2, which I'm careful to program with as little logic as possible and all the checks and safeguards I can think of.

Because I have the set of fully generated coupons, you can put in a check, such as "if total > £1,000,000 don't send".

1

You have coupled the definition of your target group with the action being performed. Instead you can have a function which produces a list or iterator of customers, and a separate function that performs the action:

function getTargetCustomers() {
  targetCustomers = []
  customers = getAllCustomers()
    for each customer in customers
        if customer is from Europe
          targetCustomers += customer
  return targetCustomers
}

function doSomeThingSensitive() {
    customers = getTargetCustomers()
    print 'we will pay ' + customers.size() * 1000
    for each customer:
      Give them a 1000 euro giveaway coupon
}
1
  • +1 aka CQS(not to be confused with CQRS) Commented Jan 28, 2022 at 12:31
0

I'd have merged the dry run code into the sensitive function, and use a Boolean switch to verify if you were just doing a logging counter, or if you were going live - because currently you're one "Dry Run where the called function is the one that isn't called Dry Run" away from executing the main logic, and that seems more dangerous than just decoupling.

You could use a Boolean parameter, but I've also seen a more static variable/global variable context been used in projects as well - showcasing the static private class variable context, I would go with:

dryRun = true; //True if not wanting to run sensitive component, false if sensitive component is fine. 
    
function doSomeThingSensitive() {
    counter = 0
    customers = getAllCustomers()
    for each customer in customers
        if customer is from Europe
            counter++
            if dryRun is true
                log "We would be logging this customer!"
            else
                Give them a 1000 euro giveaway coupon
     if dryRun is true
        log 'we will pay' + counter * 1000
     else
        log 'we *actually did* pay' + counter * 1000
    

Depending on your setup, you may want to make dryRun be a variable you pass in via the function parameter, or as a configuration value in your environment, but you'd want it to be that the default state is to Dry Run the code in your logic, but only as a check against actually sensitive actions, thus keeping your logic mostly the same regardless of if it's a Dry Run or a Sensitive Action Run; if it's a Dry Run when it needs to be a Sensitive Run, you can always make a quick change and then run it again.

2
  • 2
    This is a very brittle approach. You have a lot of copy-paste that is easy to mess up when modifying, and any tests you run with dryRun==true are not actually testing your bussiness logic. It is way better than having 2 separate functions, but not a robust approach.
    – Frax
    Commented Jan 27, 2022 at 20:55
  • Admittedly, it can be improved by putting the dryRun boolean check in the sendCoupon() function itself, if it's using a configuration global value, but it would allow you to be explicit about what parts of a function are, in fact, dangerous to run without checking that it's working as expected. Commented Jan 27, 2022 at 22:28

Not the answer you're looking for? Browse other questions tagged or ask your own question.