Design decisions for an efficient OrderBook

Question

I am designing code for OrderBook based in C++, based on STL library. Note that there is a related question here, however it is more simple - it does not support deletion of orders, and I intend to add deletion as well as modification.

The prices can be floats here, and I have modeled the orderbook as a map of key = price and value = std::list of Orders. for quick deletion, I have created an unordered_map of key = Order ID and value = iterator of the std::list. I used std list because since its iterator is very robust - it becomes invalidated only when the element itself is deleted, this should be safe. as to the complexity, insertion is logarithmic in complexity (of the number of orders in a level) and constant for deletion.

Please provide suggestions on how to improve performance of this code, and to improve the code quality. The order book supports multiple instruments, as well as deletion based on Order IDs.

The code provided below is working, though I haven't added extensive tests.

#include <string>
#include <fstream>
#include <iostream>
#include <list>
#include <map>
#include <vector>
#include <sstream>
#include <iomanip>
#include <unordered_map>

typedef std::vector<std::string> vector_str;
constexpr char string_delim = ' ';

std::vector<std::string> split(const std::string &s)
{
  std::vector<std::string> result;
  std::stringstream ss(s);
  std::string item;

  while (std::getline(ss, item, string_delim))
  {
    result.push_back(item);
  }

  return result;
}
enum side_t : bool
{
  BUY = false,
  SELL = true
};

double convertLongLongToDouble(long long value)
{
  double fractionalPart = static_cast<double>(value % 100000) / 100000.0;
  double integerPart = static_cast<double>((value / 100000) % 10000000);
  return integerPart + fractionalPart;
}

std::string format_7_5(long long number)
{
  std::ostringstream out;
  out << std::fixed << std::setprecision(5) << convertLongLongToDouble(number);
  return out.str();
}
struct Node
{
public:
  int oid;
  std::string symbol;
  side_t side;
  int qty;
  long long price;
  Node(int oid,
       std::string symbol,
       side_t side,
       int qty,
       long long price)
  {
    this->oid = oid;
    this->symbol = symbol;
    this->side = side;
    this->qty = qty;
    this->price = price;
  }
  std::list<Node>::iterator list_itr;
};

class OrderBook
{
  std::map<int, Node *> *orders_map_;

public:
  std::map<long long, std::list<Node>> Booklevel_sell_, Booklevel_buy_;
  OrderBook(std::map<int, Node *> *orders_map) : orders_map_(orders_map)
  {
  }

private:
  // Utility function to process book levels
  void processBookLevel(bool is_buy, const std::map<long long, std::list<Node>> &bookLevel, vector_str &results)
  {
    // Iterating over bookLevel in reverse order
    for (auto level_itr = bookLevel.rbegin(); level_itr != bookLevel.rend(); ++level_itr)
    {
      const auto &curr_list = level_itr->second;

      if (is_buy)
      {
        // Process curr_list in normal order for buy orders
        for (const auto &item : curr_list)
        {
          results.emplace_back("P " + std::to_string(item.oid) + " " +
                               format_7_5(item.price) + " " +
                               std::to_string(item.qty));
        }
      }
      else
      {
        // Process curr_list in reverse order for sell orders
        for (auto item_ritr = curr_list.rbegin(); item_ritr != curr_list.rend(); ++item_ritr)
        {
          const auto &item = *item_ritr;
          results.emplace_back("P " + std::to_string(item.oid) + " " +
                               format_7_5(item.price) + " " +
                               std::to_string(item.qty));
        }
      }
    }
  }

public:
  void print(vector_str &results)
  {
    processBookLevel(false, Booklevel_sell_, results);
    processBookLevel(true, Booklevel_buy_, results);
  }

  vector_str insert_order(int oid, std::string &symbol, side_t side, int qty, long long price)
  {
    // Check for duplicate orders by order ID.
    // If an order with the same ID already exists in our orders_map, return an error message.
    if (auto itr = orders_map_->find(oid); itr != orders_map_->end())
    {
      return vector_str{"E " + std::to_string(oid) + " Duplicate order id"};
    }
    vector_str final_lst{};

    // Lambda to push completed trade info (Fills) into a list for reporting.
    auto pushToFinalList = [&final_lst](int oid, const std::string &symbol, int qty, int price)
    {
      final_lst.push_back("F " + std::to_string(oid) + " " + symbol + " " + std::to_string(qty) + " " + format_7_5(price));
    };

    auto qty_to_fill = qty; // Quantity of the order to be filled.

    // Lambda for handling the trade process.
    // It iterates through the order book, executes trades based on price conditions,
    // and modifies the order quantities or removes filled orders.
    auto handleTrade = [&](auto begin, auto end, auto priceCondition, auto eraser)
    {
      for (auto itr = begin; itr != end && priceCondition(itr->first) && qty_to_fill > 0;)
      {
        auto &curr_list = itr->second;

        // Iterate through each order in the current price level list.
        for (auto list_itr = curr_list.begin(); list_itr != curr_list.end() && qty_to_fill > 0;)
        {
          // Calculate the quantity to trade, which is the minimum of qty_to_fill and the current order's quantity.
          int qty_to_trade = std::min(qty_to_fill, list_itr->qty);

          // Record the trade details for both participating orders.
          pushToFinalList(oid, symbol, qty_to_trade, price);                               // The newly inserted order.
          pushToFinalList(list_itr->oid, list_itr->symbol, qty_to_trade, list_itr->price); // The matched order.

          qty_to_fill -= qty_to_trade; // Decrease the remaining quantity to fill.

          // If the current order is fully filled, remove it and update iterator; otherwise, decrease its quantity.
          if (list_itr->qty <= qty_to_trade)
          {
            orders_map_->erase(list_itr->oid);    // Remove the order from the global map.
            list_itr = curr_list.erase(list_itr); // Remove the order from the current price level.
          }
          else
          {
            list_itr->qty -= qty_to_trade; // Deduct the traded quantity.
            ++list_itr;                    // Move to the next order in the list.
          }
        }

        // If the current list (price level) is empty after trades, remove it from the book.
        if (curr_list.empty())
        {
          itr = eraser(itr);
        }
        else
        {
          // Implies that the current price level is not empty after qty_to_fill is exhausted.
          break;
        }
      }
    };

    // Process trades for BUY orders against SELL book and vice versa.
    // Utilizes the handleTrade lambda with specific iterators and conditions based on the order side.
    if (side == BUY)
    {
      handleTrade(
          Booklevel_sell_.begin(), Booklevel_sell_.end(),
          [price](double p)
          { return p <= price; }, // Matching condition for BUY: sell price should be <= buy order price.
          [&](auto itr)
          { return Booklevel_sell_.erase(itr); }); // Eraser for direct iterator (forward list).
    }
    else
    {
      handleTrade(
          Booklevel_buy_.rbegin(), Booklevel_buy_.rend(),
          [price](double p)
          { return p >= price; }, // Matching condition for SELL: buy price should be >= sell order price.
          [&](auto itr) -> decltype(Booklevel_buy_.rbegin())
          {
            // Eraser for reverse iterator, requires conversion for erase and then to reverse iterator again.
            return std::make_reverse_iterator(Booklevel_buy_.erase(std::next(itr).base()));
          });
    }

    // If there's any quantity left unfilled after attempting to match, add the remaining part of the order to the book.
    if (qty_to_fill > 0)
    {
      // Select the appropriate book (buy or sell) based on the order side.
      std::map<long long, std::list<Node>> &bookLevel = (side == BUY) ? Booklevel_buy_ : Booklevel_sell_;
      // Add the unfilled order to the book.
      bookLevel[price].push_back(Node(oid, symbol, side, qty_to_fill, price));
      // Keep a direct reference to the order in the global orders map for quick access.
      (*orders_map_)[oid] = &*(bookLevel[price].rbegin());
      (*orders_map_)[oid]->list_itr = --(bookLevel[price].end());
    }

    return final_lst; // Return the list of filled trades.
  }
};
class UnifiedOrderBook
{
  std::unordered_map<std::string, OrderBook> ob_;
  std::map<int, Node *> orders_map_;

public:
  vector_str print()
  {
    auto ans = vector_str();
    for (auto it = ob_.begin(); it != ob_.end(); ++it)
    {
      it->second.print(ans);
    }
    return ans;
  }
  bool cancel_order(int oid)
  {
    auto itr = orders_map_.find(oid);
    if (itr == orders_map_.end())
    {
      return false;
    }

    auto &curr_node = itr->second;
    auto &ob = ob_.find(curr_node->symbol)->second;
    auto &bookLevel = curr_node->side == BUY ? ob.Booklevel_buy_ : ob.Booklevel_sell_;

    // Directly accessing the price level.
    auto &priceLevel = bookLevel[curr_node->price];

    // Erase the order from the list.
    priceLevel.erase(curr_node->list_itr);

    // If the price level becomes empty after erasure, remove it from the bookLevel.
    if (priceLevel.empty())
    {
      bookLevel.erase(curr_node->price);
    }

    orders_map_.erase(itr);
    return true;
  }
  vector_str insert_order(int oid, std::string &symbol, side_t side, int qty, long long price)
  {
    auto book_itr = ob_.find(symbol);
    if (book_itr == ob_.end())
    {
      auto insert_result = ob_.emplace(std::make_pair(symbol, OrderBook(&orders_map_)));
      book_itr = insert_result.first;
    }
    return book_itr->second.insert_order(oid, symbol, side, qty, price);
  }
};

// UI to access the orderbook
class SimpleCross
{
public:
  vector_str process(const std::string &line)
  {
    auto tokens_vec = split(line);
    switch (tokens_vec.size())
    {
    case 1:
      if (tokens_vec[0] == "P")
      {
        return uob_.print();
      }
      break;
    case 2:
      if (tokens_vec[0] == "X")
      {
        return uob_.cancel_order(std::stoi(tokens_vec[1])) ? vector_str{line} : vector_str{"E"};
      }
      break;
    case 6:
      double price;
      int order_id, qty;
      side_t side;
      try
      {
        order_id = std::stoi(tokens_vec[1]);
      }
      catch (...)
      {
        return vector_str{"E Invalid order id"};
      }
      try
      {
        qty = std::stoi(tokens_vec[4]);
      }
      catch (...)
      {
        return vector_str{"E Invalid quantity"};
      }
      try
      {
        price = std::stod(tokens_vec[5]);
      }
      catch (...)
      {
        return vector_str{"E Invalid price"};
      }
      if (tokens_vec[3] == "B")
      {
        side = BUY;
      }
      else if (tokens_vec[3] == "S")
      {
        side = SELL;
      }
      else
      {
        return vector_str{"E Invalid side provided"};
      }
      return uob_.insert_order(order_id, tokens_vec[2], side, qty, static_cast<long long>(price * 1e5));
    default:
      return vector_str{"E invalid arguments provided"};
    }
    return (vector_str{"E"});
  }
  vector_str action(const std::string &line)
  {
    return process(line);
  }

private:
  UnifiedOrderBook uob_;
};

int main(int argc, char **argv)
{
  SimpleCross scross;
  std::string line;
  std::ifstream actions("actions.txt", std::ios::in);
  while (std::getline(actions, line))
  {
    vector_str results = scross.action(line);
    for (vector_str::const_iterator it = results.begin(); it != results.end(); ++it)
    {
      std::cout << *it << std::endl;
    }
  }
  return 0;
}

Sample Input

O 10000 AAPL B 10 100.0
O 10001 AAPL B 10 99.0
O 10002 AAPL S 5 101.0
O 10003 AAPL S 5 100.0
O 10004 AAPL S 6 100.0
X 10002
O 10005 AAPL B 10 99.0
O 10006 AAPL B 10 100.0
O 10007 AAPL S 10 101.0
O 10008 AAPL S 10 102.0
O 10008 AAPL S 10 102.0
O 10009 AAPL S 10 102.0
P
O 10010 AAPL B 13 102.0

Sample Output

F 10003 AAPL 5 100.000000
F 10000 AAPL 5 100.000000
F 10004 AAPL 5 100.000000
F 10000 AAPL 5 100.000000
X 10002
F 10006 AAPL 1 100.000000
F 10004 AAPL 1 100.000000
E 10008 Duplicate order id
P 10009 102.000000 10
P 10008 102.000000 10
P 10007 101.000000 10
P 10006 100.000000 9
P 10001 99.000000 10
P 10005 99.000000 10
F 10010 AAPL 10 102.000000
F 10007 AAPL 10 101.000000
F 10010 AAPL 3 102.000000
F 10008 AAPL 3 102.000000

You specifically called out that the Public API models currencies as IEEE-754 floats, a representation that famously runs into impedance mismatches when it encounters the notion of "a dime". That seems like a deal breaker. Recommend you offer scaled integers, as millionths of a dollar or another convenient unit. Physicists grumble about roundoff error, but often a "mean error is zero" argument will assuage their fears. Accountants? No, they just aren't that way. Off by a penny, and there's paperwork to file. — J_H, Commented Oct 28, 2023 at 2:57
Understood, will use scaled millionths of integers - a long long version of these maps. — Avengerx9, Commented Oct 28, 2023 at 4:20
C++17 or C++20? It doesn't make sense to tag with both. If you need to support C++17 compilers, then you can't use some C++20 features. — Toby Speight, Commented Oct 28, 2023 at 8:07
I was okay with both,, but can proceed with c++17 for now. Any suggestions for C++20 are welcome too. — Avengerx9, Commented Oct 28, 2023 at 8:28
Do you at least have a sample input file and corresponding expected output, that reviewers can use to exercise the program? — Toby Speight, Commented Oct 28, 2023 at 10:47

Toby Speight · Accepted Answer · 2023-10-28 12:12:19Z

I have to admit that I struggle to follow this. The code seems quite jumbled, rather than well-organised and clear.

The biggest issue I have is with representing everything as arrays of strings, completely losing all information about types.

For example, we seem to be using a single-letter prefix to indicate which type of entry we have, but instead of using enum for that (which could help us write a switch where our compiler can ensure all cases are covered), we've embedded knowledge of the i/o format throughout the code. That will make it hard to adapt to future changes in that format.

One thing that I find quite problematic is this:

    switch (tokens_vec.size())

Here, we're using a consequent property of the order type to determine which type of input we're looking at. That's fragile (because if we ever add a field, we may end up having to merge or split cases - and if any command gets an optional field, then it gets really messy). Instead, we want to be switching on the command verb - but really, we should be parsing to a std::variant so that we can then just std::visit all the entries.

Let's start with a code walk-through, from top to bottom:

I find it helpful to sort the #include lines in a consistent order. This makes it easier to spot duplicates or omissions.
In C++, I prefer using to create type aliases, rather than C-style typedef.
The split() function would be more efficient if it made use of std::move:
```
std::vector<std::string> split(std::string s)
{
    std::istringstream ss(std::move(s));
    std::vector<std::string> result;

    std::string item;
    while (std::getline(ss, item, string_delim)) {
        result.push_back(std::move(item));
    }

    return result;
}
```
That said, if we really want to be efficient, we don't want to be constructing a string-stream - we would be better off accepting a string-view and using result.emplace_back() to copy specific substrings of that.

This function should probably have internal linkage (via C-style static keyword, or anonymous namespace) rather than being visible to other translation units, particularly as it has such a likely name. This observation applies to most of the functions in this program; I won't bother repeating it for each one.
Does it really matter that BUY is false, and SELL is true, rather than the other way around? It shouldn't do, so we shouldn't be specifying values. And consider enum class for improved type safety here.
convertLongLongToDouble() is very convoluted. Why do we take the value modulo 1000000000000? That could seriously upset the most wealthy users.

And why are we even using double in a financial program anyway? That can lead to the kind of error which, albeit small in value, makes accounts fail to match. That's the kind of thing that matters immensely to accountants, as well as raising suspicions that the code is intentionally fraudulent!

The next function, format_7_5() is one place where we use this double. That's just wrong; we should be printing the exact value, perhaps like this:

constexpr int money_decimal_digits = 6;
constexpr long long money_scale = 100000;

std::string format_7_5(long long number)
{
    std::ostringstream out;
    out << number / money_scale
        << '.'
        << std::setw(money_decimal_digits) << std::setfill('0')
        << number % money_scale;
    return out.str();
}

Moving on to struct Node, its constructor should be initialising its members, rather than leaving them default-initialised and then assigning them:

    Node(int oid,
         std::string symbol,
         side_t side,
         int qty,
         long long price)
      : oid{oid},
        symbol{std::move(symbol)},
        side{side},
        qty{qty},
        price{price},
        list_itr{}
    {
    }

But it turns out we don't need to write a constructor at all, because this structure is perfectly suited to aggregate initialisation:

struct Node
{
    int oid;
    std::string symbol;
    side_t side;
    int qty;
    long long price;
    std::list<Node>::iterator list_itr = {};
};

Are we correct to be using signed integer types for id, quantity and price? That doesn't look reasonable to me, especially for quantity. On the other hand, if we use a signed value for quantity, we shouldn't need side at all, as the sign could indicate the direction of trade.
The OrderBook has a raw-pointer member orders_map_, but we don't ever change what it points to. We could make it std::map<int, Node *> *const orders_map_ instead - but we'd be better off using a reference instead. As well as the constness, that also conveys to readers that it mustn't be a null pointer, that the OrderBook doesn't own the map, and it correctly addresses the warning we get from g++ -Weffc++.
Prefer to use a separate line for each member, and provide initialisers even where default-initialisation is well-defined (since -WeffC++ doesn't distinguish):
```
    std::map<long long, std::list<Node>> Booklevel_sell = {};
    std::map<long long, std::list<Node>> Booklevel_buy = {};
```

The reverse loops in processBookLevel() are hard to read. If/when we can move to C++20, we can include <ranges> which can give us a reverse view of each list (and even in C++11, we can eliminate the duplication of the string concatenation by using a local lambda function and std::for_each() from <algorithm>):

    // Iterate over bookLevel in reverse order
    for (auto const& level: bookLevel | std::views::reverse) {
        auto const& curr_list = level.second;

        auto add_order = [&results](auto const& item) {
            results.emplace_back("P " + std::to_string(item.oid) + ' ' +
                                 format_7_5(item.price) + ' ' +
                                 std::to_string(item.qty));
        };

        if (is_buy) {
            // Process curr_list in forward order for buy orders
            std::ranges::for_each(curr_list, add_order);
        } else {
            // Process curr_list in reverse order for sell orders
            std::ranges::for_each(curr_list | std::views::reverse, add_order);
        }
    }

In insert_order(), I don't think we need to modify the symbol passed in - it ought to be std::string const& symbol.
pushToFinalList() truncates price to int. That's really a sign of failing to enable a reasonable set of compiler warnings, but it could be devastating to the operation of the program.
The calls to handle_trade have conditions that take the price as a double, which may have less precision than long long. Use auto instead, so that the type automatically agrees with what's passed:
```
          [price](auto p){ return p <= price; },
```
```
          [price](auto p) { return p >= price; }, 
```
In UnifiedOrderBook::print(), why the iterator loop rather than range-based for or std::for_each()?
Since we don't use any command-line arguments, we can use the simplest signature for the main function: int main().
Is it right that we always return 0 (which would be better written as EXIT_SUCCESS, from <cstdlib>)? Even when we can't open actions.txt? That doesn't look right to me.
There's another opportunity to make the code simpler and more readable by using range-based for when printing results.
It shouldn't be necessary to flush std::cout by using std::endl every time around the loop. Just write plain '\n', and add std::flush only where specifically needed.

N.B. with a suitable simple ipow() function (which can be constexpr) we could derive money_scale from money_decimal_digits. — Toby Speight, Commented Oct 28, 2023 at 12:18

G. Sliepen · Accepted Answer · 2023-10-28 12:08:36Z

Create a type for currency

Instead of passing currency around as a long long or double, I strongly suggest you create a new type for it, or even better, use a library that provides such types for you, like moneycpp.

If you create your own type, then using an integer internally to store the value is the way to go. You should never have to convert it to a float or double. If you want to format an integer with a decimal point, you can still easily do that, for example like so:

std::string format_7_5(long long number) {
    return std::format("{:d}.{:05d}", number / 100'000, number % 100'000);
}

Create a type (alias) for order IDs

You use int to store order IDs. What if you process more than about 2 billion orders? What if the format of order IDs changes to something non-numeric? You can avoid all this up-front by creating a new type or type alias for it. For example, you can start with simply writing:

using OrderID = int;
…
struct Node
{
    OrderID oid;
    …
};

Now if you want to change the type of the order ID, you only have to change one line of code.

With using you create a type alias, so in the above example, oid is still exactly the same as an int. This doesn't provide any type safety. For example, if you accidentily write oid += qty, the compiler will not warn you about this mistake. If you make it a struct, then this will no longer compile, so it will be safer, with just a slight inconvenience in converting integers to OrderIDs when reading or printing them.

Use `std::unordered_map()` where appropriate

std::map enforces an ordering on its elements. This has some overhead, and insertions, removals and lookups will be \$O(\log N)\$, where \$N\$ is the number of elements already in the map. If you use std::unordered_map, then this all becomes \$O(1)\$. You already used it for ob_, but I would also use it for orders_map_.

Missing error checking

The only error checking I see is when calling std::stoi() and std::stod() in process(). However, there is so much more that can go wrong that you don't check for. Consider that there might be an error reading lines, there could be multiple orders with the same order ID (which would cause data to be overwritten without any warning), and there might be other issues I haven't noticed.

Use range-based `for`-loops

C++11 gave us range-based for-loops, which are much nicer to work with and result in much less chance of mistakes. Combined with C++17's structured bindings and C++20's std::ranges::reverse_view, you can rewrite something like:

for (auto level_itr = bookLevel.rbegin(); level_itr != bookLevel.rend(); ++level_itr)
{
    const auto &curr_list = level_itr->second;
    …

Into:

for (auto& [price, curr_list]: bookLevel | std::views::reverse)
{
    …

If you are stuck with C++17, then you could implement your own reverse(). Alternatively, if you only ever iterate in the reverse way over a std::map, consider using a custom comparison operator that will reverse the map's ordering.

Stack Exchange Network

Design decisions for an efficient OrderBook

2 Answers 2

Create a type for currency

Create a type (alias) for order IDs

Use `std::unordered_map()` where appropriate

Missing error checking

Use range-based `for`-loops

Not the answer you're looking for? Browse other questions tagged
c++
c++17
finance
or ask your own question.

Linked

Hot Network Questions

Design decisions for an efficient OrderBook

2 Answers 2

Create a type for currency

Create a type (alias) for order IDs

Use std::unordered_map() where appropriate

Missing error checking

Use range-based for-loops

Not the answer you're looking for? Browse other questions tagged c++c++17finance or ask your own question.

Linked

Related

Hot Network Questions

Use `std::unordered_map()` where appropriate

Use range-based `for`-loops

Not the answer you're looking for? Browse other questions tagged
c++
c++17
finance
or ask your own question.