0

I did some profiling using this code

#include "Timer.h"
#include <iostream>

enum class BackendAPI {
    B_API_NONE,
    B_API_VULKAN,
    B_API_DIRECTX_12,
    B_API_WEB_GPU,
};

namespace Functional
{
    typedef void* VertexBufferHandle;

    namespace Vulkan
    {
        struct VulkanVertexBuffer {};

        VertexBufferHandle CreateVertexBuffer(size_t size)
        {
            return nullptr;
        }

        __forceinline void Hello() {}
        __forceinline void Bello() {}
        __forceinline void Mello() {}
    }

    class RenderBackend {
    public:
        RenderBackend() {}
        ~RenderBackend() {}

        void SetupBackendMethods(BackendAPI api)
        {
            switch (api)
            {
            case BackendAPI::B_API_VULKAN:
            {
                CreateVertexBuffer = Vulkan::CreateVertexBuffer;
                Hello = Vulkan::Hello;
                Bello = Vulkan::Bello;
                Mello = Vulkan::Mello;
            }
            break;
            case BackendAPI::B_API_DIRECTX_12:
                break;
            case BackendAPI::B_API_WEB_GPU:
                break;
            default:
                break;
            }
        }

        VertexBufferHandle(*CreateVertexBuffer)(size_t size) = nullptr;
        void (*Hello)() = nullptr;
        void (*Bello)() = nullptr;
        void (*Mello)() = nullptr;
    };
}

namespace ObjectOriented
{
    struct VertexBuffer {};

    class RenderBackend {
    public:
        RenderBackend() {}
        virtual ~RenderBackend() {}

        virtual VertexBuffer* CreateVertexBuffer(size_t size) = 0;
        virtual void Hello() = 0;
        virtual void Bello() = 0;
        virtual void Mello() = 0;
    };

    class VulkanBackend final : public RenderBackend {
        struct VulkanVertexBuffer : public VertexBuffer {};

    public:
        VulkanBackend() {}
        ~VulkanBackend() {}

        __forceinline virtual VertexBuffer* CreateVertexBuffer(size_t size) override
        {
            return nullptr;
        }

        __forceinline virtual void Hello() override {}
        __forceinline virtual void Bello() override {}
        __forceinline virtual void Mello() override {}
    };

    RenderBackend* CreateBackend(BackendAPI api)
    {
        switch (api)
        {
        case BackendAPI::B_API_VULKAN:
            return new VulkanBackend;
            break;
        case BackendAPI::B_API_DIRECTX_12:
            break;
        case BackendAPI::B_API_WEB_GPU:
            break;
        default:
            break;
        }

        return nullptr;
    }
}

int main()
{
    constexpr int maxItr = 1000000;

    for (int i = 0; i < 100; i++)
    {
        int counter = maxItr;
        Timer t;

        auto pBackend = ObjectOriented::CreateBackend(BackendAPI::B_API_VULKAN);
        while (counter--)
        {
            pBackend->Hello();
            pBackend->Bello();
            pBackend->Mello();

            auto pRef = pBackend->CreateVertexBuffer(100);
        }

        delete pBackend;
    }

    std::cout << "\n";

    for (int i = 0; i < 100; i++)
    {
        int counter = maxItr;
        Timer t;

        {
            Functional::RenderBackend backend;
            backend.SetupBackendMethods(BackendAPI::B_API_VULKAN);
            while (counter--)
            {
                backend.Hello();
                backend.Bello();
                backend.Mello();

                auto pRef = backend.CreateVertexBuffer(100);
            }
        }
    }
}

In which `#include "Timer.h" is

#pragma once
#include <chrono>

/**
 * Timer class.
 * This calculates the total time taken from creation till the termination of the object.
 */
class Timer {
public:
    /**
     * Default contructor.
     */
    Timer()
    {
        // Set the time point at the creation of the object.
        startPoint = std::chrono::high_resolution_clock::now();
    }

    /**
     * Default destructor.
     */
    ~Timer()
    {
        // Get the time point of the time of the object's termination.
        auto endPoint = std::chrono::high_resolution_clock::now();

        // Convert time points.
        long long start = std::chrono::time_point_cast<std::chrono::microseconds>(startPoint).time_since_epoch().count();
        long long end = std::chrono::time_point_cast<std::chrono::microseconds>(endPoint).time_since_epoch().count();

        // Print the time to the console.
        printf("Time taken: %15I64d\n", static_cast<__int64>(end - start));
    }

private:
    std::chrono::time_point<std::chrono::high_resolution_clock> startPoint; // The start time point.
};

And after the output in a graph (compiled using the Release configuration in Visual Studio 2019), the results are as follows, enter image description here

Note: The above code is made to profile Functional vs Object oriented approach performance differences when building a large scale library. The profiling is done by running the application 5 times, recompiling the source code. Each run has 100 iterations. The tests are done both ways (object oriented first, functional second and vise versa) but the performance results are more or less the same.

I am aware that inheritance is somewhat slow because it has to resolve the function pointers from the V-Table at runtime. But the part which I don't understand is, if I'm correct, function pointers are also resolved at runtime. Which means that the program needs to fetch the function code prior to executing it.

So my questions are,

  1. Why does the function pointers perform somewhat better than virtual methods?
  2. Why does the virtual methods have performance drops at some points but the function pointers are somewhat stable?

Thank You!

4
  • Have you tried swapping the order of your program and doing the functional tests first? Commented Nov 6, 2020 at 8:28
  • @AlanBirtles Let me try that out.
    – D-RAJ
    Commented Nov 6, 2020 at 8:30
  • @AlanBirtles The results are the same.
    – D-RAJ
    Commented Nov 6, 2020 at 8:43
  • Have you ever heard the story about the drunk under the street light looking for his keys? When asked why he was looking there he says "because that's where the light is!" While you have a vaguely interesting question, if what you care about is performance you're looking in the wrong place. In fact, the way to maximize performance is not to come in with a prior guess like this one, but to be totally open to being surprised at what the problems actually are. Here's how I and others do it. Commented Nov 8, 2020 at 2:34

1 Answer 1

2

Virtual method lookup tables need to be accessed (basically) every time the method is called. It adds another indirection to every call.

When you initialize a backend and then save the function pointers you essentially take out this extra indirection and pre-compute it once at the start.

It is thus not a surprise to see a small performance benefit from direct function pointers.

2
  • But why does the virtual methods spike at some points but the function pointers stay somewhat smooth?
    – D-RAJ
    Commented Nov 6, 2020 at 8:42
  • 1
    @DhirajWishal The spikes are almost certainly caused by the CPU cache. Adding extra indirections always causes instability in performance due to the CPU cache.
    – orlp
    Commented Nov 6, 2020 at 8:44

Not the answer you're looking for? Browse other questions tagged or ask your own question.