0

I have an import that needs to grab data from a REST service and import into an web store. It's basically an ETL type of service, but because the REST service can be slow and I don't want to call it for every entity in the loop, I'm finding myself having to pass in a lot of parameters into the methods needed for import. Basically anytime I need something from the service, I have to add another param for that data. Looking for ideas on how this code could be better or if maybe nobody sees any issues with it.

    public void RunImport()
    {
        List<Product> srcProducts = new List<Product>();
        List<ProductOptions> srcProductOptions = new List<ProductOptions>();
        List<ProductCustomFields> srcProductCustomFields = new List<ProductCustomFields>();

        List<Order> srcOrders = new List<Order>();
        List<Customer> srcCustomers = new List<Customer>();
        List<Country> srcCountries = new List<Country>();

        if (ShouldImportProducts)
        {
            srcProducts = restService.GetProducts();
            srcProductOptions = restService.GetProductOptions();
            srcProductCustomFields = restService.GetCustomFields();
            if (srcProducts.Count > 0)
            {
                ImportProducts(srcProducts, srcProductOptions, srcProductCustomFields);
            }
        }

        if (ShouldImportCustomers)
        {
            srcCustomers = restService.GetCustomers();
            if (srcCountries == null) srcCountries = restService.GetCountries();
            if (srcCustomers.Count > 0)
            {
                ImportCustomer(srcCustomers, srcCountries);
            }
        }

        if (ShouldImportOrders)
        {
            srcProducts = restService.GetProducts();
            srcCustomers = restService.GetCustomers();
            if (srcCountries == null) srcCountries = restService.GetCountries();
            if (srcCustomers.Count > 0)
            {
                ImportOrders(srcOrders, srcProducts, srcCustomers, srcCountries);
            }
        }

    }

    public void ImportOrders(List<Order> srcOrders, List<Product> srcProducts, List<Customer> srcCustomers, List<Country> srcCountries)
    {
        foreach (Order srcOrder in srcOrders)
        {
            var existingOrder = FindExistingOrder(srcOrder);
            var destinationOrder = ConvertToDestinationOrder(srcOrder, existingOrder, srcProducts, srcCustomers, srcCountries);
            SaveDestinationOrder(destinationOrder);
        }
    }

    public DestinationOrder ConvertToDestinationOrder(Order srcOrder, DestinationOrder existingOrder, List<Product> srcProducts, List<Customer> srcCustomers, List<Country> srcCountries)
    {
        var destinationOrder = new DestinationOrder();
        var productId = srcProducts.Where(x => x.Sku == srcOrder.Sku);
        var customerId = srcCustomers.Where(x => x.Email == srcOrder.Email);
        var countryId = srcCountries.Where(x => x.Code == srcOrder.Country);

        destinationOrder.ProductId = productId;
        destinationOrder.CustomerId = customerId;
        destinationOrder.CountryId = countryId;

        //...
        return destinationOrder;
    }

1 Answer 1

1

I would add an additional layer here that hides the abstraction better. The RunImport method should only have a high-level view over the import process, the details of each individual import should be abstracted away.

You are dealing with a lot of data reuse, and I would redesign this code to have specific data containers whose job it is to provide the data (and fetch it if they haven't already). This cuts down on the complexity of your import logic itself.

Something along the lines of:

public class CustomerRepository : ICustomerRepository
{
    private readonly MyRestService restService;
    private IEnumerable<Customer> customers = null;

    public CustomerRepository(MyRestService restService)
    {
        this.restService = restService;
    }

    private void LoadIfNotLoaded()
    {
        if(customers == null)
            customers = restService.GetCustomers();
    }

    public IEnumerable<Customer> GetAll()
    {
        LoadIfNotLoaded();

        return customers;
    }

    public Customer FindByEmail(string email)
    {
        LoadIfNotLoaded();

        return customers.SingleOrDefault(c => c.Email == email);
    }
}

This is analogous for all other resources you fetch. Some comments:

  • If the name Repository creates ambiguity with any repositories your codebase may already implement, feel free to change this name.
  • There are shorter approaches to implement this class (e.g. directly exposing customers and letting it be handled this way, or even just using a Lazy<T> instead of a custom class), but I prefer the above because it means you can reuse the same filtering logic instead of having all consumers of the class write their own filter logic.
  • I used MyRestService because the type is not apparent in your example.
  • This approach ensures that no REST call will be made until you actually have need of the data. This prevents needing to repeatedly check if the data needs to be loaded (but not a second time) when the next import process starts.
  • I have omitted the implementation of the ICustomerRepository interface. Essentially, use it to describe all public behavior.
  • It is possible to implement these repositories inside the restService instead of the importer. You didn't post the implemenation details of the restService so I cannot judge this.

These repositories can be injected into your importer class as dependencies (unless you have a reason to not, which is not clear from the question)

public class Importer
{
    private readonly ICustomerRepository customerRepository;

    public Importer(ICustomerRepository customerRepository)
    {
        this.customerRepository = customerRepository;
    }
}

This may become a lengthy list of constructor parameters. Usually, I agree with trying to avoid those. However, import logic is a very common exception here, because imports are so very prone to touching on multiple domain objects at the same time. In your general code, that suggests a bad separation between your domain objects, but imports are a special case here as they can effectively be responsible for loading all your domain objects.

As an aside, I have toyed with the idea of making separate importers (CustomerImporter, ProductImporter, ...). Because you have the neatly separated repositories, it is possible and relatively simple to do so. However, it would require some additional configuration to ensure that your injected repositories have a shared instance between importers (to avoid unnecessary double REST calls).
For the sake of brevity, I'm bypassing this additional step. I'll only focus on one of the imports anyway. I leave it up to you whether you implement this logic in separate importers or a single one. I would suggest separating them.

As for your RunImport method, I would divide your logic into two abstraction layers (methods), to separate the higher-level import flow from the more nitty gritty implementation details of each import.

The top level should be relatively clear, so that the flow is easy to read:

public class Importer
{
    public void RunImport()
    {
        if(ShouldImportProducts)
        {
            foreach(var product in GetProducts())
            {
                Store(product);               
            } 
        }
    }
}

The Get methods should compile your entities, which in your case means combining several of the repositories. One such implementation could be:

private IEnumerable<ProductDto> GetProducts()
{
    var products = productRepository.GetAll();

    var mapped = products.Select(product => new ProductDto()
    {
        Id = product.Id,
        Name = product.Name,
        Options = productOptionsRepository.GetByProductId(product.Id),
        CustomFields = productOptionsRepository.GetByProductId(product.Id)
    });

    return mapped;
}

Change the mapping of the fields as necessary, of course.

The Store logic would in essence be doing what your ImportProduct method already does, but without needing to iterate over the collection.

private void Store(ProductDto product)
{
    var existingProduct = FindExistingProduct(product.Id);

    // Map all the fields
    existingProduct.Name = product.Name;

    SaveProduct(existingProduct);
}

One mention: your code does not account for orders that don't already exist. I can only assume that FindExistingOrder would return a new instance which will then be created (instead of updated) in your mapping logic. If that is indeed the case, I would rename FindExistingOrder to FindOrCreateOrder, just to make that clear.

1
  • I like the idea of lazy loading - I thought about that as well, but was a little worried about being able to control when to refresh the data
    – user204588
    Commented Mar 31, 2022 at 13:40

Not the answer you're looking for? Browse other questions tagged or ask your own question.