Shyanti Bhattacharjee’s Post

View profile for Shyanti Bhattacharjee, graphic

Analytics Manager ||Failing And Learning at my own pace || Curious and always ready to explore

What happens when there is a business use case where you have to work on creating models and the data required is sitting at different places like in onpremise systems (Hadoop , DB2 or EDW) and some of the data is in cloud (Azure, AWS or GCP ) The data required to create models is present in chunks at different places. How will you work on these data since the data is not present in single source ? One solution to this is : You have to bring the data into one place so that you can work on the data. How to bring the chunks of data from different sources to a single place since all of the data are present in different place ? So what we can do is migrate the required data from the onpremise systems to cloud and then perform data manipulations on the whole chunk of data using platforms like databricks or snowflake. But there comes a fallout to this solution or approach. Since we are migrating the data it comes with a cost to the business. There will be ingress and egress cost attached to data that is getting moved from one place to other. Also if the data size is huge it will add to huge cost to business. There will also be data duplicacy since the data will be present at both the places. So can we look for any other approach to this ? Is there any specific approach from where we can reduce data duplicacy? or even moving the data from onpremise to cloud can we work on this use case ? Yes here comes denodo in the picture. Denodo is a data virtualization tool where we can create connectivity between various data sources and perform our manipulations in the denodo itself. There is no need to move your data and no data duplicacy will be happening. Denodo serves as a platform where you can do manipulations in the data, create models and join data present in different sources ( be it cloud or be it onpremise ). There is no requirement to move the chunk of data to denodo. You just have to create a connectivity and you are good to go. #data #usecase #analytics #Denodo #costoptimization #datavirtualization

Pradip Basak

Sr. Data Engineer @ Daimler Truck | Jadavpur University

2mo

You can take a look at Microsoft Fabric as well. It's much more feature-rich and takes away the pain of integration altogether.

To view or add a comment, sign in

Explore topics