Google recently launched the Google Cloud Cortex Framework to reduce the time-to-value of getting SAP data into Google Big Query (GBQ). This is not a surprising development. If data is considered a valuable asset in the enterprise world then surely SAP is the treasure trove. The surprising fact is that it has taken this long for Google to get here, and that the other hyperscalers are still lacking in their ability to onboard SAP data efficiently.
Getting data out of SAP is not difficult, but making sense of SAP data is.
Traditionally SAP was a very closed ecosystem and getting data out of SAP was technically difficult. This all changed when SAP introduced the ‘Operational Data Provisioning Framework’ (ODP) in 2017. The ODP framework provides APIs for data extraction and subscription services which give you regular data feeds. There are several 3rd party ETL tools (Extract, Transform and Load) available which consume the APIs and provide a user friendly interface to creating subscriptions and ad-hoc data extracts from SAP. In addition, many popular coding languages have libraries for using the same APIs so developers can build their own data pipelines in their language of choice. The APIs are a standard part of SAP environments and there are no license restrictions. You can take any and all data out of SAP and put it on the platform of your choice, as long as you do this using standard SAP APIs. Getting the data out of SAP is only the beginning though. Making sense out of SAP data is a different story.
SAP Business Content and accelerators
It is difficult to overestimate the complexity of SAP systems. There are 100,000s of tables in a typical SAP system, and each organization has configured the SAP system differently. If you take a sales order as an example, and you wanted to provide a complete insight in sales orders in your company you not only need to know the 100s of tables related to a sales order (including master data tables, text tables). You would also need to know which different types of sales orders are used for what processes, and much more. The fact that many tables are 4 character abbreviations of German words does not help either. Luckily, SAP has done much of the legwork already by building a semantic layer on top of the raw dataset. At the lowest level there are database views and data extractors which allow you to get to the data at entity level (like ‘sales order’) instead of having to interpret the raw data. These models have been around for more than 20 years and have constantly extended and evolved, which means that today there is an incredibly rich semantic model available for data warehouse engineers. At the higher end of the spectrum, SAP also delivers pre-defined models where data is organized in a way to best support analytical processes. Even the transformation from entity level to an analytical model is delivered ‘out of the box’ by SAP. So why has it taken Google and others so long to get to where we are today, and why is there still such a long way to go until SAP data is truly integrated in every required relevant business context?
Knowledge, culture and people
There is still a hard divide in the world of enterprise data: Those who understand SAP data and SAP technology, and those who want to avoid working with SAP data and technology at all cost. The SAP specialists typically know very little about other technologies, data and processes outside the SAP world. You can have a successful lifelong career in SAP Analytics and never see SQL Data Warehouse or an ETL tool at work. Or you can be data warehouse guru and know everything about Data Vault and Dimensional modelling, yet clueless about how to set up a data warehouse in SAP Business Warehouse. So although it is easy enough to get data out of SAP and onto an open Cloud Platform, you probably have two problems:
• Your SAP data warehouse team does not know how to build a SQL data warehouse
• Your Data Warehouse team does not understand the SAP data well enough
And that is the real crux of the problem. Not technology, but people. It follows that the solution then is also about people, not technology. Bring your teams together, cross fertilize the knowledge about the SAP and the non-SAP world, and you will truly see a seamless integration of SAP and non-SAP data.
Does Google Cortex help?
If anyone can bring “Google search” to corporate data, it is Google. They continue to develop amazing technology, are probably further ahead with applying AI and Machine Learning in every-day processes than anyone else and billions of people around the world are happy to interact with Google. But where Google goes, others follow quickly. The ETL vendors will increasingly support the replication of the SAP delivered analytical models to cloud platforms and other hyperscalers will undoubtfully come up with their own packaged solutions for SAP integration.
Interestingly, SAP have made some great product improvements in data warehousing and analytics. SAP has the additional benefit of having a better understanding of enterprise data, as this has been their core business for the last 40 years. I applaud Google for once again being a trail blazer but I don’t expect a huge shift from customers to GBQ for enterprise data warehousing as a result. Enterprises already on GBQ will benefit from Cortex. Those who have chosen a different data platform will have plenty of other tools to choose from to integrate SAP data.
Organizations which suffer from SAP data not being accessible and silo’d should look at their operating model and think about how they can bring their different teams together. Today’s technology has pushed the boundaries of what is possible way beyond what anyone had imagined only five years ago. Now it is time for the different teams to collaborate and ensure technology is used to its full potential.
“Ok Google, which company can help me with bringing the world of SAP and non-SAP data together?”