How AI Is Rewriting the Enterprise Data Playbook

Luka Jasionyte, in the marketing team at Snap Analytics, catches up with Tom Bruce, Group Managing Director & Co-founder, to get the lowdown on his journey through data and analytics, and to share insights on what he thinks are the most important trends, AI initiatives and challenges facing enterprise businesses today.  

What inspired you to get into data and analytics, Tom? 

My love for sports statistics and data, along with their transformative impact on sport itself, inspired me. Books like Moneyball showed me how data and analytics can help organisations, making me realise the considerable influence I could have on companies with a relatively small amount of resource. 

What is the most exciting trend right now? 

It’s the obvious answer but clearly AI is already transformational and seeing the applications and more importantly the evolutions that will happen within this space is going to be exciting. With the rate of development and change increasing the impact on productivity, the way we work and the benefits this brings to business makes it a thrilling development to be a part of. We’ve been working on some exciting and transformational AI initiatives with clients, and everyone in this space shares the buzz about the potential and opportunities available now and into the future. 

What are some common data-related challenges that large enterprises typically face? 

I’ve been seeing two big challenges that large enterprises face when it comes to data, and these aren’t exactly new issues, but they’re becoming even more crucial now with the push towards AI-based solutions. 

First off, there’s data quality and governance. It’s all about having the right data, properly managed and owned in a way that’s governed and easily accessible to everyone who needs it. The challenge here is ensuring that the data is accurate, consistent, and reliable. Without good data quality and governance, any AI solution built on top of it is likely to be flawed or ineffective. 

Then there’s data modelling. This is about setting up the right foundations with an optimised and easily understandable data model. It’s key to making the most out of AI solutions. A solid foundation ensures that everything else works smoothly and efficiently, allowing AI to deliver its full potential. If the data model is too rigid or poorly designed, it can limit the effectiveness of AI and make it harder to adapt to new requirements or changes in the business environment. 

So, while these challenges aren’t new, they’re definitely more important now than ever before. 

What are the top client priorities for those looking to drive successful outcomes in data and analytics? 

We’ve often seen that a lot of investment has gone into cloud solutions, but there hasn’t always been control over the costs associated with these investments. Cost optimisation for cloud solutions is something that is increasingly coming up on our clients’ agendas. It’s becoming clear that while the cloud offers immense flexibility and scalability, without proper cost management, it can quickly become a financial burden. Clients are now more focused on finding ways to optimise their cloud spending to ensure they are getting the best value for their investment. 

Similarly, there have been numerous Gen AI exploration projects as organisations are just getting started with AI. We’ve noticed that there often isn’t a suitable business case for these initiatives, and most aren’t taken through to production. It’s crucial to properly identify the value of AI initiatives and ensure that more focus is given to clearly scoped AI projects with tangible business benefits. By doing so, organisations can avoid wasting resources on projects that don’t deliver real value and instead concentrate on initiatives that have a clear and measurable impact on their business.  

The key priorities for clients looking to drive successful outcomes in data and analytics are cost optimisation for cloud solutions and a focused approach to AI initiatives with well-defined business cases. 

Tom, why Snap Analytics? 

We’re experts in delivering the data foundations essential for successful AI projects. Our team is dedicated to ensuring our customers get the best service, reflected in our high retention rates and excellent feedback.  What sets us apart is our focus on our customers. We understand every business is unique, and we tailor our solutions to meet each client’s specific needs. Whether it’s optimising data quality, implementing robust governance frameworks, or developing cutting-edge AI models, we’ve got you covered. 

Our clients know they can rely on us to deliver results. We’ve built a reputation for being the team businesses turn to for new and innovative solutions. We’re constantly pushing the boundaries of what’s possible and are excited to help our clients achieve their goals. We’re here to provide the expertise, support, and innovation you need to thrive in today’s data-driven world. 

Part 2 – Scaling and Flexing: Data Vault Benefits

If you’re wondering what exactly Data Vault is then check out Part 1 of our Data Vault blog where we outline exactly what Data Vault is, otherwise keep reading to find out all the great benefits of using Data Vault methodology!

Scalability

Probably the biggest benefit given the current data landscape that we find ourselves in is the scalability of data vault. From my own experiences, data warehouses over time become more resistant to change and much less flexible to adapt to the changing needs of the business. Once certain reports and datasets have been created, in order to develop new functionality all of the existing functionality will need to be tested to ensure that it hasn’t broken, and refactored if it is. This is where Data Vault really helps as it is easily extensible without the worry about losing any historical data. New link tables and satellites can easily be added to the model, and similarly other link tables can be closed off when required. When virtual information marts are used this means it’s even more adaptable as it’s only the joins that will require updates to handle the changes.

Flexibility

Link tables mean that relationships are modelled in a way that it assumes that there will be many-to-many requiring virtually no work when a relationship changes. All historical data is tracked by default in the Data Vault by using the satellite tables meaning again the flexibility is there to report on the historical changes of the data warehouse. For example, if the business decides that actually they need to report on the historical status of a master data object that previously wasn’t a requirement, this can simply be provided by altering the joins to pull the correct data from the master data satellite tables. Previously we would have probably thought that using this approach would have caused lots of redundant data to be stored, however with the rise of cloud storage the costs for storing are so trivial that this now means we can design and build with future change in mind.

Auditability and Traceability

This tracking of history also means that the data warehouse is fully auditable. Data Vault practitioners like to say that it becomes the “single version of the facts” rather than the “single version of the truth”. All data is loaded into the data vault exactly as it was in source, rather than just cleansed data for a specific reporting purpose. This means that you can fully reconcile the data back to the source system at the point of entry, and it also becomes a lot easier to track the data lineage. Business users will no longer see the data warehouse as a ‘black box’, but instead it will become the trusted source for all of their business data!

Big Data & Loading

Traditional modelling approaches were created long before the advent of semi-structured and machine generated data. In Data Vault there isn’t the requirement to cleanse and conform the data to ensure star schema compliance which means that huge volumes of data can be loaded very quickly. Data Vault also uses hashed keys meaning that there is reduced dependencies during loading between hub, satellite and link tables meaning that they can be loaded in parallel, making use of scalable processing power that is available with cloud data warehouse solutions like Snowflake. Data vault also uses the concept of a virtual information mart layer on top of the data vault for reporting which again means increased performance as cubes don’t need to be persisted and loaded. This is a key feature that helps to ensure that real-time data warehousing can be realised.

All of this sounds great, so why isn’t data vault more widely used when building out data warehouses? I think the simply answer can be put down to a couple of key factors. The first is that cloud data warehousing has only recently started to be adopted, and it’s only with the scalability of cloud storage and processing that means that a data vault structure is truly viable. The other key factor is that if a project that is being delivered is a small project then it may be that building it using a data vault methodology is overkill. However, given the increasing importance of data and stricter data regulations (e.g. GDPR) I would argue that in many cases Data Vault methodology would be the right choice to ensure a modern and scalable data warehouse.

Written by Tom Bruce the Delivery Lead and Co-founder of Snap Analytics.