How to Automatically Shut Down an Azure Matillion Instance After a Schedule Finishes

This blog follows on from the How to Automatically Shut Down an AWS Matillion Instance After a Schedule Finishesblog but instead provides the steps relevant for Azure rather than AWS.
I would strongly recommend reading the introduction and The “Death Loop” Issue sections in that blog before proceeding with the below steps. Fortunately, configuring this for Azure is simpler than AWS due to Azure giving instances managed identities by default. Unlike Azure, AWS requires the instance be granted a custom role with a policy to allow it to turn itself off.

Step 1: Installing the Azure CLI

The Azure CLI is a powerful tool for interacting with the Azure Cloud Platform in various ways. Here, we will use a simple CLI command to deallocate an Azure VM.  To begin, you will need to install the Azure CLI on the Matillion VM, which can be done by following this installation guide by Microsoft.

Step 2: Creating a Deallocate Bash Script

Create a file with the below script in the following directory:

 /home/custom_scripts/deallocate_server

by SSHing into the VM and ensure that the centos user owns the file.

sleep 30
az login --identity
az vm deallocate --resource-group <MY_RESOURCE_GROUP> --name <MY_VM_NAME>

The first command sleeps for 30 seconds to ensure that the Matillion schedule has enough time to complete safely before the VM is deallocated. The second command authenticates with the Azure CLI with the VM’s managed identity. The final command executes the VM deallocation using the Azure CLI.

A couple of things to note:

  • If you have a separate production Matillion instance, the above steps will need to be redone on that instance, and the new resource group and VM name will need to be used in the deallocate_server script.
  • The VM’s Enterprise Application in Azure will need at least the ‘Desktop Virtualization Power On Off Contributor’ role on the VM. Usually, the VM will already have sufficient privileges for this.

Step 3: Implementing in Matillion

From here, we will use a Bash Script component to execute the above deallocate_server script. A wrapper job will be needed around your main pipeline where you can attach a Bash Script component to the end of the pipeline (this wrapper job will be the one run by your Matillion schedule). Important: the flow from the main pipeline (in this case e2e_nightly) will need to be unconditional (grey) so that the server is turned off regardless of whether the pipeline was successful. Otherwise, your VM will stay on in the event of a pipeline failure if the Bash Script is only set to execute when the main pipeline is successful (unless you have perfect pipelines… ?).

Within the Bash Script, place the below command which will execute the deallocate_server script that we created on the VM in step 2.

sh /home/custom_scripts/deallocate_server >/tmp/deallocate_server.log &

Crucially, the ampersand symbol (&) at the end of the command enables the command to be executed without waiting for the script to finish. This allows the Bash Script component to immediately flag as completed in the eyes of the Matillion task scheduler, and therefore the schedule will be marked as complete. This avoids the aforementioned “death loop” as there is no dependency on the deallocation commands completing before the Matillion schedule can finish. Additionally, the script exports the output of the deallocate command to a log file for auditing purposes.

Final thoughts

The solution proposed in this blog uses the Azure CLI to deallocate your Matillion VM by simply running a Bash Script component. It should be noted that there are a number of alternative ways to achieve this, such as using message queues to trigger a cloud function to shut down the VM, which is equally valid.

Once you have this deallocation functionality configured, you can rest assured that your Matillion VM will dynamically shut down once your schedule completes. Please feel free to reach out to me on LinkedIn or drop a comment on this blog if you have any further questions.

How to Automatically Shut Down an AWS Matillion Instance After a Schedule Finishes

Matillion customers, in their effort to optimise credit consumption, are eager to reduce unnecessary costs by minimizing the uptime of their instances. One particularly tricky aspect of this optimisation is managing instance shutdown after a routine schedule has completed, be it a successful or failed run. Unfortunately, Matillion doesn’t offer an inherent feature to automatically switch off instances as part of a pipeline. Furthermore, the execution duration of these schedules can vary due to factors like data volumes and the day of the week, making it impractical to implement a fixed-time shutdown. Consequently, a flexible alternative solution is required. The configuration process for enabling this functionality is slightly different between AWS and Azure.

This blog will cover the steps for AWS; the steps for Azure can be found here

The “Death Loop” issue discussed below is relevant to any instance: AWS, Azure or other.

The “Dead Loop” Issue

Before delving into the steps for enabling this functionality, it is crucial to address an issue concerning VM deallocation during a running job. Consider this scenario: your nightly schedule is running, all jobs complete (regardless of success or failure), and you want the last component in your pipeline to deallocate the VM (we’ll cover how to create a deallocate component in the following sections). Matillion will expect the deallocation component to return a success or failure response, like any other component, before it can mark the running task as complete. But, the deallocation component will never be seen to complete by the Matillion task manager due to the server deallocating in that instant. Consequently, when the VM is switched back on, the task scheduler detects the job didn’t fully complete and automatically resumes the job from where it left off, which was at the “Deallocate Server” component. As a result, this will enter the VM into what I like to call a “death loop” where the VM repeatedly switches itself off every time it’s turned on. Breaking this loop is challenging, but this approach avoids this problem by decoupling the deallocation from the scheduled job. The key to the solution is to do the deallocation by calling a bash script for deallocation instead of putting the deallocation command in an embedded bash script in Matillion. Below are the steps to achieve this.

Step 1: Assigning a Role to the Instance

Firstly, an AWS role withs the ability to turn off the instance needs to be created and given to the Matillion EC2 instance.

  1. Create a policy in AWS via IAM by selecting ‘Create policy’ in the Policies page.
  2. Select ‘EC2’ as the Service.
  3. Search for and select the ‘StopInstances’ Action.
  4. We will want to restrict this to only work for the specific Matillion instance so select ‘Add ARNs’. In the pop-up choose the appropriate account radio box and enter the resource’s region and ID.
    We will want to restrict this to only work for the specific Matillion instance so select ‘Add ARNs’. In the pop-up choose the appropriate account radio box and enter the resource’s region and ID.

  5. Feel free to add request conditions such as the requester’s IP address being the Matillion IP. Click ‘Next’.
  6. Provide a Policy name, then create the policy.
  7. Next, we need to create a role to assign the policy to. Select ‘Create role’ in the Roles page.
  8. Select the ‘AWS service’ Trusted entity type and ‘EC2’ as the Use case. Click ‘Next’.
  9. Search for and select the Policy created in the previous steps. In my case, this is ‘EC2StopInstancePolicy’. Click ‘Next’.
  10. Provide a Role name, then create the role.
  11. Lastly, we need to assign the newly created role to the Matillion EC2 instance. Head to the EC2 Dashboard, and then to the Instances page.
  12. Select the Matillion instance, in the top right click ‘Actions’ > ‘Security’ > ‘Modify IAM role’.
  13. Select the Role created in the previous steps and click ‘Update IAM role’.

Step 2: Installing the AWS CLI

The AWS CLI is a powerful tool for interacting with the AWS Cloud Platform in various ways. Here, we will use a simple CLI command to deallocate an EC2 instance. You will need to install the AWS CLI on the Matillion VM, which can be done by following this installation guide.

Step 3: Creating a Deallocate Bash Script

Create a file with the below script in the following directory:

/home/custom_scripts/deallocate_server

by SSHing into the VM and ensure that the centos user owns the file.

sleep 30
aws ec2 stop-instances --instance <Your Instance ID>

The first command sleeps for 30 seconds to ensure that the Matillion schedule has enough time to complete safely before the VM is deallocated. The second command executes the VM deallocation using the AWS CLI. It is worth mentioning that if you have a separate production Matillion instance in a different AWS account, the above steps will need to be redone in that account, and the new instance ID will need to be used in the deallocate_server script.

Step 4: Implementing in Matillion

From here, we will use a Bash Script component to execute the above deallocate_server script. A wrapper job will be needed around your main pipeline where you can attach a Bash Script component to the end of the pipeline (this wrapper job will be the one run by your Matillion schedule). Important: the flow from the main pipeline (in this case e2e_nightly) will need to be unconditional (grey) so that the server is turned off regardless of whether the pipeline was successful. Otherwise, your VM will stay on in the event of a pipeline failure if the Bash Script is only set to execute when the main pipeline is successful (unless you have perfect pipelines… ?).

Within the Bash Script, place the below command which will execute the deallocate_server script that we created on the VM in step 2.

sh /home/custom_scripts/deallocate_server >/tmp/deallocate_server.log &

Crucially, the ampersand symbol (&) at the end of the command enables the command to be executed without waiting for the script to finish. This allows the Bash Script component to immediately flag as completed in the eyes of the Matillion task scheduler, and therefore the schedule will be marked as complete. This avoids the aforementioned “death loop” as there is no dependency on the deallocation commands completing before the Matillion schedule can finish. Additionally, the script exports the output of the deallocate command to a log file for auditing purposes.

Final thoughts

The solution proposed in this blog uses the AWS CLI to deallocate your Matillion VM by simply running a Bash Script component. It should be noted that there are a number of alternative ways to achieve this, such as using message queues to trigger a cloud function to shut down the VM, which is equally valid.

Once you have this deallocation functionality configured, you can rest assured that your Matillion VM will dynamically shut down once your schedule completes. Please feel free to reach out to me on LinkedIn or drop a comment on this blog if you have any further questions.

5 main challenges getting data out of SAP and how to overcome them

One of the most common questions I get from clients is “why is getting data out of SAP so hard? Isn’t it just another source system?”. After a while pondering over this question I thought I would list out the reasons based on our numerous projects getting data from SAP into cloud systems such as Snowflake.  Once I’d started I couldn’t stop. Here is my top 5 outlined below.

1. The data is complex

SAP systems are the nerve centres of global enterprises. Many business critical processes are managed and controlled with SAP systems. Consequently, SAP systems contain the most treasured information large organisations have – namely their financial and operational data. As a result, these systems are both complex in terms of the number of processes that they have but also in the volume of the data. A typical SAP system contains 100,000s tables and there are many complex relationships between the tables.

2. SAP systems and SAP data are heavily governed

As a system with the most crucial and sensitive data it’s only right that there is a lot of governance and processes in place to protect the data and the system itself. That means added time and complexity when trying to get the data out of SAP. This will often involve various SAP teams and stakeholders will need to be involved to ensure the correct access is given to enable you to get your data out of SAP, and operational processes are not jeapordized by the process of ‘getting the data out’.

3. SAP at its core is old technology

SAP is 50 years old this year. When SAP started, memory was precious, to the extent that table names and field names were abbreviated to four and six characters respectively. SAP was originally developed in Germany, so the abbreviations are German abbreviations. Over time, SAP have created several metadata layers which can sometimes help to get more descriptive names in a data model but when you look at a system today, at its core you still find the incomprehensible abbreviations. This is why you need to be a SAP functional expert to understand how to both get to and make sense of the data that you need.

4. Lots of different options and frameworks for ‘getting the data’

SAP systems have a wide variety of SAP specific  object types and is not immediately obvious to those not from an SAP background – extractors, ABAP reports, BADIs, iDocs, ADSOs, CompositeProviders etc.

Lastly, to be able to use things such as the ODP framework to be able to use SAP extractors then you will need to set up various things on the SAP system itself. This will often require the help of SAP basis teams to ensure that you are able to use delta extraction enabling incremental loading which is a must given the data volumes of some key sources of data such as GL line items. For more information on the SAP extraction options then please read this excellent blog from the SAP guru Jan van Ansem here.

5. Complex licensing model, which means options which are technically available may not be used within the license agreement.

Often customers think that the easiest way to extract the data will just be by connecting to the HANA database and replicating the data that they need. Whilst this is a relatively simple process there are licensing constraints that constrain most organisations from doing so. Those that have a HANA Runtime license (which is the majority of clients) are not able to extract from the database layer and can only extract from the application layer. SAP has been known to sue for some extremely large sums of money when their licensing constraints are broken by their clients, as Anheuser Busch found out to their peril.

Hopefully the above gives you a bit of an idea of why loading from SAP to cloud data platforms is not just the same as other source systems and why it’s imperative to have people that are both experts in SAP systems and cloud data platforms and architectures. Luckily at Snap we are a team of SAP data experts with a focus on modern data cloud technologies such as Snowflake and Matillion and have a range of accelerators to simplify the process of extracting your data from SAP systems. Please do reach out to us if you’re interested in maximising the value of your SAP data in the cloud.

Photo credit: Mitchell Luo on Unsplash

Lessons Learned from an Sustainability ESG Reporting Project

What is ESG and EPR?

Organizations in the UK involved in packaging supply or importation must now adhere to the ‘Extended Producer Responsibility’ regulation (EPR). This regulation carries significant weight as they are legally binding and not complying could cause serious brand damage.

The processes related to managing packaging fall within the realm of ‘Environmental and Social Governance’ (ESG). Having recently completed an ESG reporting project for a global food manufacturer, I thought it useful to share some lessons learned here.

Companies are expected to provide evidence regarding the recyclable and non-recyclable components of their packaging. This requirement has been in effect since 2023, and achieving automation in this process necessitates a verified enterprise data set and a suitable platform for generating the required outputs.

Managers responsible for this task may find it daunting. Manual execution of the work is excessively time-consuming, labour-intensive, and susceptible to errors.

The data is complex and requires subject matter experts throughout the project.

The data required is complex. It entails product master data and bills of materials (BOMs) typically stored in SAP ERP systems. Additional data may be necessary from other packaging specification databases. Multiple versions of BOMs may exist, and packaging specifications reside within the system, incorporating various fields related to weight and dimensions. Some packaging items are composite in nature, consisting of both plastic and cardboard, requiring separation in calculations. Addressing all these factors requires careful consideration and understanding in collaboration with business subject matter experts (SMEs) and data owners.

Requirements will change. Be prepared to adapt.

The reporting output requirements are still unclear and evolving. 2023 marks the inaugural year for the formal collection of EPR data and reporting, but the precise details of what data should be reported and how are yet to be finalized. However, certain agreed classifications include:

  • – Packaging activity: How the packaging is supplied.
  • – Packaging type: Whether the packaging is household or non-household.
  • – Packaging class: Whether the packaging is primary, secondary, shipment, or tertiary.
  • – Packaging material and weight.

This calls for a solution that can swiftly adapt to changes, likely necessitating a platform separated from the strict internal SAP change control process.

Traditional methods of piecing together reports are inadequate. Attempting to manipulate sales volume data at the BOM component level for all products across multiple sites using MS Excel often results in unwieldy and unmanageable files.   The ideal solution for this scenario involves leveraging a cloud-based Data Warehouse equipped with robust capabilities to handle substantial data volumes. It should be accompanied by an efficient ETL (Extract, Transform, Load) tool capable of seamlessly extracting and loading data from SAP and other databases. Additionally, a versatile toolkit enabling flexible manipulation of the data into reusable assets is crucial. To effectively present the data in diverse report formats, a data visualization tool would be essential.

Check the quality of your data.

Jumping directly to final report outputs will lead to frustration. It is crucial to comprehend the state of your data beforehand. Master data may be incomplete or inconsistent. Begin by creating master data reports that allow for comprehensive data analysis and filtering. Generate exception reports highlighting products with missing weight data, for instance. Correct any underlying data issues and then proceed to generate the required reports with specific calculations in a second phase.

Choose the appropriate technology stack.

At Snap, we possess extensive experience in extracting data from SAP environments  and combining SAP data with other sources into cloud data warehouses such as AWS Redshift, Snowflake, and Google BigQuery. We use the best of breed cloud data platform tools for managing the data warehouse processes in an effective way. We use modern BI platforms to provide actionable insights in the ESG context, using Matillion. We also have expertise in data visualization across various tools.

Find a partner with prior experience.

Collaborating with a team that has gone through this process before will expedite your work and minimize risks. ESG responsibilities encompass more than just EPR and often require similar data transformation projects.

I hope this article helps you with your ESG reporting. If you like to discuss your requirements and how to create a flexible reporting and analytics platform for ESG then please contact us at Snap Analytics.

Be a Data Hero and deliver Net Zero!

The biggest problem in the WORLD!

It is clear that we need radical changes to save our planet. Governments, the private sector and individuals aspire to achieve ‘Net Zero’ – but radically changing the way we operate is not going to be easy.

Achieving this goal is going to be a huge challenge for big, complex organisations.  There are so many areas to explore, from reducing travel and fossil fuel consumption, leveraging renewable energy, improving efficiency of existing equipment, or simple behavior change.  With so much complexity the task can be daunting. 

Can data save us?…

Starting with data can help you to understand where the quickest and biggest wins are.  This helps you to understand what to focus on first.  As Peter Drucker once famously said “You can’t manage what you don’t measure”.

To create a link between desired outcomes and measurable targets you can use a ‘Data Value Map’. Whilst I love technology and data…it’s only useful when it drives actions and creates positive change.  The Data Value Map helps to visualise how data can help you to achieve your goals.  If your goal is Net Zero…it could look something like this:

Data Value Maps can be achieved using a mind mapping or collaboration tool (I like Mindmeister and Miro) and are best done as a highly collaborative team workshop…don’t forget to bring the coffee and cakes!

Now you have a clear view what data is required to measure and act (your “use cases”) to deliver the Net Zero goal.  Next you can score these in terms of Value and Complexity.  Something like a prioritisation matrix can help:

By focusing in on the ‘high priority’ and ‘low complexity’ use cases you can deliver quick wins to the business.  This will help you to demonstrate you are a true Data Hero and can help your organisation to fly!

Once you have prioritised your use cases, you can start to map out the underpinning systems and processes that are needed to deliver connected, structured data to drive your Net Zero goals. 

Delivering at lightning speed…

There are numerous technologies out there that can help you connect all of this data, but we love Matillion for being able to easily and quickly connect to almost any source and transform and join data to make it useful.  As a data platform Snowflake is fantastic for virtually unlimited storage, blistering speed, data warehousing and data science capabilities.  These technologies will certainly enable you to hone your capabilities as a true Data Hero!! There are also many other fantastic cloud solutions that can help you to supercharge your Net Zero data capabilities.

Join the Data League!

Snap Analytics’ team of Data Heroes are helping one of the UK’s largest food manufacturers to leverage data to drive positive change…but if we’re going to solve humanity’s greatest threat…it’s going to take a whole Justice League of Data Heroes.  So join us on this mission to save the planet, and lets all make sure the decision makers in our organisations have the data they need to drive positive change.  Don’t delay…be a Data Hero today!

We believe that businesses have a responsibility to look after our earth…it’s the only one we have!  We will give any organisation a 15% discount on our standard rates for any work directly linked to making a positive change to the environment!

5 common data project challenges and how to avoid them

Any data project is filled with challenges; solving these problems or avoiding them altogether is the key to success. Narrowing this list down to just five was a challenge in itself!

Challenge 1: My data is getting too complicated

Our mantra is ‘keep things simple’. That applies to the amount of code we write as well as reducing unnecessary maintenance and management of technology. Unfortunately this isn’t everyone’s experience. For example, when you’re attempting to integrate a variety of existing systems it presents your average developer with a tempting challenge: ‘I know how to fix this,’ they immediately shout out, ‘but it’s going to take a while!’ Chances are, somebody, somewhere has already solved that exact same problem. Rather than waste time and money reinventing the wheel you can simplify these things by using out of the box connectors and automated data pipelines which allow you to connect to your source systems straightaway. They’re pre-built and remotely managed, so you can be working on your data in minutes not months! 

Challenge 2: Nobody’s using my dashboard

The dashboard is the place to splash your important findings in a way that is accessible and easily understood. So how come nobody is looking at it? It’s a gripe we often hear from stakeholders who’ve asked for the dashboard and the developers who sweat over them. The truth is, you can make the best, most functional dashboard in the world but nobody will look if it’s carrying the wrong information. The solution lies not in greater technical or creative prowess, but during the planning stages of your entire data project. First define the overall strategy and goals for the business and then ask how your data can help reach those goals. Secondly, identify those individuals in the business who are accountable for the numbers and targets. Without clear ownership of numbers and without insights that actually drive action, then why should anyone care what’s on the dashboard?

Challenge 3: The tech team is arguing with the business team

Traditionally there’s been ‘them’, the business team and ‘us’, the IT guys, and communication between the two can be more dysfunctional than a bad tempered debate in the House of Commons. Cross purposes and hidden agendas can make a long term data project a fraught affair, but there is a straightforward solution: nominate one developer with the appropriate people and communications skills to be embedded in the business to act as a single point of contact. This ensures that information and requests get passed on to all the relevant parties in a language they understand and you’ll avoid tedious delays and costly repetition of work.

Challenge 4: My data project is going over budget

This comes back to our golden rule of keeping things simple. Many data firms will spend three months or more setting up infrastructure and installing software before they even begin to analyse data. You needn’t do any of that if you use a data streaming service like Fivetran or a cloud data warehouse like Snowflake. Rather than wait until day 91 of a project you can start integrating data straightaway And the quicker you begin loading and looking at the data, the sooner you’ll start to see results. Beyond this we like to use the agile approach in any project, delivering in regular stages which gives us the chance to make mistakes and reset if need be. As we like to say: fail fast, but fail cheap.

Challenge 5: My data project is taking too long

This is perhaps the number one gripe for many companies involved in complex data projects. The causes are many, and can be solved by successfully overcoming the other challenges on this list. Making sure you clearly understand the requirements up front, and have a clear view of the problem you are trying to solve is key. In addition, we recommend clear planning with measurable outcomes at every stage and employing a good project manager to keep everything (and everyone) on track and to deadline. Find out how our own agile approach to analytics could help you quickly get the data your business needs. 

Evolution not Revolution – An Agile Approach to Analytics

A commonly quoted adage in technology consulting is that “the last 20% of the work takes 80% of the time”. A report that will be used by a wide audience will often be designed by one or two key individuals that represent the wider business. However, due to this responsibility there is often the impression that the report must be fully completed with all functionality before it can be unleashed to the wider business. This can result in it taking a much longer time than expected to deliver new reports. However, by delivering the ‘must have’ information quickly and enhancing the report over time there can be significant benefits realised.

Faster Time to Value

By analysing the entire scope of the requirements given by the key business users it is possible to determine the core functionality which represents the Minimum Viable Product (MVP). If you are able to develop the core 80% of the functionality required within the initial 20% of the overall time that it takes for the full scope to be delivered, then there is a huge amount of benefit that the business users are able to get by starting to use this core functionality. This ensures that users are able to start getting value from the reporting product much quicker than if they have to wait for the full scope to be delivered, with the “nice to have” requirements often representing a significant proportion of the total development time.

Increased User Adoption

Another common challenge in analytics projects is user adoption. Too frequently there is a lot of time creating a bunch of reports which nobody actually uses. Users can often be frustrated with the amount of time that it takes to deliver reports, and the danger is that if this takes too long they will find alternative solutions to their reporting problems. However, using an agile approach can massively help with increasing the levels of user adoption by delivering the core reporting functionality to the users as early as possible. Additionally, the MVP will be simpler than the final report, meaning that it is easier for users to start understanding how to use the report. This also means they will be more likely to use the more complex functionality that might be delivered in a later iteration.

Better Quality

Lastly, and most importantly, the overall quality of the final reporting deliverable is likely to be much higher using an agile and iterative approach. The fact that business representatives often give the initial requirements can also mean that some key requirements from the wider business community are not captured. It can also be difficult for users to visualise and understand the reporting requirement without actually getting their hands on the data and starting to interact with it. However, by socialising the MVP version of the report you are able to receive much more feedback at an early stage of the development from a much wider user base. It can also be a lot easier to adapt to change and reduce the amount of rework that is needed. For example, if you’re baking a cake it’s a lot easier to adapt the ingredients when you’re mixing the cake than it is if you’ve added all of the ingredients and put the cake into the oven. This is very much the case when developing a report as often the functionality will build upon everything that has already been built, meaning it’s harder to change at the very end.

Using an agile approach to building reports can often lead to a solution that both meets the needs of the business users better and delivers value quicker. Viva la evolution!

How will IR35 impact your data project – and what can you do about it?

There are many things to consider when running a major data project. Weighing up the long term benefits over sometimes considerable short term expenditure can cause project managers sleepless nights. But come April 2020, many businesses will be asking themselves whether they can even afford to keep their project going. 

The latest factor to consider is IR35, which might sound like a secret branch of the security services, but is in fact a piece of HMRC legislation that will affect many thousands of companies and 230,000 freelance contractors. It aims to deal with the problem of ‘deemed employment’, the practice of using workers on a self-employed basis, often through an intermediary company when permanent employment would be more appropriate. Using these ‘disguised employees’ can save companies considerable sums in tax and National Insurance contributions, and deny workers of employment rights. Under the new arrangements it is up to the employer to assess whether anyone working for them falls within the bracket of IR35. 

On paper this sounds like a fair way of tackling tax dodging and unscrupulous employment practices. But in the public sector where IR35 has already been rolled out many IT projects have been put on hold due to fears over rising costs and investigation by HMRC. Understandably the legislation has caused considerable trepidation across the business world. A survey by Be Digital UK found that four out of ten businesses are considering phasing out contractors altogether. The result of this could see many IT projects grind to a halt.  

What can you do to lessen the impact on your data project? 

The rules surrounding who does and doesn’t fall inside IR35 are rather opaque. There is no one definitive rule, rather a number of questions that you need to answer to assess a contractor’s status. Many employers remain concerned they may still fall foul of the legislation. The most foolproof way of ensuring you avoid the IR35 trap is to make everybody employees. Of course this is a significant long term investment, and could leave you with the additional problem of what to do with them once a project has come to an end. 

Terms of employment 

A key thing to consider is whether the role a contractor fulfils falls firmly within the parameters of the project they have been employed to do. It is often common practice for people to shift around as projects change. Contractors can find themselves on the organisation chart next to regular staff and being moved to different parts of the business. This is definitely not OK under the new rules. 

Milestones 

Project based work, particularly in IT is commonly based on milestones, rather than days worked. Projects billed on milestones are great for ensuring that contractors are clearly delineated as contract workers, rather than slipping into the area of ‘deemed employment’. 

Think big 

Peace of mind can come from working with a large consultancy who can guarantee they only use their own people; this circumvents the IR35 headache for the client. This has been the go to solution for many public sector organisations, but it’s one that comes with a considerable price tag. 

Think small 

You could reach out to smaller IT consultancies, as under the new legislation businesses employing fewer than 50 people and turning over less than £10.2 million annually will be exempt. If they need to hire contractors the rules won’t apply to them. Furthermore, they are likely to offer significantly lower costs than the big consultancy firms and can still deliver great value, especially for specialised pieces of work. 

Undoubtedly the introduction of IR35 will cause anxiety, particularly in the short term, about the viability of certain projects. There is no single best way to deal with the new IR35 legislation and ultimately the right choice for your data project will depend on a variety of different factors, however if you would like to speak to Snap Analytics to see how we can help then please get in touch: 

[email protected]

5 things to do before starting a data project

You’re about to start a big data project. Fantastic! We’re big believers in the fact that every business can gain a real competitive advantage through analysing their data. It’s why we do what we do. 

But just before you go running off all excited, stop for a moment. If you really want your data project to be a success, you need to think about five key things before you even start. 

Understand the problem that you are trying to solve 

Chances are you’re looking to data analytics to fix a specific need, something which is causing inefficiencies and costing you money. Don’t assume though that by shaking the data tree enough times a solution will magically fall into your lap. First you need to look at your existing systems to see what exactly needs fixing. It’s only once you have a clearly defined vision and end point in mind, that we can see exactly how we can help. 

Define what success will look like  

Having identified what you want, it is time to think about what a successful outcome might look like. It helps nobody to embark on a data project without setting any specific goals or measurable outcomes. So make a plan, draw up a list of milestones, devise ways of measuring what’s happening and then track the results against that. One useful approach we’ve found is to run a user survey six months down the line and find out how people are using, or benefitting from the findings.  

Align with company strategy 

It’s all very well you dreaming up fantastic, innovative data driven projects that will change the very fabric of your business and the world generally. But it might be best, first of all, to check that your goals are something that fall in line the wider strategic direction of the business. Is this a problem you should even be solving? Is it a business priority? Will it help tick some important boxes when the annual report comes round? If the answer is yes to all the above, fantastic – you’re on your way to getting managerial buy-in and tapping up a healthy budget for an important piece of work. 

Data for the people 

You’ve addressed the needs of the bigger cheeses but don’t forget about the little guys, the people on the front line who are working hard to produce this data in the first place. Think about how this is going to benefit them in the long term, how will it make their day-to-work work easier, more efficient, or more effective? This is particularly pertinent if your business is going through a restructuring process. We’re great believers in the power of data analysis, but if you’re losing half your team it might not be perceived as the best use of company resources. 

Build the right team 

One commonly held assumption we come across is the idea that data is purely a tech led process: you identify a problem or need and the nerds crunch the numbers. It’s not that simple of course. To produce an effective outcome, you need quality input from people on the business side, members of the team who can provide insights into how the company works and what its goals and strategies are. You should bring together people who use the data in different ways and can provide the broadest possible range of experience. That way the insights we produce will be deeper, richer and ultimately more valuable. 

How to get buy-in for your data project

Despite the talk of a data driven revolution, the reality for many companies often lags someway behind the ideal of a business built on reliable detailed information analysed using AI. According to a Mckinsey Digital report on leadership and analytics, CEOs cite their biggest challenges to investing in data are “uncertainty over which actions should be taken” and “lack of financial resources.” Fundamentally, it appears that some business leaders still don’t believe that analytics have a high enough ROI.  

If you know that a data project could deliver huge value for your business but you’re struggling to get anyone else to appreciate the importance of all that expensive tech and numbers nonsense, here are a few ways to help change their mind. 

Speak their language 

There’s no point trying to convert anyone into data evangelists at this stage, instead work within the parameters of your organisation. Align your project to existing business priorities and show that you understand the CEO’s strategies. The best way to do this is to create a link between the results of your project and the financial benefits. Get those graphs at the ready! 

Remember that this is a business transaction and not a technical pitch. You’re wasting your time if you can’t convincingly demonstrate that you are addressing a particular business need, so resist using too many technical terms. Instead, explain using visuals which demonstrate that the outcomes from your data project align with the strategic objectives of the business and will bring tangible benefit to the company and the individuals who work there. 

You may need to convince people that you are addressing a demand that they didn’t even know they had. By the end you want not just buy in, but for them to believe that it was their idea all along! 

Recruit key players 

It’s not enough to expect the techies to wave a magic wand and sprinkle stardust over the business. Successful data projects are a partnership between the project team and key business users who work with the numbers on a daily basis. Don’t expect that a diktat from the 17th floor is going to be enough to drive them into making this thing a success, it’s your job to involve as many key players as possible, make them feel they’re being listened to and that they have some control over the direction of the project. 

Throughout the course of any data project, key business users should have frequent and regular opportunities to provide feedback. This agile approach will help to ensure that the solution is actually what the business wants – unless they are able to see the solution in action it’s impossible for them to really know what they want! An added bonus is that they will feel that they have shaped the project and will have a vested interest in the outcome. 

Recoup your investment 

From the C-Suite down to the office floor, what everybody wants is something that makes their work easier and the business more successful. If you can demonstrate that what you are doing is going to achieve tangible results they will sit up and listen. For example, a report which currently takes 5 hours will take 1 hour as a result of this project. And if that report is created by 100 people at a cost of £50 per hour, the bill drops from £25,000 to £5,000. 

To hammer that point home, McKinsey carried out analysis over a five year period which showed that companies who put data at the heart of their operation enjoyed marked improvements across all departments, with sales and marketing ROI increasing by 15%-20%. Investing in creating a data driven culture is vital for the growth of any business determined to stay ahead of the pack.