Mind the Gap: 4 key actions data engineers can do to help bridge the digital divide

Tino Masawi
October 1, 2024

Home » Insights » Mind the Gap: 4 key actions data engineers can do to help bridge the digital divide

Tino Masawi

Data Analyst passionate about bridging the gap between companies and technology using Matillion, Snowflake and Power BI.

Tino Masawi

Data Analyst passionate about bridging the gap between companies and technology using Matillion, Snowflake and Power BI.

Digital exclusion is a pressing concern. According to the UK Government in its report on the Data Skills Gap, between 2019 and 2022, approximately 46% of businesses struggled to recruit for roles that required basic data skills. Moreover, about 25% of businesses reported a lack of data skills in machine learning, 22% in programming, 23% in knowledge of emerging technologies and solutions, and 22% in advanced statistics within their sectors. It is estimated that by 2030, UK will face its largest skills gap in basic digital abilities. AI has gained significant popularity over time, however, without targeted action, the growing use of AI will widen the divide between marginalized communities and those who are digitally connected. While regulatory bodies will lead most of the targeted actions, data engineers can also contribute significantly by actioning small changes to ensure everyone has access to the benefits of the AI experience. In this article we will look at what ‘digital exclusion’ means, and how simple changes in data engineering practices can make a difference.

The integration of AI has emerged as a game-changer, enabling businesses to personalize strategies, optimize processes, and enhance customer experiences. AI-driven analytics has revolutionized how companies connect with their target audience. However, concerns remain regarding digital exclusion, which can present itself in the form of the digital divide or algorithmic bias. As data engineers, it’s essential to recognize these challenges and proactively address the risks, ensuring that AI’s transformative potential benefits all users equitably. Later, I’ll present 4 actions data engineers can employ to mitigate the impact of algorithmic bias to help bridge this digital divide.

Digital divide

The digital divide describes the gap between people who have easy access to computers, phones or the internet compared to those who do not. Factors such as access barriers therefore play a major role in the increase of the gap. Access barriers can be described as obstacles that prevent people from using or benefiting from technology. These can include high costs, lack of infrastructure, limited digital literacy, and restrictive policies that prevent access to devices, internet, and digital services. In 2023, the House of Lords Communication and Digital Committee highlighted that digital exclusion remains a critical issue, with basic digital skills projected to be the UK’s most significant skills gap by 2030. The committee noted that the cost-of-living crisis has worsened the situation, making it even harder for people to afford internet access (Tudor, 2024⁽¹⁾).

Algorithmic biases

Algorithmic bias refers to the discriminatory treatment which may stem from biases embedded within algorithms. As a result, disadvantages or advantages may be offered to certain groups of people. This bias appears in various forms, such as race, gender, ethnicity, age, or socioeconomic status. Furthermore, algorithmic biases can make unfair situations worse by leaving out some groups or reinforcing stereotypes as a result of skewed user demographics, leading to inaccurate consumer profiling and discriminatory targeting.

What you can do

Navigating these challenges requires proactive measures to mitigate biases. Data engineers can carefully scrutinize AI algorithms and implement transparent data practices. These include employing bias detection and mitigation algorithms, ensuring diverse and inclusive data collection and model development processes, and enhancing transparency and accountability in AI development and deployment. Scoring datasets is one method that can be used to achieve this. When it comes to scoring datasets on diversity properties, the goal is to assess how diverse the data is in terms of representation across different demographic groups or attributes. 4 key actions to follow to score these datasets include:

Defining diversity metrics – Identify relevant key diversity dimensions or attributes relevant.
Quantifying diversity – This could involve calculating representation percentages.
Set thresholds or Benchmarks – Base these on organisational goals, industry standards, or regulatory requirements.
Score Diversity – For example, a dataset with balanced representation across different demographic groups would receive a higher diversity score.

Alternatively, data engineers can conduct representation analysis paired with the fairness analysis to assess if different demographic groups are represented equally in both the data and the outcomes produced by the algorithm. Initially a baseline comparison of the data using preferred demographics can be conducted. Following this, a fairness metrics such as demographic parity, equal opportunity, and disparate impact to evaluate how the algorithm treats different groups can be assessed. From the results the appropriate adjustments can be made to ensure greater representation.

Snap Analytics have progressed from a start up to a scale up. While diversity is a priority, formal measurement of diversity have only recently been implemented. By leveraging HR platforms and applicant tracking systems, valuable insights are being gathered. Snap’s approach includes 2 of the 4 key steps: (1) Defining diversity metrics and (3) Setting thresholds or benchmarks. Gender has been identified as the key diversity dimension, with the organization striving towards a 50/50 gender balance. However, as the company grows, they plan to expand the range of diversity metrics. Currently, diversity is measured through the following methods:

Diversity of candidates applying for roles at Snap.
Diversity within the organisation, across the different levels.
Job Satisfaction.
Employee retention.
Employee engagement.
When someone leaves, an exit interview is conducted with a follow up survey focusing on inclusivity, culture and diversity.

Businesses must prioritize diverse and representative datasets to mitigate inherent biases and provide users with the best experience possible. Additional ways to mitigate digital exclusion include implementing rigorous testing, and validation procedures can help identify and rectify any biases present in AI algorithms. Training and monitoring on ethical awareness among team members is also considered crucial, ensuring responsible deployment of AI technologies. Furthermore, ongoing monitoring and adjustment of AI systems are essential to address emerging biases and uphold ethical standards.

Policy makers have recently presented the EU AI Act which outlines regulations that ensure ethical AI usage, protect consumer privacy, and promote transparency. However, the gap between well connected and poor connected will not close if we leave it to government legislation alone. Socially responsible enterprises must develop and demonstrate plans to reach marginalized communities, using algorithms and datasets that avoid favouring majority groups. Data engineers can take the initiative by employing diversity metrics or representation analysis paired with the fairness analysis to identify unequal outcomes across different groups.

Sources

⁽¹⁾Tudor, S. (2024, January 30). Digital exclusion in the UK: Communications and Digital Committee report. UK Parliament. Digital exclusion in the UK: Communications and Digital Committee report – House of Lords Library (parliament.uk)

GOV.UK. (2021, May 18). Quantifying the UK Data Skills Gap – Full report. Quantifying the UK Data Skills Gap – Full report – GOV.UK (www.gov.uk)