Skip to main content
CLOSE

Charities

Close

Corporate and Commercial

Close

Employment and Immigration

Close

Fraud and Investigations

Close

Individuals

Close

Litigation

Close

Planning, Infrastructure and Regeneration

Close

Public Law

Close

Real Estate

Close

Restructuring and Insolvency

Close

Energy

Close

Entrepreneurs

Close

Private Wealth

Close

Real Estate

Close

Tech and Innovation

Close

Transport and Infrastructure

Close
Home / News and Insights / Insights / AI Bias: What is it and how can it be avoided?

The risk of bias in AI systems is widely recognised. Systems trained on biased data can reproduce that bias in their algorithms which creates unfair outcomes when those systems are used to make decisions.

Most approaches to testing for bias rely on access to demographic data about individual users. For example, to test whether a recruitment system unfairly disadvantages applicants in certain age groups you would need to know the age of applicants to test whether they correlate with unfavourable outcomes.

Accessing demographic data can be a challenge for developers. A recent report from the Centre for Data Ethics and Innovation (CDEI) highlights some innovative approaches, and the potential for a new ecosystem of solution providers to tackle that challenge in the UK. These include:

Data intermediaries

Data intermediaries are organisations that facilitate sharing demographic data. The CDEI report focuses on intermediaries that act as stewards of demographic data. These intermediaries could facilitate bias audits in a couple of ways:

  • they could provide a developer with access to demographic data about individuals in a controlled way, so it is only used for the bias audit; and
  • they could store users’ demographic data and conduct the bias audit themselves so that the AI developer only receives the results of the audit and never has access to the underlying demographic data. This provides an additional layer of privacy protection for individuals.

The CDEI report identifies some real world examples of intermediaries along these lines, such as the ONS Secure Research Service and the US National Institute of Standards and Technology Face Recognition Vendor Test programme.

However, the report also notes a lack of organisations offering these kinds of data intermediary services. A market for these services has not yet developed in the UK, although the UK government has committed to support the development of an intermediary ecosystem.

Proxies for demographic data

When AI developers cannot access demographic data one solution is to use proxies instead. In short this means using existing data to infer demographic data. To give a basic example, a developer might use forename as a proxy for gender.

Inferring data raises its own ethical, legal and practical challenges. Accuracy of inferences is a particular concern. The CDEI report explores these challenges in detail but suggests that in certain circumstances, proxy data might be a viable solution for bias detection so long as robust safeguards and risk mitigations are in place. The report suggests that developers using proxy data should:

  • establish a strong use case for the use of proxies as opposed to direct demographic data;
  • select an appropriate proxy method, considering guidance from the Information Commissioner’s Office, the risk of model drift, and the feasibility of testing the accuracy of the method; and
  • implement robust safeguards and mitigations, for example measures to inform individuals about how their data will be used, and implementing privacy enhancing techniques.

The CDEI report notes that many of the most popular proxy methods and tools have been developed in the US. Developers in the UK thinking about using those tools will need to consider the different approach to data protection law here when evaluating those tools.

Future opportunities

As AI systems proliferate the need to test and mitigate their biases will only grow. Access to demographic data is a significant challenge in achieving that, but the challenge could be overcome if the UK creates the right regulatory and commercial environment. The CDEI report suggests that government recognises the opportunity to create an ecosystem of data intermediaries and similar services to meet this growing need.

Until that ecosystem is more developed collecting demographic data directly from individual users will often remain the best option. Data protection laws do not necessarily prevent that either; with the right compliance approach the law acts more like safety rails than a barrier to innovation.

Related Articles

Our Services

Charities chevron
Corporate and Commercial chevron
Employment and Immigration chevron
Fraud and Investigations chevron
Individuals chevron
Litigation chevron
Planning, Infrastructure and Regeneration chevron
Public Law chevron
Real Estate chevron
Restructuring and Insolvency chevron

Sectors and Groups

Private Wealth chevron
Real Estate chevron
Transport and Infrastructure chevron