Mitigate Risk of Bias And Hallucination with Data Diversity

Mitigate Risk of Bias And Hallucination with Data Diversity

Data diversity is one means of mitigating the risk of your AI models capturing internal bias and "hallucinating," or plainly speaking making mistakes. Gartner coined the term “wide data” a few years ago to differentiate from “big data,” but I find the term “diverse data” easier to understand. For example, if an HR department wants to identify a profile for a specific role in their organization, using only internal data would capture only the characteristics of past employees in that role. To get a more representative picture of who they might hire, the HR team would want to incorporate some external data to eliminate potential existing bias. For example, ADP Payroll and Demographic Data or Workforce Data Analytics from Revelio both offer potential sources of diverse data.

Here are five steps to data diversity:

  1. Breakdown internal silos to access cross-functional sources
  2. Transform unstructured data to expand available internal data
  3. Collaborate with partners to access different data sources
  4. Acquire third party external data sources
  5. Explore the applicability of creating synthetic data

I’d love to hear your thoughts. More to follow...

To view or add a comment, sign in

Insights from the community

Others also viewed

Explore topics