An interesting article on the difficulties of de-biasing language difficulties of de-biasing language, from a machine learning viewpoint. The author notes that simple approaches can hide bias in automated systems without removing it, e.g., if an algorithm is trained on a biased dataset in which "programmer" clusters with words that are more often found on men's resumes, words that might be irrelevant to job qualification. At the same time, the effort is worth making; even if a completely unbiased algorithm isn't possible with current methods in a society with baked-in prejudices, a less-biased one will get better results if the goal is (say) to hire qualified programmers, or make loan decisions based on ability to repay, not on race or gender.
(via Richard Mateosian, on Copyediting-L.)
The problem we’re facing in natural language processing (as in any application of machine learning) is that fairness is aspirational and forward looking; data can only be historical, and therefore necessarily reflects the biases and prejudices of the past. Learning how to de-bias our applications is progress, but the only real solution is to become better people.
(via Richard Mateosian, on Copyediting-L.)
Tags: