Latest Tech News

Stay updated with the latest in technology, AI, cybersecurity, and more

Filtered by: commonpool Clear Filter

A major AI training data set contains millions of examples of personal data

Indeed, the curators of DataComp CommonPool were themselves aware it was likely that PII would appear in the data set and did take some measures to preserve privacy, including automatically detecting and blurring faces. But in their limited data set, Hong’s team found and validated over 800 faces that the algorithm had missed, and they estimated that overall, the algorithm had missed 102 million faces in the entire data set. On the other hand, they did not apply filters that could have recognize