Posts

Es werden Posts vom August, 2024 angezeigt.

Coordinate masking of multiple connected signals – Mind the centroid!

Bild
A while ago I came across a paper by Gao et al. (2019), who try to anonymize the coordinates of geo-located tweets by Twitter users (meanwhile X users). The data in question connects the geo-coordinates by a (pseudonomized) user ID, meaning an analyst can quickly find user-specific clusters, which typically identify the user's home and / or work location. Toy example of a user's geo-located tweets – clusters clearly identify home location (blue) and a second one (red). The problem is notoriously difficult (see, for instance,  Zang & Bolot, 2011, or de Montjoye et al., 2013). Gao et al. can only manage a slight anonymization at the price of completely destroying clusters in the data, arguably squandering its analytical utility. This failure is illustrative. The approach of Gao et al. – and why it doesn't work The authors in Gao et al. (2019) apply an independent random perturbation per point , of the type I discussed in an earlier post . [Note: They also consider alter...