72%的匿名浏览历史可以联系到真人

斯坦福和普林斯顿的研究人员发现,今天的用户更可能点击朋友或朋友的朋友在社交网络上分享的链接。

根据这一点,研究人员发现很容易将一个匿名的浏览历史与社交网络上的公开信息对接起来,识别用户的真实身份。

在论文《De-anonymizing Web Browsing Data with Social Networks》中,研究人员报告他们的算法对374套匿名浏览历史记录的测试取得了70%左右的成功率。

论文地址:http://randomwalker.info/publications/browsing-history-deanonymization.pdf

 

Can online trackers and network adversaries de-anonymize web browsing data readily available to them? We show— theoretically, via simulation, and through experiments on real user data—that de-identified web browsing histories can be linked to social media profiles using only publicly available data. Our approach is based on a simple observation: each person has a distinctive social network, and thus the set of links appearing in one’s feed is unique. Assuming users visit links in their feed with higher probability than a random user, browsing histories contain tell-tale marks of identity. We formalize this intuition by specifying a model of web browsing behavior and then deriving the maximum likelihood estimate of a user’s social profile. We evaluate this strategy on simulated browsing histories, and show that given a history with 30 links originating from Twitter, we can deduce the corresponding Twitter profile more than 50% of the time. To gauge the real-world e↵ectiveness of this approach, we recruited nearly 400 people to donate their web browsing histories, and we were able to correctly identify more than 70% of them. We further show that several online trackers are embedded on su”ciently many websites to carry out this attack with high accuracy. Our theoretical contribution applies to any type of transactional data and is robust to noisy observations, generalizing a wide range of previous de-anonymization attacks. Finally, since our attack attempts to find the correct Twitter profile out of over 300 million candidates, it is—to our knowledge—the largestscale demonstrated de-anonymization to date.

我们展示从理论上讲,通过模拟,并通过实验对真正的用户数据–确定的网页浏览历史,可以链接到社会媒体的个人简介中只使用公开可用的数据。我们的方法是基于一个简单的观察:每个人都有独特的社会网络,因此,一套在饲料内出现的链接中是独一无二的。假设用户访问比随机概率较高的饲料环节的用户,浏览历史记录,包含讲故事的身份标志。我们正式通过指定这个直觉的网页浏览行为模型,然后推导的极大似然估计一个用户的社会形象。我们评估这个策略对模拟的浏览历史,和证明,给出了一个30链接来自Twitter的历史,我们可以推断出相应的推特主页上超过50%的时间。收集真实世界的电子↵这种方法的有效性,我们招募了近400人捐赠他们的网页浏览历史,和我们能够正确地识别其中的超过70%。我们还表明,在线跟踪器是嵌入式苏“(很多网站上进行这种攻击的高精度。我们的理论贡献,适用于任何类型的事务数据和鲁棒性,嘈杂的意见,概括范围广泛的以前去匿名攻击。终于,因为我们的进攻试图找到正确的Twitter配置文件从三亿候选人,它是-就我们所知–largestscale证明德-匿名约会。

上一篇:智能网联汽车的发展趋势及其将引发的信息安全问题

下一篇:为什么喀麦隆关闭部分地区的互联网