Web Browsers' 'Visited' Feature Creates Privacy Concerns

from the just-visiting dept

Ben Adida points to an interesting hack that takes advantage of a bug/feature (depending on your perspective) of modern browsers. When a webpage is rendered, the browser will typically display links that have been previously visited in a different color. Under the hood, this is implemented by setting the link’s style to “visited.” A website can use JavaScript to detect this information and report it back to the server — and could even do something sneaky like adding “hidden” links not actually visible to users just to find out if you had visited certain sites. This behavior was noticed by the Mozilla community way back in 2002, but because of the way the spec was written, there wasn’t any easy solution. Now somebody has figured out at least one useful purpose for this particular data leak: reducing the number of links some websites provide to social networking sites. As Digg, Reddit, and dozens of social news competitors have proliferated, blogs and news sites have increasingly faced the challenge of supporting ways to submit stories to those sites without unnecessarily cluttering up their pages. But this guy has developed some JavaScript code that will use the “visited” data leak to determine which social networking sites the user has visited and display badges only for those sites. It’s a clever hack, albeit one that will make privacy sticklers’ skin crawl. Browser vendors ought to fix the underlying privacy issue, which will break this little hack in the process, but in the meantime it doesn’t hurt to put it to a useful purpose.

Filed Under: , ,

Rate this comment as insightful
Rate this comment as funny
You have rated this comment as insightful
You have rated this comment as funny
Flag this comment as abusive/trolling/spam
You have flagged this comment
The first word has already been claimed
The last word has already been claimed
Insightful Lightbulb icon Funny Laughing icon Abusive/trolling/spam Flag icon Insightful badge Lightbulb icon Funny badge Laughing icon Comments icon

Comments on “Web Browsers' 'Visited' Feature Creates Privacy Concerns”

Subscribe: RSS Leave a comment
27 Comments
Yakko Warner says:

Re: Re: New Feature

Translation:

If this “privacy exploit” is used as a “feature”, and that “feature” becomes popular, then people will require that the “exploit” remains in order to maintain support for the “feature”.

In other words, if Aza’s script gains enough popularity, but then browser authors fix the leak it depends on, it could cause a backlash against fixing the leak.

More clear?

Scotty the Menace (user link) says:

turn off browser "History"?

Forgive me for being melodramatic, but it is just creepy the way so many of these social networking and ecommerce sites want to know my private business.

If I turn off my browser History (or set it to “0”) does that prevent this hack? If I turn off my browser history, does that prevent a URL from being flagged as “visited”?

ryan says:

Re: turn off browser

If I turn off my browser History (or set it to “0”) does that prevent this hack? If I turn off my browser history, does that prevent a URL from being flagged as “visited”?

I’d also be interested in an answer to this. Anyone know yet? There is something about this whole story that seems a bit over dramatic to me anyway.

Kozlo (user link) says:

Re: Re: turn off browser

Setting the Browser History to 0 will only cause the browser to clear the cache at the end of the session. The browser will still hold your history until every browser window is closed.

Seems the javascript uses the differences in a site specified style sheet. So this means the site is telling your browser to display the font of a hyperlink differently depending on if your browser recognizes this as a previously visited or not visited site. It is then able to collect the variations in the links. This can be stopped by either disabling javascript in your browser or through customizing the default fonts as Woadan above writes.

Note: This is not a browser specified concern it effects Firefox as well as IE.

some old guy says:

Fascinating

I am totally at odds with myself over this. I love the way that a site can tailor itself better to the user. But I don’t like the implications of what a somewhat less respectable site could do.

Imagine a dumb site, loading up a monster script that checks for some 5,000 pages to make a report of metrics of its competitors… or an advertiser that wants to see just how effective its ads really are. and starts not only serving an ad for a product, but also checking to see if you did any additional research on that product (it could check if you did a google search for the product name, it could check to see if you visited the product on a slew of e-commerce sites… the limits are rather not limiting…)

jonnyq says:

Re: Fascinating

“…checking to see if you did any additional research on that product (it could check if you did a google search for the product name,…”

A link is only “visited” if you’ve visited that UNIQUE URL recently. Some search engines create fairly unique URLs for search result pages (sometimes they contain browser information, user information, etc). So, that web site would have to know the EXACT URL, in addition to the EXACT keywords) a user would have visited when searching Google for a product.

I almost NEVER visit http://www.techdirt.com/. I click links in my RSS feed that go directly to articles. Someone wouldn’t be able to just know that I read Techdirt – they’d have to detect specific articles.

But yeah, I love techdirt, but this Timothy took some old news and really dumbed it down for this post. Yeah, the new hack some guy is using is sort of intersting, but it’s hardly novel.

It’s not really something browser vendors should even attempt to solve unless they’re going to drop support for the :visited CSS pseudoclass altogether – and that would be dumb, too.

In the long list of possible privacy issues, this should be near the bottom of the “who cares” pile.

Woadan says:

This is probably due to the way links are handled in HTML. You specify links as a link (never been clicked on), active link (mouse hovering, but no click yet), and visited link (clicked on in history/cache) of the browser.

Most sites/pages specify this activity in the CSS style sheet (preferred method), or in the HTML code itself (deprecated, meaning browsers still support this method now, but will probably stop supporting it in the future).

Depending on the browser, you may be able to specify your own link action/color, and this may fool anyone looking at this.

You can have Internet Explorer use your own CSS style sheet if you like (from OS help files):

If you want to have the fonts and colors you specify in Internet Explorer to be used for all websites, regardless of the fonts that have been set by the website designer, follow these steps:

1. Click to open Internet Explorer.
2. Click the Tools button, and then click Internet Options.
3. Click the General tab, and then click Accessibility.
4. Select the Ignore colors specified on webpages, Ignore font styles specified on webpages, and Ignore font sizes specified on webpages check boxes, and then click OK twice.

The only thing the steps don’t cover is creating your style sheet. You’ll have to look it up on the web.

The only other caution I would say is to be sure you make the link, active link, and visited link colors all the same. And making them different from the main color is also advisable.

I could find no similar functionality in Firefox, though it is possible there is an add-on available that does it.

In Opera, go to Tools, Preferences, and click on the Web Pages tab. Change the Normal Links and Visited Links colors so they are the same, and check or uncheck BOTH of the underline checkboxes.

I couldn’t see anything to configure in Safari. (Too bad Apple fanboys!)

I don’t know if this will actually work, never having implemented it. And it may be a function of your history, not the CSS style sheet or links configuration in the browser, or a combination of both.

Woadan

jonnyq says:

Re: Re:

just setting fonts and colors wouldn’t do it… all I have to do is

a:visited { position: relative; z-index: 5; }

And then use Javascript to determind the zIndex property of a link to see if it’s been visited. All I have to do is find ONE CSS property that you’re not overriding in a custom stylesheet and check for that.

The only real “fix” would be to use a Greasemonkey script in either Firefox or Opera that would scan a page’s stylesheet and remove any :visited rules from any stylesheet on the page.

Again, I still think this is useless since it only exposes very specific URLs (do you always go to site’s home page?) and there are much more important privacy issues in the world we could be discussing.

jonnyq says:

Re: Re: Another thing...

And then again, even if you manage to override/remove every :visited CSS rule, I think there’s a Javascript function supported by some browsers where you can pass it a CSS selector (such as “a:visited”) and tell if an element matches that selector.

Or hell… my Javascript could just go ahead and add new CSS rules – regardless of how much CSS-rule-removing your code did – and then check for that.

It’s a game that a web site author could win if he wanted unless browsers allow you to completely disable :visited support.

Michael Sherrin (profile) says:

Simple solution

I think this privacy issue is a non-starter. If you’re talking about the same JavaScript I had looked at, there’s not an easy way for me to look at a list of the sites you’ve visited. I can, however, provide a list and only show ones you’ve seen. If you then click one of the links, I know you clicked that link, but not the others that showed up. The browser’s history is only exposed to the browser. An easy solution is to clear your history. Sure the JavaScript might be upgraded to collect more statistics, but exposing your browser history is still near impossible. Just pray no site provides a list of links to porn sites to see which ones you visited.

Yakko Warner says:

Re: Simple solution

That’s true, this code does not show how to get an arbitrary list of viewed sites. It does require the coder to provide a list of sites (from the comments on Aza’s page, that list has to be very specific, down to the exact page/URL), and each URL can be queried in turn. I don’t think it’s a non-issue, considering how targeted the query has to be; but I do think the issue exists.

As far as the list to porn sites, I don’t think that’ll be a problem. On my home network, I run my own DNS. I have a list of known ad and spyware domains that I’ve added to the DNS to resolve to 0.0.0.0, the upshot being that all requests to any of those domains from any machine inside my home network fail to get routed. I once tried to do the same with a list of known porn domains. I created the file, added it to the DNS config, and restarted the server — which choked and barfed on the sheer volume of information being thrown at it.

In other words, I think any attempt to go through a list of porn sites to see if you visited certain URLs on them would probably fail miserably. 😉

itchyfish says:

Who cares?

There are a couple of people with a “who cares?” attitude. Just this bug/feature alone might warrant that resposne, but my question is what else could this lead to that I would care about? If someone knows I visited gmail, could they then target my cache and pull out the link to my inbox? I know a lot of webmail sites where you can get right back into your inbox eventhough you’ve “logged out”. What about other things e.g. private company forums, a link back to Amazon with your CC info? This one specific example doesn’t do that, but don’t think for a minute that someone out there isn’t a lot smarter than you give them credit for, and can use this as a springboard for a real attack against you.

jonnyq says:

Re: Who cares?

“If someone knows I visited gmail, could they then target my cache and pull out the link to my inbox?”

That’s not something even remotely possible with this hack. They would have to know the link to your inbox before being able to tell that you visited it.

“I know a lot of webmail sites where you can get right back into your inbox eventhough you’ve “logged out”.”

Most services put your session id in a cookie (the most common method). Some services also allow the session id in the URL (allowing you in after logging “out”), but any service worth its salt would lock that session id down to a single IP address.

“What about other things e.g. private company forums, a link back to Amazon with your CC info?”

Again with the same issues – you have to know the specific URL, and even if that URL contained a session id, Amazon’s probably smart enough to lock that to an IP address.

I think it’s been pretty well described exactly what this thing does (even though Timothy used some absolutely horrible verbiage in his summary). If you understand it, it should be clear why this can’t be a “springboard for a real attack”. I’m not being a jerk, but I hope I’m at least being clear.

itchyfish says:

RE: RE: Who cares?

@jonnyq You’re not being a jerk, you were very clear. I appreciate the comment.

However, if you read some of security papers linked deeper from the article (e.g. https://www.indiana.edu/~phishing/browser-recon/) you’ll see that this bug/feature can most certainly lead to something more dangerous. This attack may not give me the link to the inbox, but by confirming that you’ve been to gmail means I can probably write some additional code to go fishing through your history/cache to find the link with the session id, etc. This gives a nice, specific target to look for instead of trying to guess if you’ve been somewhere I’d be interested in. I guess what I’m saying is that this bug/feature enables an attacker to much more narrowly focus his attention to something that is valuable instead of just randomly trying stuff.

Chris says:

This is tiny compared to what is possible

I have a technique that allows me to use a script on a web page to get the entire URL history of the entire browsing session, without having to make stupid guesses as to where the user might have gone.

The :visited technique is a very old one (at least 3 years). The edges (and leaks) of the Javascript sandbox are much better known now. I am not going to give specifics about how the history grab is done because I don’t want it in general use. But believe me, it’s doable in ALL major browsers except Konqueror.

Prometheus says:

Do not to accept any gifts from Zeus

Another chapter in the continuing saga of “Javascript and the Unintended Consequences”

Bottom line – allowing execution of arbitrary code from an external source is a very bad idea. There really is no way around this basic concept. I’m guessing that the reason it is so pervasive is because of its exploitability. There are some who think that they can open Pandora’s box, extract the one thing they desire and then close it without causing any trouble at all. Well, it has been demonstarted over and over that this is just not true.

Anonymous Coward says:

Kill "social" buttons

I wish these “social” links disappeared or I could adblock them (they’re just ads anyway). They’re obnoxious. They replicate in the wrong place what should be a browser functionality. On some sites they even pop up on mouseover and hide content. If the content of a webpage doesn’t warrant the effort of copypasting the URL or using a browser extension or bookmarklet, people shouldn’t advertise that page in the first place.

Add Your Comment

Your email address will not be published. Required fields are marked *

Have a Techdirt Account? Sign in now. Want one? Register here

Comment Options:

Make this the or (get credits or sign in to see balance) what's this?

What's this?

Techdirt community members with Techdirt Credits can spotlight a comment as either the "First Word" or "Last Word" on a particular comment thread. Credits can be purchased at the Techdirt Insider Shop »

Follow Techdirt

Techdirt Daily Newsletter

Ctrl-Alt-Speech

A weekly news podcast from
Mike Masnick & Ben Whitelaw

Subscribe now to Ctrl-Alt-Speech »
Techdirt Deals
Techdirt Insider Discord
The latest chatter on the Techdirt Insider Discord channel...
Loading...