tracker

TQMC

TQMC has acquired wide Domain Knowledge and Experience. You can FREELY access it here and here

DISCLAIMER: This matter here is a guide only. For authentic and up-to-date information, please contact TQMC.

The DIRECTIVES and STANDARDS listed here may have been subsequently REVISED . You must refer to the CURRENT REVISION and AMENDMENTS if any.

Sunday, August 17, 2008

Google Proxy Hacking: How A Third Party Can Remove Your Site From Google SERPs



In June of 2006, while working to resolve some indexing issues for a client, I discovered a bug in Google's algorithm that allowed 3rd parties to literally hack a web page out of Google's index and search results. I notified a contact at Google soon after, once I managed to confirm that what we thought we were seeing was really happening.
The problem still exists today, so I am making this public in the hope that it will spur some action.
I have sat on this information for more than a year now. A good friend has allowed his reputation to suffer, rather than disclose what we knew. I continue to see web sites that are affected by this issue. After giving Google more than a year to resolve the issue, I have decided that the only way to spur them to action is to publish what I know.
Disclaimer: What you're about to read is as accurate as it can be, given the fact that I do not work at Google, and have no access to inside information. It's also potentially disruptive to the organic results at Google, until they fix the problem. I hope that publishing this information is for the greater good, but I can't control what others do with it, or how Google responds.
I am also not the only person who knows about this hack.

Alan Perkins (who along with many others stayed quiet about the 302 redirect bug for 2 years) knew about it the day after I found it.
Danny Sullivan has known nearly as long, and I suspect that his behind the scenes efforts are the reason why the major search engines all decided to publish "how to validate our spider" instructions after SES San Jose last year.
Bill Atchison knows, because he helped me figure out a defensive strategy for my client's sites… and along with me danced around this issue on the "Bot Obedience" panel at SES last year - trying to warn people without telling them too much.
My (now former) client Brad Fallon knows… and he's been subjected to a lot of unfair criticism that he could have easily answered by making this public. It cost him a lot of money, and "a lot of money" to Brad is a lot more than it is for most of us.
"Someone else" knows, because they were actively exploiting this bug to knock one of Brad's sites off of Google's SERPs. I suspect many other "black hats" know about it by now… because other sites are being affected. I can't believe that they're all accidents.
This is going to be a long story, I'm afraid… but bear with me, because you need to understand this, and how to defend yourself.
The story begins over a year ago…
My friend Brad Fallon had been having some troubles with Google, and one of his web sites, My Wedding Favors. In June of 2006, after exhausting all of his other options, Brad (who knows his way around SEO) hired me to direct his search marketing efforts and, in simple terms "figure out what the hell is going on with Google."
The first thing I discovered was that he wasn't "banned," but that Google was indexing everything on his site except for the home page. It took about two weeks of research and testing before I developed a working theory. When we searched Google for phrases that should have been completely unique to the My Wedding Favors home page, we kept finding a particular kind of duplicate content: proxies.
For those who don't know what a proxy is, it's a web server that's set up to deliver the content from other web sites. Among other things, proxies have been set up to allow people to surf the internet "anonymously," since the requests come from the proxy server's IP address and not their own. Some of them are set up to allow people to get to content that is blocked by firewalls and URL blocking on corporate, educational, and other networks.
The diagram above shows what this looks like, when used innocently by a human being:


No comments:

Post a Comment