Monday, June 29th, 2009

How to protect your content from scrapers

Two weeks ago, I got notifications for pingbacks from a website. Though the domain name sounded like a valid web development related site, getting pings to multiple articles from it made me suspicious. Visiting the site, I found that it was nothing but a feed scraper. There was no information about the owner of the site, and the about page was just the default WordPress about page. All the posts were taken from other sites with no attribution given. WebMaster View RSS feed contains links to each article in article content itself. That’s why I got pingbacks.

After three days I got another wave of pingback notifications for my new posts. This time, I decided to do something about it.