Watching ReadWriteWeb Content Scrapers in Action - Via WordPress Pingbacks
January 26, 2011
Last night the great folks at ReadWriteWeb wrote about one of my blog posts (about Node.js) in their "ReadWriteHack" channel. How did I first find out?
One of the spammy sites that scraped RWW's content sent a pingback to my blog post!
In WordPress, which I use on Code.DanYork.com, a pingback comes in as a comment... and over the next few hours I received more and more... I'm still getting them now. I've visited several of them and they are the typical content scraping sites - they take someone's legitimate high-quality content and surround it by various ads (and without any links back to the original article)... hoping you will find their site first. These are the kind of sites that Google needs to keep working on devaluing (and they are).
You can see here some of the ones I've already killed as spam:
Now, I'm sure this is not all of the sites scraping RWW's content... I can see many other links in a Twitter search that certainly could be spammy sites, too.
I do wonder whether these sites wanted to send a pingback or whether they just installed WordPress and kept the default checkbox checked:
It very well may be that they do want the pingbacks, as they would appear as legitimate comments on my site which might then drive traffic over to them. They might be trying to game SEO, but WordPress sets all the comment links to rel='external nofollow' which will kill any SEO value.
Regardless... it's been interesting to watch... and, frankly, as a content creator myself, sad to see. But it is just another facet of this world of online content we live in today...
If you found this post interesting or useful, please consider either:
If you found this post interesting or useful, please consider either:
- following me on Mastodon;
- following me on Twitter;
- following me on SoundCloud;
- subscribing to my email newsletter; or
- subscribing to the RSS feed.