Blogging

  • 9 Ways to Reduce Scraping and Catch Blog Post Theft

    Print Friendly

    Content scraping or theft is when original content is stolen from a blog to republish on another blog or site.  This is considered to be plagiarism or copyright infringement, but since the Web is not completely regulated yet, scraping has become an extremely common problem.  Although, this process can be done manually, normally it is done by automated software programs that allow content to be easily scraped from any blog.

    Blog Post Theft

    Blog Post Theft

    Some people practice scraping to intentionally harm others, but some do not know they are actually creating harm.  If the original content is published on other blogs or websites, it can cause duplicate content issues, especially if the copied content out-ranks the original content in the search engines (See my blog post on duplicate content).

    I have experienced this problem myself in the last few months, and I have let it slide because I know people who copied my posts meant no harm.  Although I appreciate and encourage sharing of content, there are netiquette rules that should be followed (I will cover this topic in a different post).

    Unfortunately, we cannot completely prevent this growing problem, but there are steps that can be taken to reduce the problem.

    Simple Steps

    1. Put a copyright statement with every blog post.  This can be done manually or automatically (I will share some plug-ins in the next post).
    2. Inspect your blog’s back links often.  This can be done manually in the search engines or a back link checker.
    3. Set up keywords related to your blog posts in Google Alerts.  This method will alert you of similar posts and may help monitor scraping.
    4. Use absolute links if you are interlinking any blog posts or pages of your sites (i.e., a href=”http://www.domain.com/pagename.html”; vs. a href=”pagename.html”).
    5. Place a link to your blog in the footer of each post.  This can also be done manually or automatically if you have a WordPress blog.
    6. Link to your blog posts within your RSS feeds.

    Advanced Steps

    If you are not a techie, you can ask an experienced webmaster to perform the following steps for you.

    1. Utilize cloaking to change the source code by sending content other than what is originally seen on the blog, making it harder for automated programs to scrape.  This is normally thought of as a black-hat SEO technique, so should be used sparingly with caution (only use this as means to protect your content if you are having a huge theft problem).
    2. Make use of IP Blocking methods to prevent future content theft.  IP Blocking is done through your site’s .htaccess file.  This method finds the IPs of sites publishing your content without permission, and then blocks them.
    3. Use captchas to effectively block automated scraping.  Captchas are randomly generated strings of words and numbers that can be displayed in picture format. Some scrapers have found a way around captchas.  Also, this will not work if someone is copying your content manually.

    If you use the above methods and find scraping incidents of your blog posts, I recommend you leave a comment and a link to your original post to warn both the scraper and the blog readers.

    Look for my next post on WordPress plug-ins that will help you streamline your efforts.

    If you’ve caught content thieves in the past, share your story in the comments below.

    Posted May 11th, 2010 By in Blogging, Search Engine Optimization (SEO) With | 15 Comments
  • Pingback: Data-Scraping Lawsuit Sheds Light On Risk To Databases | Harddrive Data Recovery - Data Storage

  • Pingback: Data-Scraping Lawsuit Sheds Light On Risk To Databases | Harddrive Data Recovery - Data Storage

  • http://www.meetusinghal.com Meetu Singhal

    Very helpful and informative article , Mirna!
    question – does tools like copyscape help ? what is the role and scope of creative commons in plagiarism ?

  • http://www.echimarketing.com Eureka Janet

    Hi again, Mirna (Seems we’re connecting more these days~! good for us~!)

    In January I wrote a blog about Scraping…with Integrity. Being a relatively new’ish blogger I struggled with content, like everyone else. (Now I have too much content and NO time~!) Anyway…I do truly believe, like you state, if done properly Scraping can be of benefit to everyone.

    If you don’t mind me pasting my link…the post I refer to is here: http://eurekajanet.wordpress.com/2010/01/29/how-to-scrape-with-integrity/

    Thanks again for being so darn Relevant~! Cheers…

  • http://www.shescookin.com Priscilla

    Thanks for sharing such relevant information. I have a WP (.org) website – how do I add an automatic link back to my website in the footer of each post?

  • http://www.mirnabard.com Mirna Bard

    Meetu, yes copyscape does help; however, you should not rely just that. If you combine it with some of the techniques I mention, it will be good. I am not sure I understood the other part of your question…please feel free to email me.

    Eureka, nice to “see” you again! Your blog post made me laugh…it was excellent! Thanks for sharing :)

    Priscilla, please see today’s post: 7 WordPress Plug-ins to Help You Control Content Scraping http://bit.ly/dpl1eY.

  • http://JasonWheeler.biz Jason Wheeler

    Nevermind… Now I know what you mean by scraping. Do you think it is okay to take a portion of someones content and link back to them?

  • http://www.mirnabard.com Mirna Bard

    Jason, nothing wrong if taking part of the content or summarizing in your own words,and then linking back to original.

    • http://JasonWheeler.biz Jason Wheeler

      I’ve found that even when you use others content, as long as you reference them most people don’t mind. It just gets passed around more. It’s a big problem though if someone takes your content and does not credit you. Have you had any experience with that?

  • http://www.mirnabard.com Mirna Bard

    Jason, I have not experienced yet (at least not that I know of), but I know many who have. I will be writing a short post on some tips soon!

  • http://onlinecasinoredbook.wordpress.com/2010/01/13/online-gaming-industry/ bestroulettecasinosonline

    Couldn?t be written any better. Reading this post reminds me of my old room mate! He always kept talking about this. I will forward this article to him. Pretty sure he will have a good read. Thanks for sharing!

  • http://usaonlinecasinos2.blogspot.com/2010/05/welcome-to-usa-online-casinos.html Rebecca from onlinecasino

    Valuable information and excellent design you got here! I would like to thank you for sharing your thoughts and time into the stuff you post!! Thumbs up

  • http://www.mirnabard.com Mirna

    Thank you very much for your comment Rebecca, and you are very welcome! I love sharing my knowledge, that’s for sure!

  • http://www.onlinecasinotx.com onlinecasinotx.com

    Abundant equipment as usual…

  • http://www.disastercover.com/2010/06/usa-online-casinos/ onlinecasino

    Great stuff as usual…

Copyright © 2011 MirnaBard.com