Get exclusive CAP network offers from top brands

View CAP Offers

Content scraping – the final solution and reality check

[bsa_pro_ad_space id=2]
  • This topic is empty.
Viewing 15 posts - 1 through 15 (of 140 total)
  • Author
  • #594679

    Hi all,

    I have been a member here for the last few years, visiting the forum almost on a daily bases, not posting a lot in the public but helping many members here by PM on problems, suggestions and tips.

    Now, this is going to be a LONG message, it will probably aggravate many people here, for sure it will make a lot of people re-think things and working strategies, and I do expect a lot of people responding with “now, I am pissed “ etc. , so I suggest you go to the kitchen, take a cup of coffee, be mentally prepared to read and re-read this and think about your going forward strategy.

    Lately there was an incidents where one senior member here complained about it’s content being scraped by us and re-published, since it was not us and that member was mislead to believe that we did it, we helped that member and other very senior members here to better identify who really did it and clear our name.

    All of this would have stayed behind the curtains (like many incidents in the past where we worked to identify miss-behaviors by publishers in this industry and miss doing like DDOS attacks and hacks being used by competitors which did not play by the rules) BUT since in this case we tried after identifying the guilty person to also view the issue of how to combat content scrapping in general, we have come to some very interesting conclusions that I would like to share with you.

    What we see happening now is that there are many new comers to this affiliate industry which try to game Google, yahoo and the other SE and get rankings, they build sites with many pages of content where if a real visitor is visiting these sites, he is redirect to an ads driven page that makes this person money.

    So now let’s break this to pieces so we can analyze it better: the site creation and the ads page.

    The site creation – He is creating a site with a lot of content pages, we all know what content creation is and how long it takes to produce one, so that person is making a shortcut, he is re-using content that you worked hard on , now lets get a reality check here, if that person have stolen a whole page from you and put up the text, you can send DCMA complaint and after a while get him or his ISP or the SE to get his content out, this is doable, see on how to identify and do that.
    BUT !!! in reality even before the DCMA will do it’s work most probably his page will not rank good just because of duplicate issues, so this guy will probably NOT copy your entire page but only parts of it.
    So here is the tricky part, let’s say he is copying a few sentences from your site, does that considered a wrong doing in the eyes of the law ? does that justify a DCMA complaint and will force the ISP/SE to take it down, it really depends on the extent and size of the quoted text from your page, the fact that your site name is for example PokerRoomsAndNIceChipsToPlayWith and that one word is appearing all over the place on that person sites does not justify a DCMA complaint, even if a complete sentence or paragraph was takes, it may still be ok (consult you lawyer, there are some good measurements what is a “quote” and what is “content stealing” )
    No it gets complicated, if you do not want people to quote you, how come you allow google and all the other ‘000 of SE spiders to read your site and publish parts of it as snippets?
    Now, and this is reality check, what if this guy did not even access your site but he is actually scrapping google or msn or another small SE results page, that may be ok, no one says it is not, maybe he is even scraping results pages that show result from other scrapers site that scraped google … etc.
    So in my mind until google and the other SEs will change their algorithms, you can’t do anything in regards to content scrapping as long as it is a few sentences (but always consult your lawyer, maybe in a few incidents you can)
    Hell, even if you have a trademark on a specific name, lets take “coke” for example, if I am a webmaster and I have used this word on my page, no one can say it was illegal content scraping and take the page down.
    Again, I am not doing content scraping, so I am not trying to justify anyone’s activity but I do want to give some reality check here to the people who are hurt (and we are hurt too since our content also appears on the scrappers pages )

    So, in regards to the content appearing on his sites, in 99.999 % of the cases you can not do anything within the borders of the law – and I would like to stay there.

    More then this, since the process of creating these content pages is automated and we saw that these people are putting ‘000000 of new pages on an hourly bases on ‘0000 of free hosted domains and TLDs there is no point complaining about a specific domain since it is all short living ones that will be replaces with many more – such is life.

    For the people who do not know me, You must be thinking that I do not know SEO, if I think that this content creation will give you great results in google and other SEs, links is all that matter, you are right, but the issue is that today those people and many other legitimate webmastes and SEOs, are getting it easily from a lot of legitimate, good PR sites by using services like, so getting links is not a problem any more these days.

    So how do you fight it? You don’t, there is nothing here you can really do until google and such will fix their stuff, concentrate on building good sites and growing up a reader’s community by providing them with good service and content.

    Now let’s move to the other part – the ads page.

    A lot of talking were done on it, hell even 888 was banned by members here (IMHO, I think they are the most honest online casino affiliate program we have worked with and the sequence of events that led to their banning by members here is simply a miss communication fueled by some competitors interested parties to lead to this situation, but I do not want to get into this discussion or try to get 888 out of the hole they are in, in time they will clear it up and this is not the subject of this thread so let’s not go there)

    So members here thought, ok, if we can not fight them on the legal side, let’s push the vendors not to work with them.
    Now, here is the tricky part, they are flight by night operations, working in different industries, they do not even try to establish relationships with vendors or get into their affiliate programs (in most cases) , what they do is simply sending traffic to some shitty small PPC engines and getting paid by clicks, sure the payout is very low comparing to what they would have done with a proper affiliate relationship but they do not care, they work on ‘00000 of terms and many industries, sending the traffic to a PPC engines is the fastest way to do money without exposing themselves. And WHO BUYS PPC ADS ON THESE PPC ENGINES? We all do ! The vendors, the affiliates and even other larger PPC engines that are looking for more traffic for specific keyword on their PPC page ….
    Now there are ‘0000000000 of small PPC engines, we do not know the owners (like we do with the online casinos affiliate programs) so they are safe ….
    IMHO, the practice of buying ads on PPC pages will not go away anytime soon so the ban strategy will not work

    More then this, we have seen incidents where we thought that these scraper sites were sending traffic to a specific vendor, looking at it much more we found out that these crocks are simply in sending traffic to some PPC engines but instead of sending it to the PPC page where the user had to click an ad, they sent it directly to the ad code ! this is why some affiliates that have bought media on PPC were accused of using scrapers site to send traffic (at least from what we saw), we were a few times in that incident in the past and we had to terminate our relationship with the media provider since some of the people they got media from without our and their knowledge was active in these practices.

    So what can be done here, again, almost nothing!

    Again, as I said in the beginning, this is NOT an easy message that I am carrying, but people, this is the industry, there are much harder problems we are seeing with blackmails, DDOS, spam being done in blogs with your domain name just in order to take your site down and other things, I do not have a good going forward strategy beside hoping it will be solved in the long term, and I think that only Google and the other SEs can solve it.

    If someone else has other ideas, I will be more then happy to help and use new ways to survive in this industry

    Take a deep breath, re-read it and think before you comment.



    I must say a well thought through post, and also unfortunately a typical Blackhat approach to abuse the fact that the young industry with it’s very limited legal protection can do very little to protect themselves from people with a different morality than ours. Having read many of your SEO posts in the past and knowing how brilliant you are, I can’t help but to respect you for your SEO genius and abilities. However, what you are trying to accomplish is two folded.

    You want webmasters to understand what you are doing is not wrong and what 888 is condoning is not wrong. In your ideal world this might be true, however on these forums it must be made clear that we do not support these actions because no matter how you spin the wheel what you are doing is wrong even though legally there might not be recourse, IMO it is still wrong.

    You state nothing can be done to combat sites which steal content and use in the form of scraper sites, but I disagree completely. Pressure can be applied on the programs turning a blind eye to their affiliate partners who actively participate in this unethical and illegal activity. Indeed to sit back and take no action whatsoever, allows the industry to become more tarnished.

    Causing the wild west to revive on the internet is not good from the user’s point of view. That, at the end of the day is my approach to good ethics. The internet belongs to the user, does your site provide the user with good quality unique content that are a valuable contribution to the quality the internet provides or are you the gangster who bomb people with crap just to get the clicks for your selfish financial gains.

    As far as the DDOS attacks are concerned, I have no opinion. It has always to me been an act of your typical hacker, something I also did when I was in my teens. If it was you behind it, I doubt it as I have way to much respect for you to believe that you would be guilty of such a childish act.


    just to make it clear: you said “You want webmasters to understand what you are doing is not wrong and what 888 is condoning is not wrong. In your ideal world this might be true, however on these forums it must be made clear that we do not support these actions because no matter how you spin the wheel what you are doing is wrong even though legally there might not be recourse, IMO it is still wrong. “

    i do not do content scrapping like this, this is a techniques which was used by BH 5 years ago, the techniues today are much better and do not involved content scrapping at all, but that is for another discussion.

    and i also do not support these techniques , (if that was not clear from my prev. post), i do say that such is life and trying to think what we can all do to fight this (and our sites are scrapped too – take a look at that asshole sites that were reported)

    the problem is that there are new comers to teh BH world which are not that advenced so they are using these scrapping techniques.

    you also said that “Pressure can be applied on the programs turning a blind eye to their affiliate partners who actively participate in this unethical and illegal activity. Indeed to sit back and take no action whatsoever, allows the industry to become more tarnished.” , let me show you a case and see how would you act on it, lets say a BH seo is doing the wrong thing and then redirect his traffic to the term “casino” on a small PPC provider, a larger PPC provider buys listing on that term in the little PPC provider (happen all the time), we are all bidding for that term on that big PPC provider, is that wrong to bid there? do we really know the source of that PPC traffic, all I am saying that looking a few steps ahead the approche of banning and crying is pointless and you better direct your efforts in enlarging your community.
    just me 2 cents, not trying to justify anyone, and really trying to geta reality check and maybe as a result of this thread and provocating things said here a solution will emerge, but i am looking for a REAL solution and not something which i can see even now that is not working


    Hi Janet

    I am sorry for the misunderstanding. I replied to your post according to my understanding to what your intention was for the post. If you are posting to be objective by all means distance yourself from my remarks and opinions but I hardly view your opinion as a reality check as it is just another typical blackhat approach to the internet being a free for all happy go lucky world. This is still your opinion or an opinion and I raise mine to debate your opinion or a opinion.

    Your PPC example sure creates another debate of where to draw the line, however I am very particular on where I spent my marketing budget. Sites I advertise with are thoroughly investigated and I will not support any doubtful businesses.


    Give me a few days, I will respond with the advice of Raymond, Hobbit, Steve m, Frankilin Davis,Kous van dout and others. It will take a while to collect all the personal insights to this ever growing problem. Perhaps Eric will respond, I hope he makes it to VEGAS. greek39/esr/


    If someone capitalizes on my work without permission it is theft.

    Very simple.

    Most times I just ignore it, but when it interferes with my conducting my own business you can be sure I will go after it.


    The personal views from the people I mentioned are all contributors to the internet not the www. They also strongly beleive in hackerethic, ie cracking okay, for fun and exploration as long as no theft, vandalism or breach of trust has occurred.

    But this will take valuable time away from my main goal, and just may take longer than expected. We will see where this productive discussion leads. greek39^*/


    yep, let’s name a few google, yahoo msn etc. , i am not joking, YOU by not writing some restrictive rules in the htaccess or robots.txt or the page allowed them to take your content, index it and show snippets of it and even distribute it using the web and rss to 3rd party (which in your case some of the 3rd party are people which take this content and build “new” content pages) without them promissing you ANY financial gain what so ever.

    all of these SE capetelize on your work, sell ads abd clicks etc. (and they are doing much more then we all do together)

    so where do you draw the lines both legally and morally ?

    i know you are pissed and i know you take it personal but i am still trying to be objective and play the devil advocate, again , i am not endorcing scrapping behavier but i am trying to put the right prespective and bring a reality check on it.

    they may have not even scraped your site, they may have scraped google SERPs ….

    Iwill repeat it again for the guys did not get it from the prev discussions. If someone takes your whole site, that is illegal – a few sentences is not illegal and because someone think it is “immoral” and report it to vendors, when in fact the ONLY ISSUE IS THEY USIALIZE SEO LESS. No way does it in any way harm a site ranking or anything to have a few of your sentences displayed on other sites – when I copy a block of content from some of my “real sites” into google search it returns 5k results sometimes! These “real-sites” sites rank fine. – to me it says that it has no effect what so ever, if any the effect is good since i am getting motre backlinks.

    To the point – in what way could the person scrapping have possibly scraped and harmed any webmaster? Even if a couple sentences appeared on his site, as they say HE CLOAK and no one sees it – this obviously would NEVER be the reason the scrapper rank? Where is the harm? i also think the scrapper do not crawl anyones sites for content.

    Lastly, it is not the vendors responsibility to determine whether a site in their network is acting within the fair use laws. Lawyers can barely determine this, no less affiliate managers!

    and if a vendor doesn’t want the traffic it takes 10 minutes for the scrapper to point it elswhere (there are over 1200 vendors out there), I must say that this problem is not unique to our industry – it is everywhere and no one of the vendors care about any of this noise . (and i am talking about vendors like the biggest banks, credit card companies, insurance, auto, and online comerce lik eebay and such)

    but i still want to get some constructive ideas in how to resolve this.


    If it interferes with my being able to do business I have to go after it.

    What do you expect, people sitting back and watching their businesses shrivel up and saying: oh, ok, nothing constructive to be done here?

    Someplace theory stops and reality sets in.

    And in reality, these practices need to be stopped and we know exactly where and when.

    Who all else does the same thing with no ill effect is irrelevant.

    And the only recourse affiliates have is the affiliate programs who fund this stuff.

    If you can’t go after it with the law, you have to cut off the money supply.

    Also, I am not at all convinced this cannot be fought legally, I have varying opinions on that.



    white hat can achieve serps it want to target too, without upsetting other people’s heard earned SERPS.

    Proof in point:


    I must ask Janet respectfully, do your endorse marketting strageties. What are some of your brief thoughts on the matter? I am pretty sure you are aware of the script auction house. A place to go for malicious hacker scripts, the highest neophite bidder takes it.


    Dominique wrote:
    If it interferes with my being able to do business I have to go after it.

    it is yet not clear to me how does this interferes with you being able to do business , i have NO INDICATION what so ever that it hurts your ranking, if any, it just improves it due to teh additional links and the brand recognition feeling.

    also, the content of the scrapper site not seen by the regular users since it is immediatly directed to a “money” page, but take google, your snippet is seen there and it may well be that tha user that was intrested in the information you provided already gets the answer by looking at the snippet of the text in google AND NEVER EVEN GO TO YOUR SITE and simply click one of the ads in google

    but yet you allow google to show snippets, think about it, who hurts you more ? and morei mportent then this, what can you do .

    the thread is becoming more productive by the moment …

    Janet wrote:

    the thread is becoming more productive by the moment …

    It just hit a brick wall because I am not a working example of what this does and does not do to profitablility.

    I have never watched my SERPS, and I am not about to go into a big research project to prove my point here, I have a site to take care of. I am all customer service, and not SEO.

    You need to look at what it does to a smaller site that actually can identify the location of the loss of a couple hundred SE positions.

    The most I can detect is a slowing of growth. How much money is actually lost – it would cost me a bundle to conduct the research to find out.

    But it doesn’t take rocket science to figure out that every lost SE position is a loss of income.

    When you have 15,000 sites scraping you, it has to have an impact. Or else the scraper isn’t worth anything anyway and he will just disappear shortly.


    Just had a quick check on google, using the names of sites without spaces and their tld extension, who have rogued 888 and it appears many have been hit by these sites with .pl and .info domain names.

    Sites affected and which have subsequently accrued these thousands of backlinks are:

    This is definately a crude attempt at google bowling our sites and there are many more being hit in addition to those four sites listed above.

    This means we were correct in our action in roguing 888 en masse for their failure to police their affiliate network. 888 and/or the Super Scraper who promotes them using stolen content must be very pissed off and be hurting by our just and right course of action in roguing 888.

    Fully well knowing this, means I will be sinking a few beers in celebration as a result :)

    greek39 wrote:
    I must ask Janet respectfully, do your endorse marketting strageties. What are some of your brief thoughts on the matter? I am pretty sure you are aware of the script auction house. A place to go for malicious hacker scripts, the highest neophite bidder takes it.

    greek39 marketting strageties is their own biz, i am not the one to say if it is Ok or not, if someone feel offended he needs to clear it up with them and if not then take some other action, as long as it can benefit him or the group in the long run.
    my thoughts on the matter is (and i do not want to offend anyone and be complitly frank) is that people are looking at their traffic and seeing scrapper sites that are doing better SEO then them (see my 1st post on this thread) rank higher then them. yep , it is upsetting but as long as the scrapper has not done anything that is agaist the law, i do not see anything you can do, the best opu can do is be better in SEO and learn the tricks of the trade and don’t assume that content is everything since we all know that it is not.
    i was not familier with the website you mentioned, some people mistaken people which are using aggresive SEO and malicious hacker, it is not the same, you can do aggresive SEO and still be in the boundries of the law.
    moral is a very subjective thing, you can not take someone to court based on that ….

    if someone is doing a better job in SEO then you and you can not compete, then drop out of the race, do not pay attention to your position in the SERPs , it is that simple, if you do want to compete – then learn, use tools, consult with others , so many people have asked me SEO questions in private and i helped them
    no one will do the SEO work for you, the fact that you ASSUME that your site serves the visitor better then the scrapper site is a very subjective thing, the scrapper may think that by redirecting the traffic directly to the vendor he serves the visitor better the showing him a lot of content while he just want to start playing, the SE view on this is also subjective and google have a diffrent subjective view on what are the site ranks for a specific term while MSN has a diffrent subjective view and ranking and any other SE.

    this is the reality, now I do think white hat sites can win and it may be even easier for them to win if they will utialize the tools out there, there is no legal issue in using these tools, maybe moral things (like getting links from places you did not talked with directly but you both aggreed on some links dealing aggrement)

    I also think that there is no such thing as pure white hat, if you change your title tag, or exchanged or even asked for link, your not white hat anymore …
    it is all diffrent shades of gray …

    the shade level of gray is in the eye of the beholder

    I hope that the message clear

Viewing 15 posts - 1 through 15 (of 140 total)