Did you know that it is possible to tell whether or not Google has disavowed a link that is in your disavow file? This is HUGE news. John Mueller from Google has told us several times that some links can take up to six months to be recrawled and subsequently disavowed. What this means is that even if you have done a thorough disavow, if Penguin refreshes before the majority of your links in your disavow file have been recrawled then the Penguin algorithm will still look at your site as untrustworthy. This is why some sites need to see two Penguin refreshes in order to escape the Penguin algorithm. But, if we can tell whether our links have actually been disavowed, then perhaps we can know whether we’re ready for a Penguin refresh.
In this article I’ll explain how you can tell if a link has been disavowed and I’ll also share with you some very interesting information from some testing that I did. I tested to see if it really did take 6 months for links to be disavowed and in the process I made a discovery that might be a reason why some sites never escape from Penguin or other link related penalties/algo changes.
A cached link is a disavowed link
I asked John Mueller a question in a Webmaster Central Forum hangout recently. What I wanted to know was this:
If a site is in my disavow file and I see that that site has been recached since I disavowed can I assume it’s been disavowed?
His answer surprised me. I figured he was going to say that the cache date was not at all connected to the date when the link was disavowed, but actually it is!
John said that every time they crawl a url, if it’s in your disavow file then the disavow gets applied. (When a disavow is applied it is essentially the same as Google applying an invisible nofollow tag to the link that is pointing to you.) So, if you have disavowed the home page of a site and you check the home page and see that it has been recached then you can assume your disavow has taken effect. The same thing applies for individual pages. Now, John says it’s possible for the Google index to be updated and the page NOT to be cached. But, if you do see that the page is cached then your link really should be disavowed.
In other words:
a) If you see that the page has been cached on a date that is AFTER the date that you disavowed the url or domain then your link is disavowed.
b) If you don’t see a new cache since your disavow was filed then there’s still a chance that the page has been revisited and the disavow applied, but it’s possible that it has not yet been disavowed.
It’s important to note that if you disavowed the entire domain, it’s not just the home page that has to be recached but rather the actual url that contains your link(s) that need to be recached in order to know that your link has been disavowed.
John was asked if there is any other way that we can check if our link has been disavowed? He said that the cached page is a reasonable way to check. Another way would be if the actual pages have a date stamp on them then you can do a site: query and search by date to see if Google has updated the index.
How can you tell if a page has been cached?
To tell if a page has been cached, what you do is search for the url (in quotes) in Google and then click on the little green arrow that appears after the url in the Google search results. You can then click on “Cached”:
Then, you’ll see the cache date in the grey box at the top of the page:
In the example above, the page was last cached on Jun 21, 2014. What this means in regards to our disavow file is that if this were a link that I had disavowed, and I had filed my disavow prior to June 21 I can consider this link disavowed.
You can also check cached urls in bulk using a tool such as Scrapebox’s Google Cache Extractor Addon. I did just that as a little experiment.
I did some testing – were my clients’ links actually disavowed or were we still waiting for the links to be revisited and cached?
This next part of this article is interesting. I did a bunch of tests to see how I could benefit from knowing the cache date of urls that I had disavowed.
Does it really take 6 months to revisit a link?
John Mueller has said in the past that it can take up to 6 months for a link to be revisited and disavowed. What I did was take a look at five of my past penalty removal clients who had very spammy link profiles. My assumption was that the spammiest of links would be the ones that rarely got recrawled. I ran all of their urls through Scrapebox’s Google Cache Extractor Addon. For a large number of the links, no cache of the page was found. I’ll write more on this later on in this article. But, for the remaining links I looked at the cache dates and extracted some interesting information:
Oldest Cache Date
The longest time that a link has gone without being cached was March 27, 2014 which is under 3 months ago (it is June 23 as I write this.) This means that if this link had been disavowed in April or May, it still would be considered a live link by Google’s algorithms because Google has not yet revisited this url to apply my disavow.
Average Age of Cache Dates
- 0% of the urls had a cache date older than 3 months.
- 14% of the urls had a cache date that was between 2-3 months ago.
- 23% of the urls had a cache date that was between 1-2 months ago.
- 63% of the urls had a cache date that was less than a month ago.
Now, we can’t make any set-in-stone predictions based on this data as it is from a relatively small sample size that looked at the backlinks of only 5 sites. But, what I can see is that while the majority of my unnatural links that are in the Google cache get revisited (and subsequently disavowed) within a month, a good percentage of links are going to take 2-3 months to get recrawled and disavowed. What this means is that if you file a disavow today and Penguin refreshes in two weeks from now, there’s a good chance that Penguin is still going to be affecting your site and that you will need to wait until the next refresh in order for your disavow to be fully recognized.
What about links that are NOT in the Google cache?
When I first ran my disavowed links through Scrapebox’s Google Cache checker I saw a lot of results that looked like this:
There were a large number of links for which Scrapebox reported “No Cache”. At first I thought that this was a problem with Scrapebox or perhaps with the proxies I was using, but after manually checking a number of these I realized that they actually were not in the Google cache. In fact, a good number of the bad links that were pointing to most of these sites are no longer in the Google index. Now, of course that doesn’t surprise me as Google works hard to deindex spammy sites. But, it got me thinking.
What happens to links in my disavow file that are no longer crawled and indexed by Google? What if I have disavowed a link and Google never revisits it again? Can it be disavowed? Or is it always going to remain as a bad link to my site?
Do these deindexed/no longer cached pages just drop out of a site’s backlink profile?
I figured that perhaps most of these deindexed pages would just drop out of my link graph so I did another test. What I did was take one of my clients and determine how many of the links that I had disavowed were no longer in their Webmaster Tools list of links. I did the following:
- I made a list of all of the links that were in my disavow and that Scrapebox was reporting “no cache” for.
- I pulled out the Google Webmaster Tools (GWT) links that I had downloaded for this client back in October of 2013.
- Next, I used a VLOOKUP to determine how many of my disavowed and no longer cached links were in the list of links from GWT back in October of 2013.
- This showed me that there were links from 363 domains that were originally in our GWT list but are no longer in Google’s cache.
- I then downloaded the most recent set of links from GWT for this client and did another VLOOKUP to see how many of my disavowed, no longer cached links were in our current list of backlinks.
My assumption was that the majority of the links that came from sites that had been deindexed and no longer existed in Google’s cache would have just dropped out of the link profile and would no longer be counting towards my client’s site.
However, it turns out that out of 363 linking domains, 206 of them still remained as live links according to GWT! 57% of the links were still remaining according to Google’s list of backlinks in Webmaster Tools! Have these links been disavowed? If there are no pages linking to these spammy linking pages, how will Google ever revisit them to apply my disavow directive?
Added later: I asked John Mueller about this and he told me that pages that are not in the Google cache are not used for algorithmic calculations.
Can you force Google to revisit an old page?
Many people have asked me if it is possible to speed up the disavow process. John Mueller was asked in a hangout a while back if you could use the submit to Google feature to get Google to revisit a url but John said that that wouldn’t work. Another possible solution is to actually build links to the pages that you want Google to revisit. The idea is that as Google spidered your site with these links on them then they would end up spidering the spammy pages as they follow the links. To do this you’d likely want to set up the links on sites that you don’t care about and that aren’t connected with your own sites. There is no specific penalty that I am aware of that Google would give for this type of action but my gut tells me that it’s something that might be frowned upon by the webspam team. Still…I’m in the process of doing some experiments. I’ll let you know if they work out.
Conclusions
My hope is that Google has accounted for this in their disavow calculations, but who knows. John Mueller was asked in a hangout in March of 2014 whether we should disavow links from deindexed pages and he said that we should. The reasons he gave were that these pages could pop back in the index and also that sometimes deindexed pages can still pass PageRank. This means that these deindexed and no longer cached pages can still be passing bad link signals to your site and I think that it is possible that they will never get recrawled and never get disavowed!
Who knows…perhaps this is one of the reasons why Penguin has not refreshed in so long. Perhaps Google has realized that too many sites that have done good thorough cleanups are not going to escape Penguin because Google can’t recognize that they have disavowed so many sites that are no longer in the Google index. Maybe they’re still trying to find ways to recognize ALL of a site’s disavowed links.
What do you think?
Comments
Marie, this is GOLDEN information! This is the kind of article that I like to read from the SEO community. Thanks so much for sharing your experiment with us. You just got yourself a very loyal follower. All the best! =)
Thanks Gent! I had a few people tell me that I give away too much info, but really, I’m not worried that my competitors get to learn this stuff. There’s plenty of penalty work to go around. 🙂
“John Mueller was asked in a hangout a while back if you could use the submit to Google feature to get Google to revisit a url but John said that that wouldn’t work. ”
What are your thoughts on this? Surely if Google revisits a URL it recaches it? I’ve got plenty of URLs recached in minutes using that tool …
Nice one Marie. Part of this relates to a question I had asked earlier. I’m going to act on it. (surreptitiously in some way, needless to say) 😀
This is great work Marie, thanks for sharing it with the rest of us.
Thanks for the encouragement Saijo. Glad you found it useful!
Sweet stuff. Marie, could you not submit a “disavow-sitemap.xml” file? It would have a list of all the backlinks in your disavow file.
Hmmmm…interesting question. I have to admit that I don’t know as much about sitemaps as I could. Can you submit a sitemap with links to sites that are not yours? Also, I wonder if there would be any negative component to giving Google more associations between your site and the spammy ones?
Certainly a risk. Although, you could submit the disavow from any domain (similar to linking from a random domain that is not associated with the main site).
Ideally, one would create a separate Google account, then use a throwaway blog to house the disavow sitemap.
Thanks. I may consider trying this. I’m currently running some experiments to find ways to cache pages quicker and I’m having some success with some methods. May publish once I’m done.
Thanks for the great info! What would you recommend for a webmaster who is seeing small amounts of link spam that are not in the previous disavow file? Would you risk resubmitting another disavow file before the next penguin refresh or wait it out?
Most definitely disavow them. There is no risk in filing another disavow, but there is certainly a risk that the next Penguin refresh would take into account those bad links if you don’t disavow them!
Thanks for the reply!
Interesting article. Everything just got a bit more complicated. 🙂 Maybe Google doesn’t even know. I would like to see how your current experiments are going as of 5 months ago…
Wow you’re such a smart cookie.
Good, and interesting info re: disavowed links – thanks 🙂
It’s very easy to tell if the disavowed link has taken affect. Just look at your inbound links in the Google Webmaster tools page! They disappear when they are disavowed. Like, duh.
Hi Matt,
I wish this was true, but it’s not. Disavowed links will still be in your list within Google Search Console (Webmaster Tools). John Mueller has said several times that they stay there just like nofollowed links do.
I have taken your findings and put them to the test. I disavowed 30 spammy blog post. It’s been almost two months now. All but fifteen have been deindexed. Out of the fifteen that are live six Google has revisited since the disavow. ( According to the Cache date they should be disavowed ) From my research if the page is not indexed it provides the site no link juice.
Great article!
Thanks Dan. It’s interesting that after two months there are still nine of these posts that have not be recrawled by Google!
Okey i fully understand what you are saying Marie and i been doing the same thing to look, how ever what i been doing is using a Mass ping tool to ping all domains and the urls that i disavow and boom Google revisits those spammy sites and my disavowed links take effect in 1-2 weeks instead of months. Also this method works pretty good like 70-85% disavow if you just submit the root domain to a mass pinger and then Google sees that the entire domain is spammy and pretty much your site is nice and clean. For small sites i just use Google URL submitter and that makes it almost instant.
Can you share with me which tool you used? I tested a bunch of pinging tools and I found that I could not get sites crawled any faster with any of the tools I used. But, perhaps it is time for me to do another experiment!
Hi Marie, thanks so much for writing this. Still so relevant.
So helpful to have content like this in an industry that is so full of smoke and mirrors.
Joe
Hi Marie,
Thank you for the most comprehensive and easy to understand article on ‘Disavowed links’. It really is a pain of a subject. I only found out about disavowing a link last month.
I then followed step by step directions to disavow a spammy looking website that has backlinked to my website. What I found was interesting to me: 1) in the ‘Moz’ research tools that site seems to have 423 back links to me – but – with 0 spam score! Nevertheless, I disavowed the root domain of this site because I know it has nothing to do with my site and content! This happened about 3-4 weeks ago.
As soon as I read your article I checked again, and sure enough, this site still appears to have back links to my site, in the same Moz research tool.
I did the ‘cache’ check you describe and it appears that the date of the cache page is 2 days ago.
What should I presume? That my disavow has taken effect? That I still need to wait since I see it still has back links to my site?
Any advice is deeply appreciated… 🙂
Hi Tony,
You may not actually need to disavow that link. The disavow tool should be used by people who have a large number of links that were made with the primary intention of manipulating Google. When Google takes action (either algorithmically or manually) because of unnatural links it’s always on people who have been actively trying to create their own links.
Don’t worry about the fact that one site has 423 links. It’s probably a sitewide link. Unless that was a link that you paid for, or that you made yourself for SEO reasons, it’s not going to cause you problems.
Regarding, Moz’s spam score, it’s something that can sometimes be useful in helping to make decisions on disavowing. However, it’s often not accurate. I’ve seen really bad links that have low spam scores and the opposite.
Disavowed links will still appear in your list of links from Google Search Console and also other backlink checkers and it certainly won’t remove your link from that site. All that happens is that Google will no longer use that link in their calculations of PageRank.
And yes, if the url has been cached since you filed the disavow then you can assume that the disavow has taken effect.
Hope that helps!
Hi Marie,
Thank you very much for your really prompt and helpful response! I feel better and not as worried now 🙂 I certainly learned more after reading this article – it’s a big scary world out there 🙂
Thanks again for your time!
Hi Marie,
I’m new to the Disavow and just submitted my first Disavow list to Google. After doing so I noticed that one of my biggest and best domain links was buried in the list. Are you kidding me? I quickly went in and deleted the previous list and uploaded a revised list without my cherished link. Did I catch it in time or am I doomed?
This occurred within a 10 minute window.
Sick to my stomach…
Ha! True confession – I have done this too. You really should be fine now that you have removed the domain from the list.
You mentioned that the actual url that contains your link needs to be visited/recached for your link to be disavowed. What happens if a spammy site linked to me 400 times and then the host closed their account. Is it still possible to disavow links that will never be visited by Google anymore? Is there anyway to undo this damage?
If the site linking to you is not indexed, then Google won’t be using that link in their calculations. So, you can probably ignore links like this.
Hi Marie,
There are 700 spammy links point to my site. I have disavowed all those links, it’s been 2 months and nothing…my ranking has been dropping. Those links are still live but no long index in Google. Is there anything I can do?
Hi. A lot has changed since I first wrote this post. While disavowing still can help *some* sites, Google’s algorithms are now good at ignoring spammy links. The types of links that can still harm a site in my opinion are paid links and links made with the intention of improving your rankings. The rest are likely just ignored. I wouldn’t expect it to help to disavow spammy links these days. I’ll still disavow them if I’m already filing a disavow for other reasons, but wouldn’t recommend a disavow for a site that hasn’t been actively involved in making links for SEO. (The rare exception would be a site that was under a sophisticated negative SEO link attack – not spammy links.)
This is one amazing article, Marie! I’m spell-bound. I just have 2 questions in my mind.
How relevant is this now, in 2019? Should I now (in 2019) check if Google has cached all the URLs that I disavowed?
[Our website has bought a lot of paid links in the past. Presently we are buying high quality backlinks alone.]
Thanks Arun. Here are our latest thoughts on disavowing in 2019:
https://www.mariehaynes.com/does-disavowing-links-work-2019/
And yes, if you’ve paid for links in the past, even if it was years ago, you should be looking at disavowing those.