I am seeing a lot of confusion on the web in regards to Google employee Gary Illyes' statement that Penguin will be refreshing in the next few weeks. According to an article on Search Engine Land, Gary said the following in regards to your disavow file:
Gary also said that if you disavow bad links now or as of about two weeks ago, it will likely be too late for this next Penguin refresh.
A lot of people are confused by this. Here are some of the questions that I have seen:
I thought it would be a good idea to write a post to try to explain what is happening and why you likely do not need to worry about your disavow provided that you had a decent one in place a couple of weeks ago.
A refresher on how the disavow tool works
This will likely be old news to most of you, but let's talk first about what happens when we use the disavow tool. When you place a link in your disavow file, as Google crawls the web, if they come across a link that points to your site, and that link happens to be in your disavow list, they will add something like an invisible nofollow tag to the link. What this invisible nofollow tag does is two things:
1. Stop the flow of Pagerank (or "link juice") that is going through that link.
2. Tell Google not to use that link when they are making their calculations for link based algorithms such as Penguin.
A refresher on how Penguin works
Now, I don't believe that there is anyone outside of Google who really can say how Penguin works. But, the way that I understand it is that Penguin makes an overall measure of the amount of trust that they can put into the links that are pointing to your site. We know that there are on page factors that can contribute to Penguin, but Google has said that links are the primary area of concern.
@joshbachynski saw your comment on Barry's post. Certainly links are a primary area to monitor. Been true all this year; expect to continue.
— Matt Cutts (@mattcutts) August 16, 2012
If Penguin determines that your links are untrustworthy, then the algorithm can cause your entire site to perform poorly in the search results. John Mueller from Google describes it as having an anchor that is pulling your site down.
The important part: Penguin does not run all the time
It is important to know that at this point, Penguin is not running all the time. What happens is this:
- Google gathers data about your links. This is likely a one time process each time Penguin runs. The way I imagine it is like someone from Google presses a button and then the "Penguin crawler" is unleashed and goes and gathers all of this data. This is where they take into account the fact that you have disavowed links, or possibly the fact that it looks like you have been building new unnatural links.
- Calculations are made. At this point, Google takes all of this new data and decides to what degree Penguin will affect your site. It's probably a very complicated process. But the simplified version as I see it is that they decide how big of an anchor they will be placing on you to suppress you in the rankings. Many people do not know that Penguin can affect sites to different degrees. Sites that have done some serious spamming can be severely demoted by Penguin, while others may just see a small decline.
- Penguin reruns. Up until this point, Google is just holding on to this data. Nothing has changed in the search results. But, when Google decides to refresh or update the algorithm (more on the difference below), THEN, you see the results of these calculations and you will see your site either increase or decrease in the search results.
Getting back to Gary Illyes' statement about recently disavowed links
In the bullet points above, I mentioned that part of the process of running Penguin is for Google to gather data about your links. I believe that what Gary is saying is that this part of the process has already been completed for Penguin 3.0. Google is currently processing this data, running tests on what the results would look like depending on how they deal with it, and then, at some point they will rerun the algorithm and we will all see the results.
What this means is that if you disavow a link today, it's not going to be included in the calculations for this next impending Penguin update. But, all of your previously disavowed links, up until the point where Google gathered Penguin data, are the ones that matter right now.
Gary did also comment that in the future, they will be running Penguin more frequently. So, by all means, if you have links to disavow, do it now. They will likely be taken into account with the next Penguin run.
The difference between a Penguin refresh and an update
When Google updates an algorithm like Penguin, what they are doing is changing how they deal with the data that they get. For example, a new Penguin update may put greater weight on a certain type of unnatural link. Or, it might do different calculations to determine just how much suppression a Penguin hit site will get in the search engine results. A refresh, on the other hand, uses the same calculations as the last time the algorithm ran.
An analogy would be to look at a car engine. An update of the engine would be if you tinkered with it and replaced some parts and perhaps made changes to how it processes the oil and gasoline. A refresh of the engine would mean that you just turned it on again.
In this upcoming Penguin rerun we will be seeing an update. Google is making changes to how the algorithm is run. In the future though, they'll likely do refreshes. With each refresh, the algorithm will gather disavow data, apply the current algorithm to it and then rerun the algorithm. With a refresh, the algorithm stays the same, but you can still see changes with your site if you have disavowed or removed links or if you have obtained new unnatural links.
Questions?
Hopefully that clears things up a little! But if there are still areas where you are confused about Penguin, I'm happy to see if I can help. Leave a comment below.
Comments
Great write up. I’ve been meaning to do a disavow clean up for a few weeks but didn’t get round to it. Hopefully I won’t soon be kicking myself. It does annoy me that I even have to do this in the first place, it’s the third parties that scrape my site and send dodgy links to me.