If you have been working diligently on cleaning up your backlink profile in order to recover from the Penguin algorithm, you may also want to have a thorough look at your on page factors as well. John Mueller from Google has just said in a Webmaster Central hangout that Penguin is not just about links.
I asked John this question because he previously had hinted in a Webmaster Central hangout that the Penguin algorithm could take into account more than just bad links. During the December 16th hangout, I asked the following question:
Does the Penguin algorithm only take into account links, or are there other factors that could be contributing?
You can hear John's answer (also transcribed below) at the 27:52 mark here:
I think, with the Penguin algorithm, when we rolled it out we did a blog post about that that kind of shows the other issues that we look at there. The Penguin algorithm is a webspam algorithm and we try to take a variety of webspam issues into account and use them with this algorithm. It does also take into account links from spammy sites or unnatural links in general, so that's something that we would take into account there, but I wouldn't only focus on links. Also a lot of times what we see is that when a website has been spamming links for quite a bit maybe they're also doing some other things that are kind of borderline or even against our webmaster guidelines. So I wouldn't only focus on links there. I would make sure that you are cleaning up all of the webspam issues as completely as possible.
On Page Factors that Could Contribute to Penguin
The article that John mentioned that Google published when Penguin rolled out is here. Here are some key quotes that are key to our discussion:
- "a few sites use techniques that don’t benefit users, where the intent is to look for shortcuts or loopholes that would rank pages higher than they deserve to be ranked. We see all sorts of webspam techniques every day, from keyword stuffing to link schemes that attempt to propel sites higher in rankings."
- "The change will decrease rankings for sites that we believe are violating Google’s existing quality guidelines."
The article shows the following example of keyword stuffing that could be affected, but also states that many sites that are affected by Penguin would not have as obvious an issue:
The article also references the quality guidelines. The guidelines are fairly long, so I have summarized here the parts that I believe could contribute to a site being affected by Penguin because they are committing webspam infractions:
- Keep the links on a given page to a reasonable number. We have recently seen Matt Cutts say that the "100 links per page" rule is not so much an issue now. However, if a site has pages with hundreds of links that are obviously just there for search engines and not readers then this could be a potential Penguin factor.
- Make pages primarily for users, not for search engines. This likely falls under the same category as keyword stuffing, but if you have page after page of articles that are written for search engines and not users then this could possibly be affecting you in the eyes of Penguin.
- Don't deceive your users.
- Avoid tricks intended to improve search engine rankings. A good rule of thumb is whether you'd feel comfortable explaining what you've done to a website that competes with you, or to a Google employee. Another useful test is to ask, "Does this help my users? Would I do this if search engines didn't exist?"
Google then gives an informative list of things that could be considered webspam:
- Automatically generated content
- Cloaking
- Sneaky redirects
- Hidden text or links
- Doorway pages
- Scraped ContentParticipating in affiliate programs without adding sufficient value
- Loading pages with irrelevant keywords
- Creating pages with malicious behavior, such as phishing or installing viruses, trojans, or other badware
- Abusing rich snippets markup
- Sending automated queries to Google
They also recommend the following although I don't know for certain whether these could be Penguin factors:
- Monitoring your site for hacking and removing hacked content as soon as it appears
- Preventing and removing user-generated spam on your site
How important are these on page factors?
While this information that Penguin can take into account more than just links is important, it is still vitally important that bad links to your site are addressed. Shortly after Penguin rolled out, Matt Cutts tweeted the following:
Certainly links are a primary area to monitor. Been true all this year; expect to continue.
In my opinion, links are still the most important factor when it comes to dealing with the Penguin algorithm. However, if you have been affected by Penguin, I think that it is important to look for on page issues that may be considered attempts to manipulate the search engine results or deceive users.
More articles by Marie Haynes on Penguin:
- Is Penguin recovery possible?
- Penguin recovery is possible but link cleanup is not enough
- Penguin 2.0 "recoveries"
- A theory about how Penguin and the disavow tool work
I regularly tweet about Penguin and unnatural links issues. You can follow me here.
What on page factors do you think are important in the eyes of Penguin?
I'd love to hear your thoughts. Do you have examples of sites that needed to clean up on page factors before recovering from Penguin? How important do you think on page issues are in regards to recover? Please leave a comment below.
Comments
Off the top of my head I can see abuse of internal anchor text as being potentially problematic. It is pretty rare to see a site that has been struck by Penguin to not have some obvious on page stuff that might be considered to be spammy. I think that once a site has been hit by a penalty (manual or algorithmic), it is far more susceptible to other algorithmic nuances that can affect the site in a negative way. It seems that the thresholds that can trigger these filters are lowered substantially for sites that are in (or have been in) the grips on a penalty. My advice is always to make sure that regardless of the type of penalty that is in play, to make sure that the rest of the site is as squeaky clean as possible.
Good advice Paul. Regarding internal anchor text, in this video Matt Cutts says that’s it’s usually not an issue unless extremely overdone: http://www.youtube.com/watch?v=6ybpXU0ckKQ. With that said, “extremely overdone” could be a Penguin factor!
Hi paul,
Could you give me a bit more detail on what you would consider abuse of internal anchor text. I would usually expect it to be normal and beneficial to be quiet keyword descriptive with internal anchor text?
It will be nice once we have a better grasp of what’s recovering and why… till then it’s tread carefully 🙂
Hey John, one of my blog http://www.zonatechno.com was having 4000 around pageviews per day, and in last week I dont know what went wrong, but my organic traffic reducted by almost 95%! I tweeted matt cutts about the same, and he replied that “check your webmaster tools”, I didnt find anything fishy in webmaster, so what could be the reason behind this penalty?
Hi Suumit. You won’t find John here. I just write about what he says. 🙂
Did your traffic drop around May 20? There was a huge Panda update that happened that day. The official date was May 20 but there were lots of people that saw changes starting May 18.
Around the same time, there was a new iteration of the Payday Loans algorithm which went after sites with REALLY spammy links.
Ohh I see! Thanks for reply 🙂
I generally explain Penguin to people as a filter aimed at the actions of SEO’s
In most cases thats mainly links but I can easily see it also covering any tactic used to manipulate the signals used by Google. Internal anchor, keywords where they dont really belong… etc etc