Do 404 Errors Hurt SEO?

// // February 01st 2016 // SEO

Do 404 errors hurt SEO? It’s a simple question. However, the answer is far from simple. Most 404 errors don’t have a direct impact on SEO, but they can eat away at your link equity and user experience over time.

There’s one variety of 404 that might be quietly killing your search rankings and traffic.

404 Response Code

Abandoned Building

What is a 404 exactly? A 404 response code is returned by the server when there is no matching URI. In other words, the server is telling the browser that the content is not found.

404s are a natural part of the web. In fact, link rot studies show that links regularly break. So what’s the big deal? It’s … complicated.

404s and Authority

Evaporation Example

One of the major issues with 404s is that they stop the flow of authority. It just … evaporates. At first, this sort of bothered me. If someone linked to your site but that page or content is no longer there the citation is still valid. At that point in time the site earned that link.

But when you start to think it through, the dangers begin to present themselves. If authority passed through a 404 page I could redirect that authority to pages not expressly ‘endorsed’ by that link. Even worse, I could purchase a domain and simply use those 404 pages to redirect authority elsewhere.

And if you’re a fan of conspiracies then sites could be open to negative SEO, where someone could link toxic domains to malformed URLs on your site.

404s don’t pass authority and that’s probably a good thing. It still makes sense to optimize your 404 page so users can easily search and find content on your site.

Types of 404s

Google is quick to say that 404s are natural and not to obsess about them. On the other hand, they’ve never quite said that 404s don’t matter. The 2011 Google post on 404s is strangely convoluted on the subject.

The last line of the first answer seems to be definitive but why not answer the question simply? I believe it’s because there’s a bit of nuance involved. And most people suck at nuance.

While the status code remains the same there are different varieties of 404s: external, outgoing and internal. These are my own naming conventions so I’ll make it clear in this post what I mean by each.

Because some 404s are harmless and others are downright dangerous.

External 404s

External 404s occur when someone else is linking to a broken page on your site. Even here, there is a small difference since there can be times when the content has legitimately been removed and other times when someone is linking improperly.

External 404 Diagram

Back in the day many SEOs recommended that you 301 all of your 404s so you could reclaim all the link authority. This is a terrible idea. I have to think Google looks for sites that employ 301s but have no 404s. In short, a site with no 404s is a red flag.

A request for domain.com/foobar should return a 404. Of course, if you know someone is linking to a page incorrectly, you can apply a 301 redirect to get them to the right page, which benefits both the user and the site’s authority.

External 404s don’t bother me a great deal. But it’s smart to periodically look to ensure that you’re capturing link equity by turning the appropriate 404s into 301s.

Outgoing 404s

Outgoing 404s occur when a link from your site to another site breaks and returns a 404. Because we know how often links evaporate this isn’t uncommon.

Outgoing 404 Diagram

Google would be crazy to penalize sites that link to 404 pages. Mind you, it’s about scale to a certain degree. If 100% of the external links on a site were going to 404 pages then perhaps Google (and users) would think differently about that site.

They could also be looking at the age of the link and making a determination on that as well. Or perhaps it’s fine as long as Google saw that the link was at one time a 200 and is now a 404.

Overall these are the least concerning of 404 errors. It’s still a good idea, from a user experience perspective, to find those outgoing 404s in your content and remove or fix the link.

Internal 404s

The last type of 404 is an internal 404. This occurs when the site itself is linking to another ‘not found’ page on their own site. In my experience, internal 404s are very bad news.

Internal 404 Diagram

Over the past two years I’ve worked on squashing internal 404s for a number of large clients. In each instance I believe that removing these internal 404s had a positive impact on rankings.

Of course, that’s hard to prove given all the other things going on with the site, with competitors and with Google’s algorithm. But all things being equal eliminating internal 404s seems to be a powerful piece of the puzzle.

Why Internal 404s Matter

If I’m Google I might look at the number of internal 404s as a way to determine whether the site is well cared for and has an attention to detail.

Does a high-quality site have a lot of internal 404s? Unlikely.

Taken a step further, could Google determine that the odds of a user encountering a 404 on a site and then use that to demote sites from search? I think it’s plausible. Google doesn’t want their users having a poor experience so they might steer folks away from a site they know has a high probability of ending in a dead end.

That leads me to think about the user experience when encountering one of these internal 404s. When a user hits one of these they blame the site and are far more likely to leave the site and return to the search results to find a better result for their query. This type of pogosticking is clearly a negative signal.

Internal 404s piss off users.

The psychology is different with an outgoing 404. I believe most users don’t blame the site for these but the target of the link instead. There’s likely some shared blame, but the rate of pogosticking shouldn’t be as high.

In my experience internal 404s are generally caused by bugs and absolutely degrade the user experience.

Finding Internal 404s

You can find 404s using Screaming Frog or Google Search Console. I’ll focus on Google Search Console here because I often wind up finding patterns of internal 404s this way.

In Search Console you’ll navigate to Crawl and select Crawl Errors.

404s in Google Search Console

At that point you’ll select the ‘Not found’ tab to find the list of 404s Google has identified. Click on one of these URLs and you get a pop-up where you can select the ‘Linked from’ tab.

Linked from Details on 404 Error

I was actually trying to get Google to recognize another internal 404 but they haven’t found it yet. Thankfully I muffed a link in one of my posts and the result looks like an internal 404.

Malformed Link Causes Internal 404

What you’re looking for are instances where your own site appears in the ‘Linked from’ section. On larger sites it can be easy to spot a bug that produces these types of errors by just checking a handful of these URLs.

In this case I’ll just edit the malformed link and everything will work again. It’s usually not that easy. Most often I’m filing tickets in a client’s project tracking system and making engineers groan.

Correlation vs Causation

Not This Again

Some of you are probably shrieking that internal 404s aren’t the problem and that Google has been clear on this issue and that it’s something else that’s making the difference. #somebodyiswrongontheinternet

You’re right and … I don’t care.

You know why I don’t care? Every time I clean up internal 404s, it produces results. I’m not particularly concerned about exactly why it works. Mind you, from an academic perspective I’m intrigued but from a consulting perspective I’m not.

In addition, if you’re in the new ‘user experience optimization’ camp, then eliminating internal 404s fits very nicely, doesn’t it? So is it the actual internal 404s that matter or the behavior of users once they are eliminated that matters or something else entirely? I don’t know.

Not knowing why eliminating internal 404s works isn’t going to stop me from doing it.

This is particularly true since 404 maintenance is entirely in our control. That doesn’t happen much in this industry. It’s shocking how many people ignore 404s that are staring them right in the face. Whether it’s not looking at Google Search Console or not tracking down the 404s that crop up in weblog reports or deep crawls.

Make it a habit to check and resolve your Not found errors via Search Console or Screaming Frog.

TL;DR

404 errors themselves may not directly hurt SEO, but they can indirectly. In particular, internal 404s can quietly tank your efforts, creating a poor user experience that leads to a low-quality perception and pogosticking behavior.

Postscript: Leave A Comment // Subscribe (RSS Feed)

The Next Post:
The Previous Post:

2 trackbacks/pingbacks

  1. Pingback: SearchCap: Google Presidential Candidate Cards, Valentine's Day Searches & Locksmith Woes on February 1, 2016
  2. Pingback: Newsletter #160 - The Flexible Case of 404s Edition - ShivarWeb on February 7, 2016

Comments About Do 404 Errors Hurt SEO?

// 23 comments so far.

  1. Kristopher // February 01st 2016

    Ironically subscribe without commenting produces a 404 error.

  2. AJ Kohn // February 01st 2016

    And my 404 page is broken as well. Thanks for reporting this problem Kristopher. I’ll get around to looking at it … soon. Not as soon as I’d like but soon.

  3. Andrew Akesson // February 03rd 2016

    Hey AJ

    In my experience, removing internal 404s has produced increases in organic traffic and I agree with you completely – “You know why I don’t care? Every time I clean up internal 404s, it produces results.” I feel like that on quite a few Tech SEO aspects.

    If Google didn’t care about 404s, then they certainly wouldn’t mention them in the Search Console, and the same goes for soft 404s.

    I feel that anything Google shows us in the Search Console is there for a reason, otherwise they wouldn’t bother taking the time to develop something for us that provides meaningless information.

  4. AJ Kohn // February 03rd 2016

    Andrew,

    Thanks so much for your comment. It’s great to hear others who have embraced removing this type of 404 and seen the results.

    I tend to agree with you regarding what is displayed in Google Search Console. The information isn’t there just for giggles. If Google is pushing it back to us there’s a good reason for it. I’m always surprised and, frankly, saddened by the number of clients who simply ignore Google Search Console.

  5. Rick B // February 03rd 2016

    Strictly speaking of internal 404s, on a couple occasions I have seen significant performance hits when there are multiple on a page. In each case, the pages began improving almost immediately after they were fixed. Seems logical that there is a connection.

    Of course fixing the internal nav would also fix backlink issues but it is the former which seems more critical.

    Watch out for the odd 404 header on pages that load properly in the browser. This one often gets by dev & QA processes.

  6. AJ Kohn // February 04th 2016

    Thanks for the comment Rick. Yes, I surmise that internal 404s carry some added negative weight but it’s impossible to tell really. I just know that they hurt when you’re throwing them and it gets much better once you’re not.

    And yes, the page loads and looks like a 200 but throws a 404 code is devilish. That’s a subject for a whole other post where you’d introduce many folks to cURL.

  7. JR Oakes // February 06th 2016

    Just a technical note, but it incorrect to say someone has a 404 page. Someone has a page that returns a 404 status code (and probably a view indicating that status)

    ie.
    If authority passed through a 404 page I could redirect that authority to pages not expressly ‘endorsed’ by that link. Even worse, I could purchase a domain and simply use those 404 pages to redirect authority elsewhere.

    I know what you are saying, but the language was difficult to parse in the way it is worded.

  8. AJ Kohn // February 06th 2016

    JR,

    Thanks for your comment. Yes, you’re returning a specific page based on the fact that there’s a 404 status code. And that links on that page don’t pass any authority. If they did, then the endorsement becomes detached from the content.

    So any links that would be on a page that is returned based on a 404 status code would have this issue, whether you create a specific 404 page or not.

  9. Charles Atkins // February 07th 2016

    Hi AJ,
    I must commend you for writing such a detailed article with proper sections and explanations. Google Console is very important to me as well. It definitely is very unfortunate that many people just prefer to ignore Google Search Console.

  10. Stephen Hamilton // February 08th 2016

    I totally agree, AJ. I’ve seen instances following CMS migrations etc where redirects have not been managed well and has resulted in a drastic increase in internal 404s. Not only did this correspond with a fall in organic traffic, but fixing them lead to a recovery that took longer than the drop to occur, but did happen.

    I particularly liked your point about not redirecting every 404 that shows up. I see a lot of inbound links from crappy sites that I don’t even want to dignify by redirecting their (obviously made up) links to an actual page. I can’t imagine it would do me any good giving Google an indication that I either care or endorse such crappy/spammy sites.

  11. Anne // February 12th 2016

    We have bunch of 404s in our site… So thanks for writing this article, I should not worry to my site’s performance anymore.

  12. AJ Kohn // February 12th 2016

    Err, well Anne, make sure they aren’t the harmful variety. And be sure to see if you can redirect any of those appropriately.

  13. Anne // February 14th 2016

    Aww, ok thanks AJ.. Have to check again. So as much as possible, those 404’s should be redirected?

    I have 100+ 404s.. 🙁

  14. ant // February 27th 2016

    “Out of stock” problems are considered by Google Shopping. So it should be important, but not critical.

  15. Jo // March 14th 2016

    Hi AJ,

    Great article and very timely for me as I have just moved my site from WordPress to another platform. The first set of 404 errors is picking up trackbacks – any clues on the best way to fix these?

    Should I just treat them as external 404s coming into my site and ignore them? Or 301 redirect them to the page/post the trackback refers to?
    Many Thanks
    Jo

  16. Derek Sanders // March 23rd 2016

    AJ,

    This is interesting information. Any time I see an increase in 404s error in Google Webmaster Tools I always want to drop everything and get it fixed. I am glad to read it doesn’t directly affect SEO. I do agree that it does cause user problems which could indirectly hurt SEO in the long run.

    Thanks for the great information.

    ~Derek

  17. AJ Kohn // May 21st 2016

    Thanks for the kind words Derek. And it’s still a smart idea to drop everything and fix those internal 404s because they will hurt your SEO.

  18. SEO Guy // October 05th 2016

    One of my main keywords lost ranking overnight, from 3 to 45. I did the usually freak out, trip to get coffee then I checked all my stats. Bam… overnight 404 errors up the ass. I finally read this article and found out how to check where the links were coming from and they were internal “links.” Actually caused by our automated canonicalization. I’ll let you guys know when I get a chance to fix them, but I’m pretty sure that was the cause.

  19. AJ Kohn // October 08th 2016

    Sorry to hear about your situation but I’m interested to know if fixing them helps you recover.

  20. Loy Lauden // May 31st 2017

    Hi AJ,
    I have 707 soft 404 responses and most are internal links to photos that have now been properly renamed on my new site, or photos that are no longer being used.
    How should I correct these? should I do a 410 “gone” or 301 redirect and then “mark as fixed” in the Google Console?
    Thanks for your reply and a really informative post.

  21. Ryan // August 01st 2017

    Yes and no. having a 404 that the SE finds is not going to hurt you, it will eventually remove it from its index, but if you have a broken link pointing to a 404 then yes you are leaking PR. Seeing you did a scan, that usually follow links, I think you must have the later.

  22. AJ Kohn // August 01st 2017

    Next time try reading the content before commenting Ryan. I promise it’ll go better for you.

  23. Alex // January 19th 2018

    I was just thinking about this the other day.

    If I was Google the first filter I’d have is to check if it’s likely a SEO is involved. Then I’d apply my filters differently based on this. I’d then check how many errors a site has. A site that has no errors tells me they’re tech savvy and probably SEO involved.

    I’d then tick the box ‘SEO Involved’ on site and set it to = TRUE. Then I’d apply whatever other filters are required. These would be more strict filters than those with SEO Involved = FALSE.

    I’m only speculating though!

Sorry, comments for this entry are closed at this time.

You can follow any responses to this entry via its RSS comments feed.

xxx-bondage.com