404 Error Pages

Welcome to 404 Error Pages .com

Overview

The Error 404 "Page not found" is the error page displayed whenever someone asks for a page that’s simply not available on your site. The reason for this is that there may be a link on your site that was wrong or the page might have been recently removed from the site. As there is no web page to display, the web server sends a page that simply says "404 Page not found".

The 404 error message is an HTTP (Hypertext Transfer Protocol) standard status code. This "Not Found" response code indicates that although the client could communicate to the server, the server could not find what was requested or it was configured not to fulfill the request.

The 404 "Not Found" error is not the same as the "Server Not Found" error which you see whenever a connection to the destination server could not be established at all.

The default 404 error page as shown on Internet Explorer is given below.

404 Error Pages

 

HTTP Status Code

Whenever you visit a web page, your computer will request data from a server through HTTP. Even before the requested page is displayed in your browser, the web server will send the HTTP header that has the status code. The status code provides information about the status of the request. A normal web page gets the status code as 200. But we do not see this as the server proceeds to send the contents of the page. It’s only when there is an error, we see the status code 404 Not Found.

Origin of Status Codes

As a part of the HTTP 0.9 specifications, the World Wide Web Consortium (W3C) established HTTP status codes in 1992. Tim Berners-Lee, who invented the web and the first web browser in 1990, defined the status codes.

List of Status Codes

A brief overview of HTTP status codes is given below.

Code

Meaning

Description

100

Continue

Confirms the client about the arrival of the first part of the request and informs to continue with the rest of the request or ignore if the request has been fulfilled

101

Switching Protocols

Informs the client about the server switching the protocols to that specified in the Upgrade message header field during the current connection.

200

OK

Standard response for successful requests

201

Created

Request fulfilled and new resource created

202

Accepted

Request accepted, but not yet processed

203

Non-Authoritative Information

Returned meta information was not the definitive set from the origin server.

204

No Content

Request succeeded without requiring the return of an entity-body

205

Reset Content

Request succeeded but require resetting of the document view that caused the request

206

Partial Content

Partial GET request was successful

300

Multiple Choices

Requested resource has multiple choices at different locations.

301

Moved Permanently

Resource permanently moved to a different URL.

302

Found

Requested resource was found under a different URL but the client should continue to use the original URL.

303

See Other

Requested response is at a different URL and can be accessed only through a GET command.

304

Not Modified

Resource not modified since the last request.

305

Use Proxy

Requested resource should be accessed through the proxy specified in the location field.

306

No Longer Used

Reserved for future use

307

Temporary Redirect

Resource has been moved temporarily to a different URL.

400

Bad Request

Syntax of the request not understood by the server.

401

Not Authorized

Request requires user authentication

402

Payment Required

Reserved for future use.

403

Forbidden

Server refuses to fulfill the request.

404

Not Found

Document or file requested by the client was not found.

405

Method Not Allowed

Method specified in the Request-Line was not allowed for the specified resource.

406

Not Acceptable

Resource requested generates response entities that has content characteristics not specified in the accept headers.

407

Proxy Authentication Required

Request requires the authentication with the proxy.

408

Request Timeout

Client fails to send a request in the time allowed by the server.

409

Conflict

Request was unsuccessful due to a conflict in the state of the resource.

410

Gone

Resource requested is no longer available with no forwarding address

411

Length Required

Server doesn’t accept the request without a valid Content-Length header field.

412

Precondition Failed

Precondition specified in the Request-Header field returns false.

413

Request Entity Too Large

Request unsuccessful as the request entity is larger than that allowed by the server

414

Request URL Too Long

Request unsuccessful as the URL specified is longer than the one, the server is willing to process.

415

Unsupported Media Type

Request unsuccessful as the entity of the request is in a format not supported by the requested resource

416

Requested Range Not Satisfiable

Request included a Range request-header field without any range-specifier value

417

Expectation Failed

Expectation given in the Expect request-header was not fulfilled by the server.

422

Unprocessable Entity

Request well-formed but unable to process because of semantic errors

423

Locked

Resource accessed was locked

424

Failed Dependency

Request failed because of the failure of a previous request

426

Upgrade Required

Client should switch to Transport Layer Security

500

Internal Server Error

Request unsuccessful because of an unexpected condition encountered by the server.

501

Not Implemented

Request unsuccessful as the server could not support the functionality needed to fulfill the request.

502

Bad Gateway

Server received an invalid response from the upstream server while trying to fulfill the request.

503

Service Unavailable

Request unsuccessful to the server being down or overloaded.

504

Gateway Timeout

Upstream server failed to send a request in the time allowed by the server.

505

HTTP Version Not Supported

Server does not support the HTTP version specified in the request.

Meaning of 404

When we expand the code 404, the first digit “4” represents a client error. The server indicates that you did a mistake like misspelling the URL or requesting for a page that is no longer available.

The middle digit, 0 represents a general syntax error and could indicate a spelling mistake.

The last digit, 4 refers to a specific error in the group of 40x.

The World Wide Web Consortium (W3C) states that 404 Not Found should be used in cases where the server fails to find the requested location and is unsure of its status. Whenever a page has been permanently removed, the status code used must be 410. But hardly have we seen a 410 page. Instead, 404 Not Found page has become popular and the most commonly used error page.

Content of a 404 Error Page

A 404 response code is always followed by a human readable reason phrase as per the HTTP specification. Generally, a web server issues an HTML page that has the 404 code and the “Not Found” phrase by default. You can configure a web server to display a branded page with a better description and a search form. But the protocol level phrase requires no customization as it is hidden from the user.

Soft 404s

Soft 404 errors are actually “Not Found” errors returned by a web server as a standard web page with a 200 Ok response code. In an automated process of discovering a broken link, the soft 404 errors are problematic.

The BT Group of UK has a clean feed content blocking system that returns a 404 error to the requests for content identified as illegal by the Internet Watch Foundation. Even when the user tries to access the Government censored websites, a fake 404 error will be returned.

404 Error Percentages

A sample web trends’ summary report by ARCHIVI shows the client error details for 404 Page.

Client Errors

Error

Hits

% of Failed Hits

000 Incomplete / Undefined

29,164

69.62%

404 Page or File Not Found

12,651

30.2%

400 Bad Request

57

0.13%

18745 Incomplete / Undefined

5

0.01%

18747 Incomplete / Undefined

4

0%

401 Unauthorized Access

4

0%

Total

41,885

100%

 

Although the web statistics generally vary from month to month, based on the strategy used to eliminate 404 errors, and how active the website is, the percentage of 404 errors varies. Most active websites that have frequently changed or added content generally experience a higher number of Page Not Found errors. But there are many large and busy sites that achieve zero percent 404 errors over a period. On an average, around 7% of visits to any given web site will result in a 404 error page.

Tracking and Preventing 404 Errors

  • Log Files - Web Server log files help in tracking the 404 errors. These standard log files are just ASCII text files that have each HTTP protocol transaction, whether completed or not, recorded in them. Most of the HTTP errors are recorded in the transfer log and the error log files. If you have access to the log files of your website, you can observe the HTTP status code field. This field gives you an idea about the occurrence of 404 errors, their frequencies, consistencies, and also the referred document that led to the errors. Also find out the existence of any broken link on your site and the misspelled URL that led to the error. When you know all these information, you can easily correct the link and prevent 404 errors on your website.
  • Redirects – If you find a page that is consistently getting a 404 error, you can create a redirect page using the .htaccess file that automatically takes the users from an older page to its newer replacement. You can use Permanent and Temporary Redirects to "catch" old referrals from other sites and send the visitors to their intended information.
  • Robots File - If you have a section of your site with pages that are frequently changed, you can block the search engines from indexing them in their databases using robots.txt file so that you can prevent 404 errors.

Using Log Files

A sample line from a common transfer log file is given below.

revacsystems.com - - [18/June/2008:12:13:03 -0700]

GET /download/windows/happiness.zip HTTP/1.0 200 9887

http://www.payoneer.com/ Mozilla/4.7 [en]C-SYMPA (Win95; U)

 

Address or DNS revacsystems.com
RFC931 -
AuthUser -
TimeStamp [18/June/2008:12:13:03 -0700]
Access Request GET /download/windows/happiness.zip HTTP/1.0
Status Code 200
Transfer Volume 9887
Referer URL http://www.payoneer.com/
User Agent Mozilla/4.7 [en]C-SYMPA (Win95; U)

 

  • Address or DNS – This field refers to the address of the computer making the HTTP request.
  • RFC931 – This field identifies the requestor. If no information is available, you’ll find the symbol – for this column in the log file.
  • AuthUser - This field has the authenticated user sent via clear text.
  • TimeStamp – This field has the date, time, and offset from Greenwich Mean Time (GMT x 100) recorded for each hit. You can even compare the time stamps between entries so that you’ll know how long a visitor stayed on a given page.
  • Access Request – This field has one of the three types of HTTP requests. Get request is for a document or program. POST is to tell the server that the data is following. HEAD is for use by the link checking programs.
  • Status Code – This field has the status code of 200 meaning that the transaction was successful. If the requested URL doesn’t exist, then you would have noticed the 404 code in the log.
  • Transfer Volume – This field shows the number of bytes transferred.
  • Referer URL – This field refers to the page where the visitor was located when making the next request.
  • User Agent – This field refers to the information such as the browser, version, and operating system of the reader.

Using Redirects

Redirects are very useful when used in conjunction with a 404 error page. To redirect a page, simply follow the steps given below.

1. Create a file "notfound404.htm" with a message such as:

"Sorry, this page was not found. In a few seconds, you will be redirected to the Home page."

 

2. Allow 5 seconds for reading the message and then redirect.

3. A sample redirect code is:

<HTML>

<head>

<meta HTTP-EQUIV="Refresh" CONTENT="5; URL=not404.htm">

</head>

</HTML>

Note: The value for CONTENT specifies the number of seconds you allow the user to read the message before redirecting.

 

Using robots.txt File

Robots.txt file is useful when there are frequently changing sections on your webpage. To use a robots.txt file, simply follow the steps given below.

1. Create a file “robots.txt” in the root directory.

2. A sample robots.txt code is:

User-agent: *

Disallow: /disappearing/

Disallow: /soontobe404.htm

Note: User-agent: * will apply to all search engines. Disallow command helps you to block complete directories or only the individual files that change.

 

Customization of 404 Error Pages

According to a recent poll conducted by the web masters, only 23% of visitors that encounter a 404 page make a second attempt to find the missing page. The rest 77% of visitors will not make any effort to find a missing page whenever they encounter a 404 error. So, customizing such an error page on your site will increase your chances of keeping those visitors on your site. A custom error page will be an added feature and advantage to your website. It shows that you care for your visitors and made an attempt to catch them.

Besides, there are many problems with the standard error page. They are:

  • It doesn’t convey any information to your visitor about why he was shown the error page
  • It fails to match with the other pages of your site
  • It doesn’t help the visitor in any way and the visitor feels lost on your website.

To avoid these problems and provide your visitor with a better user experience, it is always ideal to customize a 404 error page.

Qualities of a Good 404 Error Page

A good 404 error page conveys a right message and leads the visitor to where he intends to go.

  • It has an error message written in plain language so that even non-technical users will find it easy to understand. The tone of the message is slightly apologetic.
  • It has an error message that tells what went wrong without admonishing the visitor:
    • Explains the user why the URL could not be found
    • Lists the most common mistakes in accessing the files on the site
    • Lists the consistent naming conventions used on the site
    • Provides spell check functionality for the failed URLs
  • It allows the visitor to go to other important links of the site:
    • Tempts the visitor about what’s popular.
    • Links to main page or help files or a site FAQ
    • Displays not more than 10 links
  • It has a link to email the webmaster or a form to submit the broken links.
  • It has a search field linked to the site's search engine and a list of links similar to what was entered.
  • It has no ads displayed.
  • It avoids redirection of 301 and 302.

Note: There is no need for you to do all of these things on a single error page. But following these simple tactics will help your visitor to stay on your site.

Forcing Internet Explorer to Display Your Custom 404

Internet Explorer versions 5 and below do not support your customized error page if its size is less than 512 bytes. Make sure you create customized error page that has the size greater than 512 bytes. Note that this doesn’t include graphics. The best way is to add some filler text in comment lines in the source code of your file.

The versions 5 and above have friendly built in error page. But this is no replacement to a customized error page. You can even change the default behavior of the Internet Explorer.

1. Go to Tools Menu in the Internet Explorer and click the Internet Options.

404 Error Pages













2. You will be displayed with the Internet Options dialog box. Click the Advanced tab.

404 Error Pages





















3. Uncheck the "Show friendly HTTP error messages" check box and then click the OK button at the end.

404 Error Pages
























.htaccess File

The Hypertext access (.htaccess) file is the Apache's directory-level configuration file that allows you to manipulate the behavior of the server. Often, the .htaaccess file is accompanied by an .hpasswd file for storing valid usernames and respective passwords.

The .htaccess file is used to specify the security restrictions of a particular directory. It helps the server to rewrite URLs and control user agent caching so that bandwidth usage, server load and perceived lag are reduced to a great extent. More commonly, it is used for customizing error messages for server errors.

Using .htaccess File

Before we use any .htaccess file, we need to know that it’s the filename in full and not any extension. For example, we cannot create a file called "error.htaccess" and the file is just called ".htaccess".

When you place a ".htaccess" file in any directory, it will be loaded via the Apache Web Server software. The file will have effect over the other files in the entire directory it is placed in, including the subdirectories. You can use text editors like TextPad, UltraEdit or MS Wordpad to create a .htaccess file.

After you create .htaccess file, you must upload it using a FTP (File Transfer Protocol) program. It is also important to note that while you upload the file, you need to upload it in "ASCII" mode.

Customizing 404 Error Page Using the .htaccess File on Apache Based Servers

It is not possible to customize your 404 error page if your web host has not enabled this facility for your website.

If your web host has this facility, you will usually find mentioning of this information somewhere in their documentation. In fact, if they mention somewhere that you can customize a file named ".htaccess", it probably means that you can also customize your 404 File Not Found error page.

The steps in customizing a 404 error page are given below.
Step 1: Create a customized 404 File Not Found Error page.
Step 2: Create a .htaccess file
Step 3: Upload both files.
Step 4: Test the page.

Step 1: Create a Customized 404 File Not Found Error Page

We’ll create a custom 404 error page based on a html/php script. This code will send an email notice whenever the error page loads. The notice has the information such as the Date and Time of the visit, the IP address of the visitor, the attempted URL, the visitor’s browser information, and the bad link that led to the error page.

Here’s the sample code:

<HTML>

<HEAD>

<title> 404 Error Page</title>

</HEAD>

<BODY>

<p align="center">

<h1>Error 404</h1><br>Page Not Found

<p>

<?php

$ip = getenv ("REMOTE_ADDR");

$requri = getenv ("REQUEST_URI");

$servname = getenv ("SERVER_NAME");

$combine = $ip . " tried to load " . $servname . $requri ;

$httpref = getenv ("HTTP_REFERER");

$httpagent = getenv ("HTTP_USER_AGENT");

$today = date("D M j Y g:i:s a T");

$note = "You are in a wrong page!" ;

$message = "$today \n

<br>

$combine <br> \n

User Agent = $httpagent \n

<h2> $note </h2>\n

<br> $httpref ";

$message2 = "$today \n

$combine \n

User Agent = $httpagent \n

$note \n

$httpref ";

$to = "error@yourdomain.com";

$subject = "yourdomain Error Page";

$from = "From: fake@yourdomain.com\r\n";

mail($to, $subject, $message2, $from);

echo $message;

?>

Visit our Home Page yourdomain.com

</BODY></HTML>

1. Copy this code and paste it in notepad as shown below.

404 Error Pages





















2. Replace "yourdomain.com" with the URL of your website.

3. Save the file as 404NotFound.php.

404 Error Pages

















Step 2: Create a .htaccess file

1. Open a new file in NotePad.

2. Add the following line to the file:

ErrorDocument 404 /errors/404NotFound.php

3. Save the file as .htaccess. The name must be all lowercase i.e. .htaccess

Note: When you create a new .htaccess file, the resulting file may be named as - htaccess.txt If that is the case, remove the extension and rename the file to .htaccess.


Step 3: Upload Both Files

1. Open the FTP tool through which you upload your website files.

2. Upload the 404NotFound.php file to the folder you wish to upload.

3. Upload the .htaccess file to the top directory.

Step 4: Test the page.

1. Now test your error page by typing a URL that you know does not exist. Your error page should load up like the one shown below.

404 Error Pages




















Common Errors with a 404 Custom Error Page

The most common error you will find is related to the URL. If you mention the URL incorrectly in the .htaccess file, it leads the web server into a loop when a visitor tries to access the missing file.

Another common error is related to the insertion of invalid hyperlinks on 404 Not Found page. So, When you provide the hyperlinks of other pages on the 404 Not Found page, make sure that they work and are not relative links. For Instance,

<a href="http://www.sample.com/support.html">Support</a>

instead of

<a href="support.html">Support</a>