Web Start Letter - Website Tips and Info by Email Your Email:
We will never sell or give away your email address.

Internet Basics
   About the Internet
   About Web Browsers
   Why Domain Names

Getting Started
   HTML vs XHTML
   Making Webpage Files
   Naming Webpage Files

HTML Basics
   About HTML Tags
   Basic HTML Page
   DTDs and Doctype Tags
   Spaces and New Lines
   Special Characters
   Bold, Italics, More
   Writing Headlines
   Adding Links
   Making Lists
   Comments in HTML

Images and Colors
   How to Add Images
   Sources of Images
   Image File Formats
   Optimizing Images
   Color in HTML & CSS
   "Web-safe" Color Chart

More Advanced HTML
   Making Tables
   Formatting with Tables
   Making Forms
   Using Imagemaps
   Using Frames
   Meta Tags

Cascading Style Sheets
   Intro to CSS
   Ways to include CSS
   Some Useful CSS
   CSS Hover for Links

More
   Promoting Your Site
   How-To's Homepage
   Links

Meta Tags

Meta tags contain descriptive information (meta-information) about a page and are used in the head section of a page. Meta tags are visible when you read the p age source, but are not visible when the page is viewed normally.

There are two types of <meta> tags. Ones that imitate HTTP headers which are information that the web server usually sends the browser and ones that have nothing to do with HTTP headers.

I'm just going to cover a few of the more useful ones of each type here. For information about several other meta tags see the Meta Tag Dictionary.

<meta http-equiv="some header name" content="some value for header" />

Meta tag page Refresh

<meta http-equiv="Refresh" content="10;URL=http://www.someurl.com/" />

Setting Character Set for page

<meta http-equiv="Content-Type" content="text/html; charset=Shift-jis" />
<meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1" />

Meta Tag - Content Rating

This is a voluntary rating method where you can specify things like whether your content is "adult" or you have bulletin boards or other factors influencing whether the site is suitable for children. Most filtering software makes some use of this information.

Use PICS-Label for the http-equiv. See Platform for Internet Content Selection (PICS) on W3C's site for details of how to use this and discussion of its merits.

<meta name="type of info" content="value of info" / >

Just about anything could go in the name and content fields of a meta tag, many web authoring programs insert a meta tag that says they were used to make the page. There's only a few that are normally useful though.

<meta name="keywords" content="important words about site contents">

List the most important words and phrases describing your site here. Use both terms that are in the text, and ones that aren't, but that someone searching for your site might use. Don't overdo this. Avoid gratuitous repetitions, or unrelated words in meta tags. Some search engines will exclude your site if they think you are trying to trick them.

<meta name="description" content="Describe your site here">

If you use this tag, not only will some search engines count the words in this as more important than otherwise, some of them will use the descriptions as the page description summary on the results list.

Robot meta tags

This can be used anywhere on a site, and even if you don't have your own domain, but not as many spiders know how to use these.

<meta name="ROBOTS" content="NOINDEX, NOFOLLOW">

this tells the spider not to index (save information about) this page and not to follow any links on this page. You could also tell it to index the page and not follow the links, or to not index this page, but follow the links and perhaps index those pages. A couple other possibilities honored by some search engine spiders: NOIMAGEINDEX (index text on page, but not images), NOIMAGECLICK (don't link directly to an image), NOARCHIVE(don't save a copy of this page). More details at HTML Author's Guide to the Robots meta tag.

There is another way to tell spiders what you want them to (not) do on your site use a robots.txt file. It works with more spiders than the meta tags do, but it is still up to the people who wrote or configured the spider whether they honor this. However most major search engines will pay attention to your requests.

robots.txt only works if you have your own domain (eg: www.me.com). In the top level directory of your website (the one where your home page should be) put a file called "robots.txt". If it is empty then spiders will know that they have free rein on your site. If you want to limit spiders do it this way... In the robots.txt file put

User-agent: *
Disallow: /
this would tell all spiders that you don't want them on your site at all.

To tell them not to go to only certain parts of your site change the Disallow line to list those parts, always starting the directory name with a slash.

eg: Disallow: /cgi-bin/
You can exclude other directories by listing them on more Disallow lines below the first one. To exclude a particular spider put that spider's name on the User-agent: line instead of the *.

More details at Web Server Adminsitrator's Guide to the Robots Exclusion Protocol.

If you want to be certain that a page will not end up in a search engine, it should be in a password protected directory. If you can't do that then try to keep anyone from linking to it, or don't put it on the web.