I built a little system for keeping track of visitors to this site. Here's how it works - by way of disclosure, and in case you care... and also in case you have suggestions for improvement!
Many if not most sites track visitors, why?
The general information available from a visitor comes from three sources:
Keep in mind that HTTP is stateless, a webserver cannot remember anything from one request to the next. The only way to maintain state across requests is by saving information via the user's browser.
For many website purposes it is important to maintain information within a session. (Session is not a technical term, but it means roughly "same visit by same user from same computer".) Temporary cookies or form variables can be used for this purpose.
For other website purposes it is important or desirable to main information across sessions. Essentially this means "different visit by same user from same computer". Persistent cookies are the only way to accomplish this.
When passing information from one request to the next via form variables or cookies, there are two basic techniques. One it to pass the information itself through the browser. This has the advantage that the server need not store anything, but it has several disadvantages:
The other technique is to store the information on the server, and pass a pointer to the information through the browser. This is preferred.
I decided to assign each visitor a unique number, and store it in a permanent cookie. (The cookie is named w-uh, if you'd like to check...) This number is used as an index to a small database. In the database I keep track of each new visit, with "a visit" defined as "each time I see the cookie after at least three hours have elapsed since the last visit". I'm currently storing date, time, IP address, and domain, along with a count of visits. I also have another little database where I store referring URLs and their corresponding targets.
Because I have the date and time of a user's last visit, I can highlight things which are new since they last visited. I'm currently thinking about the most useful way to do this. Possibilities:
Stay tuned - I'll let you know what I decide...
Along with mere traffic information, I'd like to have a visitor's email address. That way I can communicate with them to tell them about site updates, ask their opinion, etc. The only way to get a visitor's email address is to ask them for it, and naturally you don't want to pester them. So I decided on the following logic:
The implementation was simple because my site has two entry point CGIs, index.cgi and noframes.cgi, as described in the frames article. These CGIs simply call a common subroutine to manage the cookie stuff.