Brain Candy #7 - Metasearch Engines

Brain Candy #7 - Metasearch Engines

My first thoughts about creating a column for the Braegen were to write articles that would direct readers to places that they would find fun and interesting. I consciously planned to avoid any discussion of the mechanics of the Web, or on how to search it.

After discussing the evolution of the column with Catherine, I've had a change of heart. Each Mensan has a unique concept of fun and interesting. Helping people to find their own sites of interest is a valuable goal, too. A new class of search tool, called metasearch engines has developed, that makes searching more powerful. A metasearch engine searches search engines (nice sentence, eh?) With a metasearch engine, you submit a query in a simple syntax. The metasearch engine then farms out the search work to a number of search engines, takes what is returned, sorts it by some quality criterion and compiles the results. If these steps are done well, a metasearch engine can be a very easy, yet powerful way to gather good information on topics of interest.

There are two metasearch engines that I've used extensively, MetaCrawler at www.metacrawler.com and ProFusion at www.designlab.ukans.edu/profusion/. In doing a bit of research, I found another site, AskJeeves at www.askjeeves.com, that I've only used a few times, but feel deserves comment.

MetaCrawler is the first metasearch engine that I used regularly. It is fairly simple; you type what you want to look for, tell it where you want to look, and it goes off and searches. You can limit where it looks (by continent, or US sites only), how long it waits for each search engine to report back and how many results it uses from each search engine. You have the option to look for any word, all words or all words as a phrase for multiple word requests. MetaCrawler will then submit your query to search engines Yahoo!, Alta Vista, Excite, WebCrawler, Lycos and InfoSeek. It waits for a short time and then collects what it has received and shows it to you. A multi-site search that would have taken you fifteen minutes or more is completed in less than a minute. You do get fewer hits than you would have received by going to each site individually, but the hits you get are usually more to the point. The title and first few lines of each site come back, as well as the URL and search engine that found it. The hits are rank ordered according to a relevance scale. You can directly click on any of the URLs and move to the site.

The other metasearch engine that I use frequently is ProFusion. This site is slower than MetaCrawler, but usually complements MetaCrawler well. It allows the user more control of the process and includes more search engines (OpenText, Magellan and HotBot, in addition to those that MetaCrawler accesses). It allows the user to search the fastest three sites, the "best" three sites, all the sites, or manual choice. It allows Boolean search logic (eg: a AND (b OR c) AND NOT d) to be used and specifies which sites can process Boolean search logic. It also offers a personal information filtering service, which periodically submits a search and e-mails you when it changes.

Both MetaCrawler and ProFusion usually lead to very focused searches if you tailor your queries carefully. I'm including AskJeeves in this article because it seemed to offer something a bit different for the casual web surfer. AskJeeves parses English sentences and makes its best guess as to what you want. It gives you back a wide variety of things, including a lot that you didn't ask for, but may be interested in. For example, with the coming of baseball season, I asked it "Where can I find out about the Cleveland Indians baseball team?". I considered this search a success because I got a lot of information about what I was after. Also, I could easily access the same information for other baseball teams without submitting another query. Another question "Where can I find out about Ohio Wines?" was less directly successful, but provided me with other sites connected with Ohio (but not wine) or wine (but not Ohio) that I might want to check out anyway. It did find most of the Ohio wine resources I was familiar with (including our own CATBAR home page). Perhaps there just isn't anything else on the WWW about Ohio wine yet. I don't think I would use AskJeeves if I were in a hurry or had a very focused search in mind, but if you're fond of 'this isn't what I wanted, but it looks neat' results, it certainly seems to have a lot to offer.

One more search tip comes from Catherine: most sites, especially commercial ones, desperately want you to find them. They try to have a no-brainer URL, so that you can easily find them and remember them. So if you're looking for a company, for instance, try www.{companyname}.com. I was recently looking for some technical details for Sennheiser headphones; the above trick worked. Before I used the search engines, I tried www.indians.com and got the official web site of the Cleveland Indians. However, the official site of the Kansas City Royals is www.kcroyals.com. You might guess it, you might not. Comedy Central is www.comcentral.com, CD Now is www.cdnow.com, National City Bank is www.national-city.com, and Molecular Design, Ltd. is www.mdli.com . Some are hard to guess, and some are easy. If you guess wrong, you'll either get an error message or a random site, not of interest to you. But it might save you a few minutes. Some people try random words in the middle position, like www.peewee.com, just to see what they'll get. It can be a total waste of time in some cases (how can some web sites be so mind-bogglingly boring?) but sometimes yields a pearl. Have fun!

More Brain Candy | Back to Brain Candy Central | Return to the CATBAR Main Page.

CATBAR - Brain Candy 7 - Metasearch Engines / Brian Rock / Mar 22 1998