VICUG-L Archives

Visually Impaired Computer Users' Group List

VICUG-L@LISTSERV.ICORS.ORG

Options: Use Forum View

Use Monospaced Font
Show Text Part by Default
Show All Mail Headers

Message: [<< First] [< Prev] [Next >] [Last >>]
Topic: [<< First] [< Prev] [Next >] [Last >>]
Author: [<< First] [< Prev] [Next >] [Last >>]

Print Reply
Subject:
From:
Kelly Pierce <[log in to unmask]>
Reply To:
Kelly Pierce <[log in to unmask]>
Date:
Fri, 12 May 2000 22:23:45 -0500
Content-Type:
TEXT/PLAIN
Parts/Attachments:
TEXT/PLAIN (175 lines)
   The Chronicle of Higher Education

                                    From the issue dated January 21, 2000

   http://chronicle.com/weekly/v46/i20/20b00601.htm


  Searching for the Right Search Engine

   By ROBERT BERKMAN

   Researchers now have it all on the World Wide Web: facts on virtually
   any topic, available from the far corners of the globe, unfiltered by
   reporters, editors, or publishers, and usually free. But sometimes we
   feel that we have too much information -- often way too much -- and
   that it may not be correct.
   Despite the latest flurry of prime-time ads by search-engine vendors
   boasting that they can find anything you want online, search engines
   can't distinguish among Web pages based on their contents. The only
   way researchers can pinpoint information on the Web is if they learn
   how to do efficient Web searches, and which engines are best for which
   purposes.
   One important lesson is to understand the range of search tools now
   available. Many researchers don't realize that they can use
   hierarchical indexes, standard search engines, alternative search
   engines, meta search engines, and databases -- and that those tools
   are not all the same.
   In a hierarchical index -- probably the best known is Yahoo
   (http://www.yahoo.com) -- people trained to categorize information,
   such as librarians and indexers, examine Web sites and put them in
   categories and subcategories. Thus, when you do a search on a
   hierarchical index, it is much more likely that what you find will be
   relevant to what you are looking for.
   The drawback to hierarchical indexes is that they are extremely
   selective. Because they are created by human beings rather than by
   computers, they can include only a tiny portion of what is available
   on the Web. Of course, in these days of abundant information, that may
   not be such a bad thing.
   Yahoo uses a standard search engine as well. For that reason, the
   results of a search on Yahoo are split into several sections.
   "Category matches" inform you if your topic matches one of Yahoo's
   existing categories. "Site matches" are the sites that have been
   indexed and categorized. "Web pages" provide links to pages located by
   the search engine. Yahoo also groups results into two other sections:
   "related news," for any news item it locates on your subject, and "Net
   events," which are mostly chat sites.
   Yahoo is by no means the only hierarchical index, and some of the many
   others are aimed specifically at academic users. The latter group
   includes: AlphaSearch (http://www.calvin.edu/library/as), BUBL Link
   (http://www.bubl.ac.uk/link), and Infomine (http://infomine.ucr.edu).
   Then there are the standard search engines. Popular ones include
   AltaVista (http://www.altavista.com), Excite (http://www.excite.com),
   Go Network (http://infoseek.go.com), and HotBot
   (http://hotbot.lycos.com). Unlike hierarchical indexes, standard
   search engines send out software "robots" or "spiders" to search the
   Web and index the pages in each site they encounter. The engines then
   calculate mathematically how relevant the pages are to your search
   terms; each engine uses its own algorithm to rank pages. Factors in
   the calculation include the frequency and placement of your keywords
   on a page, and their occurrence in the descriptions that owners write
   of their pages, which are invisible to users. The search engine puts
   the pages that get the highest score at the top of the list of
   results.
   Savvy researchers will avoid standard search engines when they have a
   very broad subject. Instead, they will use a hierarchical index, to
   find just a few relevant, well-cataloged sites.
   Alternative search engines, which take various approaches to ranking
   and sorting the pages that they find, are often more helpful than
   standard engines. Northern Light (http://www.northernlight.com), for
   instance, ranks Web pages as a standard search engine does. But
   instead of displaying all of its results in a single listing, it sorts
   pages into categories and groups the results into folders. As an
   example, a search for "alternative energy" creates folders with labels
   such as "solar power," "air pollution," and "National Technical
   Information Service," which includes documents from that agency. And
   the folders contain subfolders. Within the solar-power folder, for
   instance, are folders for "photovoltaic systems" and "government
   sites." That arrangement of material can help you determine which
   groups of pages are most likely to be relevant to your needs.
   Ask Jeeves (http://www.askjeeves.com) takes an altogether different
   approach. You don't enter keywords, but type a question in plain
   English -- perhaps "Is there evidence of life on Mars?" Ask Jeeves has
   recorded millions of questions that users have asked it, and has found
   Web sites that answer those questions.
   The first thing that Ask Jeeves does after getting your query is to
   scan its database of questions and answers. It then gives you a list
   of questions that it "thinks" you want the answer to. If you select
   one of them, it lists sites that contain the answers. Ask Jeeves
   doesn't always work, but it can save you time, and it is fun to use.
   Google (http://www.google.com) takes yet another tack. Like other
   search engines, it first matches up your keywords to the pages it has
   collected in its index. Then, however, it ranks each page based on how
   many other pages link to it -- and how many link to those pages in
   turn. The pages you see at the top of your list of results are those
   with the highest number of links to other pages. The idea is that such
   popularity is meaningful, just as a diner that has many trucks parked
   in front probably serves better food than the diner whose parking lot
   is empty. The approach works. After several years of being a loyal
   AltaVista user, I am now a "googler."
   Oingo (http://www.oingo.com) has an even more radical approach. The
   site's slogan is "We know what you mean," and Oingo conducts a
   "conceptual search" to make sure that it understands your request. Ask
   it to search for "china," for example, and it will ask you to choose
   "porcelain" or any of the various geographical Chinas. Once you make a
   selection, Oingo will display "directory hits" and "Web hits." The
   site combines a hierarchical index and a search engine (it uses
   AltaVista), although the conceptual search applies only to its
   directory results.
   Search engines that search other engines are called meta search
   engines. Among the popular ones are Dogpile (http://www.dogpile.com),
   Inference Find (http://www.inferencefind.com), and MetaCrawler
   (http://www.metacrawler.com). The concept here is that because no
   single search engine indexes the entire Web, using a meta search
   engine allows a researcher to scan more sites. The downside is that
   such an engine needs to use a "lowest common denominator" search
   statement, so that all of the search engines that it searches
   understand the request. Therefore, meta search engines are not a very
   good choice for complex searches, involving, say, Boolean logic.
   (Dogpile does include some Boolean-search capabilities.)
   A completely different strategy is to search a database on the Web.
   Hundreds of databases originally searchable on CD-ROM or through
   proprietary online dial-up services are now available on the Web, and
   new databases are continually being born there as well. That makes it
   possible to search rich databases with a standard Web browser,
   although in many cases, the researcher must pay a fee or be affiliated
   with a university that subscribes to the database. The fee-based sites
   typically filter the data they contain, increasing the likelihood that
   the results will be relevant to a search; many also offer superior
   search capabilities, so requests can be more precise.
   The many new, free databases on the Web can also be helpful. A site
   that does an excellent job of identifying and sorting free databases
   is The BigHub (http://www.thebighub.com). Through its "specialty
   search categories," it allows you to search more than 1,500 databases
   on the Web, many of which are oriented toward academics.
   What new tools for searching the Web are on the horizon? At a recent
   conference, I heard about "vortals," vertical portals that provide
   information from only a designated slice of the Web. For example, a
   vortal might search only those sites and pages that have to do with
   health care. VerticalNet (http://www.verticalnet.com) offers portals
   to industries including communications and advanced technologies.
   Although the concept is a good one, the jury is still out on vortals'
   usefulness.
   Farther down the road are visual representations of search results.
   Those search tools display their results graphically, allowing you to
   see at a glance which items are the most relevant. A service called
   NewsMaps (http://www.newsmaps.com), for example, displays the results
   of your search as a thematic map. Topographical markers indicate
   clusters of similar documents -- the most similar ones are piled up
   into little hills. According to Cartia, the company behind the
   technology, the maps are created automatically by an algorithm that
   "reads documents, extracts the content, and organizes the collection
   into a map." You can view some sample maps at the site.
   No matter which search tool you choose, you will get the best results
   if you know what information you need, know the advantages and
   disadvantages of the various ways to search the Web, and regularly
   practice doing research online. Despite technological innovation, the
   best research tool remains the human brain.
   Robert Berkman is a member of the faculty of the graduate
   media-studies program at the New School University, and conducts
   workshops on searching the Internet. He is the author of Find it Fast:
   How to Uncover Expert Information on Any Subject



(end of article)


VICUG-L is the Visually Impaired Computer User Group List.
To join or leave the list, send a message to
[log in to unmask]  In the body of the message, simply type
"subscribe vicug-l" or "unsubscribe vicug-l" without the quotations.
 VICUG-L is archived on the World Wide Web at
http://maelstrom.stjohns.edu/archives/vicug-l.html


ATOM RSS1 RSS2