Power
Search Techniques (Boolean and Field Searching)
Why
"Power Search?" | What
is Boolean Searching? | What
are "Proximity Operators"? | What
is Field Searching? | For
More Information, take another online course! | For
More Information | Assignments
This week we are going to explore Power
Search Techniques. There are many excellent guides describing Boolean
logic and charts showing the features available at the "major" search engines,
which you should consult for more details (please see the section For
More Information). Here, our purpose is a brief overview of possible
techniques. Please note that while most of the major search engines we
have been using allow some advanced search capabilities, they appear or
are implemented differently.
And of course, we'll be going on another
Info Quest.
Why
"Power Search?"
We've all had the experience where we query a
search engine and come back with thousands, maybe hundreds of thousands
of "hits" or "matches" for our search terms. Unfortunately, it is time-consuming
to sift through the web sites, and usually not very profitable. When you
obtain a large number of "hits" from a search, this is known as high
recall. While this might be the goal in some cases (for example, if
you are working on a topic that is relatively new and you want everything
published on it -- and the number of "hits" is only a couple of hundred),
for the most part when people are searching the web, they are interested
in high precision. High precision means that the retrieved documents
are highly relevant to your subject, and is achieved by fine tuning your
search to accurately describe your topic with its unique aspects.
Many advanced web searching techniques are
old friends of folks used to searching more traditional databases, such
as those containing bibliographic citations or references to journal articles.
Some techniques are unique to the web because of its media and structure.
Important Note:not
all advanced techniques are enabled by all search engines! Consult one
or two of the charts in For More Information and/or read the Help documentation
of the search facility you are using.
Another Note: for the most part, search
engines at directory sites do not offer advanced searching features.
What
is Boolean Searching?
Boolean searching is an implementation of Boolean
logic and set theory. Boolean operators, such as AND, OR
and NOT, are used to combine search sets in a variety of ways and
appear within Internet search engines in a range of disguises. A very
brief overview:
Search phrase: cats and dogs
means find web pages in which both
terms occur
Search phrase: cats or dogs
means find web pages in which either
term occurs
Search phrase: cats not dogs
means find web pages in which the term
cat appears but not dog
Most web search engines have the capability
to implement these basic Boolean operators but may present them in a different
way. You will almost always need to go to an "Advanced" search function
to use true Boolean operators; however, you may be able to search using
implied
Boolean
using the symbols + (must include) or -(exclude)
from the "Basic" search interface.
Examples of usage:
AND
Use this operator to search for documents
where you'd like both terms to appear, narrowing a search.
Dalmatians AND feeding
OR
Use this operator to include synonyms,
particularly where there are several terms or names used for a topic, or
you would like to broaden a search.
Dalmatians OR spotted dogs
NOT
Use this operator to exclude terms, particularly
when your search terms have more than one meaning.
Blues NOT depression
Special Note: these Boolean operators are
often presented as options like "include all the words," (AND operator)
"include any of the words," (OR operator) and "exclude" (NOT
operator).
Another special
note: while you might expect that search engines default to an implied
AND
(which means if you enter 2 search terms it returns documents in which
they BOTH occur) in fact this is not always the case -- some search engines
default to the initially unhelpful OR (it returns documents in which
EITHER occur)
What
are "Proximity Operators"?
Also Boolean Operators, proximity operators
such as NEAR or ADJ are used to control how closely the terms
occur in the web document that is retrieved. For example, NEAR/3 means
that the terms must occur within 3 words of each other. Proximity operators
ensure that your terms are more closely related to another.
Examples of usage:
NEAR/x
Use this operator to search for documents
where you'd like both terms to appear within a specified distance of each
other, narrowing a search
Dalmatians NEAR/3 feeding (web documents
will be returned that have the term Dalmatians occurring within
3 words of the term Feeding)
ADJ
Use this operator to search for documents
where you'd like both terms to appear next to each other, narrowing a search
Dalmatians ADJ feeding
Special Note: the ADJ Boolean operator
is often disguised as the option "exact phrase."
What
is Field Searching?
Remember that a web search engine is only as
good as its database and indexes. Databases are collections of records
organized in a similar manner; simply put, this means they are divided
into fields that contain the same information in each record. If data is
entered into a separate field you can retrieve it using its field label.
This means that if you want to search by title, the search engine looks
in a special title index (or searches notations that indicate that the
term occurs in the title field) where it has collected data from the field
with the label title.
Field searching is so wonderful because you
can specify where to look in the web document; for example, in the title
only, or the url fields. Field searching allows you to be very specific
about where you want you terms to occur and hence is a very powerful tool.
Using Search templates:
I'm a big fan of advanced search templates
such as the ones used by Hotbot (http://www.hotbot.com)
and Snap (http://www.snap.com). These
templates use many Boolean and field searching techniques without having
to learn the syntax of yet another search engine.
For
More Information, take another online course!
Recommended:
Diane Kovacs, well known Internet Trainer
and Writer teaches Web or email based classes on a variety of topics including
a class in advanced searching techniques in the Spring.
For more information and registration information, go to http://www.kovacs.com/online.html.
The class, called "Finding Real Information on the Web: Advanced Web
Searching Tips and Techniques," may also be taken any time in email
format.
For
More Information:
-
Boolean Searching on the Internet by
Laura Cohen
(http://www.albany.edu/library/internet/boolean.html)
Excellent explanation of Boolean logic with
diagrams.
-
Choose the Best Search Engine for Your Information
Needs by Debbie Abilock
(http://www.nueva.pvt.k12.ca.us/~debbie/library/research/adviceengine.html)
Contains a comprehensive chart describing
information needs and possible Web-based resources to use.
-
Finding It Online (Debbie Flanagan)
(http://home.sprintmail.com/~debflanagan/main.html)
Contains general information on how to search;
also information on selected search engines and a neat "practice" facility.
-
Guide to Effective Searching on the Internet
(The
Web Tools Company)
(http://www.thewebtools.com/tutorial/tutorial.htm)
Series of 12 tutorials focusing on 48 topics
containing lots of tips on "Power Searching."
-
How to Choose a Search Engine or Research
Database by Laura Cohen
(http://www.albany.edu/library/internet/choose.html)
Contains in-depth charts describing information
needs and which search facilities to use. Ms. Cohen's work is always
up to date, fresh and comprehensive! Excellent.
-
NetLearn, the directory of Internet Learning
Resources (Robert Gordon Unversity, Aberdeen)
(http://www.rgu.ac.uk/~sim/research/netlearn/callist.htm)
"NetLearn is a directory of resources for
learning and teaching Internet skills, including resources for WWW, email
and other formats. Links with descriptive and evaluative annotations are
provided, covering: learning, teaching, navigating and providing information
on the Internet; learning HTML; demographics; special needs and foreign
language resources. " Well organized and comprehensive.
-
Search Tools Chart by the InfoPeople
(http://www.infopeople.org/search/chart.html)
This table contains useful information about
major search engines such as database sizes, Boolean and other search options
and miscellaneous special features.
-
Search Engine Math by Danny Sullivan
(http://searchenginewatch.com/facts/math.html)
How to maximize searches by learning a few
simple applications of + and - (plus more!). Highly readable and
techniques are effective.
Assignments:
Please use search engines only (that
is, no directories)
to find your answers so you can practice
some of the advanced searching techniques.
1. Please answer the following questions
with site address and actual answer.
-
Where can I find a picture of a dragon in the
Western Sahara?
-
Why was the Taj Mahal built? Use a web site that
includes a picture.
-
Where can I find out about recalls on children's
toys?
-
Find out why crime doesn't pay in Brazil.
-
What happened to Mr. Earnshaw on p. 39 of Wuthering
Heights?
-
Listen to Martin Luther King discussing defense
expenditures. What did he think we should be spending more money on?
-
What can you take a tour of this native of Queensland
that is invisible to the naked eye?
-
What class of people came after the Pharaoh in
the social pyramid?
2. Suggestion 2 implementations (search
queries which demonstrate the use of) of the following techniques:
-
Boolean Searching
-
Proximity Searching
-
Nesting
-
Field Searching
3. Try the following searches in 2 different
search engines, noting the differences. Do you get the results
you expect (for example, if you think you are narrowing the search, does
this in fact happen?)? Use a search template for the field searches, or
try out some of the field searching techniques.
-
rainbow trout
-
rainbow and trout
-
"rainbow trout"
-
rainbow +trout
-
rainbow -trout
-
rainbow trout but not farm
-
rainbow trout and not farm
-
(rainbow trout) and (fly or flyfishing)
-
rainbow trout in title field
-
rainbow trout in top level field
-
rainbow trout in .edu domain
-
rainbow trout in .gov domain
-
rainbow trout in .com domain