WIR #10: Mastering Online Research

Mastering Online ReearchShaw, Maura D. Mastering Online Research: A Comprehensive Guide to Effective and Efficient Search Strategies. Cincinnati, Ohio: Writer's Digest Books, 2007.

This is going to be a lengthy post. Given the nature of this book, which I just finished reading, it has to be. Shaw provided lots and lots of URLs throughout this book, and I was diligent to visit every last damned one of them. Given that this book was published in 2007, it is inevitable that some of the URLs have either changed or are no longer valid. Fortunately, the number of links no longer valid were few. I took, as the old cliché says, "copious notes" so that you would be aware of the discrepancies. (NOTE: For convenience' sake, I've noted above the copyright information for this book and when I quote directly from it, I will cite only page from which the quote comes.)

On the whole, this book is fabulous. It's value is not diminished at all by any of the problems noted above. In fact, I found it to be quite inspiring, especially once I got past the most basic material. Below is a brief rundown of the book's chapters:

  1. The Internet Defined
  2. Conducting Basic Searches
  3. Conducting Advanced Searches
  4. Evaluating Websites
  5. The World of Hyperlinks
  6. The Best Websites to Begin Your Research
  7. Searching for People and Places
  8. Accessing Special Search Areas
  9. Searching for Image, Audio, and Video Files
  10. Research Skills for Writers
  11. Permissions and Copyright Issues
  12. Concluding Your Research

The chapters I enjoyed the most were chapters 6, 8, 10, 11, and 12. The list of websites in chapter 6 is excellent. Although not exhaustive, it provides a great starting point for any online research project. Chapter 8 has a section discussing the "deep web." That is, websites/web pages that are normally hidden from view. Some of these, such as many web pages at a corporation's web site, are not accessible at all, for privacy or proprietary reasons, and some might be personal web pages that have been marked with code to tell a webcrawler to ignore them so that they don't get cataloged by search engines like Yahoo! or Google. As a writer, it should be obvious that chapter 10 would be of interest to me. There is in this chapter a list of extremely useful web sites. Chapter 11 contains one of the best and most thorough, yet brief, discussions of copyright I've ever read (I cover this in some detail later in this review). Finally, chapter 12 provides a good list of questions to ask one's self to determine if you've done enough research.

Shaw's style is conversational, never condescending. For me, since I'm hardly a novice, the opening chapters were a chore to read through. If you're unfamiliar with the basics, if you're the sort of person who, when you visit Google to conduct a search, will type in something like, "When did Col. George Custer's Last Stand take place?" then you need to pay attention to the chapters on the basics.

Now, to cover the discrepancies:

Chapter 3: Conducting Advanced Searches

  • This chapter focuses mainly on the Google "Advanced Search" page, but also covers the same at Yahoo!, even if only briefly. Unfortunately, Google's page has changed significantly since this book was published, as noted below:
    • Find Results – The appearance of the "Find Results" section is different on Google's current "Advanced Search" page.
    • Language – This still appears.
    • File Format – This still appears, but is now listed at Google as "File Type."
    • Date – This still appears.
    • Numeric Range – This still appears.
    • Occurrences – This feature appears to have been dropped from the Google search engine. Per Shaw, this feature noted the number of times your search phrase appeared in each item listed in the results. In addition, it also allowed you to search anywhere on the page, anywhere in the title of the page, anywhere in the text of the page, anywhere in the URL, and anywhere in links to the page. Even Shaw in her book questioned the usefulness of this feature. Perhaps Google came to the same conclusion and therefore eliminated the option.
    • Domain – This appears to have been dropped.
    • Usage Rights – This still appears.
    • Safesearch – This still appears.
    • Page-Specific Search – This still appears.
    • Topic-Specific Search – This still appears.
    • Country – Here, Shaw refers to the Yahoo! advanced search, but Google has this, too. It appears under "Region."
    • Subscriptions – Again, Shaw is referring to a feature at Yahoo! which at the time of her writing was in beta. This does not appear on the Google "Advanced Search" page.
  • Other Search Engines –Below is a list of links to sites recommended by Shaw. Any discrepancies in the book are noted, but this list is not necessarily a list of discrepancies. Although Shaw focuses largely on the Google and Yahoo! search engines, she does offer brief sections on others, as follows:
    • ask.com
    • gigablast.com
    • wisenut.com — this takes you to a search engine whose default is in Korean; you can change the language, but for your reference, the link for this site in English is http://en.wisenut.com/.
    • Windows Live Search – Shaw's discussion of this points to www.life.com, a URL which takes you to the Life Magazine web site. (NOTE: When I originally tried that link, I was redirected to www.bing.com, a Microsoft owned search engine.)
    • looksmart.com — a syndicated pay-per-click search network.
  • Metasearch Engines
    • dogpile.com – does a simultaneous search in the Google, Yahoo!, bing.com, and ask.com search engines.
    • clusty.com – returns search results in topic clusters (from which it gets its name).
    • mamma.com – billed as "the Mother of All Search Engines," mamma.com was initially created as a graduate student's thesis. It distinguishes itself from other metasearch engines by its ability to search "deep content sites." Other names for the Deep Web, according to Shaw are "the invisible Web" or "the hidden Web."
    • webcrawler.com – similar to dogpile.com.
    • metacrawler.com – similar to dogpile.com.

Chapter 5: The World of Hyperlinks

  • Where Shaw discusses Yahoo!'s "Directory" link (p. 120), it does not appear where she says to find it, because the web site's organization has changed in the intervening time since the book was published. You must now go to Yahoo!'s home page — www.yahoo.com — then click on "View Yahoo! Sites," then click on the "Find Info" tab. Here, included in the list of categories and subcategories, you will find the "Directory" link.
    • Google has the same feature — go to their home page — www.google.com — then click on the "more" drop down link at the top, then click on "even more." The list that comes up will include "Directory." Google's directory doesn't appear to be as comprehensive as Yahoo!'s, but it is likely that their category listing is simpler than Yahoo!'s. This is a quick convenient way to find sites on history, for example, but it may not yield web sites pertinent to the historical period that you might be researching.
    • A fairly comprehensive directory can also be found at about.com. Simply click the "All Topics" link at the bottom of their home page.
    • The Open Directory Project — http://www.dmoz.org/ — is another good directory, one that is incredibly comprehensive (it's been around since 1998 and as of the writing of Shaw's book had compiled a database of over four million web sites in 590,000 subject categories).

Chapter 6: The Best Websites to Begin Your Research

  • The addresses provided for the Library of Congress catalog (www.catalog.loc.gov) is incorrect. It is: catalog.loc.gov. May seem like a minor thing to not include the "www," but if you do include it, you won't get to your intended destination.
  • The URL for the University of Virginia Hypertext Collection has changed from http://etext.lib.virginia.edu/ to http://www2.lib.virginia.edu/scholarslab/.
  • Trying to access http://encarta.msn.com/ will give you a page with the message, "The MSN Encarta page you are trying to visit has been discontinued."
  • Of course, Shaw provides a link to Wikipedia (www.wikipedia.org), but adds the caveat: "Not all the content is written by experts or reviewed before posting online—an interesting concept, but it makes me nervous when I've read a 'fact' in a Wikipedia article that I personally know to be incorrect. Double-check information from Wikipedia with another online source, just to be sure. Students should be aware that some university and college departments do not accept (...) Wikipedia articles as legitimate research." (p. 143) You can add to this the interesting tidbit of a story given by Neil Gaiman at his web site, where a friend of his, a movie writer, said, as Gaiman put it, "the Wiki articles on anything he actually had personal knowledge of were mostly only about 60% accurate." Gaiman told him that he should be able to change them, since that's how Wikipedia worked. The movie writer said he had changed them, but then everything had been changed back. So, yeah, Wikipedia, while very convenient, should always be viewed with suspicion as a reference source.
  • The URL for "Foreign Government Resources on the Web," found at the University of Michigan (www.lib.umich.edu/govdocs/foreign.html), is no longer correct. Such information can still be found, but the University of Michigan has since reorganized its web site. Googling on "Foreign Government Resources University of Michigan" gave me this URL: http://www.lib.umich.edu/government-documents-center/explore/browse/foreign-governments/255/search/.
  • www.mammahealth.com no longer exists, unless the URL was mistyped into Shaw's book. There is a web site with the following URL: http://www.mamashealth.com/. I don't know what this one is related to the database that Shaw references, however.

Chapter 7: Searching for People and Places

  • The URL www.nationwidespeakers.com is no longer valid.
  • The URL www.feedster.com is no longer valid.

Chapter 9: Searching for Image, Audio, and Video Files

  • The URL http://yotophoto.com/ does not point to a photography site. It currently points to an Apache 2 test page. I waited a few days to see if this test page would disappear since it could've been the result of maintenance, but it still points to the test page as of this writing.
  • The URL http://www.lcweb2.loc.gov/pp/pphome.html is no longer valid.

Chapter 11: Permissions and Copyright Issues

  • Shaw provides an excellent discussion on this particular topic. One thing rarely covered in any online discussion (that I've ever seen) on this subject is the doctrine of "fair use." Shaw, in a demonstration of the doctrine, quotes from the 15th edition of the Chicago Manual of Style, saying,

    In an exercise of the fair use doctrine advocated by the University of Chicago Press, I am quoting here a couple of sentences from its Manual of Style (section 4.75), to be sure that you understand the limits clearly. According to the Manual of Style, the doctrine of fair use "allows authors to quote from other authors' work or to reproduce small amounts of graphic or pictorial material for purposes of review or criticism or to illustrate or buttress their own points. Authors invoking fair use should transcribe accurately and give credit to their sources. They should not quote out of context, making the author of the quoted passage seem to be saying something opposite to, or different from, what was intended." (p. 287)

    Another important facet of "fair use" pointed out by Shaw is that "you cannot use such a large portion of a copyrighted work that its intrinsic value is endangered" (p. 288). What does this mean? She explains:

    If you reproduce an entire five-line poem, for instance, you've taken the whole piece away from the author without payment. If the poem is eighty lines and you quote only five of them in an article or book or web page that discusses such poetry, it could be considered fair use — you have not diminished the worth of the poem as it appears in the poet's published volume, and you may even have contributed to gaining new readers for the poet. Of course, you need to give accurate citation to the source of the poem, but you do not need to request written permission to use such a small portion of the work. (p. 288)

    Of course, if the work is used for commercial purposes, as when, say, an automotive manufacturer uses the lyrics and music of a musician to sell their car, then permission must not only be obtained from the creator of the piece, but payment must also be made. Some people think this is too restrictive, but were it their work that was being abused they'd complain to high heaven and sue for just recompense, too. Remember that the next time you quote someone else's words, use their ideas, or purloin their images off the Internet. It could be your work that's being stolen. Using a the image of a book cover to discuss a book you've read, as I'm doing here, is fair use (the same is true for the short bits I've quoted; I've credited the source and I'm not claiming to be the author). Fair use crosses into plagiarism, and possible copyright infringement (they are two different things), when you represent someone else's work as your own.

  • Shaw also discusses a well known example that went to court. She writes:

    [Nancy Stouffer of Pennsylvania — Mechanicsburg, Pennsylvania, I believe] claimed that J. K. Rowling had stolen the term "Muggles" and the physical appearance of a character named Larry Potter from works published in 1983 and later. U.S. Courts in 2002 found in favor of Rowling and assessed fines against [Stouffer] for lying and doctoring evidence.

    I knew about this case when it first became news because I lived in Mechanicsburg, Pennsylvania, at the time. (I did not know that Stouffer had been fined, however.) I had even visited Stouffer's web site and examined much of the work that was available on the Internet. Her claims were so ludicrous that I knew her case would be dismissed. (This is a link to a website created later by Stouffer and this is an incredibly interesting blog post written by a professional editor on the subject.) As for the word "Muggles," subsequent research by others showed that the term went back at least 300 or 400 years. The only way Stouffer could've had a case in that particular example would've been if she had been marketing her material under the term "Muggles," and then Rowling tried to do the same. In that event, however, it would've been an infringement of trademark, not one of copyright. Further, U.S. copyright law clearly states that titles and names cannot be copyrighted. Thus, I could easily write a work of fiction in which there is a character named Harry Potter and as long as I make sure that my "Harry Potter" does not resemble Rowling's, Rowling would not be able to sue me.

    As for obtaining permission to use quoted material, it's not that difficult. I've done it for articles I've written in the past. I probably didn't need to, given that my quotes were not of significant length, but I did so anyway. Permission was gladly granted, and I followed up by sending the copyright owner a copy of the article I had written. In fact, I've done this on a couple of occasions.

    Per Shaw, the guidelines generally accepted by publishers for fair use is that quotes may be no more than 300-350 words, total. Above, I've quoted a total of 351 words, so I can't quote any more, or I need to paraphrase anything else I want to reference from this book. The total words quoted should not exceed those numbers, or if they do, they shouldn't exceed them by much. It's not a matter of 300-350 from chapter one, and then 300-350 words from chapter seven. That's not how it works. It's in the general range of 300-350 total! Period. (My going over by just 1 word isn't going to be quibbled over, I'm sure. If you discount the sentences Shaw quoted from the Chicago Manual of Style — since those aren't Shaw's words — then I've quoted only 279 words from Shaw and 72 words from the Chicago Manual of Style.) For poetry, it's 4-5 lines, unless the poem is that short, in which case you'd be obligated to get permission from the poet. For short stories, about 100 words. The total quoted is proportional to the length of the work quoted. Quotations need to be accurate, too, and any missing text should be replaced with an elipsis (...). You can do as I did above, where I replaced a short portion of text with [Stouffer], but you should do it in a fashion similar to how I've done so that readers know unmistakably that you've replaced some text and you need to be accurate with your replacement so that you've not misquoted the author.

    Shaw points out that song lyrics are a different beast altogether. Trying to use even a single line from a song without permission can result in Legal Dragons coming out of their musical caves seeking to devour you without mercy. Best to obtain permission, or to let the beast sleep unquoted.

  • Shaw's coverage of copyright is incredibly thorough, but there is one point that has changed since her book was published. She notes that works created 1 January 1978 and after are copyrighted for the life of the author plus 50 years. However, at the U.S. Copyright Office's web site it states that currently "for works created after January 1, 1978, copyright protection lasts for the life of the author plus an additional 70 years." There are other factors that can come into play, however, so it's best to investigate the information found at the Copyright Office's web site to make sure which rules apply to any work of your own creation. To give just two examples, a "work for hire" scenario would have anything you've written belonging to the person who hired you. If you write for the Federal Government, then your work falls in the public domain.

    Every year older works enter into the public domain, but one cannot assume that just because something is old that it is not protected by copyright. It is safe to assume, however, that works prior to 1923 are in the public domain. This means you may quote from them freely, that the doctrine of "fair use" doesn't apply (since you would not need to seek permission of the work's creator), but that does not mean you are exempt from giving due credit to your sources. To not give credit, regardless of a given work's age, is plagiarism.

    What does this mean? Well, as an example, the works of Shakespeare are all in the public domain, as is the Bible. However, with the Bible you will find copyright notices in every modern translation. What is copyrighted in this case is the translation, not the original manuscripts. The publishers of each translation have web sites where you can learn more about their policies regarding the use of their translations. When it comes to Shakespeare, you will also find copyright notices, but what is copyrighted is the introductory material included by the publisher, historical notes, etc. Basically, anything not written by Shakespeare himself. So, even with older works, you not only have to take care what you quote, but you also have to give credit lest you suffer the slings and arrows of plagiarism be levied at you (quoted a little Shakespeare there, for ya :P — the "slings and arrows" bit, if you didn't know).

  • The URL www.memory.loc.gov/learn/start/cite/index.html in this chapter is redirected to http://www.loc.gov/teachers/usingprimarysources/. This — http://www.loc.gov/teachers/usingprimarysources/citing.html — is a good URL to use for learning how to cite electronic sources (which is Shaw's purpose behind providing the now defunct URL), as is this — http://www.loc.gov/teachers/usingprimarysources/chicago.html (this URL shows the formats given in the Chicago Manual of Style) — and this — http://www.loc.gov/teachers/usingprimarysources/mla.html (this URL shows the formats given in the MLA Handbook — MLA is the Modern Language Association)

It seems to me that a book of this sort ought to have a web site associated with it for online updates of the information contained within. Many computer books do this. Web sites come and go, and the more reliable and stable ones are often subject to reorganization. That alone would justify this feature.

0 comment(s):

Post a Comment