WCopyfind 2.1 InstructionsPurpose:WCopyfind 2.1 compares text or word processor documents with one another to determine if they share words in phrases. Overview:
Step-by-step Instructions:Step 1: Start WCopyfind 2.1.
|
Click on image to enlarge |
This number is the minimum string length that WCopyfind 2.1 will consider to be a match. For example, when this parameter is set to 6, WCopyfind 2.1 will ignore matching phrases that are only 5 words long or less. I recommend leaving this parameter at 6 (words).
This number is the fewest matching words in a pair of documents that will cause WCopyfind2.1 to report a document match in its “Compare Documents” window and generate a pair of underlined comparison documents in the Report Files Folder. There is no recommended value for this parameter.
This number is the shortest sequence of printable characters that WCopyfind 2.1 will extract from a word processor or other document to use in its comparisons. Decreasing it will allow WCopyfind to extract shorter snippets of text, but may cause WCopyfind to include some non-text portions of word processor documents in the comparisons. I recommend leaving this parameter at 100 (characters).
This number is the maximum number of non-matches that WCopyfind 2.1 will allow between perfectly matching portions of a phrase. For example, if this value is set to 2, then WCopyfind 2.1 will bridge its way across two non-matching words to connect pieces of perfectly matching prose. A value of 0 will limit WCopyfind 2.1 to finding only perfect matches, while a value of 1 to 9 will allow WCopyfind 2.1 to find imperfectly matching phrases (matches that contain flaws). Increasing this value slows the program down. I recommend a value of 0 (if speed or absolute matching are your main requirements) or 2 (if you want to find matches despite minor editing).
This number is the minimum percentage of perfect matches that a phrase can contain and be considered a match. Setting this value at 100 limits WCopyfind 2.1 to finding only perfect matches. I recommend a value of 100 (if speed or absolute matching are your main requirements) or 80 (if you want to find matches despite minor editing).
When checked, this parameter causes WCopyfind 2.1 to ignore all punctuation characters when it is performing its comparisons. While punctuation will continue to appear in the reports that WCopyfind 2.1 generates, it will not affect the phrase matching. The matching will normally increase when punctuation is ignored. I recommend against checking this box unless you really want to ignore all punctuation.
When checked, this parameter causes WCopyfind 2.1 to ignore any punctuation characters that appear to the left or right of a word when it is performing its comparisons. For example, the quoted sentence: “The box, which I found, is broken.” will be treated as though it were simply: The box which I found is broken (with no final period) . While this “outer punctuation” will continue to appear in the reports that WCopyfind 2.1 generates, it will not affect the phrase matching. The matching will normally increase when outer punctuation is ignored. I recommend against checking this box if your want absolute matching, but for checking this box if you want to find matches despite minor editing.
When checked, this parameter causes WCopyfind 2.1 to ignore any number characters when it is performing its comparisons. For example, the words 8-fold and 10-fold will match if this parameter is checked. While numbers will continue to appear in the reports that WCopyfind 2.1 generates, they will not affect the phrase matching. The matching will normally increase when numbers are ignored. I recommend against checking this box if your want absolute matching, but for checking this box if you want to find matches despite minor editing.
When checked, this parameter causes WCopyfind 2.1 to ignore capitalization of letters when it is performing its comparisons. For example, the words Whenever and whenever will match if this parameter is checked. While capital letters will continue to appear in the reports that WCopyfind 2.1 generates, they will not affect the phrase matching. The matching will normally increase when capitalization is ignored. I recommend against checking this box if your want absolute matching, but for checking this box if you want to find matches despite minor editing.
When checked, this parameter causes WCopyfind 2.1 to completely skip words that contain any characters other than letters, except for internal hyphens and apostrophes. The non-words will neither be used in matching, nor will they appear in the reports that WCopyfind 2.1 generates. If you check this box, I suggest also checking “Ignore Outer Punctuation,” so that words that begin or end with punctuation aren’t skipped over (including plural possessives). I recommend against checking this box if you want absolute matching, but for checking this box if the documents you are comparing contain many non-textual items, including filenames, URL, and other word-processor junk.
When checked, this parameter causes WCopyfind 2.1 to completely skip words that are longer than the number of characters you select. The too-long-words will neither be used in matching, nor will they appear in the reports that WCopyfind 2.1 generates. I recommend checking this box and setting the number of characters at 20, unless your documents really do contain words longer than that. This choice will allow WCopyfind 2.1 to skip over many non-textual items, including filenames, URL, image data, and other word-processor junk.
When checked, this parameter causes WCopyfind 2.1 to load and use a word map (a generalized thesaurus) of your choice. Once the map has been loaded, WCopyfind 2.1 will examine each word it reads to see if there is a substitute in the word map. It will then perform that substitution prior to doing any comparisons for matching phrases. For example, if the word map indicates that the word “excellent” should be replaced by the word “good”, then WCopyfind 2.1 will consider beautiful and pretty to be matching words. Any number of word substitutions is allowed. The original, rather than the substituted words, will appear in any reports generated by WCopyfind 2.1. If you check this box, I suggest also checking “Ignore Outer Punctuation” and “Ignore Letter Case” because the word map requires perfect matching—it considers “Excellent” and “excellent” to be different words. Checking this box will slow the loading and hashing of the documents, but not the comparisons themselves. The format of a word map is described below. I recommend against checking this box unless you know what you are doing, have a good word map file prepared, and want to be able to find matches despite the presence of synonyms.
Copyright 1997-2006 © Louis A. Bloomfield, All Rights Reserved
Page Last Updated:
May 9, 2002 3:45 PM