Welcome to WFindStrDotNet!

Download WFindStrDotNet at Get it from CNET Download.com!

WFindStrDotNet is the 3rd member of my text search family, which comprise WFindStr version 1.32 (originally written 2002), WIndexService (written 2006/03) and now WFindStrDotNet (also written 2006/03). Essentially WFindStrDotNet is a superior version of WFindStr, since it is written in C# and not Visual Basic 6, and it has better functionality; the only stuff missing which was in WFindStr is the deduplicating and sorting email addresses or indeed any text list - because of that and that alone, some people may find WFindStr version 1.32 a bit useful.

Having said that, the 2 main products are WIndexService and WFindStrDotNet. WFindStrDotNet can be useful if you dont have Microsoft Indexing Service (MIS), or you want to search on CD's or DVD's, or external hard drives which are not indexed by MIS, or if you find some other reason. WIndexService is so darned fast that it has many advantages. Maybe in the future I will combine the 2 products WIndexService and WFindStrDotNet, and the application will decide which technology to use for which search...

Feedback would be useful - all kinds - positive and negative - are welcome.

Most questions can be answered by reading the tooltip text in the application.

I leave below the original help file for WFindstr in case that helps to show some of the background to the system...

WFindStrDotNet uses the same kinds of commands as WFindStr, piggy-backing on the microsoft utility findstr.exe.



STOP PRESS 2006/03 - check my new product WIndexService at http://www.cd3wd.com/WIndexService

STOP PRESS 2006/02 - download wfindstr.exe installation at Get it from CNET Download.com!



STOP PRESS 2006/03 - there is a bug in WFindstr 1.32 which does not display the results in the display window, and the hourglass cursor does not revert to normal cursor, although the search is already finished. My sincere apologies for this fault - I will be rewriting WFindstr in DotNet in the very near future and re-issuing that on www.download.com - MEANWHILE - use this bugfix - open a browser, enter in the address C:\findstr.htm, press go - the system will almost certainly give you a 404 error message - file not found. Startup WFindstr, fill out the search criteria, start the search, then refresh the web page C:\findstr.htm very few seconds - after some refreshes the search results will appear. You can create a shortcut to that webpage C:\findstr.htm on your desktop to make the process simpler... and reduce both the WFindstr and the browser windows so you can see both at the same time on your screen.... Thanks for your patience.

Alex Weir

Welcome to WFindStr.exe – Alex Weir 2002 – All Rights Reserved – Conditional Shareware – 2002.01.31

 

Overview

 

WFindstr.exe is a text search utility for rapid search of a combination of keywords in .htm (.html), .doc (MS word), .pdf (Adobe Acrobat), .txt (text), or .* (any format)  files on local hard drives and especially on CD Rom drive and/or DVD drive.  It should also work on remote mapped network drives (e.g. D: thru Z:) but I have not checked that.  The intention is that these documents or pages are accessed directly (the same way as by clicking on the file from Windows Explorer) and not through a local web-server or intranet.

 

It is designed for the rapid search and viewing of .htm,.doc, .pdf and .txt files, and is therefore ideal for any text database of for example CV’s (resumes, biodata).  There is a refinement built in which enables you to do a search and then to create automatically a list of the email addresses corresponding to the file matches.  This list is in the format which allows you to copy and paste into the c.c. box in Microsoft Outlook and similar email packages.  The secret is in the naming of the .doc or .htm files – they should have as a name the Email Address, with a .doc or .htm at the end.   Because some people want to have more than one version of their CV, then the processing system in WfindStr.exe will also accept __01 , __shortversion and other types of entry like that in between the email address and the .htm or.doc at the end of the filename.  Note that this convention uses a DOUBLE Underscore (not a single underscore!).  Therefore an email address alexweir1949@yahoo.com can have a CV filename alexweir1949@yahoo.com.doc or alexweir1949@yahoo.com__02.doc or alexweir1949@yahoo.com__short.htm .   Press the LIST EMAIL button after a SEARCH is done to get this email address listing.  You can then shade, CTRL-C to copy, and paste into Microsoft Outlook or any other email package.  Thus if consultants email their CV as an email attachment using this naming convention, then the Consulting Company can simply drop those cv’s into any directory or into any directory-and-sub-directories system and do rapid searching when required to create short (or long) email address lists, to which job requirements are then emailed out….

 

It works only under Windows (sorry) – all 32-bit flavors I think, i.e. Win95, Win98, Win2000, NT etc etc..

 

Note that this product is Freeware (i.e. free to use) for Charitable Organizations, NGO’s, for my personal friends, for any and all users in the Third World, and for use in Schools, Colleges and Universities globally.  For others I will fix a Shareware Price, which will be very reasonable, but which will depend on the organization.   Typically this will be US$ 5-00 one-off fee for individuals, more for organisations.  I reserve also the right to change any and all of the above – my website as above will hold any news on that.  Interested organizations please email me at alexweir1949@yahoo.com

 

It utilizes the windows FINDSTR.EXE command to do its work, and shells from a Visual Basic 6 GUI

 

The installation is 3 Megabyte total and is available as email from alexweir1949@yahoo.com in 3 chunks. There are 3 files to download – WfindS2.cab, WfindS3.cab, and WFIunzip.zip. Download all 3 into any directory, unzip the WFIunzip.zip files into the same directory, then double click on the setup.exe file.

 

It can search on htm, doc, pdf, txt or .* files,  or on any of the 15 combinations of these 4 types.  You can set the type(s) of files which you are searching for, and store that as a setting which will also become default value when you next start the application.

 

It can search on up to 8 “AND” conditions – so it is quite powerful.  Windows search utility can only really search on 1 condition at a time….  Put each condition in any one of the 8  text boxes – press the HELP button to get an example, then press the SEARCH button to do the search.   To maximize search speed if you are using more than one condition, then put what you expect to be the rarer keywords or key-phrases in the left-hand box or boxes.

 

If you get zero results for a normal search while using 2 or more keywords or key phrases, then you can press the ADV SEARCH button to get a file count for each of the keywords or keyphrases you have entered.  That information will then help you to modify your search so as to get some matches (e.g. by eliminating any keywords which have a zero filecount).  Better still – do a FUZZY SEARCH – this will drill down up to 4 levels to find partial matches for the keywords and keyphrases you entered – note that this FUZZY SEARCH can save you literally hours of work re-entering some of the keywords….  You can also enter (in the textbox next to the Fuzzy Search button) the minimum number of matches or hits which is acceptable to you on a fuzzy search.  The default value for that is 1 but you may wish to set the cut-off point at 5 or even 20 or more…  Of course on the result page, the matches closer to the top usually match more keywords – check that out – the explanation on the results page is good…

 

Each condition can be a single word or can be several words (but note that if the htm code containing those several words is not on one line only, then the search will miss the phrase)

 

The time to do each search is detailed above the search results – it is typically 1 second on a CD in a 12x DVD drive searching thru 13 megabytes of htm files (185 files) for up to 8 keywords or keyphrases.  Faster obviously on hard drive.  170 seconds searching through 160 meg in 12,000 files on a CD in a 12x DVD drive with up to 8 keywords.  Check out your own speeds in your own environment.

 

You can link directly to each and every .doc, .htm. .pdf or .txt  file in the search results.  And this linking is done by a normal pop-up browser window, so that you can use the Find On This Page command in Internet Explorer to exactly locate any or all of the keywords you used.

 

You can easily change the drive and directory under which you search.  All searches automatically search subdirectories.  On this version you can only specify one drive and major directory; if there is demand to be able to search multiple drives and directories then this could be built into a future version of this Utility – email me at alexweir1949@yahoo.com.   You can store the changed drive and directory info so that on restarting the application, that new drive and directory become your default – press the “Store Drive + Directory” Button after changing the drive and/or directory to be searched.

 

The htm code for each and every search result can be copied by pressing the “Copy Htm Source” button – then you can make your own index htm page(s) with several commonly used searches if you wish (a bit like using FAQ’s).  These “pre-searches” also mean that CD’s or DVD’s designed for use on Windows, Mac and Linux can all benefit from WfindStr.

 

If there is a demand I can improve this product to allow 2 or 3 “OR” searches (each with up to 8 AND conditions)  – email me at alexweir1949@yahoo.com

 

Similarly I could do a proximity search if there is demand – i.e. several keywords on the same or nearby lines.

 

If or when this product starts getting used on any scale and I find that there are some FAQ’s (frequently asked questions), then I will deal with them on my website – http://www.cd3wd.com/wfindstr/

 

The Program deals nicely with non-standard ascii characters such as accents, umlauts, graves, acutes, etc as are found in French, German, Spanish, Nordic languages etc..  DOC, PDF and TXT files are handled automatically by the Findstr.exe;  and for HTM files the Program substitutes for example an a umlaut to &#228 before doing the search.  All file types are therefore handled well except if you choose the ALL_Files option – in that case the “umlaut to 228” substitution feature is not implemented for HTM files – all other options run with substitution.    There is a constant in the WfindStr.ini file called AsciConv – by default that is set to value =1 .   If that feature becomes inconvenient for any reason then manually reset the value to 0 using notepad or some other text editor.

 

The searches are not case-sensitive – they do not differentiate between capital and non-capital letters.

 

There is a Beep Facility – the default threshold is 20 seconds – any searches longer than 20 seconds will beep when the search is complete – you can change this value in the text box above the STORE button to anything you wish.  A value of 0 means that beeping never takes place.  This is convenient if any searches take really long – you can minimize the window and do something else useful like make coffee.

 

The HELP button also loads 8 specimen keywords and key-phrases into the 8 text boxes, then press SEARCH button to conduct  search.

 

There is a nice DEDUP EMAIL Email Button which takes helps with both the LIST EMAILS and the COPY HTM SOURCE options.  This enables you to deduplicate a list of email addresses which may have been generated from several OR searches or from a FUZZY SEARCH.  You can also deduplicate HTM SOURCE lines which may have been generated from several OR searches or from a FUZZY SEARCH and use the SAVE AS HTM AND VIEW suboption.  And DEDUP EMAIL can be used to sort and deduplicate any list of any kind from one or multiple text list files – that can be very useful.

 

Note the installation of the package is in French – please just use your intuition if you don’t speak the language.  There is sometimes an IGNORE and then a YES at the end of the install routine.

 

Alex Weir  ,  2002.01.31

TroubleShooting

 

  1. The main problem I foresee is that Win95 and Win98 do not include findstr.exe in normal installations (but it works perfectly well with them, if of course the findstr.exe file sits in the path).  I have not been able yet to test my install with Win95, therefore it is possible that the file findstr.exe which the install puts on your hard disc is not in the path. If so, then copy the file into c:\windows or c:\windows\system.  You will be able to tell there is a problem if WfindStr never finds any results and always brings up a Web Page not Found message even after a search (a web page not found message is quite normal when the program is started).

 

 

 

 

 

 

 

The Following is an overview/spec with I wrote 15.01.2002 – some of the above re-appears below.

 

 

String Search Utility

 

Need

 

  1. windows find.exe and findstr.exe are fast and efficient, but do not allow searching for multiple words inside a document or htm page.

 

  1. index server is limited to certain platforms (usually more expensive operating systems such as NT or Windows 2000 Enterprise).  And index server is dubious with CD Roms (?)

 

  1. when issuing info on CD rom, then use of a packager like Greenstone Library is useful and allows good and rapid searching, but it encapsulates the contents and makes modular useage and copying off problematic.

 

  1. If a CD rom is issued only in htm format, then searching on multiple words becomes a problem (as in (1) above), although modular useage is simple.

 

  1. Some (many) organizations keep databases of people  and/or CV’s (resumes) – it if often useful to them to be able to do free text search, either as the main search medium, or as a backup search technique (e.g. if a new – non-classified - search word or phrase becomes necessary).  Note that if free text search is the main search medium then there is no need for admin work to –pre-categorise CV’s – they can be dropped into a directory or directories as .doc or .htm files.

 

Present products

 

  1. Altavista used to have a free product called Discover or Discovery (or something like that).  This has now been replaced by expensive chargeable products.

 

  1. Freeware and shareware sites (see http://www.cd3wd.com/ShareWareCD  for a listing of some of these sites) have some products which I last reviewed, downloaded and tested August 2001, and found quite far from the requirements as I saw them.

 

  1. Ask Sam is possibly the most famous text search utility outside Microsoft index server

 

 

Specification

 

  1. Something which allows AND and OR command searching

 

  1. the results are presented as a web page with links, from which the docs or htm’s can be immediately accessed

 

  1. that access either by existing browser window or by pop-up window, at least so that the Find On This Page command can be used to locate exactly one or several of the search keywords

 

  1. something which allows up to at least 3 and maybe up to 8 AND combinations of key words

 

  1. operation must be rapid

 

  1. results should also if possible be able to be copied and pasted into other more permanent and/or more useable index pages.

 

  1. the product should be freeware or shareware

 

  1. it is possible that the search on multiple keywords should operate on a proximity basis – e.g. be on a range within +/- 2 lines of each other in the doc or htm.  This should be a selectable / toggleable feature, and the number of lines should also be specifiable.

 

  1. the drive and directory to be searched should be saveable in a config or .ini file, but of course be modifyable (and re-saveable) on each search.

 

  1. possibly up to 3 drive+directory combinations should be searchable on each search

 

  1. one should be able to search on .doc or .htm or (.doc and .htm)

 

  1. possibly additional user-specified formats should be included, such as .txt files etc etc..

 

  1. ideally UNC addresses as well as normal local and/or mapped drives should be searchable

 

  1. possibly a database of word frequencies should be included, so that the fastest searching can take place and so that a zero-return search can be indicated without any search having to take place

 

 

Proposed Methodology

 

  1. Use of the standard windows FINDSTR.EXE command with a Visual Basic GUI wrapper.

 

Potential Users

 

  1. Academics doing self-publishing of teaching and other materials on CD Rom and/or DVD,  computer recruitment agencies,  other recruitment agencies, large companies for personnel records and application letters/cv’s.  Production and marketing companies issuing product CD Roms or DVD’s.  Etc etc etc..

 

Alex Weir

 

Acknowledgements:

 

Martin Parkes (currently on assignment in China), VITA USA (http://www.vita.org) and Michael Loots of Humaninfo  Belgium (http://www.humaninfo.org) have all been instrumental (sometimes without knowing) in convincing me that there is a need for something like WFindStr.  Klaus Stelzl from Munich Germany provided the VB Shelling code – thanks.  Matthias Heuer of Cambridge Technology Partners Frankfurt Germany drew my attention to language-specific issues – thanks.  Verner Jensen of Danagro Copenhagen drew my attention to AltaVista Discover – thanks.  Ian Mitchell suggested I try some fuzzy logic – thanks Ian.  Thanks to http://www.anytimenow.com for running a free download site which seems to meet the requirements of programmers like me – there seem to be a lot of possible File Download sites and free sites out there but most have difficult-to-discover and apparently stupid restrictions which make them unuseable. 

 

PS – I am a commercial programmer, specializing in Visual Basic (VB), VB.Net, Visual Studio Net, Sql Server, Oracle, Access,  etc etc – Client Server and Web Database solutions.  I freelance and work for anyone anywhere – email me if interested, and/or view my Resume at http://www.cd3wd.com/resume/resume.htm.  And I am interested in doing other Systems which are useful to Mankind – so little of IT work has any socially beneficial impact – such Systems I am interested to do free of charge – contact me.