SWISH++ Changes =============== ******************************************************************************* 6.1.5 ******************************************************************************* BUG FIXES --------- * mod/html/elements.c 1. s/int const/long const/ for 64-bit systems. * version.h 1. Upped version to 6.1.5. ******************************************************************************* 6.1.4 ******************************************************************************* BUG FIXES --------- * The definition of MAKEDEPEND was wrong. This broke dependency-file generation when not using g++. (This bug fix shall be known as bug fix MD1.) * Indexing of some ID3 tags was broken. (This bug fix shall be known as bug fix ID31.) CHANGES, file-by-file --------------------- * config/config.mk 1. For MAKEDEPEND, s/:=/=/ for bug fix MD1. * mod/id3/id3v2.c 1. In parse_int(), added cast to unsigned char for bug fix ID31. * version.h 1. Upped version to 6.1.4. ******************************************************************************* 6.1.3 ******************************************************************************* BUG FIXES --------- * The search(1) -d option didn't work. (This bug fix shall be known as bug fix SEARCHd.) * Fixed a mistake in the httpindex.1 manual page that showed incorrect use of multiple -e options. (This bug fix shall be known as bug fix HTTP1.) CHANGES, file-by-file --------------------- * man/man1/httpindex.1 1. Fixed the aforementioned mistake. * classic_formatter.c * classic_formatter.h * encoded_char.c * encoded_char.h * indexer.c * indexer.h * results_formatter.c * results_formatter.h * xml_formatter.c * xml_formatter.h 1. Added virtual destructors. * search.c 1. In main(), added check of opt.dump_word_index_opt for bug fix SEARCHd. * version.h 1. Upped version to 6.1.3. ******************************************************************************* 6.1.2 ******************************************************************************* BUG FIXES --------- * A LaTeX file ending in a '%' (with no newline) would result in a core dump. (This bug fix shall be known as bug fix LATEX1.) CHANGES, file-by-file --------------------- * mod/latex/mod_latex.c 1. Reworked handling of '%' in index_words() for bug fix LATEX1. * version.h 1. Upped version to 6.1.2. ******************************************************************************* 6.1.1 ******************************************************************************* BUG FIXES --------- * Fixed compilation on non-MacOSX systems. (This bug fix shall be known as bug fix MAC1.) CHANGES, file-by-file --------------------- * search.c 1. Moved ';' in usage message for bug fix MAC1. * version.h 1. Upped version to 6.1.1. ******************************************************************************* 6.1 ******************************************************************************* NEW FEATURES ------------ * Made search(1) cooperate with Mac OS X's launchd(8). (This feature shall be known as feature LAUNCHD.) CHANGES, file-by-file --------------------- * conf_var.h 1. Added "launchdcooperation" for feature LAUNCHD. * FollowLinks.h 1. Added missing: #ifndef PJL_NO_SYMBOLIC_LINKS * INSTALL.unix 1. Added seperate daemon section. * man/man1/search.1 1. Added mention of new -l/--launchd options for feature LAUNCHD. * man/man4/swish++.conf.4 1. Added mention of new LaunchdCooperation configuration variable for feature LAUNCHD. * Group.h * SearchBackground.h 1. Added missing: #ifdef SEARCH_DAEMON * LaunchdCooperation.h 1. New file for feature LAUNCHD. * search.c 1. Added launchd_cooperation global variable. 2. Made specification of launchd cooperation force not going into the background and not set resource limits. 3. Added -l case. 4. Added -l/--launchd to usage message. * search.h 1. Added launchd_opt. * search_daemon.c 1. Added checking of launchd_cooperation variable. * search_options.c 1. Added "launchd". * swish++.conf 1. Added new LaunchdCooperation configuration variable for feature LAUNCHD. * User.h 1. Added missing: #ifdef SEARCH_DAEMON 2. Added comment about -U. * version.h 1. Upped version to 6.1. ******************************************************************************* 6.0.6 ******************************************************************************* BUG FIXES --------- * The "fix" in 6.0.4 for queries containing meta-names broke another form of using meta-names, e.g.: some_meta=(word1 word2) that *should* be equivalent to: some_meta=word1 some_meta=word2 but wasn't. (This bug fix shall be known as bug fix MN2.) * The installation and removal of SYSV start/stop script symlinks was totally wrong. (This bug fix shall be known as bug fix SYSV.) * Some words with hyphens in manual pages weren't indexed correctly. (This bug fix shall be known as bug fix MHY.) CHANGES, file-by-file --------------------- * config/src/explicit.c * config/src/namespaces.c * fake_ansi.h 1. Removed since they're no longer needed. * GNUmakefile 1. Reworked creation of SYSV start/stop symlinks for bug fix SYSV. * mod/mail/mod_mail.c * mod/mail/multipart.c * mod/mail/vcard.c 1. Split handling of multipart and vCard mail into its own files. * mod/man/mod_man.c 1. Changed to using "normal" way of using iso8859_1_to_ascii() for bug fix MHY. * query.c 1. Completely reworked the way query parameters are passed for bug fix MN2. * version.h 1. Upped version to 6.0.6. ******************************************************************************* 6.0.5 ******************************************************************************* BUG FIXES --------- * Decoding of mail attachments didn't work right if the attachments contained bytes whose value > 127. (This bug fix shall be known as bug fix ATB.) CHANGES, file-by-file --------------------- * conf_var.c * do_file.c * mmap_file.[ch] 1. s/normal/bt_normal/ 2. s/random/bt_random/ 3. s/sequential/bt_sequential/ * charsets/utf7.c * encodings/base64.c 1. Removed call to iso8859_1_to_ascii() for bug fix ATB. * indexer.c 1. s/No_Meta_ID/Meta_ID_None/ 2. Removed new_file(). 3. Added call to iso8859_1_to_ascii() for bug fix ATB. * indexer.h 1. s/No_Meta_ID/Meta_ID_None/ 2. Removed new_file(). * index_segment.h * search.c 1. s/word_index/isi_word/ 2. s/stop_word_index/isi_stop_word/ 3. s/dir_index/isi_dir/ 4. s/file_index/isi_file/ 5. s/meta_name_index/isi_meta_name/ * mod/html/elements.h 1. s/forbidden/et_forbidden/ 2. s/optional/et_optional/ 3. s/required/et_required/ * mod/html/mod_html.c 1. s/No_Meta_ID/Meta_ID_None/ 2. Moved element_stack_ here. 3. Reworked new_file(). * mod/html/mod_html.h 1. s/No_Meta_ID/Meta_ID_None/ 2. Moved element_stack_ to .c file. * mod/id3/id3v2.h 1. s/Failure/hr_failure/ 2. s/Success/hr_success/ 3. s/End_of_Frames/hr_end_of_frames/ * mod/id3/id3v2.c * mod/id3/mod_id3.c * mod/id3/mod_id3.h 1. s/No_Meta_ID/Meta_ID_None/ * mod/mail/mod_mail.c 1. s/No_Meta_ID/Meta_ID_None/ 2. Moved stack_type, boundary_stack, and did_last_header here. 3. Renamed content_type enum values. 4. Reworked new_file(). * mod/mail/mod_mail.h 1. s/No_Meta_ID/Meta_ID_None/ 2. Moved stack_type, boundary_stack, and did_last_header to .c file. 3. Renamed content_type enum values. * mod/latex/mod_latex.c * mod/man/mod_man.c * mod/rtf/mod_rtf.c 1. Added call to iso8859_1_to_ascii() for bug fix ATB. 2. s/No_Meta_ID/Meta_ID_None/ * query.c 1. s/no_token/tt_none/ 2. s/and_token/tt_and/ 3. s/equal_token/tt_equal/ 4. s/lparen_token/tt_lparen/ 5. s/near_token/tt_near/ 6. s/not_near_token/tt_not_near/ 7. s/not_token/tt_not/ 8. s/or_token/tt_or/ 9. s/rparen_token/tt_rparen/ A. s/word_star_token/tt_word_star/ B. s/word_token/tt_word/ C. s/No_Meta_ID/Meta_ID_None/ * query_node.c * mod/man/mod_man.h * mod/rtf/mod_rtf.h 1. s/No_Meta_ID/Meta_ID_None/ * stem_word.c 1. s/initial/st_initial/ 2. s/vowel/st_vowel/ 3. s/consonant/st_consonant/ * stop_words.c 1. s/stop_word_index/isi_stop_word/ * token.[ch] 1. s/no_token/tt_none/ 2. s/and_token/tt_and/ 3. s/equal_token/tt_equal/ 4. s/lparen_token/tt_lparen/ 5. s/near_token/tt_near/ 6. s/not_near_token/tt_not_near/ 7. s/not_token/tt_not/ 8. s/or_token/tt_or/ 9. s/rparen_token/tt_rparen/ A. s/word_star_token/tt_word_star/ B. s/word_token/tt_word/ * word_info.h 1. s/No_Meta_ID/Meta_ID_None/ * version.h 1. Upped version. ******************************************************************************* 6.0.4 ******************************************************************************* BUG FIXES --------- * Queries containing meta-names didn't work right if the meta-name was given in any position other than last. (This bug fix shall be known as bug fix MN1.) CHANGES, file-by-file --------------------- * query.c 1. In parse_meta(), added: "args.meta_id = No_Meta_ID" for bug fix MN1. * version.h 1. Changed to "6.0.4". ******************************************************************************* 6.0.3 ******************************************************************************* BUG FIXES --------- * The calculation of word deltas was wrong. (This bug fix shall be known as bug fix CWD.) CHANGES, file-by-file --------------------- * index.c 1. Removed no-class and dump-html options. (They should have been removed a long time ago because module-specific options were moved to the modules themselves.) * indexer.h 1. In resume_indexing(), added "if" statement. * INSTALL.unix 1. Changed minimum supported g++ compiler to the 3.x series, i.e., 2.95.x and earlier are no longer supported. * mod/html/mod_html.c 1. In parse_html_tag(), now additionally skipping XML processing instructions. * word_info.c * word_info.h 1. Added last_absolute_word_pos_ for bug fix CWD. * version.h 1. Changed to "6.0.3". ******************************************************************************* 6.0.2 ******************************************************************************* BUG FIXES --------- * error_string() in util.h failed to compile using g++ 3.4.1. (This bug fix shall be known as bug fix G341.) CHANGES, file-by-file --------------------- * Group.c * SocketAddress.c * WordThreshold.c 1. Removed config variable name from error message. * conf_var.c * search.c * stop_words.c * thread_pool.c 1. Switched to using error_string. * index.c 1. Switched to using error_string. 2. Added: max_out_limit( RLIMIT_FSIZE ); * man/man4/swish++.conf.4 1. Added missing ID3 and LaTeX modules. * mod/html/mod_html.c 1. Made entity_to_ascii() and find_attribute() static. * util.h 1. Reworked error_string() for bug fix G341. * version.h 1. Changed to "6.0.2". ******************************************************************************* 6.0.1 ******************************************************************************* BUG FIXES --------- * Changes to make it compile with g++ 3.4.0 which purports to be much more standards-conforming. (This bug fix shall be known as bug fix G34.) CHANGES, file-by-file --------------------- * my_set.h 1. s/end()/this->end()/ for bug fix G34 * pattern_map.h 1. s/begin()/this->begin()/, s/end()/this->end()/ for bug fix G34 2. Made __SUNPRO_CC section the only one for bug fix G34. * version.h 1. Changed to "6.0.1". ******************************************************************************* 6.0 ******************************************************************************* NEW FEATURES ------------ * Added the ability to search using "near." The downside is that word-position data must be stored for every word. This approximately doubles the size of the generated indicies. (This feature shall be known as feature NS.) BUG FIXES --------- * file_list::const_iterator didn't work right for copy construction or assignment. (This bug fix shall be known as bug fix FLCI.) OTHER CHANGES ------------- * Reworked the thread_pool/thread code to use thread-local data. It makes the code much simpler. * Moved scripts (*.in files) to new scripts directory. * Added a makedepend.pl script to make dependencies when g++ is not being used, e.g. CC on Solaris. * Made lots of other changes to get SWISH++ to compile using Sun's CC. CHANGES, file-by-file --------------------- * charsets/GNUmakefile 1. s/$(AR) rv/$(AR)/ for Sun's CC. 2. Added: $(TEMPLATE_REPOSITORY) for Sun's CC. * config/config.mk 1. Added FEATURE_LIST, FEATURE_DEFS, and word_pos for feature NS. 2. Now no longer doing -fno-rtti when the word_pos feature is being compiled in for feature NS. 3. Added MAKEDEPEND variable. 4. Reworked OS definition; eliminated $(OS). 5. Added AR definition. 6. Added TEMPLATE_REPOSITORY for Sun's CC. * config/makedepend.pl 1. New file. * config/mod.mk * charsets/GNUmakefile * encodings/GNUmakefile 1. s/include/-include/ to silence "no such file or directory" messages. * config.h 1. Made comments for ShellFilenameDelimChars and ShellFilenameEscapeChars clear that they are for FILE (not path) names. 2. Added WordsNear_Default for feature NS. 3. s/WIN32/__CYGWIN__/ * conf_var.c 1. Removed \n from internal_error. 2. In conf_var::map_ref(), added storewordpositions and wordsnear for feature NS. * directory.c 1. s/WIN32/__CYGWIN__/ * do_file.c 1. Added code to reset word_pos for feature NS. * encoded_char.h 1. On line 106: s/value_type const/encoded_char_range::value_type const/ to make Sun's CC happy. * encodings/GNUmakefile 1. s/$(AR) rv/$(AR)/ for Sun's CC. 2. Added: $(TEMPLATE_REPOSITORY) for Sun's CC. * exit_codes.h 1. Added Exit_No_Create_Thread_Key. 2. Added Exit_No_Word_Pos_Data for feature NS. * extract.c 1. s/ctime/time.h/ for Sun's CC. * file_list.c 1. Added definition of file_list::const_iterator::end_value for bug fix FLCI. 2. In operator++(), now using end_value for bug fix FLCI. 3. Added code to accumulate word position data for feature NS. * file_list.h 1. Made temp object in operator++(int) const. 2. Added static end_value for bug fix FLCI. * filter.c 1. Got rid of newlines in error messages. * GNUmakefile 1. Added: DEBUG_eval_query 2. Added query_node.c to S_SOURCES for feature NS. 3. s/include/-include/ to silence "no such file or directory" messages. 4. Added: $(TEMPLATE_REPOSITORY) for Sun's CC. * IncludeMeta.c 1. Added const_cast() around strchr() for Sun's broken CC strchr() declaration. 2. s/char const *const m/char *const m/ for Sun's CC compiler. * index.c 1. Added #include "StoreWordPositions.h" for feature NS. 2. s/word_file_max/word_files_max/ 3. Added store_word_positions and word_pos for feature NS. 4. Added -P option for feature NS. 5. In merge_indicies() and write_word_index(), added call to write_word_pos() for feature NS. 6. s/ctime/time.h/ for Sun's CC. * indexer.c 1. Added #include "StoreWordPositions.h" for feature NS. 2. Added "extern int word_pos" for feature NS. 3. Removed \n from internal_error. 4. In indexer::index_word(), added "++word_pos" for feature NS. 5. In indexer::index_word(), added code to store word position data for feature NS. * INSTALL.unix * INSTALL.win32 1. Removed note about "no such file or directory" warnings. * man/man1/index.1 1. In the description of the Mail module, item #7, added mention of -A option. 2. For the -A option, elaborated description. 3. Added description of the -P option for feature NS. 4. Added mention of StoreWordPositions variable for feature NS. 5. Added mention of WordsNear variable for feature NS. * man/man1/search.1 1. Added description of "near" for feature NS. 2. Reworked EXAMPLES section. 3. Added: Could not create thread key. 4. Added: Attempted "near" search without word-position data. * man/man4/swish++.conf.4 1. Added StoreWordPositions for feature NS. 2. Added WordsNear for feature NS. * man/man4/swish++.index.4 1. Added Word-position list for feature NS. * mmap_file.c 1. s/MacOSX/__APPLE__/ 2. s/ctime/time.h/ for Sun's CC. * mmap_file.h 1. Removed trailing ',' from behavior_type. 2. Added "#ifndef __SUNPRO_CC" for problem with Sun's CC compiler. * mod/mod_id3/mod_id3.h 1. Removed trailing ',' from enums. * mod/mod_mail/mod_mail.h 1. s/Multipart,/Multipart/ 2. Added: struct message_type; friend struct message_type; * pattern_map.h 1. s/WIN32/__CYGWIN__/ 2. Added #ifdef for Sun's CC compiler. * query.c 1. s/find_result/word_range/ 2. Added #include "query_node.h" for feature NS. 3. Added parse_args struct for feature NS. 4. Reworked parse functions to take parse_args for feature NS. 5. Move is_too_frequent() to query.h. 6. Reworked parse functions to build query_nodes. 7. Moved the code for perform_and() to query_node.c. 8. Added assert_index_has_word_pos_data() for feature NS. * query.h 1. s/find_result/word_range/ 2. Move is_too_frequent() here. * query_node.[ch] 1. New files for neature NS. * search.c 1. Added #include "WordsNear.h" for feature NS. 2. s/word_file_max_arg/word_files_max_arg/ 3. Added -n option for feature NS. 4. s/ctime/time.h/ for Sun's CC. 5. Added #include "vector_adapter.h" for Sun's CC. 6. s/search_result_type/search_result/ 7. In search(), added #ifdef __SUNPRO_CC since Sun's CC compiler and/or their STL implementation seems pretty broken. * search.h 1. s/word_file_max_arg/word_files_max_arg/ 2. Added words_near_arg for feature NS. * search_daemon.c 1. s/ctime/time.h/ for Sun's CC. * search_options.c 1. Added "-n" for feature NS. * search_thread.c 1. s/ctime/time.h/ for Sun's CC. * simple_pool.h * StoreWordPositions.h 1. New file for feature NS. * swish++.conf 1. Removed -f and -p options from search(1). 2. Added StoreWordPositions for feature NS. 3. Added WordsNear for feature NS. * thread_pool.c 1. Added thread_pool::thread::thread_obj_key_. 2. s/thread_pool_thread_cleanup/thread_pool_thread_data_cleanup/ 3. Added code to deal with thread-specific data. 4. Added thread_pool_thread_once(). 5. s/destructing_/in_cleanup_/ 6. s/ctime/time.h/ for Sun's CC. 7. s/start_function_type/thread_start_function_type/ for Sun's CC. * thread_pool.h 1. Removed thread_pool_thread_cleanup(). 2. Added thread_pool_thread_data_cleanup(). 3. Added thread_pool_thread_once(). 4. Removed thread_pool::thread::operator delete(). 5. s/destructing_/in_cleanup_/ 6. Removed thread_pool::thread::thread_. 7. Added thread_pool::thread::thread_obj_key_. 8. s/start_function_type/thread_start_function_type/ for Sun's CC. * token.c 1. Added "near" for feature NS. 2. Got rid of #include for Sun's CC. 3. s/transform/to_lower/ for Sun's CC. * token.h 1. Added near_token and not_near_token for feature NS. * util.c 1. Added: to_lower( char*, char const* ) * util.h 1. Added: pjl_abs() for feature NS. 2. Removed \n from internal_error. 3. Added FOR_EACH_IN_PAIR. 4. Added: to_lower( char*, char const* ) 5. s/ctime/time.h/ for Sun's CC. * vector_adapter.h 1. New file for Sun's CC. * version.h 1. Changed to "6.0". * word_info.c 1. Moved word_info::file constructors here. 2. Added write_word_pos() for feature NS. * word_info.h 1. Added file::has_meta_id(). 2. Added word position data for feature NS. * word_markers.h 1. Added Word_Pos_List_Marker for feature NS. * WordsNear.h 1. New file for feature NS. ******************************************************************************* 5.15.4 ******************************************************************************* BUG FIXES --------- * extract(1) using stdin was broken. (This bug fix shall be known as bug fix ESI2.) CHANGES, file-by-file --------------------- * extract.c 1. In main() in the code for processing stdin, s/*argv/file_name/ for bug fix ESI2. * version.h 1. Changed to "5.15.4". ******************************************************************************* 5.15.3 ******************************************************************************* BUG FIXES --------- * Fixed a bug in the code that merges partial indicies. (Hopefully, this was the last bug introduced as a result of the new index file format.) (This bug fix shall be known as bug fix NMF.) CHANGES, file-by-file --------------------- * index.c 1. In merge_indicies(), moves declaration of "continues" out of the loop for bug fix NMF. * version.h 1. Changed to "5.15.3". ******************************************************************************* 5.15.2 ******************************************************************************* BUG FIXES --------- * Search results didn't include the last one, i.e., if there are N results, only N-1 were returned. (This bug fix shall be known as bug fix RM1.) CHANGES, file-by-file --------------------- * file_list.h 1. Added: typedef unsigned char byte; 2. s/unsigned char/byte/ * file_list.c 1. s/unsigned char/byte/ 2. In calc_size(), fixed list-skipping code. 3. In operator++(), added "sentinel" code for bug fix RM1. * SearchResults.xsd 1. s/homepage.mac.com/www.pauljlucas.org/ * version.h 1. Changed to "5.15.2". * xml_formatter.c 1. Changed xmlns location. ******************************************************************************* 5.15.1 ******************************************************************************* BUG FIXES --------- * Fixed a bug in new file-format decoder. (This bug fix shall be known as bug fix FFD1.) CHANGES, file-by-file --------------------- * file_list.c 1. In calc_size(), removed incorrect ++p for bug fix FFD1. * version.h 1. Changed to "5.15.1". ******************************************************************************* 5.15 ******************************************************************************* NEW FEATURES ------------ * Numbers stored in the generated index file are now more highly compressed resulting in an overall average savings of approximately 6% in index size. (This feature shall be known as feature HCI.) BUG FIXES --------- * The call to change the behavior of mmap(2) was in the wrong place. (This bug fix shall be known as bug fix MMB2.) CHANGES, file-by-file --------------------- * bcd.h * bcd.c 1. Replaced by enc_int.[hc], respectively, for feature HCI. * conf_var.c 1. Moved call to change behavior of mmap(2) for bug fix MMB2. * file_info.c 1. s/bcd.h/enc_int.h/ for feature HCI. 2. s/parse_bcd/dec_int/ for feature HCI. * file_list.c 1. s/bcd.c/enc_int.c/ for feature HCI. 2. Reworked calc_size() and operator++() for new index file format for feature HCI. * GNUmakefile 1. s/bcd.c/enc_int.c/ for feature HCI. * index.c 1. s/bcd.c/enc_int.c/ for feature HCI. 2. Added: #include "word_markers.h" for feature HCI. 3. s/parse_bcd/dec_int/ for feature HCI. 4. Reworked merge_indicies() and write_word_index() for new file format for feature HCI. * query.c 1. s/bcd.h/enc_int.h/ for feature HCI. 2. s/parse_bcd/dec_int/ for feature HCI. * search.c 1. Removed unneeded #include "bcd.h" * word_info.c 1. s/bcd.h/enc_int.h/ for feature HCI. 2. Added: #include "word_markers.h" for feature HCI. 3. Now using Meta_Name_List_Marker and Stop_Marker for feature HCI. * word_markers.h 1. New file for feature HCI. * version.h 1. Changed to "5.15". ******************************************************************************* 5.14.2 ******************************************************************************* BUG FIXES --------- * When filenames containing shell meta-characters were passed to the shell for filtering, they weren't escaped. (This bug fix shall be known as bug fix SMC.) * For extract(1), added check to ensure that the extracted file name's length does not exceed PATH_MAX. (This bug fix shall be known as bug fix EEL.) * The call to change the behavior of mmap(2) was in the wrong place. (This bug fix shall be known as bug fix MMB.) CHANGES, file-by-file --------------------- * config.h 1. Updated Word_Max_Consec_Consonants to 7. 2. Updated Word_Max_Consec_Vowels to 5 3. Added ShellFilenameDelimChars and ShellFilenameEscapeChars for bug fix SMC. * conf_string.h 1. Added length() and size(). * do_file.c 1. Added check against PATH_MAX for bug fix EEL. 2. Moved call to change behavior of mmap(2) for bug fix MMB. * filter.c 1. Added escape_filename() and unescape_filename() for bug fix SMC. 2. Changed substitute() to use escaped filename for bug fix SMC. * man/man1/index.1 1. s/subdiretories/subdirectories/ * man/man1/search.1 1. Added missing mention of exit codes 68 and 69. * searchc.in 1. Added -F option. * version.h 1. Changed to "5.14.2". ******************************************************************************* 5.14.1 ******************************************************************************* BUG FIXES --------- * Fixed a small error when compiling under Solaris. (This bug fix shall be know as bug fix MADV1.) * The version string in 5.14 wasn't updated. CHANGES, file-by-file --------------------- * mmap_file.c 1. Added cast to caddr_t in madvise() call for bug fix MADV1. * version.h 1. Changed to "5.14.1". ******************************************************************************* 5.14 ******************************************************************************* NEW FEATURES ------------ * The searchd script now supports chkconfig. (This feature shall be know as feature CHK.) BUG FIXES --------- * Use of an iterator in the rank_full_index() function was improper. Apparantly it doesn't matter under GCC/HP implementations of STL, but does under the .NET implementation. (This bug fix shall be known as bug fix INVIT.) * The is_xxxxx() functions in util.h were doing sign-extension during char-to- int conversion (apparantly only) under the .NET compiler. This has been fixed by using a proper cast. (This bug fix shall be known as bug fix CHCAST.) CLARIFICATIONS -------------- * The undocumented behavior of index(1) skipping all files that started with '.' has been changed to skip only the directory entires '.' and ".."; this has also been documented. (This clarification shall be know as clarification DOTS.) CHANGES, file-by-file --------------------- * directory.c 1. In do_directory(), changed code for DOTS. * index.c 1. In rank_full_index(), fixed tha handling of the 'w' iterator for bug fix INVIT. * INSTALL.unix 1. Updated gcc information. * man/man1/index.1 1. For the ID3 module description, fixed references to "header": they should be "field." * searchd.in 1. Added chkconfig information for feature CHK. 2. s!.echotmp!/tmp/.echotmp! * util.h 1. Added static_cast<>()s to the is_xxxxx() functions for bug fix CHCAST. * version.h 1. Changed to "5.14". ******************************************************************************* 5.13.5 ******************************************************************************* BUG FIXES --------- * The top-level GNUmakefile didn't make the "etc" directory (I_ETC) if it didn't exist. (This was supposedly fixed in 5.8, but apparantly not.) (This bug fix shall be know as bug fix I_ETC2.) CHANGES, file-by-file --------------------- * GNUmakefile 1. Added $(I_ETC) as a target of the $(MKDIR) $@ line for bug fix I_ETC2. * version.h 1. Changed to "5.13.5." ******************************************************************************* 5.13.4 ******************************************************************************* BUG FIXES --------- * For search(1), the --max-results option was missing the "max" part. (This bug fix shall be know as bug fix MRO.) CHANGES, file-by-file --------------------- * search_options.c 1. s/results/max-results/ for bug fix MRO. * version.h 1. Changed to "5.13.4." ******************************************************************************* 5.13.3 ******************************************************************************* BUG FIXES --------- * Fixed occasional segmentation fault in the Manual page indexing module. (This bug fix shall be know as big fix MMC1.) CHANGES, file-by-file --------------------- * BUGS 1. New file containing bug-reporting instructions. * INSTALL.win32 1. s!sources.redhat.com/cygwin!cygwin.com! 2. Removed mention of Windows 95 -- it's dead. 3. Added mention of Windows XP. * man/man4/swish++.index.4 1. Added (missing) mention of MP3 file titles. * mod/man/mod_man.c 1. In index_words(), s/while ( true )/while ( c != e.end_pos() )/ for bug fix MMC1. * version.h 1. Changed to "5.13.3." ******************************************************************************* 5.13.2 ******************************************************************************* BUG FIXES --------- * Fixed threads linking problem. (This bug fix shall be know as big fix TL1.) CHANGES, file-by-file --------------------- * GNUmakefile 1. s/PTHREAD_LIB/PTHREAD_LINK/ for bug fix TL1. 2. Added $(PTHREAD_LINK) to E_LINK for bug fix TL1. * version.h 1. Changed to "5.13.2." ******************************************************************************* 5.13.1 ******************************************************************************* BUG FIXES --------- * Yet more BSD compilation fixes. (Why the hell can't BSD headers #include everything they need themselves?) (This bug fix shall be know as big fix BSD6.) CHANGES, file-by-file --------------------- * util.h 1. Added: #include for bug fix BSD6. * version.h 1. Changed to "5.13.1." ******************************************************************************* 5.13 ******************************************************************************* NEW FEATURES ------------ * Added an indexing module for ID3 tags (typically found inside MP3 files). ID3v1.x and ID3v2.x through 2.4.0 are supported (with the exception of encrypted frames). (This feature shall be known as feature ID3.) * Since it was needed by feature ID3, decoding of UTF-16 text (both big- and little-endian) was added. (This feature shall be known as feature UTF16.) BUG FIXES --------- * If the Mail module was compiled without Base64 encoding compiled in, it was indexed as plain text (which is wrong). It should be treated as Binary and not indexed at all. (This bug fix shall be known as bug fix MN64.) * If the Mail module was compiled without the UTF-7 charset compiled in, it was indexed as plain text (which is wrong). It should be treated as Binary and not indexed at all. (This bug fix shall be known as bug fix MNU7.) * httpindex didn't accept some index(1) options that it should have. (This bug fix shall be known as bug fix HIO.) * httpindex could block if index(1) was generating partial indicies. (This bug fix shall be known as bug fix HPI.) CHANGES, file-by-file --------------------- * charsets/charsets.h 1. Added UNKNOWN_CHARSET for feature ID3. 2. Added charset_utf16be() and charset_utf16le() for feature UTF16. * charsets/utf16.c 1. New file for feature UTF16. * config/config-sh 1. Added a target.mk argument. 2. Now generating a target.mk file. 3. Added DATE. 4. s/TARGET/TARGET_H/ 5. Changed define() to emit for both targets. * config/config.mk 1. Added "id3" to MOD_LIST for feature ID3. 2. Added "utf17" to CHARSET_LIST for feature ID3. 3. Removed CHARSET_LIST and ENCODING_LIST from inside MOD_mail. 4. Added "DECODING:= -DIMPLEMENT_DECODING" to simplify encoded_char.[ch]. 5. s/PTHREAD_LIB/PTHREAD_LINK/ 6. s/SOCKET_LIB/SOCKET_LINK/ 7. Added ZLIB_LINK for feature ID3. 8. Added "$(DECODING)" to CCFLAGS. 9. Added platform.mk fo dependency line. * config/GNUmakefile 1. Added platform.mk to $(TARGET). 2. s/$@/$(TARGET)/ 3. Removed .*.d * config/src/zlib.c 1. New file for feature ID3. * do_file.c 1. s/#ifdef MOD_mail/#ifdef IMPLEMENT_DECODING/ for feature ID3. 2. Switched to using indexer::text_indexer(). * encoded_char.c * encoded_char.h 1. s/#ifdef MOD_mail/#ifdef IMPLEMENT_DECODING/ for feature ID3. * encodings/encodings.h 1. s/-1/~0/ * GNUmakefile 1. Added include of platform.mk 2. Added -DDEBUG_id3v2 for feature ID3. 3. Changed the way CFLAGS was assigned. 4. Added config to SUBDIRS. 5. Added ifndef HAVE_ZLIB 6. Added $(BLIB_LINK) to I_LINK for feature ID3. 7. s/PTHREAD_LIB/PTHREAD_LINK/ 8. s/SOCKET_LIB/SOCKET_LINK/ 9. Removed platform.h from disclean. * httpindex.in 1. Added missing index(1) options for bug fix HIO. 2. Added code to read extra lines from index(1) for bug fix HPI. * indexer.c 1. Added text_indexer_. 2. In map_ref(), added assignment to text_indexer_. * indexer.h 1. Made find_meta() public. 2. Made index_word() public. 3. Added text_indexer(). 4. Added text_indexer_. * man/man1/index.1 1. Added description of ID3 module for feature ID3. * mod/id3/GNUmakefile * mod/id3/id3v1.c * mod/id3/id3v1.h * mod/id3/id3v2.c * mod/id3/id3v2.h * mod/id3/mod_id3.c * mod/id3/mod_id3.h 1. New files for feature ID3. * mod/html/mod_html.c * mod/latex/mod_latex.c * mod/mail/mod_mail.c * mod/man/mod_man.c 1. Switched to using move_if_match(). * mod/mail/mod_mail.c 1. Made indexing treat Base64 as Binary if its encoding wasn't compiled in for bug fix MN64. 2. Made indexing skip UTF-7 and UTF-8 if their respective character set code wasn't compiled in for bug fix MNU7. 3. In index_headers(), Removed module #ifdef's since they weren't really needed. * mod/mail/mod_mail.h 1. Removed module #ifdef's since they weren't really needed. * README 1. Aded blurb about ID3 for feature ID3. * swish++.conf 1. Added IncludeMeta's for ID3v2 tag fields for feature ID3. 2. Added "IncludeFule ID3 *.mp3" for feature ID3. * util.h 1. Added NUM_ELEMENTS(). * word_util.c * word_util.h 1. Added move_if_match(). * www_example 1. Added "--" at end of search options to close security hole. * version.h 1. Changed to "5.13." ******************************************************************************* 5.12.1 ******************************************************************************* BUG FIXES --------- * Some ranks returned were negative. (This became broken in 5.11.) (This bug fix shall be known as bug fix FNR.) * The version number for 5.12 wasn't updated in the code from 5.11.1. CHANGES, file-by-file --------------------- * word_info.h 1. Changed occurrences_ and rank_ from short to int for bug fix FNR. * version.h 1. Changed to "5.12.1." ******************************************************************************* 5.12 ******************************************************************************* NEW FEATURES ------------ * WordThreshold may now be set either in a config. file or on the command line (as opposed to being only a compiled-in constant). However, only the super- user may specify a value larger than the default. (This feature shall be know as feature SWT.) CHANGES, file-by-file --------------------- * config.h 1.s/Word_Threshold/WordThreshold_Default/ for feature SWT. * conf_var.c 1. Added wordthreshold for feature SWT. * do_file.c 1. s/Word_Threshold/word_threshold/ for feature SWT. * exit_codes.h 1. Added Exit_Not_Root for feature SWT. * GNUmakefile 1. Added WordThreshold.c to I_SOURCES for feature SWT. * index.c 1. Added: #include "WordThreshold.h" for feature SWT. 2. Added word_threshold for feature SWT. 3. In main(), added "word-threshold" and -W options for feature SWT. 4. In usage(), added -W and --word-threshold options for feature SWT. * man/man1/index.1 1. Added -W, --word-threshold, end WordThreshold for feature SWT. 2. Added exit code 13 for feature SWT. * man/man4/swish++.conf.4 1. Added WordThreshold for feature SWT. * mod/html/html.c 1. Moved over line of ':' to be in-line with those in index.c for feature SWT. * swish++.conf 1. Added Word_Threshold for feature SWT. * WordThreshold.c * WordThreshold.h 1. New file for feature SWT. * version.h 1. Upped version to "5.12". ******************************************************************************* 5.11.1 ******************************************************************************* BUG FIXES --------- * Invalid UTF-8 could send the indexer into an infinite loop. (This bug fix shall be know as bug fix IU8.) CHANGES, file-by-file --------------------- * charsets/utf8.c 1. Changed syncing to skip forward rather than back for bug fix IU8. 2. Added check for FE and FF bytes. * version.h 1. Upped version to "5.11.1". ******************************************************************************* 5.11 ******************************************************************************* NEW FEATURES ------------ * The ranking result numbers (1-100) have been significantly improved: they are much less striated now. (Simply increasing the scaling factor by 3 orders of magnitide did the trick.) (This feature shall be know as feature IRF.) CHANGES, file-by-file --------------------- * index.c 1. Added Rank_Factor constant. 2. Changed rank factor from 10000 to 10000000 for feature IRF. * man/man1/extract.1 1. Changed wording for -e to agree with index(1). * man/man1/index.1 1. Added (missing) documentation that -e and --pattern options can take multiple patterns separated by commas. * swish++.conf 1. Added *.png to ExcludeFile 2. Changed Word filter to wvText and added URL for www.wvware.com. 3. Added (missing) LaTeX and Man IncludeFile lines. * version.h 1. Upped version to "5.11". ******************************************************************************* 5.10 ******************************************************************************* NEW FEATURES ------------ * Some XHTML 2.0 elements have been added to the HTML module. (This feature shall be know as feature XHTML2.) * Theoretically improved indexing performace by adding calls to madvise(2). (This feature shall be know as feature MADV.) BUG FIXES --------- * Specifying filename patterns for extract(1) was broken. (This bug fix shall be know as bug fix EFP.) * extract(1) was broken altogether since 6/16/2000. Apparantly, very few people use it since nobody pointed it out. (This bug fix shall be know as bug fix ETB.) CHANGES, file-by-file --------------------- * config/config.mk 1. Added -D_BSD_SOURCE to LINUX definition. 2. Added MAC_OS_X as a separate OS. * config/src/madvise.c 1. New file for feature MADV. * conf_var.c 1. In parse_file(), added call to mmap_file::behavior(). * do_file.c 1. In do_file(), added call to mmap_file::behavior() for feature MADV. 2. In do_file(), added "out = &extracted_file;" for bug fix ETB. * extract.c 1. In main(), s/.insert( pat )/.insert( pat, 0 )/ for bug fix EFP. * ExtractFile.c 1. s/insert( new_strdup( s ) )/insert( new_strdup( s ), 0 )/ for bug fix EFP. * ExtractFile.h 1. Made derived from conf_var and pattern_map for bug fix EFP. * mmap_file.c * mmap_file.h 1. Added behavior() for feature MADV. * mod/html/elements.c 1. Added the h, line, name, nl, quote, and section elements for feature XHTML2. * search.c 1. In main(), added call to mmap_file::behavior(). * stop_words.c 1. Added call to mmap_file::behavior(). * version.h 1. Upped version to "5.10". ******************************************************************************* 5.9.6 ******************************************************************************* BUG FIXES --------- * Non-space whitespace characters were mistakenly turned into spaces. This has been this way since version 5.8 and, in theory, broke meta-names in e-mail files (although the old code seems to have worked, but I don't know how). (This bug fix shall be known as bug fix NSW.) * Recognition of LaTeX commands seems to have been completely broken. (This bug fix shall be known as bug fix LAC.) * Unknown LaTeX commands are now ignored (as they should have been) rather than indexed. (This bug fix shall be known as bug fix LAS.) CHANGES, file-by-file --------------------- * iso8859-1.c 1. Changed ' ' to '\t', '\n', '\v', '\f', '\r' for bug fix NSW. * mod/latex/mod_latex.c 1. Added find_left(). 2. In index_words(), added check for '\r'. 3. In parse_latex_command(), removed local scope. 4. In parse_latex_command(), s/!is_alnum/is_alnum/ for bug fix LAC. 5. In parse_latex_command(), added call to find_left(). 6. In parse_latex_command(), added code such that if a command is not found, skip it (for bug fix LAS). * version.h 1. Upped version to "5.9.6". ******************************************************************************* 5.9.5 ******************************************************************************* BUG FIXES --------- * For document sets that contain a lot of words (more than 2^31), the number of total words reported was negative due to signed integer overflow. The fix was to make the counter unsigned. (This bug fix shall be known as bug fix ULN.) CHANGES, file-by-file --------------------- * index.c * indexer.c 1. Made num_indexed_words, num_total_words, and num_unique_words unsigned long rather than just long for bug fix ULN. * version.h 1. Upped version to "5.9.5". ******************************************************************************* 5.9.4 ******************************************************************************* BUG FIXES --------- * There was a problem whereby search(1) would core-dump under FreeBSD. This problem surfaced a while ago, then disappeared, and now it's back again. This problem has finally been fixed (apparantly). (This bug fix shall be known as bug fix END.) CHANGES, file-by-file --------------------- * GNUmakefile 1. Added ifdefs for CHARSET_LIST and ENCODING_LIST. * extract.c * index.c * mod/html/mod_html.c * mod/mail/mod_mail.c * mod/man/mod_man.c * query.c * search.c * search_daemon.c * search_thread.c 1. Made whatever can be declared static actually static. * stem_word.c 1. Made "end" static for bug fix END. 2. Made whatever else can be declared static actually static. * version.h 1. Upped version to "5.9.4". ******************************************************************************* 5.9.3 ******************************************************************************* BUG FIXES --------- * Under Windows, the printing of file names was slightly messed up. (This bug fix shall be known as bug fix WDS.) CHANGES, file-by-file --------------------- * directory.c 1. In do_check_add_file(), s!'/'!Dir_Sep_Char! for bug fix WDS. * version.h 1. Upped version to "5.9.3". ******************************************************************************* 5.9.2 ******************************************************************************* BUG FIXES --------- * If the "mail" module wasn't selected for compilation, then overall compilation failed due to a missing #ifdef. (This bug fix shall be known as bug fix MMI.) CHANGES, file-by-file --------------------- * encoded_char.h 1. In encoded_char_range::const_iterator::const_iterator(), added missing "#ifdef MOD_mail" for bug fix MMI. * version.h 1. Upped version to "5.9.2". ******************************************************************************* 5.9.1 ******************************************************************************* BUG FIXES --------- * The feature of being able to do "not foo = bar" introduced in 5.8 was broken: right intent, wrong line of code. D'oh! (This bug fix shall be known as bug fix NMN1.) CHANGES, file-by-file --------------------- * query.c 1. In parse_primary() in lparen_token case, s/parse_meta/parse_query2/ for bug fix NMN1. 2. In parse_primary() in not_token case, s/parse_primary/parse_meta/ for bug fix NMN1. * version.h 1. Upped version to "5.9.1". ******************************************************************************* 5.9 ******************************************************************************* NEW FEATURES ------------ * Added XML schema information in seach results XML output. (This feature shall be known as feature XMLS.) CHANGES, file-by-file --------------------- * man/man1/search.1 1. Fixed formatting for grammar. 2. Added LaTeX to set of files that can have titles. 3. Added XML schema information for feature XMLS. * man/man4/swish++.index.4 1. Added missing mention of LaTeX titles. * version.h 1. Upped version to "5.9". * SearchResults.xsd 1. New file for feature XMLS. * xml_formatter.c 1. Added SWISHPP_URI, SEARCH_RESULTS_DTD, SEARCH_RESULTS_NS_URI, SEARCH_RESULTS_XSD, and XML_SCHEMA_INSTANCE_URI for feature XMLS. 2. In pre(), put " 2. Added: #include "iso8859-1.h" 3. s/decoder_type/encoding_type/ 4. Added: encoded_char_range::charset_type for feature UTF. 5. Added charset_type to encoded_char_range and encoded_char_range::const_iterator constructors. 6. s/decoder_/encoding_/ 7. Added encoded_char_range::charset_ for feature UTF. 8. Added encoded_char_range::decoder class. 9. In encoded_char_range::const_iterator::decode(), added code to call charset decoder for feature UTF. A. In encoded_char_range::const_iterator::operator*(), added call to iso8859_1_to_ascii(). * encodings/GNUmakefile * encodings/base64.c * encodings/encodings.h * encodings/quoted_printable.c * encodings/README 1. New files. * extract.c 1. In extract_words(), removed called to iso8859_to_ascii(). * fdbuf.h 1. Added #ifdef PJL_GCC_2xx for streambuf vs. streambuf.h for bug fix GCC31. * GNUmakefile 1. Added $(I_ETC) as a target of the $(MKDIR) $@ line for bug fix I_ETC. 2. Added CHARSET_* and ENCODING_* variables for feature UTF. 3. Added CHARSET_LIB and ENCODING_LIB targets for feature UTF. 4. Added "charsets" and "encodings" to MAKE_SUBDIRS for feature UTF. 5. Added iso8859-1.c to I_SOURCES, S_SOURCES, and E_SOURCES. 6. Added $(CHARSET_LINK) and $(ENCODING_LINK) to I_LINK for feature UTF. 7. Added $(PTHREAD_LIB) to I_LINK for bug fix SOL_THR. 8. Added $(CHARSET_LIB) and $(ENCODING_LIB) to index for feature UTF. * indexer.c 1. In index_words(), removed called to iso8859_to_ascii(). * init_modules-sh * init_mod_vars-sh 1. Added -e 's/mod_/MOD_/' to sed lines. * INSTALL.unix 1. Added mention of minimum GNU make version for bug fix MAKE_VER. * iso8859-1.c * iso8859-1.h 1. Moved from word_util.[ch] * man/man1/index.1 1. For the Mail module, added mention of UTF-7 and UTF-8 for feature UTF. 2. Added a caveat about an e-mail message having a simultaneous encoding and character set. 3. Added Unicode references. * man/man1/search.1 1. Changed query grammar for feature NMN. * mmap_file.h 1. Replaced deprecated declarations for reverse_iterator and const_reverse_iterator for bug fix GCC31. * mod/mail/mod_html.c 1. Added: #include "charsets/unicode.h" 2. In entity_to_ascii(), now using unicode_to_ascii(). 3. In index_words(), removed call to iso8859_to_ascii(). * mod/mail/mod_mail.c 1. Added: #include "encoded_char.h" for feature UTF. 2. Removed: #include "word_util.h" 3. Replaced decoder_ with charset_ and encoding_ for feature UTF. 4. In index_headers(), added code to extract the charset for feature UTF. 5. In index_multipart(), added call to encoded_char_range::decoder::reset_all(). 6. s/while ( 1 )/while ( true )/ * mod/mail/mod_mail.h 1. Added: #include "charsets/charsets.h" 2. Added: #include "encodings/encodings.h" 3. Replaced decoder_ with charset_ and encoding_ * mod/rtc/mod_man.c * mod/rtc/mod_rtf.c 1. In index_words(), removed call to iso8859_to_ascii(). * option_stream.c 1. Added #include for bug fix GCC31. * query.c 1. In parse_query2(), s/parse_query2/parse_meta/ for feature NMN. * README 1. Added missing item mentioning ability to index LaTeX and RTF documents. 2. Moved "Index non-text files such as Microsoft Office documents" up one. 3. Added mention of UTF-7 and UTF-8 character sets for feature UTF. * stop_words.c * token.c 1. Added: #include "iso8859-1.h" 2. s/iso8859_to_ascii/iso8859_1_to_ascii/ * token.h 1. Added #ifdef PJL_GCC_2xx for strstream vs. sstream for bug fix GCC31. 2. Added #ifdef PJL_GCC_2xx for istrstream vs. istringstream for bug fix GCC31. * version.h 1. Upped version to "5.8". * word_util.c * word_util.h 1. Moved iso8859 stuff to iso8859-1.[ch] ******************************************************************************* 5.7.1 ******************************************************************************* BUG FIXES --------- * Some "and" query results were slightly messed up due to an iterator being invalidated. (This bug fix shall be known as bug fix PAI.) * Ranks could be printed as zero (wrong!). (This bug fix shall be known as bug fix RANK0.) CHANGES, file-by-file --------------------- * auto_vec.h 1. Made const/non-const versions of accessors. * conf_enum.c * IncludeMeta.c 1. In parse_value(), made "lower" non-const to work with updated auto_vec. * config/GNUmakefile * GNUmakefile 1. Removed "dist" target. * query.c 1. In perform_and(), added a temporary iterator for bug fix PAI. * search.c 1. In search(), added code to ensure rank is not zero for bug fix RANK0. 2. In search(), added const to highest_rank. * thread_pool.h 1. Made various thread_pool data members "volatile" because this should be done for variables that are accessed by multiple threads. * version.h 1. Upped version to to "5.7.1". ******************************************************************************* 5.7 ******************************************************************************* NEW FEATURES ------------ * LaTeX files can now be indexed directly. (This feature shall be known as feature LATEX.) * Document titles now have multiple whitespace characters squeezed into single whitespace characters. (This feature shall be known as feature STWS.) * index(1) will now use the value of the environment variable TMPDIR if it's set as the default temporary directory. However, the value is still superseded by one of -T, --temp-dir, or TempDirectory if given. (This feature shall be known as feature TMPDIR.) BUG FIXES --------- * HTML comment parsing was broken in that it allowed "->" in addition to "-->" to terminate a comment. (This bug fix shall be known as bug fix HTC.) * Yet more bugs in the thread-pooling code. (This bug fix shall be known as bug fix TPB.) CHANGES, file-by-file --------------------- * config/config.mk 1. Added "latex" to MOD_LIST for feature LATEX. 2. Added I_ETC. * config.h 1. Changed value for SocketQueueSize_Default to 511. * GNUmakefile 1. Added TempDirectory.c for feature TMPDIR. 2. Changed installation of swish++.conf to $(I_ETC) * indexer.c 1. In tidy_title(), added code to squeeze multiple whitespace characters for feature STWS. * man/man1/index.1 1. Added TMPDIR for feature TMPDIR. 2. Added LaTeX section for feature LATEX. 3. Added Leslie Lamport reference for feature LATEX. 4. s/SCCS/CVS/ since nobody uses SCCS any more. * man/man1/search.1 1. Changed default value for -q to 511. 2. Added missing "encoding" to XML example. * man/man4/swish++.conf.4 1. Added missing TempDirectory. * mod/html/mod_html.c 1. In is_html_comment(), reworked code for bug fix HTC. 2. In entity_to_ascii(), added static reference to char_entity_map::instance(). 3. In parse_html_tag() and post_options(), made "elements" reference static. * mod/latex/GNUmakefile * mod/latex/commands.c * mod/latex/commands.h * mod/latex/latex_config.h * mod/latex/mod_latex.c * mod/latex/mod_latex.h 1. New files for feature LATEX. * README.Solaris 1. Removed "ephemeral ports" since that wasn't right. * search_daemon.c 1. Removed: accept_failed(). 2. Added: handle_accept() and reset_socket(). 3. Moved thread_pool object inside of handle_accept(). * search_thread.c 1. Factored out reset_socket(). * swish++.conf 1. Changed value for SocketQueueSize to 511. * TempDirectory.c 1. New file for feature TMPDIR. * TempDirectory.h 1. Added default_value() for feature TMPDIR. 2. Moved #include "config.h" to new .c file for feature TMPDIR. * thread_pool.c 1. s/thread_pool_thread_destroy/thread_pool_thread_cleanup/ 2. Added: thread_pool_decrement_busy(). 3. In thread_pool_thread_main(), changed code so that pool_.t_idle_ is always signalled when idle. 4. In thread_pool_thread_main(), added: pthread_cleanup_push( thread_pool_decrement_busy, t ) to ensure that t->pool_.t_busy_ gets decremented even if the thread is killed. 5. In thread_pool_thread_main(), added DEFER_CANCEL/RESTORE_CANCEL around code that removes a task from the queue. 6. In ~thread(), added DEFER_CANCEL/RESTORE_CANCEL. 7. In thread_pool::thread_pool(), added t_busy_( 0 ) for bug fix TPB. 8. In thread_pool_thread_main(), made signaling of idle independent of the size of the thread pool. 9. Made new_task() take and return a bool argument and queue the task only if it will queue it. * thread_pool.h 1. s/thread_pool_thread_destroy/thread_pool_thread_cleanup/ 2. Added: thread_pool_decrement_busy(). 3. Made new_task() take and return a bool argument. * version.h 1. Upped version to to "5.7". ******************************************************************************* 5.6 ******************************************************************************* NEW FEATURES ------------ * The text/enriched attachment indexer that was part of the Mail module was split off into its own RTF (Rich Text Format) module so stand-alone RTF files can be indexed. (This feature shall be known as feature RTF.) * For search(1) running as a daemon, added code to reset the TCP connection for bad requests. The reason for doing this is so we don't potentially have a socket lingering in TIME-WAIT from a client that was too dumb to give us a valid request in the first place. This helps alleviate denial-of-service attacks (if that's what's going on). This came about due to the way Solaris handles TIME-WAIT. Read the new README.Solaris file for details. This change has no effect in in Linux 2.2.x kernels since sending a reset on close by setting SO_LINGER wasn't implemented. (This feature shall be known as feature RST.) BUG FIXES --------- * The files Group.c and SocketAddress.c didn't compile under FreeBSD. (This bug fix shall be know as big fix BSD5.) CHANGES, file-by-file --------------------- * config/config.mk 1. Added "rtf" to MOD_LIST. 2. Added explanation about module dependencies. * Group.c * SocketAddress.c 1. Added #include for bug fix BSD5. * INSTALL.unix 1. Added mention of README.Solaris file for feature RST. 2. s/www.objectspace.com/www.stlport.org/ 3. Moved module wording in step 2 of building to config/config.mk. * man/man1/index.1 1. Added description of RTF module. * man/man4/swish++.conf.4 1. Added mention of RTF module. 2. Added mention that text/enriched attachments can be indexed only if the RTF module is compiled into index(1). 3. Added mention that text/html attachments can be indexed only if the HTML module is compiled into index(1). 4. Fixed RFC 1563 attribution. * mod/mail/mod_mail.c 1. Removed: #include "platform.h" 2. Added: #include "mod/rtf/mod_rtf.h" 3. Removed index_enriched(). 4. In index_headers(), made "text/enriched" #ifdef'd on mod_rtf. 5. In index_words(), switched to using RTF module. * mod/mail/mod_mail.h 1. Fixed RFC attributions. 2. Made Text_Enriched #ifdef'd on mod_rtf. 3. Removed index_enriched(). * mod/rtf/GNUmakefile * mod/rtf/mod_rtf.c * mod/rtf/mod_rtf.h 1. New files for feature RTF. * README.Solaris 1. New file for feature RST. * search.c 1. Made return-type of search() and service_request() bool for feature RST. * search.h 1. Made return-type of service_request() bool for feature RST. * searchc.in 1. Added call to shutdown() after sending query. * search_thread.c 1. In search_thread::main(), added: out << flush; 2. In search_thread::main(), removed EINTR guard (not needed). * swish++.conf 1. Added: IncludeFile RTF *.rtf * thread_pool.c 1. s/thread_main/thread_pool_thread_main/ 2. s/thread_destroy/thread_pool_thread_destroy/ 3. Replaced q_lock class by simple mutex again. 4. Changed state_ back to destructing_. 5. Added DEFER_CANCEL, RESTORE_CANCEL, MUTEX_LOCK, MUTEX_UNLOCK macros. 6. In thread_pool_thread_destroy(), removed unlocking of q_lock. 7. In thread_pool_thread_main(), moved unlock of run_lock_ here. 8. In thread_pool_thread_main(), reworked mutex locking. 9. In ~thread(), removed mutex_lock of t_lock_. A. In thread_pool(), removed ERRORCHECK attribute. B. In ~thread_pool(), added DEFER_CANCEL(). C. In new_task(), reworked mutex locking. D. In new_task(), Added DEFER_CANCEL(). * thread_pool.h 1. s/thread_main/thread_pool_thread_main/ 2. s/thread_destroy/thread_pool_thread_destroy/ 3. Replaced q_lock class by simple mutex again. 4. Added private default constructor to argument_type. 5. Changed state_ back to destructing_. * version.h 1. Upped version to to "5.6". ******************************************************************************* 5.5.3 ******************************************************************************* NEW FEATURES ------------ * A sample Procmail recipe has been included that can be used to split incoming mail messages into individual files for indexing. (This feature shall be known as feature SIM.) * The indexing word-determination rules have been relaxed somewhat; the following rules have been eliminated: 1. Starts with a capital letter, is of mixed case, and contains more than a third capital letters. This enables words like FedEx to be indexed. 2. Contains a capital letter other than the first. This enables words like iMac to be indexed. (This feature shall be known as feature RWD.) BUG FIXES --------- * When running as a server, search(1) had a memory leak. (This bug fix shall be know as bug fix SML.) * When running as a server, search(1) didn't make the sockets reusable. (This bug fix shall be know as bug fix RSA.) CHANGES, file-by-file --------------------- * GNUmakefile 1. For INITD_DIR and LEVEL_DIR, redirected error output to /dev/null. * INSTALL.unix 1. Added mention of Procmail for feature SIM. * man/man1/index.1 1. Removed mention of removed word-determination rules for feature RWD. * procmailrc 1. New file for feature SIM. * searchd.in 1. Added: KILL=`which kill` 2. Added "|| exit 1" in a few places. 3. Added "sleep 3" in restart case. * search.c 1. In search(), added "delete format" for bug fix SML. * search_daemon.c 1. Added BIND_SOCKET() for bug fix RSA. * searchmonitor.in 1. Added: KILL=`which kill` * version.h 1. Upped version to 5.5.3. * word_util.c 2. In is_ok_word(), removed rules for feature RWD. * www_example/sample.html 1. Converted to XMTML. ******************************************************************************* 5.5.2 ******************************************************************************* BUG FIXES --------- * Indexing attachments has been broken since version 5.2. Major d'oh. (This bug fix shall be known as bug fix IAB.) CHANGES, file-by-file --------------------- * mod/mail/mod_mail.c 1. In index_headers(), put a missing "else" back for bug fix IAB. * version.h 1. Upped version to 5.5.2. ******************************************************************************* 5.5.1 ******************************************************************************* BUG FIXES --------- * Automatic thread-pool size reduction had a race condition where too many threads could be destroyed. (This bug fix shall be known as bug fix TCD.) CHANGES, file-by-file --------------------- * thread_pool.cpp 1. Changed thread::destructing_ to thread::state_ for bug fix TCD. 2. In thread_main(), set thread state to expired before calling delete on it for bug fix TCD. * thread_pool.h 1. Changed thread::destructing_ to thread::state_ for bug fix TCD. * version.h 1. Upped version to 5.5.1. ******************************************************************************* 5.5 ******************************************************************************* NEW FEATURES ------------ * search(1) can now be run as a daemon without it automatically putting itself into the background. This is useful in order to wrap a start script around it and automatically restart it if it dies for any reason. Correspondingly, there are 2 new utility scripts: searchmonitor (a process monitor for search) and searchd (a start/stop script for SysV-like systems). (This bug fix shall be known as feature NOB.) * search(1), when run as a daemon, can give away its root privileges if it started with them. There are now new command-line options of -U, --user, -G, and --group as well as new configuration variables User and Group. (This bug fix shall be known as feature GAR.) BUG FIXES --------- * When search(1) was running as a daemon, it ignored -F and --format options specified via the socket. (This bug fix shall be known as bug fix SDF.) * For very large document sets when many partial indicies were generated, if the number of partial indicies exceeded the maximum number of file descriptors a process could have open, merging would fail. (This bug fix shall be known as bug fix MFD.) CHANGES, file-by-file --------------------- * conf_enum.c * conf_enum.h 1. Added the is_legal() function for bug fix SDF. * config.h 1. Added Group_Default and UserDefault for feature GAR. * conf_var.c 1. In map_ref(), added "user" and "group" for feature GAR. 2. In map_ref(), added "searchbackground" for feature NOB. * GNUmakefile 1. Added Group.c and User.c to S_SOURCES for feature GAR. 2. Removed WIN32 PERL_TARGET conditional since WIN32 isn't set at that point. 3. s/PERL_TARGET/OTHER_TARGET/ 4. Added searchmonitor to OTHER_TARGET for feature NOB. 5. Added INITD_TARGET for feature NOB. 6. Added BIN_TARGET since other targets get installed places other than in a bin directory. 7. Added INITD_DIR and LEVEL_DIR to figure out a SysV system's run level directories for feature NOB. 8. Added installation of /etc/swish++.conf for feature NOB. 9. Added install_sysv target for feature NOB. A. Added uninstallation of start/stop scripts to uninstall target for feature NOB. * Group.c * Group.h 1. New files for feature GAR. * exit_codes.h 1. Added Exit_No_User and Exit_No_Group for feature GAR. 2. Changed Exit_Internal_Error from 255 to 127. * index.c 1. In main(), added maxing out of number of file descriptors to enable more partial indicies to be generated for bug fix MFD. * INSTALL.unix 1. Added step 5 regarding installing the searchd start/stop script for feature NOB. * man/man1/index.1 1. Added missing error codes 40 and 127. * man/man1/search.1 1. Added description of -B and --no-background options and the SearchBackground variable for feature NOB. 2. Added description of -U, --user, -G, and --group options and the User and Group variables for feature GAR. 3. Added subsections to Daemon section. 4. Added mention of giving away root privileges for feature GAR. 5. Added mention of searchmonitor(8) for feature NOB. * man/man4/swish++.conf.4 1. Added mention of SearchBackground variable for feature NOB. 2. Added mention of Group and User variables for feature GAR. * man/man8/GNUmakefile * man/man8/searchd.8 * man/man8/searchmonitor.8 * SearchBackground.h 1. New files for feature NOB. * search.c 1. Added #include "SearchBackground.h" for feature NOB. 2. Added global search_background variable for feature NOB. 3. In main(), added check of search_background_opt for feature NOB. 4. In search_options::search_options(), added initialization of search_background_opt for feature NOB. 5. In search_options::search_options(), added case for 'B' for feature NOB. 6. In usage(), added usage for -B and --no-background for feature NOB. 7. In search(), added results_format parameter for bug fix SDF. 8. In search_options::search_options(), added code to check legality of argument to -F option for bug fix SDF. 9. In service_request(), added opt.results_format_arg to call to search() for bug fix SDF. A. Added #include "User.h" and "Group.h" for feature GAR. B. Added global user and group variables for feature GAR. C. In main(), added check of group_arg and user_arg for feature GAR. D. In search(), added static_cast to get rid of float->int conversion warning. E. In search_options::search_options(), added initialization of user_arg and group_arg for feature GAR. F. In search_options::search_options(), added cases for 'G' and 'U' for feature GAR. G. In Usage(), added description of -G and -U for feature GAR. * search.h 1. Added search_background_opt for feature NOB. 2. Added user_arg and group_arg for feature GAR. * searchd.in * searchmonitor.in 1. New files for feature NOB. * search_daemon.c 1. Added #include "SearchBackground.h" for feature NOB. 2. In become_daemon(), added tests of search_background for feature NOB. 3. Added #include "User.h" and "Group.h for feature GAR. 4. In become_daemon(), added code to change UID/GID for feature GAR. * search_options.c 1. Added no-background option for feature NOB. 2. Added user and group options for feature GAR. * swish++.conf 1. Added SearchBackground for feature NOB. 2. s!/tmp/search.pid!/var/run/search.pid! * User.c * User.h 1. New files for feature GAR. * util.h 1. In max_out_limit(), set limit to infinity if running as root for bug fix MFD. * version.h 1. Updated version to "5.5". ******************************************************************************* 5.4.6 ******************************************************************************* BUG FIXES --------- * On systems (such as Solaris) where /bin/sh is still really Bourne shell (as opposed to bash in disguise), -e tests don't work. (This bug fix shall be known as bug fix DEF.) CHANGES, file-by-file --------------------- * GNUmakefile * init_mod_vars-sh 1. s/-e/-f/ for bug fix DEF. * version.h 1. Updated version to "5.4.6". ******************************************************************************* 5.4.5 ******************************************************************************* BUG FIXES --------- * If AssociateMeta, IncludeFile, IncludeMeta, ExcludeFile, or ExcludeMeta were not given in a configuration file, values given via the command line were discarded. (This bug fix shall be known as bug fix CRA.) * On some systems, the auto-building of dependencies got into an infinite loop since the "dep" directory's timestamp was updated for every dependency file and thus everything that depended on it was always out of date. Why this doesn't happen on all systems isn't clear. (This bug fix shall be known as bug fix DTS.) CHANGES, file-by-file --------------------- * config/config.mk 1. Removed "dep" for bug fix DTS. * config/GNUmakefile 1. s/dep/.*.d/ for bug fix DTS. * conf_var.c 1. In parse_file(), removed call to reset_all() for bug fix CRA. * conf_var.h 1. Made reset_all() public. * GNUmakefile * mod.mk 1. Changed "dep/%.d" (back) to ".%.d" for bug fix DTS. 2. In distclean rule, s/dep/.*.d/ for bug fix DTS. * INSTALL.win32 * INSTALL.unix 1. s/dep/.*.d/ for bug fix DTS. * version.h 1. Updated version to "5.4.5". ******************************************************************************* 5.4.4 ******************************************************************************* BUG FIXES --------- * In index(1), the config-file option wasn't recognized because it was spelled as just "config" in the source code. D'oh! (This bug fix shall be known as bug fix LCO.) * Configuration file variables in modules were somehow being corrupted so some weren't being recognized any longer. I really don't know what was going on. But, module-specific variables weren't recognized at all in search(1). Oops. (This bug fix shall be known as bug fix XCV.) CHANGES, file-by-file --------------------- * conf_var.c 1. In map_ref(), added call to init_mod_vars() for bug fix XCV. * conf_var.h 1. Added init_mod_vars() for bug fix XCV. * GNUmakefile 1. Added init_mod_vars.c to I_SOURCES, S_SOURCES, and E_SOURCES for bug fix XCV. 2. Added rule to make init_mod_vars.c for bug fix XCV. * index.c 1. In main(), s/config/config-file/ for bug fix LCO. * init_mod_vars-sh 1. New file for bug fix XCV. * mod/html/mod_html.c * mod/html/mod_html.h * mod/mail/mod_mail.c * mod/mail/mod_mail.h 1. Moved constructor to .h file and removed register_var() for bug fix XCV. * mod/html/vars * mod/mail/vars 1. New files for bug fix XCV. * version.h 1. Updated version to "5.4.4". ******************************************************************************* 5.4.3 ******************************************************************************* BUG FIXES --------- * When compiling without the search daemon, search(1) wouldn't link because it needs conf_enum.o and it wasn't compiled. (This bug fix shall be know as big fix CEO.) * The file thread_pool.c didn't compile under FreeBSD. (This bug fix shall be know as big fix BSD4.) CHANGES, file-by-file --------------------- * config/config.mk 1. In "OS selection" section, added comment for Mac OS X. * GNUmakefile 1. Moved conf_enum.c so that it's always compiled for bug fix CEO. * INSTALL.unix 1. Added fact that g++ 2.95.2 works. 2. Added note about g++ 2.96. * thread_pool.c 1. Added "#ifndef FreeBSD" around use of PTHREAD_MUTEX_ERRORCHECK for bug fix BSD4. * version.h 1. Updated version to "5.4.3". ******************************************************************************* 5.4.2 ******************************************************************************* BUG FIXES --------- * The "classic" results formatting was broken in that the result separator wasn't output in all the places it should be. How I didn't catch this isn't clear. (This bug fix shall be know as bug fix CFS.) CHANGES, file-by-file --------------------- * classic_formatter.c 1. In result(), added missing "results_separator" for bug fix CFS. * version.h 1. Updated version to "5.4.2". ******************************************************************************* 5.4.1 ******************************************************************************* BUG FIXES --------- * The command-line option spec. building introduced in version 5.4 was broken. (This bug fix shall be know as bug fix COS.) CHANGES, file-by-file --------------------- * indexer.c 1. In indexer::all_mods_options(), s/++option_count/*c++ = *s/ for buf fix COS. * version.h 1. Updated version to "5.4.1". ******************************************************************************* 5.4 ******************************************************************************* NEW FEATURES ------------ * Search results can now optionally be output in XML. (This feature shall be known as feature XML.) * The modular indexing rearchitecture is now complete. CHANGES, file-by-file --------------------- * classic_formatter.c * classic_formatter.h 1. New files for feature XML. * conf_var.c 1. In map_ref(), removed ExcludeClass and FilterAttachment. 2. Added: register_var() 3. In map_ref(), added ResultsFormat for feature XML. * conf_var.h 1. Added: register_var() * file_info.c 1. Reordered mem-initializers to match new order in declaration for feature XML. 2. Added file_info( unsigned char const* ) for feature XML. * file_info.h 1. Added file_info( unsigned char const* ) for feature XML. 2. Reordered data members to facilitate new constructor for feature XML. * GNUmakefile 1. Added file_info.c, classic_formatter.c, ResultsFormat.c, results_formatter.c, and xml_formatter.c to S_SOURCES for feature XML. * index.c 1. Removed #include of mod_html .h files. 2. Removed mod_html command-line options. 3. In main(), added code to gather all module options. 4. In main(), moved code to dump HTML elements into mod_html. 5. In usage(), removed mod_html usage. 6. In usage(), added call to: indexer::all_mods_usage(). * indexer.c * indexer.h 1. Added any_mod_claims_option(), all_mods_options(), all_mods_post_options(), all_mods_usage(), claims_option(), option_spec(), post_options(), and usage(). * INSTALL.unix 1. Updated Unix prerequisites. * man/man1/search.1 1. Added XML results description for feature XML. 2. Added -F, --format, and ResultsFormat for feature XML. 3. Corrected wording regaring titles. 4. For -P and --pid-file, added mention of default being none. 5. For -u and --socket-file, added mention of default being /tmp/search.socket. 6. In meta data query examples, removed mention of "HTML or XHTML" since other document types can have meta information. 7. Added XML output caveat for feature XML. 8. Added reference to XML specification for feature XML. * man/man4/swish++.conf.4 1. Added ResultsFormat for feature XML. * mod/html/mod_html.c 1. Moved constructor definition here. 2. In constructor, added call to register_var( "excludeclass" ); 3. Moded global dump_html_elements_opt definition here. 4. Added claims_option(), option_spec(), post_options(), and usage(). * mod/html/mod_html.h 1. Mode constructor definition to mod_html.c. 2. Added claims_option(), option_spec(), post_options(), and usage(). * mod/mail/mod_mail.c 1. Moved constructor definition here. 2. Added call to register_var( "filterattachment" ). * mod/mail/mod_mail.h 1. Moved constructor definition to mod_mail.c. * README 1. Added "XML search results" for feature XML. * ResultsFormat.c * ResultsFormat.h * results_formatter.c * results_formatter.h 1. New files for feature XML. * search.c 1. Added #include of classic_formatter.h, file_info.h, ResultsFormat.h, results_formatter.h, ResultsMax.h, and xml_formatter.h for feature XML. 2. Added global results_format for feature XML. 3. In main(), added test of opt.results_format_arg for feature XML. 4. In search(), replaced result output with new result formatter classes for feature XML. 5. In search_options::search_options(), added initialization of restuls_format_arg for feature XML. 6. In search_options::search_options(), added case 'F' for feature XML. 7. Rewrote write_file_info() using a file_info. 8. In usage(), added usage message for -F option for feature XML. * search.h 1. Added results_format_arg for feature XML. * search_options.c 1. Added "format" for feature XML. * SearchResults.dtd 1. New file for feature XML. * swish++.conf 1. Added ResultsFormat for feature XML. * version.h 1. Upped version to "5.4." * xml_formatter.c * xml_formatter.h 1. New files for feature XML. ******************************************************************************* 5.3.6 ******************************************************************************* NEW FEATURES ------------ * When compiling using g++, added additional compiler options to reduce code size and slightly improve performance. (This feature shall be known as feature GPPO.) BUG FIXES --------- * Indexing files via standard input where the order of the directories wasn't "monotonically increasing," didn't work: files ended up in the wrong directory. As a beneficial consequence, the -D and -G options and the DirectoriesGrow and DirectoriesReserve variables are no longer needed. (This bug fix shall be known as big fix ISI2.) * Destroying a thread_pool's threads didn't work properly in that the clean-up function for all threads didn't get called. (This didn't matter for SWISH++ since search(1) never destroys its thread_pool.) (This bug fix shall be known as big fix TPD.) CHANGES, file-by-file --------------------- * config/config.mk 1. Added -D_XOPEN_SOURCE=500 for compiling search daemon under Linux for bug fix TPD. 2. Added DEBUG variable. 3. Added -fno-rtti to CCFLAGS for feature GPPO. 4. Added -fomit-frame-pointer to OPTIM for feature GPPO. * config/mod.mk 1. s/DEBUG/DEBUGFLAGS/ * config.h 1. Removed DirectoriesGrow_Default and DirectoriesReserve_Default as part of bug fix ISI2. * conf_var.c 1. In parse_file(), added reset_all(). 2. In reset_all(), added check for null pointer. 3. Removed DirectoriesGrow and DirectoriesReserve as part of bug fix ISI2. * DirectoriesGrow.h * DirectoriesReserve.h 1. Removed these files as part of bug fix ISI2. * directory.c 1. Changed do_file() to take a second dir_index argument for bug fix ISI2. 2. Changed return type of check_add_directory() to return a directory index for bug fix ISI2. 3. In check_add_directory(), changed from using a set to a map where the value is the directory index for bug fix ISI2. 4. In do_check_add_file() and do_directory(), made it get and pass dir_index to do_file() for bug fix ISI2. 5. Removed directories_reserve as part of bug fix ISI2. * directory.h 1. s/dir_list/dir_set/ * do_file.c 1. When compiled for index(1), made do_file() take a second argument of dir_index for bug fix ISI2. 2. Added dir_index to call to file_info() constructor for bug fix ISI2. * exit_codes.h 1. s/Exit_No_Init_Condition/Exit_No_Init_Thread_Condition/ 2. s/Exit_No_Init_Mutex/Exit_No_Init_Thread_Mutex/ * file_info.c 1. Made first constructor take dir_index argument for bug fix ISI2. 2. Removed second constructor for bug fix ISI2. 3. Integrated construct() into lone constructor. * file_info.h 1. Removed constructor not taking dir_index for bug fix ISI2. 2. Removed construct(). * GNUmakefile 1. Added DEBUGFLAGS. 2. For $(MOD_LIBS) rule, s/DEBUG/DEBUGFLAGS/ * index.c 1. In load_old_index(), removed new_strdup()'s since file_info is now doing them for bug fix ISI2. 2. Removed DirectoriesGrow and DirectoriesReserve as part of bug fix ISI2. 3. Removed -D and -G options as part of bug fix ISI2. 4. In write_dir_index(), added code to order the directories for bug fix ISI2. 5. Moved definition of exlude_class_names to mod_html.c. * index_header.c 1. s/dir_list/dir_set/ for bug fix ISI2. * man/man1/httpindex.1 1. Added -e's to example. * man/man1/index.1 1. Removed -D, --dirs-reserve, -G, --dirs-grow, DirectoriesGrow, and DirectoriesReserve as part of bug fix ISI2. * man/man4/swish++.conf.4 1. Removed DirectoriesGrow and DirectoriesReserve as part of bug fix ISI2. * mod/html/mod_html.c 1. Moved definition of exlude_class_names here. * search_daemon.c 1. In set_signal_handlers(), removed SA_RESTART. * swish++.conf 1. Removed DirectoriesGrow and DirectoriesReserve as part of bug fix ISI2. * thread_pool.c 1. In thread_destroy(), added code to unlock q_lock for bug fix TPD. 2. In thread_destroy(), added code to decrement q_lock's reference count for bug fix TPD. 3. In thread_destroy(), added code to deallocate thread object storage for bug fix TPD. 4. In thread_main(), added code to increment q_lock's reference count for bug fix TPD. 5. s/Exit_No_Init_Condition/Exit_No_Init_Thread_Condition/ 6. s/Exit_No_Init_Mutex/Exit_No_Init_Thread_Mutex/ 7. In ~thread(), removed destructing_lock_ since I don't think it's needed. 8. In ~thread(), added optimization for a thread committing suicide. 9. Added q_lock::dec_ref() and q_lock::inc_ref() functions for bug fix TPD. A. In thread_pool::thread_pool(), created q_lock with the PTHREAD_MUTEX_ERRORCHECK attribute for bug fix TPD. B. In ~thread_pool(), removed destructing_lock_ since I don't think it's needed. C. In ~thread_pool(), added code to decrement q_lock's reference count for bug fix TPD. * thread_pool.h 1. Overrode thread::operator delete() to do nothing for bug fix TPD. 2. Changed q_lock_ to a reference-counted object for bug fix TPD. 3. Made max/min threads and thread timeout settable after thread_pool creation. * version.h 1. Upped version to "5.3.6." ******************************************************************************* 5.3.5 ******************************************************************************* NEW FEATURES ------------ * The code for modules has been reorganized into subdirectories that build libraries with the goal of having a completely modular indexing architecture similar to the way Apache has modules. This is a work-in-progress. BUG FIXES --------- * Version 5.1 broke indexing file names via standard input. D'oh! (This bug fix shall be known as bug fix FSI.) * Version 5.1 also added unnecessary work for extract(1). (This bug fix shall be known as bug fix ESI.) * The thread::~thread() destructor mistakenly killed the calling thread rather than itself. Oops. This didn't actually matter for SWISH++ since it's never called. (This bug fix shall be known as bug fix TPTD.) CHANGES, file-by-file --------------------- * config.h 1. Moved MOD_HTML parameters to mod/html/html_config.h. * config/config.mk 1. Changed format of MOD_LIST. 2. Added RANLIB. 3. Moved auto-dependency generation here. * config/GNUmakefile 1. Moved TARGET definition before include. 2. Added removal of accidental dep subdirectory. * config/mod.mk 1. New file for modularization. * conf_var.c 1. s/MOD_HTML/mod_html/ 2. s/MOD_MAIL/mod_mail/ * directory.c 1. Made this file #include'd by index.c and extract.c for bug fix FSI. 2. Added: #include "fake_ansi.h" 3. Added "#ifdef INDEX" in various places for bug fix ESI. 4. Added do_check_add_file() for bug fix FSI. * directory.h 1. Removed local #include's and follow_symbolic_links and function declarations for bug fix FSI. * elements.c * elements.h * entities.c * entities.h * ExcludeClass.h * mod_html.c * mod_html.h 1. Moved to mod/html subdirectory. * encoded_char.c * encoded_char.h 1. s/MOD_MAIL/mod_mail/ * extract.c 1. Moved #include of platform.h first so PJL_NO_SYMBOLIC_LINKS would be defined at the right time. 2. Added: #include "FollowLinks.h" for bug fix ESI. 3. Added: #include "directory.c" for bug fix ESI. 4. s/::strdup()/new_strdup()/ * ExcludeFile.c * ExtractFile.c 1. s/::strdup()/new_strdup()/ * file_info.c 1. s/::strdup()/new_strdup()/ * FollowLinks.h 1. s/follow_links/follow_symbolic_links/ * FilterAttachment.h * mod_mail.c * mod_mail.h 1. Moved to mod/mail subdirectory. * GNUmakefile 1. Moved target definition before include. 2. s/C_TARGET/CPP_TARGET/ 3. Added MOD_LIBS, MOD_LIB_PATHS, MOD_LINK. 4. s/I_SRCS/I_SOURCES/, s/I_OBJS/I_OBJECTS/, s/S_SRCS/S_SOURCES/, s/S_OBJS/S_OBJECTS/, s/E_SRCS/E_SOURCES/, s/E_OBJS/E_OBJECTS/ 5. Removed module-specific .c files. 6. Removed $(CCLINK) -- not used. 7. Made use of $(MOD_LINK) 8. Removed entities.c from E_SOURCES -- not used. 9. Added $(MOD_LIBS) to index dependencies. A. Added rule to build init_modules.c B. Added ruleto build module libraries. C. Moved auto-dependency generation to config/config.mk. D. Added MAKE_SUBDIRS function and made use of it in clean, distclean. E. Removed directory.c from I_SOURCES AND E_SOURCES for bug fix ESI. * IncludeFile.c * IncludeMeta.c 1. s/::strdup()/new_strdup()/ * index.c 1. Changed includes to use mod/html form. 2. s/MOD_HTML/mod_html/ 3. Moved #include of platform.h first so PJL_NO_SYMBOLIC_LINKS would be defined at the right time. 4. Added: #include "FollowLinks.h" for bug fix FSI. 5. Added: #include "directory.c" for bug fix FSI. 6. In main(), s/do_file()/do_check_add_file()/ 7. s/::strdup()/new_strdup()/ * indexer.c 1. s/::strdup()/new_strdup()/ * init_modules.c 1. Removed since it's not automatically generated. * init_modules-sh 1. Added to generate init_modules.c automatically. * mod_man.c * mod_man.h 1. Moved to mod/man subdirectory. * mod/html/html_config.h 1. Moved MOD_HTML-specific configuration parameters here. * mod/html/ExcludeClass.h * mod/html/GNUmakefile * mod/html/elements.c * mod/html/elements.h * mod/html/entities.c * mod/html/entities.h * mod/html/mod_html.c * mod/html/mod_html.h * mod/mail/FilterAttachment.h * mod/mail/GNUmakefile * mod/mail/encoded_char.c * mod/mail/mod_mail.c * mod/mail/mod_mail.h * mod/man/GNUmakefile * mod/man/mod_man.c * mod/man/mod_man.h 1. Moved from top-level directory. * stem_word.c * stop_words.c 1. s/::strdup()/new_strdup()/ * thread_pool.c 1. Fixed thread::~thread() for bug fix TPTD. * util.c * util.h 1. Removed unneeded #include's. * version.h 1. Changed version to "5.3.5". ******************************************************************************* 5.3.4 ******************************************************************************* BUG FIXES --------- * File titles turned to garbage when indexing file incrementally. (This bug fix shall be known as IIT.) * option_stream's test main() was incorrectly defined inside the PJL namespace. (This bug fix shall be know as bug fix OSM.) * option_stream didn't report an error for an option that required an argument when no argument was given when said option was the last thing on the commend line. (This bug fix shall be know as bug fix OSN.) CHANGES, file-by-file --------------------- * index.c 1. Merged parse_file_info() function into load_old_index(). 2. In (what is now) load_old_index(), added a strdup() for the file's title for bug fix IIT. 3. In write_dir_index(), switched to using FOR_EACH(). * option_stream.c 1. Got rid of option_stream::option::copy(). 2. Moved the test main() outside of the PJL namespace for bug fix OSM. 3. s/c_/short_name_/, s/index_/argi_/. 4. Added was_short_option_ variable for bug fix OSN. 5. Replaced some duplicated option argument code with a goto. 6. Reworked argument processing for buf fix OSN. * option_stream.h 1. Got rid of option_stream::option::copy() and destructor. 2. Made copy constructor and assignment operator private. 3. Renamed the following: s/c/short_name/, s/index_/argi_/ * version.h 1. Changed version to "5.3.4". ******************************************************************************* 5.3.3 ******************************************************************************* BUG FIXES --------- * SIGPIPE wasn't handled at all so a search client that disconnected unexpectedly could crash the server. Now that this has been fixed, the server also needs to check the state of the outgoing stream during writes for an error: if an error occurs, assume the client disconnected from the socket and stop sending output. (This bug fix shall be known as bug fix PIPE.) * On Linux systems, multiple reads from the search daemon timed out sooner than requested because Linux modifies the timeval struct passed to select() to reflect the amount of time not slept. (This bug fix shall be known as bug fix LMR.) CHANGES, file-by-file --------------------- * search.c 1. In main(), s/search_options opt/search_options const opt/ 2. In dump_single_word(), dump_word_window(), several places in search() and service_request, added a check for the state of the "out" stream for bug fix PIPE. * search_daemon.c 1. Added set_signal_handlers() for bug fix PIPE. * search_thread.c 1. In search_thread::main(): s/search_options opt/search_options const opt/ 2. In timed_read_line(), reworked the timeout such that the timeval struct is always initialized properly for every loop iteration for bug fix LMR. * version.h 1. Changed version to "5.3.3". ******************************************************************************* 5.3.2 ******************************************************************************* BUG FIXES --------- * On some platforms, index(1) would index "0 words" for every file. (This bug fix shall be known as bug fix IZW.) * There was a race condition in the search daemon thread pool code whereby the prototype thread could begin executing before its owning thread_pool was fully constructed. (This bug fix shall be known as bug fix TPR.) CHANGES, file-by-file --------------------- * encoded_char.h 1. Made encoded_char_range::const_iterator's ch_ and decode() compile in only when MOD_MAIL is defined for bug fix IZW. 2. In encoded_char_range::const_iterator::operator*(), made it return *pos_ when MOD_MAIL wasn't defined for bug fix IZW. * exit_codes.h 1. Added Exit_No_Init_Condition and Exit_No_Init_Mutex. * man/man1/search.1 1. Added exit coded 66 and 67. * thread_pool.c 1. In thread_main(), added "::pthread_mutex_lock( &t->run_lock_ );" for bug fix TPR. 2. In thread_pool::thread::thread(), added initialization and locking of run_lock_ for bug fix TPR. 3. In thread_pool::thread::~thread(), added destruction of run_lock_ for bug fix TPR. 4. Changed thread_pool::thread_pool() to take a pointer to non-const thread for bug fix TPR. 5. In thread_pool::thread_pool(), made use of Exit_No_Init_Condition and Exit_No_Init_Mutex. 6. In thread_pool::thread_pool(), added prototype thread to pool and now creating thread_min - 1 additional threads. 7. In thread_pool::thread_pool(), s/create/create_and_run()/ for bug fix TPR. 8. In thread_pool::new_task(), s/create/create_and_run()/ for bug fix TPR. * thread_pool.h 1. Added thread_pool::thread::run_lock_ for bug fix TPR. 2. Added thread_pool::thread::run() and create_and_run() for bug fix TPR. 3. Changed thread_pool::thread_pool() to take a pointer to non-const thread for bug fix TPR. * version.h 1. Changed version to "5.3.2". ******************************************************************************* 5.3.1 ******************************************************************************* BUG FIXES --------- * Searching with more that two "and" terms caused a core cump. This bug was a result of the "enhancement" to doing multiple "and" terms in 5.3. (This bug fix shall be known as bug fix MAT.) * Compiling with all but the text module produced a syntax error. (This bug fix shall be known as bug fix NMS.) CHANGES, file-by-file --------------------- * conf_bool.h * conf_int.h 1. Removed extraneous backslash. * init_modules.c 1. Added #include "indexer" for bug fix NMS. * query.c 1. In perform_and(), added needed "break" for bug fix MAT. * version.h 1. Changed version to "5.3.1". ******************************************************************************* 5.3 ******************************************************************************* BUG FIXES --------- * The weighting of multiple "and" terms has been fixed. Previously, the query: mouse and computer and keyboard was parsed and treated as: (mouse and computer) and keyboard 25% 25% 50% The problem was that the last term always got 50% of the weighting and the rest got 50% divided by the number of terms minus 1. In order to weight all the terms equally, the "and" results for each term are now saved in a list and then and'ed together at the end. (This bug fix shall be known as bug fix MAW.) * The index(1) manual page didn't explicitly state that words are converted to lower case prior to indexing. CHANGES, file-by-file --------------------- * conf_var.c 1. Changed from abort() to internal_error. * exit_codes.h 1. Added: Exit_Internal_Error * filter.c * indexer.c * option_stream.c * thread_pool.c 1. Changed from abort() to internal_error. * man/man1/index.1 1. Added paragraph at the end of the "Word Determination" subsection addressing conversion to lower case prior to indexing. * query.c 1. Changed from abort() to internal_error. 2. Moved declarations for get_meta_id(), parse_meta(), parse_primary(), and parse_optional_relop() here from query.h. 3. In parse_meta() and parse_primary(), got rid of unused default value. 4. Changed what what parse_query() to parse_query2(). 5. Added a new parse_query(). 6. Added and_results_type argument to parsing functions for bug fix MAW. 7. In parse_query2(), deferred and'ing of results for bug fix MAW. 8. Added perform_and() function for bug fix MAW. * query.h 1. Moved declarations for get_meta_id(), parse_meta(), parse_primary(), and parse_optional_relop() to query.c. 2. Added: stop_word_set 3. s/set< string >/stop_word_set/ 4. For parse_query(), got rid of unneeded bool& and int arguments. 5. s/search_results_type/search_results/ 6. s/find_results_type/find_results/ * search.c 1. s/set< string >/stop_word_set/ 2. In search(), got rid of unused "ignore" variable. 3. s/search_results_type/search_results/ 4. s/find_results_type/find_results/ * util.h 1. Added internal_error and report_error(). * version.h 1. Upped version. ******************************************************************************* 5.2 ******************************************************************************* NEW FEATURES ------------ * E-mail attachments can now be filtered by external programs. (This feature shall be know as feature AFP.) CHANGES, file-by-file --------------------- * conf_filter.c * conf_filter.h 1. Replaced FilterFile.c and made generic for feature AFP. * conf_var.c 1. Added filterattachment for feature AFP. * do_file.c * extract.c 1. s/filters/file_filters/ for featuer AFP. * filter.h 1. Added: substitute( std::string const &file_name ); * FilterAttachment.h 1. Added this file for feature AFP. * FilterFile.c 1. Replaced by conf_filter.c * FilterFile.h 1. Made FilterFile derived from conf_filter for feature AFP. 2. s/filters/file_filters/ for featuer AFP. * GNUmakefile 1. Added conf_filter.c for feature AFP. 2. Removed FilterFile.c for feature AFP. * index.c 1. s/filters/file_filters/ for featuer AFP. * man/man1/extract.1 1. Added FilterAttachment for feature AFP. * man/man1/index.1 1. Added mention of FilterAttachment for feature AFP. 2. s/-D/-G/ * man/man4/swish++.conf.4 1. Added "Filter variables" section. 2. Added information on filtering attachments for feature AFP. 3. Added more references. * mod_mail.c 1. Added #include's for , , "FilterAttachment.h", and "Verbosity.h" for feature AFP. 2. Added "attachment_filters" declaration for feature AFP. 3. Added index_via_filter() for feature AFP. 4. In index_headers(), added code for filters for feature AFP. 5. In index_words(), added case for External_Filter for feature AFP. * mod_mail.h 1. Added "External_Filter" for feature AFP. 2. Changed message_type from s pair<> to a struct for feature AFP. * README 1. Added mention of filtering attachments for feature AFP. * swish++.conf 1. Added FilterAttachment section for feature AFP. 2. Added: FilterFile *.ps pstotext %f > @%F.txt 3. Added: FilterFile *.bz2 bunzip2 -c %f > @%F * version.h 1. Upped version. ******************************************************************************* 5.1 ******************************************************************************* NEW FEATURES ------------ * Reduced index storage size by recording directory names once. Note that the old -G option for index(1) has changed to -g and that there is a new -G option. (This feature shall be known as feature DIR1.) BUG FIXES --------- * The swish++.conf(4) manual page was missing FilesReserve and ResultsMax. (This bug fix shall be known as bug fix MFR.) CHANGES, file-by-file --------------------- * bcd.h 1. Added: #include "fake_ansi.h" * config.h 1. Added DirectoriesGrow_Default and DirectoriesReserve_Default for feature DIR1. * config/config.mk 1. Added g++ 3.0-specific warnings to CCFLAGS for development purposes. * conf_bool.h * conf_enum.h * conf_int.h * conf_set.h * conf_string.h 1. Added: #include "fake_ansi.h" * conf_percent.c * conf_percent.h * DirectoriesGrow.h * DirectoriesReserve.h 1. Added these files for feature DIR1. * conf_var.c 1. Added DirectoriesGrow and DirectoriesReserve configuration variables for feature DIR1. * directory.c 1. Added dir_list for feature DIR1. 2. Added directories_reserve. 3. Added check_add_directory() for feature DIR1. 4. s/queue< string >/queue< char const* >/ 5. Switched from using std::string to create the current path to using a simpler char buffer. 6. Made sure the directory that is passed to do_directory() recursively has been strdup()'d. * directory.h 1. Added dir_list for feature DIR1. 2. Added check_add_directory() for feature DIR1. 3. Added: #include "fake_ansi.h" * elements.h * entities.h 1. Added: #include "fake_ansi.h" * ExcludeFile.h 1. Added extern declaration. * extract.c 1. s/do_directory( file_name )/do_directory( ::strdup( file_name ) )/ * ExtractFile.h * ExtractFilter.h 1. Added extern declaration. * ExtractExtension.h 1. Added extern declaration. 2. Added: #include "fake_ansi.h" * fake_ansi.h 1. Removed __STL_NO_NAMESPACES and __STL_USE_NAMESPACES. This stopped working and I can't figure out why. * FilesGrow.c 1. This functionality was replaced by conf_percent.c for feature DIR1. * FilesGrow.h 1. Changed to be derived from conf_percent. 2. Added extern declaration. * file_info.c 1. Added result_separator. 2. Redid the constructor mem-initializers. 3. Moved common constructor code to construct(). 4. Added a second constructor used for reconsituting instances during incremental indexing. 5. Moved code for parse() to index.c. * file_info.h 1. Removed: #include 2. Added: #include "fake_ansi.h" 3. Added a second constructor used for reconsituting instances during incremental indexing. 4. Added dir_index() and dir_index_ for feature DIR1. 5. Added: construct() 6. Made all data mambers private and added accessor functions. 7. s/struct/class/ 8. Added: const_iterator, begin(), end(), ith_info(), and num_files(). * FilterFile.h 1. Added extern declaration. 2. Removed: #include * fnmatch.h 1. Removed unused #ifndef's. 2. Added #undef's. 3. Removed FNM_ERROR since it's not used. * FollowLinks.h 1. Added extern declaration. * GNUmakefile 1. Added conf_percent.c for feature DIR1. 2. Removed FilesGrow.c for feature DIR1. 3. Removed file_info.c from S_SRCS since file_info::out() has been moved to write_file_info() in search.c for feature DIR1. 4. Added query.c to S_SRCS. * IncludeFile.h 1. Added extern declaration. * IncludeMeta.h 1. Added: #include "fake_ansi.h" * Incremental.h 1. Added extern declaration. * index.c 1. Added DirectoriesGrow and DirectoriesReserve for feature DIR1. 2. Added my_write() since ostream::write() now apparantly requires a char* rather than a void* and I'm lazy about having to cast the pointers. 3. Added dirs-reserve and dirs-grow command-line options for feature DIR1. 4. Added #ifdef PJL_GCC_295. 5. In load_old_index(), added loading of directory index for feature DIR1. 6. Moved index-file header-writing code to index_header.c. 7. Added write_dir_index() for feature DIR1. 8. Added new options to usage message for feature DIR1. 9. In usage(), s//title/. A. In main(), added: check_add_directory( "." ); B. s/file_info::parse/parse_file_info/ C. Added parse_file_info(). D. s/do_directory( file_name )/do_directory( ::strdup( file_name ) )/ * IndexFile.h 1. Added: #include "fake_ansi.h" * indexer.h 1. Added: #include "fake_ansi.h" * index_header.c 1. Added this file to have index-file header-writing code only once. * index_segment.h 1. Added dir_index for feature DIR1. * man/man1/index.1 1. Added -D, --dirs-reserve options for feature DIR1. 2. Changed old -G option to -g for feature DIR1. 3. Added new -G, --dirs-grow options for feature DIR1. 4. Added missing FilesGrow variable for bug fix MFR. 5. Added DirectoriesGrow and DirectoriesReserve variables for feature DIR1. * man/man4/swish++.conf.4 1. Added missing FilesReserve variable for bug fix MFR. 2. Added DirectoriesGrow and DirectoriesReserve vairable for feature DIR1. 3. Added "Percentage variables" section. * man/man4/swish++.index.4 1. Added directory index description. 2. Added other module cases describing a file's title. 3. Made separate BCD subsection. * meta_map.h * mmap_file.h * mod_html.h * mod_mail.h 1. Added: #include "fake_ansi.h" * mod_man.c 1. In index_words(), s/register char const* c/char const* c/ since its address is taken. * my_set.h 1. Moved declaration of #include "fake_ansi.h". * omanip.h 1. Added: #define PJL /* nothing */ 2. Added: #include "fake_ansi.h" * option_stream.h * pattern_map.h * PidFile.h 1. Added: #include "fake_ansi.h" * query.c * query.h 1. Split out thr query-parsing code from search.c to here. * ResultsMax.h 1. Added extern declaration. * ResultSeparator.h 1. Added extern declaration. 2. Added: #include "fake_ansi.h" * search.c 1. Added "directories" index_segment global variable for feature DIR1. 2. Moved file_info::out() to write_file_info() for feature DIR1. 3. Moved result_separator definition here for feature DIR1. 4. Moved query-parsing code to query.c. * search.h * SocketAddress.h * SocketFile.h 1. Added: #include "fake_ansi.h" * StemWords.h 1. Added extern declaration. * StopWordFile.h 1. Added: #include "fake_ansi.h" * swish++.conf 1. Added DirectoriesGrow and DirectoriesReserve for feature DIR1. * TempDirectory.h 1. Added: #include "fake_ansi.h" * thread_pool.c 1. Added start_function_type to thread() constructor. * thread_pool.h 1. Added start_function_type to thread() constructor. 2. Added: #define PJL /* nothing */ 3. Added: #include "fake_ansi.h" * util.h 1. Added: #include "fake_ansi.h" * version.h 1. Updated version to "5.1". * WordFilesMax.h * WordPercentMax.h 1. Added extern declaration. ******************************************************************************* 5.0.1 ******************************************************************************* BUG FIXES --------- * This releases fixes a lot of compile issues (mostly namespaces) with g++ 3.0. (This bug fix shall be known as bug fix GCC3.) * The changes to fix the above have apparantly caused bugs in (at least) g++ 2.95.3 to manifest themselves: 1. In some cases, the compiler "forgets" that operator<<( ostream&, string const& ) has been defined. The hack workaround is to use operator<<( ostream&, char const* ) and use string::c_str(). 2. The compiler "forgets" that stream manipulators have been defined. The workaround is not to use them. :-( (This fix shall be known as fix OOS.) CHANGES, file-by-file --------------------- * bcd.h 1. Switched to using local omanip since depending on the underlying C++ implementation is not portable. This was done for GCC3. * config/config-sh 1. Added PJL_GCC_295 since it's used in multiple places. This was done for OOS. * config/config.mk 1. Made OPTIM = -O2 for g++ also since the optimizer under 3.0 takes ridiculously long and uses most of the CPU and memory. 2. s/($(CC),g++)/($(findstring g++,$(CC)),g++)/ * conf_var.h 1. s/cerr/std::cerr/ for OOS. * do_file.c 1. s/basename/pjl_basename/ due to name collision. * fake_ansi.h 1. Replaced __GNUC__, et al, with PJL_GCC_295 for OOS. * fdbuf.c * fdbuf.h 1. Added these files since the ability to attach an fstream to a Unix file descriptor has been removed from ANSI C++. This was done for OOS. * filter.c 1. s/basename/pjl_basename/ due to name collision. * filter.h 1. s/std::unlink/::unlink/ for OOS. * GNUmakefile 1. Added fdbuf.c to S_SRCS for OOS. * index.c 1. Added my_write() since ostream::write() now apparantly requires a char* rather than a void* and I'm lazy about having to cast the pointers. This was done for OOS. 2. s/o.write( /my_write( o, / for OOS. 3. Added #ifdef PJL_GCC_295 for fix OOS. * index_segment.h 1. s/random_access_iterator_tag/std::random_access_iterator_tag/ for OOS. * less.h 1. Added needed "namespace std { ... }" for OOS. * mmap_file.c 1. s/ios::open_mode/ios::openmode/ for OOS. * mmap_file.h 1. Added missing #include <fstream> for OOS. 2. s/ios::open_mode/std::ios::openmode/ for OOS. 3. s/reverse_bidirectional_iterator/std::reverse_bidirectional_iterator/ for OOS. * omanip.h 1. Added this file to roll own ostream manipulator since depending on the underlying C++ implementation is not portable. This was done for OOS. * option_stream.h 1. s/cerr/std::cerr/ for OOS. * pattern_map.h 1. Removed PJL_LOCAL_FNMATCH since it's not needed. 2. s/unary_function/std::unary_function/ for OOS. 3. s/std::fnmatch/::fnmatch/ for OOS. * search.c 1. s/#include <iomanip>/#include "omanip"/ for OOS. 2. Added #ifdef PJL_GCC_295 for fix OOS. * search.h 1. s/cerr/std::cerr/ for OOS. 2. s/cout/std::cout/ for OOS. * search_thread.c 1. Removed #include <fstream> for OOS. 2. Added #include "fdbuf.h" for OOS. 3. Switched to using fdbuf since the ability to attach an fstream to a Unix file descriptor has been removed from ANSI C++. * stem_word.h 1. s/less/std::less/ * util.h 1. Added missing #include <iostream> 2. s/basename/pjl_basename/ due to name collision. 3. s/std::stat/::stat/ for OOS. 4. s/std::lstat/::lstat/ for OOS. 5. s/cerr/std::cerr/ for OOS. 6. s/endl/std::endl/ for OOS. * version.h 1. Updated version to "5.0.1". * word_info.c 1. Added missing "using namespace std;" for OOS. ******************************************************************************* 5.0 ******************************************************************************* NEW FEATURES ------------ * The indexing code has bee rearchitected to be modular allowing for new file formats to be indexed directly (without filters). Consequently, the indexing of HTML files has been turned into a module. The -e option and IncludeFile variable are now INCOMPATIBLE with previous releases. Read the updated documentation. (This feature shall be known as feature MOD.) * A filter module for mail (and news) files has been added. (This feature shall be known as feature MAIL.) * A filter module for manual page files has been added. (This feature shall be known as feature MAN.) * For index, a new -A or --no-assoc-meta option and AssociateMeta configuration variable have been added. (This feature shall be known as feature AMN.) * There is a new %E (second-to-last filename extenstion) substitution. (This feature shall be known as feature 22L.) * FilterFile configuration lines are now different and INCOMPATIBLE with previous releases. The @ character no longer does substitutions but merely marks the target filename. This was done to enable filtering to files having a fixed name to be able to handle filenames with spaces better. (This feature shall be known as feature SM2.) * The search daemon can now answer queries via TCP sockets in addition to Unix domain sockets. (This feature shall be known as feature TCP.) * You can now specify the separator character in search results. (This feature shall be known as feature SRS.) * Added parsing of XHTML 1.1 ruby elements. (This feature shall be known as feature RUBY.) BUG FIXES --------- * The index(1) -T option was ignored. (This bug fix shall be known as bug fix ITO.) * A configuration file that did not end in a newline would cause a segfault. (I think: I never tried it, but it looked like a bug to me.) (This bug fix shall be known as bug fix CNL.) * Configuration error messages output "(null)" (or seg-faulted) for the variable name. I don't see how the compiler didn't catch this since the name_ data member is const and therefore must be initialized in the constructor. (This bug fix shall be known as bug fix NVR.) * Setting the SearchDaemon config. variable to Y didn't allow no command-line arguments to be given. (This bug fix shall be known as bug fix DCL.) * Filter substitution incorrectly rescanned substituted text. (This bug fix shall be known as bug fix FSR.) * Several tweaks were made to make SWISH++ compiled under FreeBSD. (This bug fix shall be known as bug fix BSD3.) * Added -lnsl for compiling the search daemon under Solaris. (This bug fix shall be known as bug fix SOL2.) * Removed some more buffer-overflow bugs. (This bug fix shall be known as bug fix BOB.) * Filename patterns didn't match if the wildcard wasn't first, e.g., foo* (This bug fix shall be known as bug fix WWF.) CHANGES, file-by-file --------------------- * AssociateMeta.h 1. Added this file for feature AMN. * auto_vec.h 1. Removed #ifdef SEARCH_DAEMON since it's now used by code not in the search daemon for bug fix BOB. 2. Added "explicit" to constructor. 3. Added: auto_vec<T>& operator=( T *p ) 4. s/T *const p_/T *p_/ 5. Added PJL namespace. * bcd.c 1. s/fake_ansi.h/platform.h/ 2. s/STATIC_CAST(...)/static_cast<...>/ * config/config.mk 1. Added MOD_* definitions for feature MOD. 2. Added MOD_LIST to CCFLAGS for feature MOD. 3. Removed definition of MAKE. 4. Added separate "OS selection" section since there's now FreeBSD and Solaris also. 5. Added "PTHREAD_LIB= -pthread" for bug fix BSD. 6. Added "SOCKET_LIB+= -lnsl" for bug fix SOL2. 7. Wrapped thread and socket stuff inside "ifdef SEARCH_DAEMON". 8. Added OS variable. 9. Added OPTIM variable since -O3 in the cygwin environment causes a segfault due to an optimizer bug, presumeably. A. Added: -DMOD_MAN for feature MAN. B. If g++, added: -fno-exceptions to reduce code size. * config/src/mutable.c 1. Removed this file since all C++ compilers should now support "mutable". * config/src/new_casts.c 1. Removed this file since compilers should be implementing new casts by now. * config/src/socklen_1_socklen_t.c * config/src/socklen_2_int.c * config/src/socklen_2_unsigned.c 1. Added "#include <sys/types.h>" for bug fix BSD3. * config.h 1. Added "#ifdef MOD_HTML" around HTML and XHTML options for feature MOD. 2. Moved Title_Max_Size and TitleLines_Default down to Miscellaneous section for feature MAIL. 3. Added SocketPort_Default for feature TCP. * conf_bool.h * conf_int.c * conf_int.h * conf_set.c * conf_set.h * conf_string.c * conf_string.h * ExcludeFile.h * FilesGrow.c * FilesGrow.h * FilterFile.c * FilterFile.h 1. Removed "var_name" parameter from parse_value(). * conf_bool.c 1. Removed "var_name" parameter from parse_value(). 2. Added "using namespace std;" since it should have been there all along. 3. Switched to using auto_vec<char> and to_lower_r() for bug fix BOB. 4. Added PJL namespace. * conf_enum.c * conf_enum.h 1. Added for feature TCP. * conf_int.c 1. Switched to using auto_vec<char> and to_lower_r() for bug fix BOB. 2. s/cerr << error/error()/ 3. Added PJL namespace. * conf_set.h 1. Added PJL namespace. * conf_string.c 1. Added (missing) include of platform.h and namespace stuff. 2. Added code to strip leading/trailing quotes for feature SRS. 3. s/cerr << error/error()/ * conf_string.h 1. Added == and != operators. * conf_var.c 1. Removed HTMLFile for feature MOD. 2. Added "#ifdef MOD_HTML" around ExcludeClass for feature MOD. 3. Removed "var_name" parameter from parse_value(). 4. Replaced alias_name() by constructor. 5. Added ExtractFile for feature MOD. 6. In parse_file(), redid the finding of a newline for bug fix CNL. 7. In parse_file(), made use of find_newline(). 8. In map_ref(), added "SocketAddress" for feature TCP. 9. In conf_var::conf_var(), added initialization of name_ for bug fix NVR. A. In conf_var::conf_var(): s/map_ref()[ name_ ]/map_ref()[ to_lower( name_ ) ]/ so the case for variable names is irrelevant. B. In conf_var::map_ref(), made all variable names lower case. C. In conf_var::parse_line(), added "to_lower( line )" so the case for variable names is irrelevant. D. In conf_var::parse_line(): s/ in config. file// E. In conf_var::map_ref(), changed to doing initialization via a table. (This had the side-effect of making "search" work under FreeBSD.) F. Added ResultSeparator variable for feature SRS. G. s/cerr << warning/warning()/ H. Added PJL namespace. I. s/isspace/is_space/ J. Added "associatemeta" for feature AMN. * conf_var.h 1. Added default argument of "cerr" to error() and warning(). * directory.c 1. Moved configuration variable extern declarations to .h files. * do_file.c 1. Added "#ifdef INDEX" around declaration of orig_file_size and orig_file_name. (It should have been there all along.) 2. Reworked calling of the indexer for feature MOD. 3. Added ExtractFile::const_iterator for feature MOD. 4. s/name_set_.contains()/seen_file()/ 5. s/file_info::current_file().num_words_/fi->num_words()/ 6. Removed "filter_list.reserve( 5 )" so as not to waste time and thereby penalize the performance for files that are not filtered. 7. Recalculated basename for bug fix WWF. * elements.c 1. Added "#ifdef MOD_HTML" for feature MOD. 2. s/REINTERPRET_CAST(...)/reinterpret_cast<...>/ 3. Added ruby elements for feature RUBY. * elements.h 1. Added "#ifdef MOD_HTML" for feature MOD. 2. Added PJL namespace. * entities.h * entities.c 1. Added "#ifdef MOD_HTML" for feature MOD. * encoded_char.c * encoded_char.h 1. Added for feature MAIL. * ExcludeClass.h 1. Added "extern ExcludeClass exclude_class_names;". * ExcludeFile.c 1. Added "using namespace std;". 2. Removed "var_name" parameter from parse_value(). * ExcludeMeta.h 1. Added "extern ExcludeMeta exclude_meta_names;". * exit_codes.h 1. Created TCP and Unix versions of the search daemon exit codes. * extract.c 1. s/IncludeFile/ExtractFile/ for feature MOD since extraction doesn't use modules. 2. Reworked -e and -E options to allow multiple, comma-separated patterns just like for index(1). 3. Added PJL namespace. 4. In extract_words(), removed "buf" and now using "word" exclusively. * ExtractFile.c * ExtractFile.h 1. Added for feature MOD. * fake_ansi.h 1. Got rid of faking "mutable" since all C++ compilers should now support this. 2. Removed new casts section since compilers should be implementing them by now. 3. Added hack to fix g++/STL/iterator bug. * file_info.c 1. s/fake_ansi.h/platform.h/ 2. s/REINTERPRET_CAST(...)/reinterpret_cast<...>/ 3. Added: #ifndef PJL_NO_NAMESPACES (it should have been there all along). 4. Added definition of result_separator variable for feature SRS. 5. s/' '/result_separator for feature SRS. * file_info.h 1. s/ostream/std::ostream/ 2. Made all but list_ data members private. 3. Added public accessor functions for now-private data members. 4. Added inc_words() and seen_file(). 5. Added PJL namespace. * file_list.c 1. Removed: #include "fake_ansi.h" 2. Removed PJL_NO_MUTABLE section. 3. s/THIS->// * file_list.h 1. Removed: #include "fake_ansi.h". 2. s/REINTERPRET_CAST(...)/reinterpret_cast<...>/ 3. Removed pointer and reference type. 4. Made const_iterator derived from std::iterator. * file_vector.c * file_vector.h 1. Replaced by mmap_file.[ch] * FilesReserve.h 1. Added "extern FilesReserve files_reserve;". * filter.c 1. Removed all WIN32 special cases. 2. Added (missing) #include "platform.h" 3. s/find()/rfind()/ 4. Added code to increment pos past substituted text for bug fix FSR. 5. Added code for %E for feature 22L. 6. Changed handling of @ for feature SM2. 7. Made use of basename() added to util.h. * filter.h 1. s/::unlink/std::unlink/ * FilterFile.c 1. Changed handling of @ for feature SM2. 2. Consequently, now require only 1 substitution. 3. Added %E as a valid substitution for feature 22L. * fnmatch.c 1. Added: #include "platform.h" 2. Added: #ifndef PJL_NO_NAMESPACES * GNUmakefile 1. Reorganized HTML sources for feature MOD. 2. Added MOD_MAIL sources for feature MAIL. 3. Added conf_enum.c, SearchDaemon.c, and SocketAddress.c for feature TCP. 4. Added IncludeMeta.c for feature MAIL. 5. Added splitmail target for feature MAIL. 6. s/ifndef WIN32/ifdef SEARCH_DAEMON/ 7. Added fnmatch.c conditionally for WIN32 to E_SRCS. 8. Removed WIN32 special case for platform.h. 9. s/=/:=/ A. Reworded C++ compiler section. B. Added MOD_MAN sources for feature MAN. * html.c * html.h 1. Replaced by mod_html.c and mod_html.h, respectively, for feature MOD. * IncludeFile.h 1. Removed "var_name" parameter from parse_value(). 2. Removed the alias for HTML_File for feature MOD. 3. s/pattern_map< bool >/pattern_map< indexer* >/ for feature MOD. * IncludeFile.c 1. Removed "var_name" parameter from parse_value(). 2. Changed form of line to include indexer for feature MOD. 3. Added "using namespace std;". * IncludeMeta.c 1. Added this file for feature MAIL. 2. Added PJL namespace. * IncludeMeta.h 1. Added "extern IncludeMeta include_meta_names;". 2. Changed base class from conf_set to conf_var and map for feature MAIL. * index.c 1. Added "#ifdef MOD_HTML" for feature MOD. 2. Performed following substitutions for feature MOD. s/html.h/mod_html.h/ s/index.h/indexer.h/ 3. Changed the syntax for -e for feature MOD. 4. Removed the -h option for feature MOD. 5. Moved index_word() to indexer.c for feature MOD. 6. Allowed multiple patterns to be specified via -E option. 7. In main(), performed following substitution: s/TempDirectory_Default/0/ for bug fix ITO. 8. Updated the usage message for feature MOD. 9. Moved configuration variable extern declarations to .h files. A. In main() for case 'm', performed following substitution: s/include_meta_names.insert( to_lower( opt.arg() ) ) /include_meta_names.parse_value( opt.arg() )/ B. Added "#include <sys/time.h>" for bug fix BSD3. C. s/REINTERPRET_CAST(...)/reinterpret_cast<...>/ D. s/remove_temp_files()/remove_temp_files( void )/ for picky HP-UX compiler. E. In rank(), s/num_words_/num_words()/ F. In write_file_index(), made use of new file_info member functions. G. Removed all WIN32 special cases. H. Added PJL namespace. I. Added associate_meta global variable for feature AMN. J. Added "no-assoc-meta" and 'A' command-line options for feature AMN. * index.h 1. Replaced by indexer.h for feature MOD. * indexer.c * indexer.h 1. Added for feature MOD. 2. Added PJL namespace. * index_segment.c 1. Removed: #include "fake_ansi.h" 2. s/REINTERPRET_CAST(...)/reinterpret_cast<...>/ 3. Added PJL namespace. * index_segment.h 1. Made index_segment::const_iterator derived from std::iterator. 2. Added PJL namespace. * init_modules.c 1. Added for features MOD and MAIL. * INSTALL.win32 1. Changed from mingw to cygwin. 2. Removed note about extract(1). 3. Changed build instructions to match Unix version. * itoa.c 1. s/fake_ansi.h/platform.h/ 2. Added PJL namespace. * itoa.h 1. Added PJL namespace. * less.h 1. s/binary_function/std::binary_function/ * man/man1/extract.1 1. Added description for multiple patterns for -e, --pattern, -E, and --no-pattern options. 2. s/pjl@best.com/pauljlucas@mac.com/ * man/man1/httpindex.1 * man/man4/swish++.index.4 1. s/pjl@best.com/pauljlucas@mac.com/ * man/man1/index.1 1. s/pjl@best.com/pauljlucas@mac.com/ 2. Added description of modules and mod_mail for feature MAIL. 3. Removed -h, --html-pattern, and HTMLFile. 4. Reworked description of -m and --meta. 5. Added references for feature MAIL. 6. Added references for feature MAN. 7. Added -A, --no-assoc-meta, and AssociateMeta for feature AMN. 8. Added mention of and reference for Ruby elements for feature RUBY. 9. Made -T option no longer refer to <TITLE> element. * man/man1/search.1 1. Added -R, --separator, ResultSeparator for feature SRS. 2. Made "select" in daemon example more concise. * man/man1/splitmail.1 1. Added for feature MAIL. * man/man3/WWW.3 1. s/pjl@best.com/pauljlucas@mac.com/ 2. Redid formatting of references. 3. Removed trim_whitespace(), url_decode(), and url_encode() since they are no longer used now that the search.cgi example uses CGI.pm * man/man4/swish++.conf.4 1. Added section for enumeration variables and SearchDaemon for feature TCP. 2. Changed IncludeFile from a set variable to an other variable for feature MOD. 3. Added SocketAddress for feature TCP. 4. Added section for IncludeMeta for feature MAIL. 5. s/pjl@best.com/pauljlucas@mac.com/ 6. Added: "For variables_names, case is irrelevant." 7. Added note about preserving whitespace in string values. 8. Added ResultSeparator for feature SRS. 9. Added more IP address detail for SocketAddress. A. Added "# WRONG!" comment to filter example. B. Added "AssociateMeta" for feature AMN. C. Added missing FollowLinks. D. Added "Man" module for feature MAN. * mmap_file.c 1. Added "#include <sys/time.h>" for bug fix BSD3. 2. s/REINTERPRET_CAST( caddr_t )( -1 )/MAP_FAILED/ 3. Removed all WIN32 code. 4. s/fake_ansi.h/platform.h/ 5. Added PJL namespace. * mmap_file.h 1. Removed all WIN32 code. 2. Added PJL namespace. * mod_html.c 1. Replaced html.c for feature MOD. 2. Reworked everything to use encoded_char_ranges. 3. Moved configuration variable extern declarations to .h files. 4. In parse_html_tag(), s/tag/name/ 5. Added PJL namespace. 6. Removed "buf" and now using "word" by itself. 7. s/isxdigit/is_xdigit/ 8. s/isdigit/is_digit/ 9. s/isalpha/is_alpha/ A. s/isspace/is_space/ B. Reworked meta names are handled for feature AMN. C. In tag_cmp(), "fixed" increment and end-of-string test. * mod_html.h 1. Replaced html.h for feature MOD. 2. Made find_title(), index_words(), and new_file() public so they could be accessed from mod_mail.c. 3. Made index_words() and parse_html_tag() take an encoded_char_range or encoded_char_range::const_iterator argument so they could parse HTML that is encoded. 4. Moved configuration variable extern declarations to .h files. 5. Added PJL namespace. * mod_mail.c * mod_mail.h 1. Added for feature MAIL. * mod_man.c * mod_man.h 1. Added for feature MAN. * my_set.h 1. Added PJL namespace. * option_stream.c * option_stream.h 1. Added PJL namespace. * pattern_map.h 1. s/::find_if/std::find_if/ 2. Added: #ifdef PJL_LOCAL_FNMATCH 3. Added "typename" to declaration of map_type. 4. s/value_type const&/argument_type/ * PidFile.h 1. Added "extern PidFile pid_file_name;". * platform.h.win32 1. Removed since not longer needed under cygwin. * postscript.h 1. Added PJL namespace. * README 1. Added new feature descriptions. 2. s!www.best.com/~pjl!homepage.mac.com/pauljlucas! 3. Added mention of Christoph Conrad. * RecurseSubdirs.h 1. Added "extern RecurseSubdirs recurse_subdirs;". * ResultSeparator.h 1. Added this file for feature SRS. * search.c 1. s/html.h/indexer.h/ for feature MOD. 2. Added #include "SocketAddress.h" for feature TCP. 3. s/am_daemon/daemon_type/ and s/daemon_opt/type_type_arg/ for feature TCP. 4. Made the daemon configuration variables global and become_daemon() take no arguments because the argument list was getting way too long. 5. In search_options::search_options(), added socket_address_arg for feature TCP. 6. In search_options::search_options(), added -a option for feature TCP. 7. In usage(), updated message for feature TCP. 8. Added "#include <sys/time.h>" for bug fix BSD3. 9. In main(), moved check of number of command-line arguments after conf_var::parse_file() and command-line override code for bug fix DCL. A. Switched to using auto_vec<char> and to_lower_r() all the time for bug fix BOB. B. Removed all WIN32 special cases. C. Added: #include "ResultsSeparator.h" for feature SRS. D. In main(), added code for result_separator for feature SRS. E. In dump_single_word(), search(), and service_request(): s/' '/result_separator/ for feature SRS. F. In search_options::search_options(), added 'R' case for feature SRS. G. In usage(), added line for -R for feature SRS. H. Added PJL namespace. * search.h 1. s/bool daemon_opt/char const* daemon_opt_arg/ for feature TCP. 2. Added socket_address_arg for feature TCP. 3. s/ostream/std::ostream/ (it should have been that way all along). 4. Added result_separator_arg for feature SRS. 5. Added PJL namespace. * searchc.in 1. Added stuff to connect via a TCP socket to the search daemon for feature TCP. 2. Updated Perl book references for 3rd ed. * search_daemon.c 1. Moved configuration variable extern declarations to .h files. 2. Added "#include <sys/time.h>" for bug fix BSD3. 3. In accept_failed(), added "#ifdef EPROTO" for bug fix BSD3. 4. Partitioned the code into smaller functions. 5. Added PJL namespace. * SearchDaemon.c 1. Added a bunch of #include's ane extern declarations since become_daemon() now uses globals rather than parameters. This was done for feature TCP. 2. Added accept_failed() function for feature TCP. 3. In become_daemon(), added code for TCP sockets for feature TCP. * SearchDaemon.h 1. Changed to be derived from conf_enum for feature TCP. * SearchDaemon.c 1. Added this file for feature TCP. * search_options.c 1. Added "separator", 'R' option for feature SRS. * search_thread.c 1. s/fake_ansi.h/platform.h/ 2. Added PJL namespace. 3. s/isspace/is_space/ * search_thread.h 1. Added PJL namespace. * SocketAddress.h * SocketAddress.c 1. Added these files for feature TCP. * SocketFile.h 1. Added "extern SocketFile socket_file_name;". * socket_options.c 1. Added "socket-address" for feature TCP. 2. s/daemon/daemon-type/ and made it take an argument for feature TCP. * SocketQueueSize.h 1. Added "extern SocketQueueSize socket_queue_size;". * SocketTimeout.h 1. Added "extern SocketTimeout socket_timeout;". * splitmail.in 1. Added this utility for feature MAIL. * stem_word.c 1. Performed following substitution: s/replace_suffix( char *word, rule_list* ) /replace_suffix( char *word, rule_list const* )/ It should have been that way all along. * stop_words.c 1. Added "shall", "you'll", "you're". 2. Added PJL namespace. 3. s/word_buf/word/ 4. s/word_len/len/ * stop_words.c 1. Added PJL namespace. * swish++.conf 1. Added ExtractFile for feature MOD. 2. Removed HTMLFile for feature MOD. 3. Added module name to IncludeFile for feature MOD. 4. Changed SearchDaemon for feature TCP. 5. Added SocketAddress for feature TCP. 6. Added missing variables and sorted alphabetically properly. 7. Added IncludeMeta values for mail/news. 8. Added ResultSeparator variable for feature SRS. 9. Changed @ in FilterFile lines for feaure SM2. A. Added "AssociateMeta" for feature AMN. * swish++.conf.4 1. Added note about preserving whitespace in string values. 2. Added ResultSeparator for feature SRS. 3. Added more IP address detail for SocketAddress. 4. Added "# WRONG!" comment to filter example. 5. Added "For variables_names, case is irrelevant." 6. Sorted Other variables. 7. Added section for IncludeMeta. 8. Removed HTMLFile. 9. Added ExtractFile. A. Added section for enumeration variables and SearchDaemon. B. Changed IncludeFile from a set variable to an other variable. C. Added SocketAddress. D. s/pjl@best.com/pauljlucas@mac.com/ E. Ad