Document:   fsc-00xx
 Version:    0.6
 Date        Aug 30, 1995
 Title:      Implementation and Usage of FileFind Utilities
 Authors:    Robert Williamson FidoNet#1:167/104.0


    A portion of the document is derived from information in
      AllFix.DOC by Harald Harms @ 2:281/910
    with  additional  sections  from
      FQuery.DOC by Robert Williamson @ 1:167/104

        The  MSdos program ALLFIX by Harald Harms first introduced the idea
    of searching for files via echomail.  The term applied to this function
    is  'FileFind'.   A FileFind system allows sysops, points and BBS users
    to  search  for  files  by  placing  a  message  to 'ALLFIX' in an echo
    designated  for  the purpose of finding files.  All FTN sites running a
    FileFind  processor which is configured to scan that echo will reply to
    that  user if there any files matching his query.  This system provides
    a  method  for  searching  many  FTN sites throughout the world, with a
    single message.

        FileFind  programs  work by either scanning through defined message
    bases or scanning packets for defined AREA tagnames for messages to the
    default  name  ALLFIX.   All FileFind programs MUST respond to the name
    ALLFIX,  but  may also respond to the name FILEFIND and the name of the
    particular  FileFind  program  in  use  or  defined  for the echo.  The
    FileFind  program  will  process  these messages, examining the Subject
    field  for  search  queries.  If any valid query is found, the FileFind
    program  will  search  the  sites files database for files matching the
    users's query.

        If the FileFind program finds any matches, it will generate a reply
    containing  a list of the files found, and some basic information ABOUT
    the  system  posting  the reply.  When the user who initially wrote the
    request  reads  the reply, he will then be able to decide if any of the
    reported  files  meet  his  needs,  and  from the ABOUT included in the
    reply, learn where and how he may get those files.

  FileFind Query Message Structure

    To: name_of_FileFind program

    The  message  must be addressed to ALLFIX so that all FileFind programs
    can  respond.   To  use  features  specific  to  a  particular FileFind
    program,  or  to  limit  the  responses  to  a particular platform, the
    message  should  be  addressed  to  that program's name.  Some FileFind
    programs  will  respond  to more than two names.

    A  space-separated  list  of  file  specifications,  keywords or quoted

    keyword     - single word preceeded by a '/' with no intervening spaces,
                  must be at least 3 characters, not including the '/'.
                  a keyword search is in actually a substring search of the
                  site's filelist.

    description - string enclosed in double-quotes,
                  if a single word, must be more than 3 characters.

    filespec    - single word, no spaces, no double-quotes or preceding /,
                  must be at least 3 characters, not including any wildcard
                  or pattern matching charcaters, such as '*'.
                  Messages addressed to ALLFIX must not have any embedded
                  pattern matching characters.

        The  minimum  number  of  characters  for  description, keyword and
    filespec  queries  is an implementation detail of the FileFind program.
    These  values  should  be configurable, but should never be settable to
    values of less than 3.

        Each  implementation  should  allow  the  operator  the  ability to
    configure a list of disallowed keywords.

  NetMail Queries

        Some  FileFind  programs  may also have the ability to process file
    search  queries  received  as  netmail and addressed to the name of the
    particular  FileFInd  program  with this capability.  In this case, all
    replies are via netmail also.

  NetMail Commands
        FileFind   Netmail   commands  are  identifed  by  a  leading  '%'.
    Implementation  of  netmail  commands  is  optional.   If  implemented,
    compliant  FileFind  utilities  should be able to process the following
    minimum NetMail command set.

    %HELP       - netmail only, returns an extended help text for the
                  FileFind program, the ABOUT of the the site and a list
                  of MAGIC freqable names.
    %ABOUT      - netmail only, returns the ABOUT of the site and a full
 or %MAGIC        list of MAGIC names.

    %NEWFILES   - netmail only, returns the NEWFILES list of the site
 or %NEW          via netmail.

    Extended NetMail Commands:
        Implementation  of  the  following netmail commands is optional and
    not required for compliance with the FileFind NetMail Command set.

    %REPORT <tagname>
                - sends a configuration report for echo <tagname>
                  this allows an echo moderator to check if a site running
                  a  FileFind  utility  is  compliant with the rules of the
                  filefind echo.

    %REQUEST <filename>
                - if found, will place requested file on hold for remote

    %UUREQUEST <filename>
                - if  found,  and  the filesize after uuencoding is less
                  than 60K, it will be sent as multiple netmail messages

  The Site ABOUT

        Obviously,  a  system that neither accepts file requests nor allows
    users  to  download  on  their  first  call should not be responding to
    FileFind  messages.   If  there  are  any limitations for the caller to
    acquire  any  of  the  files  that  the  site  has  advertised as being
    available  in  it's FileFind response, these limitations MUST be listed
    in  the  reply.   This information should be included in the ABOUT file
    that the FileFind program user creates.

        The  site  ABOUT  should  contain  the  following information.  The
    FileFind  program  implementor  should  instruct  his  users  on  these

      - sitename
      - site operator's name
      - complete phonenumber
      - baud rate
      - hours during which filerequests are accepted, if at all
      - hours during which users can download
      - conditions for file requests and user downloads
      NOTE: the above information should be within the first 14 lines.
      - a list a MAGIC names
      - an indication if magic names are also available to terminal users.

  Searching for Files and Creating Replies

        The  method  used by the FileFind program to search for requests is
    up  to  the  implementor.   However,  if searching a list, the FileFind
    program should confirm the actual existance of all files that match the
    query specification.

        The  FileFind  program  should  only  process  description strings,
    filespecs  or  keywords  that  contain more than 3 valid characters and
    should  have configuration options to define greater minimum lengths on
    a per-echo basis.

        For  filespecs,  the  wildcard  character '*' IS considered a valid
    specification  as  well  as the '?' wildcard, but only the '?' is to be
    counted  as  a  character  when  determining the length of query.  File
    extensions  are  not necessary and any characters AFTER a '*' are to be
    ignored.   The  FileFind  program should be configurable so as to allow
    replacement  all  of the file extensions with '.*' or '#?' dependant on
    platform.   This  results  in  queries being independant of the various
    archivers in use.


        Replies  created  by  FileFind  utilities  are  expected  to  be in
    compliance with the following FTN specifications:
        FTS-0001    -  packed message format
        FTS-0009    -  MSGID/REPLY
        FSC-0046    -  PID and tear line

        In  addition, a FileFind utility may use the FID:  control line for
    any  information needed that cannot be put in a PID:  without violating
    that specification.

        ^AFID: ascii text CR

    Must be less than 80 characters including ^A and terminating CR.

    There  are three ways in which the FileFind program can create replies:
        - write the replies in the echo in which the query appeared.
        - write the replies in an echo that has been specifically
          designated for that purpose in the particular FTN or for
          a gorup of echos in that FTN.
        - reply via routed netmail.

        Since each FTN site connected to a particular FileFind program area
    is  capable  of creating an information reply, there is much concern as
    to  the  amount  of  traffic  that  can  be generated, FileFind program
    developers  must  be sensitive to these concerns by providing the means
    to  their users to limit the traffic on a per-echo basis.  For example,
    various  FileFind  echos  have  rules  limiting  the  size or number of
    replies,  or  the length of the system information that may be included
    in a reply.

  Limiting replies

    It is strongly suggested that some default limitations be built-in.

    Limiting Site Header (ABOUT):

        If the site's ABOUT, (the text that has been configured in order to
    add  the  system's  information  and Magic names list to the reply), is
    greater  than  14  lines,  the  remainder should NOT be posted.  A line
    should  be  added  to  the response indicated this, and the user may be
    invited to either Freq or download the MAGIC name's ABOUT or MAGIC, for
    a  full  list of magic names.  The FileFind program may optionally send
    the full system information and magic name list via routed netmail.

    Limiting Match List due to ambiguity of query:

        If  the list of matches (note:  not the size of the message itself)
    is  greater than 32K, the FileFind program should post a message to the
    user to indicate that his query may have been too ambiguous and perhaps
    invite him to freq or download the MAGIC name FILES for a full list.

    Splitting Match List into Multiple Messages:

        If the list of matches is greater than 10K, it should be split into
    multiple  messages  of  no more than 8K.  Although the backbone permits
    messages  up  to  16K  in length, 8K is a more readable size.  Only the
    first  split  message  may  contain  the ABOUT information of the site.
    Each  message must be given both a unique Subject field (eg:  prepended
    by  "Part n/n") and a unique MSGID:.  This because some tossers may use
    either or both for dupe detection.

    Limiting Number of Split Messages:

        If  the  number of messages is greater than the preset limit of the
    echo,  and  the FileFind Program does not have an option to forward the
    replies  via  netmail,  the  replies  should  be discarded and the user
    informed that his request may have been too amibiguous.

    NetMail Reply:

        The  FileFind program may have an option to forward all replies via
    routed netmail, or to do so under certain conditions as outlined above.
    Obviously, if the FileFind program can process netmail queries, it MUST
    respond via netmail.

    User NetMail Reply Request:

        Alternativly the user can request a netmail reply for his echomail
    query by preceeding the query with either "%" or "!".
        Subject:  % /fsc /fts

        If  the  FileFind  program  does  not support this feature, it must
    ignore  any  echomail  query message that has a "%" or "!" as the first
    WORD of the Subject field.

    Second Reply or Extended Response Request:

        The  FileFind  site  indicates  availablility  of  Second  Reply by
    placing the string 'program_name 4d_address' in the From:  field of the
        eg: FROM: FQUERY 1:167/104.0

        When a user replies to a FileFind reply, the message will be to the
    FileFind  program  @  {network  address}.  When processing the FileFind
    conferences, the FileFind program will treat any message to itself that
    includes the site address as a Second Reply Request.

        If  this feature is available, the FileFind program will include up
    to  a maximum of 15 files (maximum 12K match list) in it's replies.  If
    the  user  wants  a  more  detailed  listing,  he simply replies to the
    FileFind  program's  reply.   Only  the system that posted the original
    reply  will  respond to that new request.  This second, specific reply,
    will  contain  up  to  50  files (32K of matchlist) either including or
    SKIPPING the first 15.  These numbers may be replaced by byte limits in
    some implementations.

    No Second Reply in Designated Reply Echo:

        The Designated Reply Echo method does not allow replies to be made,
    because  the FileFind program may not be permitted to scan a Designated
    Reply  Echo.  The FileFind program should automatically report up to 50
    files  for any requests.  Therefore, the traffic limitaion features may
    be  disabled for networks that require the FileFind program to reply in
    a Designated Reply Echo, and disallow Second Reply in that echo.

    Disable Local Messages:

        The  FileFind  program must be able to to disable the processing of
    local  messages.  What this means is that the FileFind program will not
    process  any messages generated on that FTN site, including messages by
    the  sysop  using  an  offline  reader,  or by a site's BBS or off-line
    reader users.  This should NOT exclude messages from a site's points.

    Limit by Age:

        The  FileFind program must be configurable so that the operator can
    limit  the  age  of an query message that is acceptable for processing.
    This  should  be  in  number  of  days.   The  FileFind  program may be
    configured  to  process all the FileFind requests regardless of how old
    they are.  Age should never be greater than 365 days.

    LinkMGR Support:
        Implmentors  may choose to support the LinkMGR proposal for netmail
    queries  and  commands.   In this proposal, the queries and commands do
    not  appear  in  the  subject  field but rather, in the the BODY of the
    message.  The subject field wil contain the LinkMGR password.
        Use of the LinkMGR method allows the user to send multiple commands
    to the fIleFind program.
BackGo Back