NAME
    BW Whois -- A whois client by Bill Weinman

SYNOPSIS
    whois [options] request[@host[:port]] [ ... ]

VERSION
    This documents BW Whois version 5.0

DESCRIPTION
    BW Whois was originally designed to work with the new "Shared
    Registration System" whois introduced 1 December 1999. This new system
    has proved to be remarkably disorganized and inconsistent, resulting in
    tremendous confusion for those of us who need to find the ownership of a
    domain now and then.

    This program mitigates most of that confusion by referring to a table of
    TLDs (Top-Level Domains) and associated registrars in the tld.conf file.

    Over the past few years this program has evolved into the most
    full-featured whois client available providing features like a
    self-detecting CGI mode and SQL database caching, for those who need
    such features, while still maintaining a simple command-line interface
    for those who just need that.

    The CGI mode can be secured against abuse by a number of different
    methods including "Referer:" headers, IP addresses, and a system of
    128-bit hashed cookies. These security options can be tailored to suit
    the demands of a given installation using the whois.conf configuration
    file.

    There are features to support a web-based whois service, including
    support for Apache-style server-side includes, and support for a
    distinct initial page a "domain not found" page.

    An optional caching capability is provide for using an SQL database
    (currently MySQL is supported). When configured for caching, requests
    are forwarded to the corresponding whois server only if the cache does
    not contain a result for the given request/server combination. Cached
    values are expired after a configurable amount of time.

OPERATION
    When given a request, the program first checks the requested domain
    against the tld.conf file for an associated whois server. If not found
    the program will then submit the request to the "root" whois server
    (currently whois.crsnic.net) and wait for a referral to a registrar's
    whois server.

    If given a referral, the program will then submit the request a second
    time to the referred whois server.

    The request can be a domain name, (e.g. whois bw.org) or any other
    entity that the given host can resolve (e.g. whois
    !ww104@whois.networksolutions.com).

    If request is an IP address (or part thereof), the ARIN whois server
    will be used as a root server (whois.arin.net).

    If host is specified, the request will be sent literally to the
    specified host.

    If both host and port are specified, the request will be sent to that
    host using the specified port instead of the normal whois port (43).

    Multiple requests on a single command line are supported.

  Self-detecting CGI Support
    BW Whois detects CGI operation by looking for the standard "SCRIPT_NAME"
    environment variable. This behavior can be overridden by using the
    --nocgi switch.

    In CGI mode the program attempts to make intelligent links out of IP
    addresses, domain names, and handles. It doesn't always get it right,
    but it tries real hard!

    You can also specify an optional whois.html file to create your own
    look. The HTML file will need a few simple "placeholders" in it. The
    placeholders are replaced at runtime with the various values which make
    this work. These placeholders are represented by text enclosed in '$'
    signs like this: "$PLACEHOLDER$"

    Separate HTML files may be specified for an initial page and a "not
    found" page, if desired.

    The placeholders are described here:

    $SELF$
        The URI path of the program on your web server, taken from the value
        of the "SCRIPT_NAME" environment variable.

    $DOMAIN$
        The domain that was last looked up, if any.

    $RESULT$
        The result of the whois query from BW Whois.

    You can get an example file from the program with:

        whois --makehtml > whois.html

  Optional Apache SSI Support
    If you need to include other files into your HTML file dynamically,
    experimental support for Apache-style SSI (server-side includes) is
    provided with the bwInclude.pm module. This currently works only for
    "include virtual" and "echo var" directives.

    Simply place the bwInclude.pm file with your other perl module files, or
    specify the directory that contains the module in the "use lib" line in
    the source code.

  Optional TLD Table Support
    Bcause of the unfortunate design of the Shared Registration System, only
    the .COM, .NET, and .ORG Top-Level Domains (TLDs) are referred by the
    "root" domain servers at whois.crsnic.net and whois.internic.net. If you
    want results for other TLDs you must know where to find them, and there
    is no central repository for current whois server referrals.

    The optional whois.tld file includes whois servers for all known TLDs,
    and some second-level domains that are registrered separately (e.g.
    .net.au, .uk.com, etc.).

    The format of the tld.conf file is as follows:

       Lines that begin with "#" are ignored.

       Token lines are like:

            token  token  optional comments

       The first token is the TLD, the leading dot (".") is required.

       The second token is the fully-qualified domain name for the whois
        server that responds to requests for the given TLD.

       The two tokens can be separated by spaces and/or tabs

       Anything on the line after the second token is ignored.

       A leading "#" for in-line comments is not required, but may be in
        the future.

       The file is searched sequentially, so it's important to have
        2nd-level domains earlier in the file than corresponding top-level
        domains. (e.g. .net.au before .au).

  Optional Support for Stripping Disclaimers
    Most whois servers deliver a disclaimer along with thier whois results.
    The disclaimer generallly says something like "By submitting the request
    that you already submitted before you saw this agreement you have agreed
    to this binding contract. Haha!"

    Many people who are not otherwise lawyers are annoyed by this. The
    stripdisclaimer option will remove the disclaimers before you see them.

    This feature requires the sd.conf file.

    The format of the sd.conf file is:

        server "first line" "last line"

       server is the DNS name of the whois server

       "first line" and "last line" are regular expressions that match
        the first and last line (respectively) of the disclaimer to be
        stripped. The quotes are required.

  Netblock Referrals
    This program attempts to find netblock requests. If a request is
    entirely numeric (e.g. 123.234), the program first checks with
    whois.arin.net (ARIN). If an ARIN record contains a referral to another
    whois system, (e.g. RIPE or APNIC) the program will attempt to detect
    that and snatch the record from the referened whois system. Note: ARIN's
    records are very inconsistent in their formatting, so this may not
    always do something intelligent.

  Packed IP addresses
    If the request is a string of numbers without any other characters, the
    program will treat it as a 32-bit (packed) IP address. It will first
    unpack it into dotted-quad notation and then submit it to the ARIN whois
    server.

    Packed IP addresses are often used by spammers in an attempt to confuse
    those who might try to report thier abuse. This feature makes it easy
    for you to decypher those addresses and find the owner of the netblock
    all in one step.

    IP addresses are actually 32-bit integers (until we get IPv6 -- but
    that's another story). The common notation represents the address as
    four separate 8-bit integers, like this: 192.149.252.21 (actually one of
    ARIN's servers). That's called "dotted-quad" notaion. If you were to
    represent that address as one big 32-bit integer it would look like
    this: 3231054869. I call that a "packed" IP address.

    Sometimes a spammer will use a packed IP address in a URL like this:

        http://3231054869/index.html

    That address will work in a web browser, but it's hard to look up. This
    program will accept a packed IP address like this:

        whois 3231054869

    The program will unpack it into dotted-quad notation, and submit it to
    the ARIN whois server just like a normal IP address.

COMMAND LINE SWITCHES
    --help
        Print a usage message.

    --version
        Print the version information and exit.

    --config=path
        Full path to the configuration file. Default: /etc/whois/whois.conf

    --refresh, -r
        Refresh the cache for this query. Forces the request to go to the
        whois server even if the result is cached. (Only valid if caching is
        configured.)

    --tld=path
        Full path/file name for tld.conf file. Default: /etc/whois/tld.conf

    --host=host, -h host
        Specify a specific host.

    --port=port, -p port
        Specify an alternate port.

    --timeout=seconds
        Set the timeout to a number of seconds. The default is 60 seconds if
        this is not specified.

    --quiet, -q
        Be wery, wery quiet. I'm hunting wabbits. (--quiet overrides
        --verbose)

    --verbose, -v
        Show details of every step. (--quiet overrides --verbose)

    --stripdisclaimer, -s
        Sets the stripdisclaimer mode. The program makes an attempt to strip
        off those inane disclaimers that so many registries are starting to
        include with their whois records. This feature requires the sd.conf
        file.

    --makehtml
        Writes a sample HTML file (for CGI mode use) to standard out.

    --nocgi
        Prevent CGI mode. This is useful if you have a script that used a
        legacy character-mode whois program.

    --html
        Create HTML links of handles, IP addresses, and domains without
        using HTML in the rest of the output. Useful with --nocgi for using
        an external wrapper CGI program.

    --jpokay
        Allow japanese output from nic.ad.jp.

CONFIGURATION FILE
    A sample whois.conf file is included with the BW Whois distribution. It
    is not necessary to use the whois.conf file to use the program.

    If you want to use advanced features, such as caching or optional CGI
    security features, you will need to install the whois.conf file and
    configure it to reflect your preferences.

    The standard location for whois.conf is in the /etc/whois directory. If
    you do not have access to that directory, or are running on a non-UNIX
    operating system that does not use the /etc directory, you may specify
    another location by setting the "WHOIS_CONF" environment variable or by
    editing the source code.

    If you need to edit the source code, be sure you are using a plain
    text editor (not a word processor!) and that you save the file with
    appropriate line-endings for your system. If you do not understand those
    distinctions I highly recommend that you find a friend or hire a
    consultant who knows about such things. (The author is occasionally
    available for such small consulting tasks -- feel free to contact him if
    you need help.)

  Format of the Config File
    The config file format is very simple.

    Lines that begin with "#" are considered comments and are ignored.

    Anything after a "#" to the end of a line is considered a comment and
    ignored.

    The format of each non-comment line is:

     option value

    For logical values, "1" or "true" (without the quotes) are considered
    true. Anything else is considered false.

    For options that take a list of values, the list is separated by colons
    (":") without spaces. Spaces are not currently supported in any value.

    See the SECURITY section of this man page for more information about
    security features.

    The following options are supported:

    stripdisclaimer true|false
        Strip off the disclaimer/header from the results returned by many
        registrars. This feature requires the sd.conf file.

    tld_conf filepath
        Alternate location for the tld.conf file. Default:
        /etc/whois/tld.conf

    sd_conf filepath
        Alternate location for the sd.conf file. Default: /etc/whois/sd.conf

    timeout number
        The number of seconds to timeout if a result is not returned by a
        whois server. Default: 60 seconds.

    default_host hostname
        A hostname to use as a default whois server if the TLD is not found
        in the tld.conf file. Default: whois.crsnic.net

    htmlfile filepath
        An HTML file to use for queries and results. Default: internal

    htmlfirst filepath
        An HTML file to use for the initial page. This is the page displayed
        when no query is submitted. Default: htmlfile or internal

    htmlnotfound filepath
        An HTML file to use for results that are not found. This is the page
        displayed when a query returns a negative response. It may be used
        to display a page indicating that a domain may be available for
        registration. Default: htmlfile or internal

    htmlfound filepath
        An HTML file to use for results that are found. This is the page
        displayed when a query returns a positive response. It may be used
        to display a page indicating that a domain is not available for
        registration. Default: htmlfile or internal

    error_403 filepath
        An HTML file to use for error 403 (Forbidden) results. Default:
        internal

    error_408 filepath
        An HTML file to use for error 408 (Expired Session) results.
        Default: internal

    logfile filepath
        This option enables logging and provides a path and filename for the
        log. Log entries look like this:

          2002-12-11 20:06:00 [12745] (192.168.0.30) whois.cgi: cgi domain: bw.org (1)

        Items are, from left to right:

           Date and time (UTC) of the log entry.

           The process ID, enclosed in square brackets.

           The IP address of the CGI client, enclosed in parenthesis. This
            item only appears in CGI mode.

           The process name, or the log_name (see below), followed by a
            colon.

           The text of the log entry (in this case, "cgi domain: bw.org").

           A log-level for this item. The log-level only appears if
            log_level (see below) is provided in the config file.

        Make sure the user-ID that owns the whois process has permission to
        write the log file. This option is usually used when running in CGI
        mode. In that case, you need to ensure that the user-ID of the web
        server has permission to write to the log file.

    log_level level
        level can be a number from 1-9.

        This item specifies what level of logging you want. Without this
        item, events with log-levels higher than 1 will not be logged. For
        most purposes, that will be fine. The higher the number, the more
        events get logged.

    log_name name
        This option provides a specific name for log entries. This will be
        used instead of the process-name in log entries.

    database token
        This option enables database operations. The token can be mysql or
        pgsql, corresponding to the database system you are using.

    connect connect string
        This option is required if database is used. It specifies the
        connection parameters used to access the database. The format is:

         database:host:port:user:pass

        For example, if your database were named "whois" on the local
        machine, on the standard port (3306) and the user was "web" and the
        password was "foo.bar" you could use:

         connect whois:localhost:3306:web:foo.bar

    cache_table table_name
        The name of the database table to use for the results cache. This
        also serves to enable results caching.

    cache_expire seconds
        The number of seconds to hold a result before it is considered
        stale. Stale results will be refreshed when requested again.
        Default: 432000 seconds (five days).

    control_table table_name
        The table name to use for security control records. This is required
        to enable security control features.

    cookie_name cookie_name
        The name to use for control cookies. This also serves to enable the
        cookie control feature.

    cookie_expire seconds
        How many seconds a cookie is valid for. Default: 3600 seconds (one
        hour).

    ip_control number
        The number of hits allowed from one IP address within the ip_expire
        time. This also serves to enable the IP control feature.

    ip_expire seconds
        The number of seconds required between hits from one IP address
        before that address is expired from the control table.

    allow_referer list:of:domains
        A list of valid hostnames to allow in the "Referer:" header. Use a
        value of  to turn off referer checking entirely. Default: The
        hostname in the HTTP "Host:" header.

    direct_link number
        Allow links to a whois record without a cookie or a referer. This is
        useful for providing a link in an email message. The number is how
        many seconds apart to allow linked hits from the same IP address.
        This requires control_table and ip_control.

    outgoing_ip list:of:ip:addresses
        A list of IP addresses that can be used for the outgoing connection.
        BW Whois will select an address from this list at random and bind to
        that for your outgoing connection. This will help with some whois
        servers that block based on number of connections from a given IP
        address. These are IP addresses ON YOUR SYSTEM. You must have these
        IP addresses configured in order for them to work.

ENVIRONMENT
    The environment variable "WHOIS_CONF" may be used to specify an
    alternate path to the whois.conf file.

    The environment variable "BW_WHOIS" is no longer supported.

SECURITY
    This version of BW Whois contains features to help secure a
    web-accessable installation from abuse.

    Over the past few months many users of BW Whois have sustained attacks
    from automated web clients (bad robots) that would rapidly request whois
    results, presumably for illicit purposes. My own server was attacked and
    queries from my server became disallowed by Verisign (ne Network
    Solutions).

    When I first detected these attacks on my own site, I quickly
    implemented a simple control that kept a flat-file list of IP addresses
    and refused connections from an IP address after it was represented more
    than a given number of times in that file.

    A few weeks later the attack started up again from a number of IP
    addresses too large to control in this manner. I was amazed, to say the
    least. My server was blocked again by NSI. This was a coordinated attack
    from a large number of hosts on a large number of disparate networks.

    This time I buckled down and devised a set of controls that would
    require a lot more sophistication to subvert. So far these controls have
    been very successful on my server.

  Three Types of Controls
    There are three distinct types of controls. They can be used separately,
    but personally, I use all three and I recommend you do the same.

  Referer Controls
    The referer controls are enabled by default and do not require that a
    database be installed.

    If a request is received that does not provide an HTTP "Referer:"
    header, or provides a referer that does not match the hostname in the
    "Host:" header, the request is denied and a 403 (Forbidden) result code
    is returned.

    So far the robots do not provide an HTTP "Referer:" header, but I expect
    they will soon if people rely on this control without the others. It
    would be a trivial addition to their code.

  IP Controls
    The IP control requires an SQL database. Currently only MySQL is
    supported (by far the most popular database on the net). Support for
    others will come later.

    Whenever a request comes in from a web client, the database is queried
    to see if that IP address has visited recently. If not found, the
    request is allowed and a record is created.

    If the IP address is found in the database, a counter is updated to
    reflect how many hits have arrived from that address. If the count is
    above the limit, the request is denied and a 403 (Forbidden) result code
    is returned. If more than "ip_expire" seconds have passed since the last
    hit from that IP address, the count is reset and the request is allowed.

    This control will be difficult to subvert. The problem is that the count
    must be high enough to permit hits from clients behind proxy servers,
    such as AOL and Earthlink users.

  Cookie Controls
    The cookie controls also require an SQL database. Currently only MySQL
    is supported (by far the most popular database on the net). Support for
    others will come later.

    When a first request comes in from a web client (e.g., a request for a
    web form, but not for data), a unique cookie is generated with a 128-bit
    pseudo-random hash, and given to the browser. The cookie is then stored
    in the database with a timestamp showing when it was generated.

    When a web client makes a request that requires a data response, a
    registered cookie is required. If no cookie is provided a 403
    (Forbidden) result code is returned. If an expired cookie is provided a
    408 (Expired Session) result code is returned.

    A new cookie is generated on each connection from each client.

    In order to subvert this control, a robot would have to process and
    store actual cookies. So far, they don't do that.

  Direct Links
    Some users have requested a way to provide links to individual whois
    records to their clients in email messages. A facility is provided to
    allow this practice without significant compromise to the system.

    When the direct_link option is set in the whois.conf file, links are
    allowed with neiter a cookie nor a referer, but not if that IP address
    has been used within the number of seconds provided in the option line.

    This has the same problem as the IP controls with proxy clients, but it
    should work under most circumstances.

CAVEATS
    Not all whois servers comply with RFC 954. Unfortunately that lack of
    compliance is so inconsistent that the same commands can produce wildly
    different results from server to server.

    This client deals with the situation by sending fully-qualified requests
    only to NSI's servers, and the simplest form of request to other
    servers. This tactic is not entirely reliable.

SEE ALSO
    RFC 954: NICNAME/WHOIS
        http://www.ietf.org/rfc/rfc0954.txt

FILES
    /etc/whois/tld.conf
        An optional table of TLDs and associated whois servers.

    /etc/whois/whois.conf
        A configuration file for optional flags and other configurable
        values.

    /etc/whois/sd.conf
        A configuration file for optional stripdisclaimer feature.

NOTE BENE
    The format of the tld.conf file changed in version 2.7. Please be sure
    your file has leading dots (e.g. .au) if you are using a current version
    of BW Whois.

    The tld.conf file for versions 3.0 and above includes servers for the
    .COM, .NET, and .ORG domains. Older versions of the program did not
    support tld.conf file lookups for these domains.

    The default location for all the configuration files was changed to
    /etc/whois/ in version 3.1.

    The stripheader feature was changed to stripdisclaimer in version 3.1.
    This feature now requires the sd.conf configuration file.

HISTORY
    The whois command first appeared in 4.3BSD. The BW Whois command first
    appeared 2 December 1999.

    See the HISTORY file for more detail about the history of BW Whois.

AUTHOR
    Bill Weinman <http://bw.org/>

    You can find the latest version of BW Whois at <http://whois.bw.org/>.

    You can send email to Bill Weinman using the web form at
    <http://bw.org/contact/>.

COPYRIGHT
    Copyright 1999-2006 William E. Weinman

    This program is free software. You may modify and distribute it under
    the same terms as perl itself.

