[HPADM] SUMMARY Slow netstat command

From: Christopher H Vann (vannc@dteenergy.com)
Date: Wed Mar 26 2003 - 11:52:26 EST


Hi,

Here's what I believe is my issue.
When doing a netstat -a or -r it does a lookup on all the IPs.
My isolated DNS server can answer most of them.
If it can't, it tries to talk to the forwarders, which are not
accessible.
So, it times out on each IP it can't find locally.
For some reason my HP box is trying to resolve 0.0.0.0 over and over.
That was the only IP not in my zones.

I don't know why my boxes would be looking up 0.0.0.0.
So, I looked at the whois and that is reserved.
I looked up rfc3330 and here's what it says.

 0.0.0.0/8 - Addresses in this block refer to source hosts on "this"
   network. Address 0.0.0.0/32 may be used as a source address for this

   host on this network; other addresses within 0.0.0.0/8 may be used to

   refer to specified hosts on this network [RFC1700, page 4].

I don't have a window to test again for a while.(and the 1st closed on
me before I could do more testing)
I figure I'll take out the forwarders entry and hope it fails the
lookups rather than timing out on them.
I don't know why my old UX10.20 box running BIND 4.9.7 works where my UX
11 and Solaris boxes don't using BIND 9.2.

If my test fails and I get to the bottom of it, I'll repost.

See responses below original posts:

Christopher H Vann wrote:

> Hi,
>
> I've gotten several responses.
> I'll post later.
>
> I did some more digging. The output of netstat -a only shows local
> boxes.
> So the resolution should have never tried to get it it's forwarders.
> (which are unreachable)
> But I really think the unreachable forwarders is part of my problem.
>
> I ran rndc querylog on the dns slave (which was the master when we
> were isolated).
> I did a netstat -a on one of our HP boxes.
> I reran rndc querylog and looked thru syslog.
>
> The box is querying 0.0.0.0
> Any idea why?
>
> Chris
>
> Christopher H Vann wrote:
>
>> Hi,
>>
>> A problem raised its head during a Disaster Recovery (DR) test.
>>
>> We have a remote site with HPUX, Solaris, NT and W2K boxes.
>> We isolated it to simulate the loss of our main center.
>> During this test the netstat -a and netstat -r commands took forever
>> to
>> complete.
>> The netstat -an and netstat -rn commands worked fine. (no name
>> resolution)
>>
>> So, I'm thinking DNS.
>> However, all nslookups return quickly. I resolved a ton of names and
>> it
>> came back very quickly.
>> I resolved some IPs back to names. That comes back quickly.
>> I ran nslookup in TCP and UDP modes. They both work fine.
>>
>> Here's our layout.
>> When going into DR mode we take the DNS slave at the site and flip
>> it to
>> become a master.
>> Our clients use that as their resolver.
>> We resolve our domains and use forwarders to resolve the Internet
>> names.
>> (W2K forwards to me)
>> I could not find any names in the netstat output that was not in our
>>
>> domains. (so no need to forward)
>> We are running BIND 9.2
>>
>> We also have a left over box at the remote site running BIND 4.9
>> that is
>> a slave to our old master.
>> We could do a netstat on that box just fine. It also can not contact
>> its
>> forwarders.
>>
>> I removed dns from the nsswitch.conf file from a client and it runs
>> the
>> netstat just fine.
>> So, it looks like DNS, but nslookup works fine.
>> What am I missing?
>>
>> I dumped stats on the DNS box and it's not overworked.
>> I ran top on it too. That looks fine.
>>
>> We flipped the new master back into a slave.
>> netstat is still slow.
>> We reconnect the network and now netstat run fine on all boxes.
>>
>> Chris Vann
>
Responses: (with my comments)
=================================================================

0.0.0.0 is the address for the entire (inter-) network.

The network address of your local network is a binary AND with any IP
address in your network and your netmask. For example if your IP address
is 10.20.30.40 and
the netmask is 255.255.0.0 then the network address is 10.20.0.0

You see, if the netmask is set to 0.0.0.0, then the network address
becomes 0.0.0.0.

CBee

=================================================================

How about an incorrect setting for the default gateway ? (not my
issue)

Or if you are convinced it is DNS, maybe you don't have any valid root
hints ? ( I don't have any root hints, it is a forwarder only)
The DNS book recommends that if you are going to be "off" the internet
for
several days, you should set your master up as a "false root", so that
it
can (fail to) resolve . .com .uk etc. itself instead of waiting for
the internet to time-out. Same applies to reverse lookups
(in-addr.arpa)

(Where I got the idea to delete my forwarders and hope to fail not
timeout)

--
Mike
=================================================================
How many dns-servers have you configured? (In our DR site - 1 server
running)
What is the order? (resolv.conf: I have 3 nameserver entries with the
local server being the 1st. Making it the only one did not help)
I had a problem where there was just one dns server configured, the old
one. New dns servers should already have been configured but somehow the
machine was skipped. When the old dns went down and the host was
recycled for new usage, the `netstat -a` was also dead-slow. After
adding
new dns servers (in second and third place) I got results like you: some
dead slow apps, some proper functionin (well, no heavy test done here).
After removing the old dns from the first position, all went better.
I somehow lost your story. I'd do the next for the remote site:
If you expect frequent lost connections between the 2 sites, don't
bother on using a single dns server at one site and use some
dns-caching-servers at the satelite sites. Better use a full dns at both
sides. They can do some peer-level synchronisation to avoid external
trafic and for local network configuration but not master-slave
configurations..
Only if the network connection is expected to be always-on, then a
caching dns server or master-slave dns servers can be handsome.
(We have a master and several slaves. One of them is at the DR site. It
gets turned into a master upon disaster so the data never expires. We
don't expect the circuit to get broken except for a real disaster :( or
a test (rare). All clients at the DR are resolvers only. When we go into
DR mode we flip several DNS records to point applications at the new DR
boxes. This must be propogated out quickly. Since not all the clients
are running a version of BIND that allows for notifies, setting them up
as resolvers was a good fix. I don't think caching would have fixed the
resolution of 0.0.0.0)
This is slow because the slave-dns-server cannot do its dns-lookup at
the masters site.
CBee
======================================================================
 nsswitch.conf
  Actually, nslookup is telling you the reason too. If you were to
  use nslookup with a DNS server, you'd see the same problem.  The issue
  is indeed DNS, but the behavior is controlled by nsswitch.conf as well
  as resolv.conf.  For reliability, always specify host files first,
then
  DNS as the order for nameserver resolution. Then put ALL important
systems
  into /etc/hosts so you don't have to depend on DNS for production LAN
  communication.
(good idea, to have redundancy in hosts files. Just a lot up upkeep.)
  DNS is a CRITICAL service and should have an backup solution for
production
  servers. resolv.conf works but there is a 10-20 sec delay between dead
systems
  and the change to another DNS server. That's why using hosts first is
a
  requirement for reliability.
(1st nameserver is the one that works for all my internal names & IPs)
--
Best regards,
Bill Hassell
=================================================================
Just a hunch - are you aware that nslookup on hp-ux works with
/etc/hosts?  On Sun, that is not the case.   That may be your nslookup
"issue", but it still doesn't fix your DNS issue...
(hosts that were resolving were not in the hosts files, so I know its
using DNS. nslookup tells you where it's resolving too)
Alex Vinson

--
             ---> Please post QUESTIONS and SUMMARIES only!! <---
        To subscribe/unsubscribe to this list, contact majordomo@dutchworks.nl
       Name: hpux-admin@dutchworks.nl     Owner: owner-hpux-admin@dutchworks.nl
 
 Archives:  ftp.dutchworks.nl:/pub/digests/hpux-admin       (FTP, browse only)
            http://www.dutchworks.nl/htbin/hpsysadmin   (Web, browse & search)



This archive was generated by hypermail 2.1.7 : Sat Apr 12 2008 - 11:02:27 EDT