SUMMARY: Why is FTP to localhost slower on Solaris2.9 than 2.8?

From: Alek O. Komarnitsky (N-CSC) (alek@ast.lmco.com)
Date: Mon May 12 2003 - 13:37:48 EDT


I asked last week about how come FTP's to localhost on Solaris2.9
machines were noticeably slower than on Solaris2.8 machines with
similar hardware/configuration. This was actually part of a bigger
problem of slow/erratic throughput over Gigabit links during testing.

I actually did not hear back from anyone ... but I figured out
the problem myself so here's my summary back to the list - in a
nutshell, the Solaris2.9 in.ftpd appears to be much slower than
the 2.8 version - possibly because using wu-ftpd stuff now (?)

I also came across a what I think is a minor bug in the ce drivers;
will send a seperate Email about that.

I'll see if someone here can formally submit both of these to
Sun as a bug report, but I can't promise that will happen,
so if someone else on the list wants to submit to Sun, that
would be great since these should be fixed.

Thanx,
alek
http:/www.komar.org/

[Report below is sanitized ... I'll use beer names for the 2.9 machines]

Subject: GigE is fine ... it's the *&^%$#@! Solaris2.9 in.ftpd ...

OK ... after **ALL* that work, I'm fairly certain the problem
is with the Solaris2.9 in.ftpd and the data clearly supports this.

Recall that (for same hardware), 2.9->2.9 GigE ftp *PUTS* were very
erractic in performance (often slower than 100baseT). Also, recall
that ftp's to locahost were noticeably slower on 2.9 than 2.8.
And the summary of what I "learned" from spending time with Dave
on Friday is that the network gear looked clean, but the netstat -k
output seemed to indicate a window size reduction going on.
And rcp data seemed pretty decent all around.

So I search SunSolve (for the bazillion'th) time and start looking
for stuff like "in.ftpd", "tcp", and other stuff (since I was thinking
it's not the network driver itself, but something in the network stack).
Basically zippo that seems to relate to this problem.

But I keep thinking about that darn difference between FTP's to localhost;
and finally hit on the idea of trying the Solaris2.8 ftp & in.ftpd on
a Solaris2.9 host - BINGO!!!!!

ftp (the client side program) made no difference ... but when I put
the Solaris2.8 /usr/sbin/in.ftpd in place on the 2.9 hops machine - BAMM;
everything ROCKED!!!

Attached is some results that clearly indicates that 2.9 in.ftpd appears
to be the culprit ... and only for ftp PUTS ... GETS are mostly OK.

Since we rarely use ftp in production use, I don't see a need to patch
our systems (eventually Sun will figure this out and we'll pick up on
the next recommended patch set) so I'd say based on this, we are SOLID
with GigE. I realize we spent a bit of time on this, but we did learn
some other "good/applicable" things such as the speed/duplex autoneg,
pause stuff, and infinite_burst needing to be set to 1 to get on both
interfaces (I'm convinced this is a Solaris bug).

I also like "proving" a "good" benchmark to test/debug other
GigE implementations as I'm sure we'll have more of that in
the future.

alek

First, lets review the machines/setup with GigE connections:
    HOST Private IP
    SUN-2.8 192.168.1.22
    HP-11.11 192.168.1.20
    barley 192.168.1.23
    hops 192.168.1.24
    malt 192.168.1.25
    h2o NONE
The PRIVATE network is a Copper GigE SWITCH with some other machines
connected not shown here. Probably "some" traffic on this from other
machines, but probably little impact.

HP-11.11 is a "big" HP running HP-UX11.11
barley, SUN-2.8, hops, & malt are Sun280R's that are fairly "quiet" right now.
SUN-2.8 is running Solaris2.8 and has 2 Gbytes of RAM.
Barley, hops, & malt are running Solaris2.9 and have 4 GBytes of RAM.
All are dual-CPU. As alluded to above, latest GigE patches have been
recently applied. Only tweek to the driver (see /kernel/drv/ce.conf)
is to turn-off auto-negotiation/all speeds/duplex and force 1000FULL.

For my tests, I did the following:
   - Create 100 MByte file by cat'ing netscape executeable - /tmp/alek-100
   - Fire up FTP from source to target machine - use binary mode
   - Started from /tmp on source, do a PUT/GET to /dev/null - i.e.:
        $ cd /tmp
        $ ftp TARGET (login)
          ftp> binary (ensure binary mode)
             PUT ftp> lcd /tmp
             PUT ftp> cd /dev
             PUT ftp> put alek-100 null
             GET ftp> lcd /dev
             GET ftp> cd /tmp
             GET ftp> get alek-100 null
    - Repeat MULTIPLE TIMES (>5) looking for repeatability and also
      that 100 MByte file is truly RAM resident. I'm pretty certain
      that FTP itself does not do any caching (whereas NFS does).
      I threw out the slowest and fastest times as outlyers.

Since /dev/null is a "bit bucket", this should be reasonable test
of fastest possible performance, since no writes are required on
the target side, and (assuming the file is RAM resident on the source),
the limits are the ability of the network/wire and machine/IP stacks.

Here is a table showing the results I got. The "TARGET" column shows
what the FTP connection went *TO* (or where the file was PUT), with
the source machine being where the FTP originated from. The numbers
show throughput (as reported by FTP) in MBytes/second. Absolute
theoretical max would be on the order of 100 MBytes/second.
Note that this implies almost a *ONE* second transfer time for the
100 MByte file, so for further testing, I could use larger files
if need be - fast machines/networks requires BIG BIG files!!!!

Barley: "stock" in.ftpd from build CD's (I think 2002/12?)
Hops: Solaris2.8 in.ftpd
Malt: Solaris2.9 in.ftpd patch 114564-01 (security related)

                             SOURCE MACHINE
                      SUN-2.8 HP-11.11 Barley Hops Malt
TARGET
PUT-LOOPBACK 133-135*2 331-336 67-77 146-148*2 67-76
GET-LOOPBACK 122-123*2 360-364 147-150*1 132-133 142-147*1

PUT-PRIVATE-SUN-2.8 N/A 67-68 56-57 57-58 57-59
GET-PRIVATE-SUN-2.8 N/A 59-60 57-58 57-58 56-58
PUT-PRIVATE-HP-11.11 62-63 N/A 70-70 70-70 70-70
GET-PRIVATE-HP-11.11 71-72 N/A 52-58 54-57 53-57
PUT-PRIVATE-barley 52-56 51-53 N/A 4-52*3 7-54*3
GET-PRIVATE-barley 59-60 67-68 N/A 63-67*1 65-67*1
PUT-PRIVATE-hops 44-47*4 68-69 51-56*2 N/A 63-66*2
GET-PRIVATE-hops 59-60 68-68 61-66*2 N/A 65-66*2
PUT-PRIVATE-malt 58-61 52-54*4 8-52*3 6-56*3 N/A
GET-PRIVATE-malt 59-60 67-68 65-67*1 65-67*1 N/A

*1: Note that GETS are noticeably quicker than PUTS on 2.9 in.ftpd

*2: Solaris 2.8 in.ftpd is noticeably quicker for PUTS than 2.9!

*3: Example of Solaris2.9 in.ftpd PUTS showing erratic PUT performance.

*4: These data points aren't terrible, but a little odd ...

Note with with the Solaris2.8 in.ftpd, performance is rock-solid
consistant/good/fast on the PRIVATE network.
_______________________________________________
sunmanagers mailing list
sunmanagers@sunmanagers.org
http://www.sunmanagers.org/mailman/listinfo/sunmanagers



This archive was generated by hypermail 2.1.7 : Wed Apr 09 2008 - 23:26:23 EDT