How I collect network
statistics during benchmarks.
by
the hardworking DBA…
as
of 2/17/2013
I
am benchmarking an Oracle stored procedure these days. We have a 1 second SLA on this procedure. It returns a rather large payload, sometimes
exceeding 10,000 records and 5MB of data.
I have challenged the justification for that, and I am reassured that is
what the application requires.
I
was especially curious how many network packets it takes to move that much
data. The ping command told me the average
network round-trip speed between the server host and the client host is
0.14s. That might be OK for a small
payload, but if it takes 250,000 packets per call, at 0.14ms per packet, that
would threaten my SLA.
If
I wanted to check the impact of this stored procedure on the network, how could
I do that?
Linux
has a command, “netstat –s”, that produces the report shown in Listing 1 below. It reports a lot of statistics about
network usage. I could run that before
and after, and compare the results, right?
The
challenge is that the statistics are relative to when the machine last
started. I only care about statistics
during a short period of time. I cannot
easily compare two 10-digit numbers that show total packets received, and tell
if the difference is significant. And
a lot of the statistics reported don’t change at all. I only care about the measures that
changed.
The
solution is to
1) take a “netstat –s” snapshot
before the benchmark.
2) run the benchmark.
3) take a second “netstat –s”
snapshot after the benchmark, and
4) report any differences in a
meaningful, human-readable way.
The
result is a report such as the one shown in Listing 2. While I was developing the script, I followed the netstat-differential report
with some other statistics from sar and a list of the machines connected to my
node. Those things have helped me talk
to few other issues. It doesn’t hurt to
know such things.
The
script that produces Listing 2 is shown in Listing 3.
Check
it out. Maybe that will give you an
idea?
Best
wishes to always have good network throughput!
~
the hard-working DBA
Listing 1 – Typical
output from the netstat –s command.
> netstat -s
Ip:
1346387939 total
packets received
2943839 with invalid
addresses
0 forwarded
0 incoming packets
discarded
1282689297 incoming
packets delivered
3791124361 requests
sent out
19013 outgoing packets
dropped
6 fragments dropped
after timeout
128 reassemblies
required
26 packets reassembled
ok
6 packet reassembles
failed
136 fragments received
ok
19012 fragments failed
422 fragments created
Icmp:
74159 ICMP messages
received
255 input ICMP message
failed.
ICMP input histogram:
destination
unreachable: 24427
timeout in transit: 7
echo requests:
3238
echo replies:
46462
timestamp request:
17
address mask
request: 8
93967 ICMP messages
sent
0 ICMP messages failed
ICMP output histogram:
destination
unreachable: 25376
echo request:
65521
echo replies: 3053
timestamp replies:
17
IcmpMsg:
InType0: 46462
InType3: 24427
InType8: 3238
InType11: 7
InType13: 17
InType17: 8
OutType0: 3053
OutType3: 25376
OutType8: 65521
OutType14: 17
Tcp:
18330381 active
connections openings
10578845 passive
connection openings
31217 failed
connection attempts
804128 connection
resets received
150 connections
established
1104945674 segments
received
3666265534 segments
send out
2514600 segments
retransmited
31 bad segments
received.
242582 resets sent
Udp:
122170701 packets
received
16624 packets to
unknown port received.
0 packet receive
errors
122252674 packets sent
TcpExt:
89760 invalid SYN
cookies received
2411 resets received
for embryonic SYN_RECV sockets
65 packets pruned from
receive queue because of socket buffer overrun
211 ICMP packets dropped
because they were out-of-window
14535660 TCP sockets
finished time wait in fast timer
181501 time wait
sockets recycled by time stamp
1370 packets rejects
in established connections because of timestamp
126238132 delayed acks
sent
43908 delayed acks
further delayed because of locked socket
Quick ack mode was
activated 1727250 times
372 times the listen
queue of a socket overflowed
372 SYNs to LISTEN
sockets ignored
521838861 packets
directly queued to recvmsg prequeue.
3963876152 packets
directly received from backlog
3457491267 packets
directly received from prequeue
293190 packets dropped
from prequeue
3307290762 packets
header predicted
3884415728 packets
header predicted and directly queued to user
886041005
acknowledgments not containing data received
1618307249 predicted
acknowledgments
1 times recovered from
packet loss due to fast retransmit
549898 times recovered
from packet loss due to SACK data
3 bad SACKs received
Detected reordering
163 times using FACK
Detected reordering
3012 times using SACK
Detected reordering
613 times using time stamp
616 congestion windows
fully recovered
7910 congestion
windows partially recovered using Hoe heuristic
TCPDSACKUndo: 459
10024 congestion
windows recovered after partial ack
2549350 TCP data loss
events
TCPLostRetransmit: 28
1 timeouts after reno
fast retransmit
5440 timeouts after
SACK recovery
1123 timeouts in loss
state
2193866 fast
retransmits
93844 forward
retransmits
99062 retransmits in
slow start
71073 other TCP
timeouts
TCPRenoRecoveryFail: 1
12967 sack retransmits
failed
265222 times receiver
scheduled too late for direct processing
17038 packets
collapsed in receive queue due to low socket buffer
1721513 DSACKs sent
for old packets
49 DSACKs sent for out
of order packets
10316 DSACKs received
550 DSACKs for out of
order packets received
29 connections reset
due to unexpected SYN
770680 connections
reset due to unexpected data
53290 connections
reset due to early user close
4355 connections
aborted due to timeout
IpExt:
InMcastPkts: 862471
InBcastPkts: 54622317
Listing 2 – An
example of the output from the script.
================================================================================
Netstat Differential Report.
Ip:
total
packets received: 8210
with invalid addresses: 1
incoming packets delivered: 8173
requests sent out: 8670
Icmp:
Icmp_input:
Icmp_output:
IcmpMsg:
Tcp:
active connections openings: 68
passive connection openings: 31
connection resets received: 2
segments received: 8012
segments send out:
8522
Udp:
packets received: 148
packets sent: 148
TcpExt:
TCP
sockets finished time wait in fast timer: 53
delayed acks sent: 211
packets
directly queued to recvmsg prequeue.: 4746
packets directly received from prequeue: 2240426
packets
header predicted: 1601
packets header
predicted and directly queued to user: 4731
acknowledgments not containing data received: 622
predicted
acknowledgments: 5877
connections reset due to unexpected data: 2
IpExt:
InBcastPkts: 13
End of Netstat Differential Report.
================================================================================
Recent Network stats reported by SAR.
12:00:01 AM IFACE rxpck/s
txpck/s rxbyt/s txbyt/s
rxcmp/s txcmp/s rxmcst/s
07:50:01 PM lo 15.97
15.97 49533.85 49533.85
0.00 0.00 0.00
07:50:01 PM eth0 139.20
179.11 90875.86 131371.47 0.00
0.00 0.44
07:50:01 PM eth1 34.23
35.50 6752.12 18519.29
0.00 0.00 0.44
07:50:01 PM eth2 1.75
0.01 159.03 0.65
0.00 0.00 0.44
07:50:01 PM eth3 1.99
0.00 183.95 0.11
0.00 0.00 0.44
07:50:01 PM sit0 0.00
0.00 0.00 0.00
0.00 0.00 0.00
================================================================================
Connections are seen to the following hosts.
g3db2v.sys.cigna.com.
gvoz0858.internal.cigna.com.
localhost.
lsh1010a.sys.cigna.com.
lshnap001a.internal.cigna.com.
lshslx001a.sys.cigna.com.
lshslx002a.sys.cigna.com.
lshslx003a.sys.cigna.com.
lshslx014a.sys.cigna.com.
lshslx015a.sys.cigna.com.
psystcp1.sys.cigna.com.
Listing 3 –
Korn-shell script that produces the netstat differential report shown above.
> cat netstat_differential_report.ksh
#!/bin/ksh
# This script simplifies taking a netstat snapshot before a
benchmark, taking
# another netstat snapshot after a benchmark, and then reporting
a
# comparison of just the network statistics that changed during
the benchmark.
#
# See the show_usage function for more explanation.
#
# History
# Feb 2013 SWC Wrote it.
#==============================================================================
export NET_FILE_1
export NET_FILE_2
export NET_STRING_FILE
#==============================================================================
show_usage ()
{
cat <<-EOF
Usage:
$0 -n SNAPSHOT_NUMBER
[-h|-?]
where
-n enables
specification of a snapshot number.
When you say
"-n 1", the script will run netstat -s and save the
output for
future reference.
When you say
"-n 2", the script will run netstat -s, and display a
report of
the differences in network statistics since
the first
snapshot was saved.
The temporary files used by this script are saved in a
subdirectory called logs.
Best wishes.
EOF
exit 1
}
#==============================================================================
# Parse the command line parameters.
if [ $# -eq 0 ]; then
show_usage
fi
while getopts n: option
do
case $option in
n)
SNAPSHOT_NUM=$OPTARG
;;
h)
show_usage
;;
\?)
show_usage
;;
esac
done
#==============================================================================
if [ "$SNAPSHOT_NUM." = "." ]; then
echo "Error:
Please specify which snapshot_number to process."
show_usage
exit 1
fi
DIR=`dirname $0`
cd $DIR
DIR=`pwd`
BASE=`basename $0 .ksh`
if [ ! -d $DIR/logs ]; then
mkdir $DIR/logs
fi
#==============================================================================
# Use this version when the netstat statistic is at the start of
the line.
report_netstat_diff_v1 ()
{
S=$1
grep
"$S$" $NET_FILE_1 | awk '{print $1}' | read V1
grep "$S$" $NET_FILE_2 | awk '{print
$1}' | read V2
let V=V2-V1
if [ $V -ne 0 ];
then
printf
"%60.60s: %d\n" "$S" $V
echo $S
>> $NET_STRING_FILE
fi
}
#==============================================================================
# Use this version when the netstat statistic follows a colon.
report_netstat_diff_v2 ()
{
S=$1
grep
"$S" $NET_FILE_1 | cut -f2 -d':' | read V1
grep
"$S" $NET_FILE_2 | cut -f2 -d':' | read V2
let V=V2-V1
if [ $V -ne 0 ];
then
printf
"%60.60s: %d\n" "$S" $V
echo $S
>> $NET_STRING_FILE
fi
}
#==============================================================================
# Use this version when the netstat statistic is in the middle
of the line.
report_netstat_diff_v3 ()
{
S1=$1
FIELD_NUM=$2
S2=$3
grep
"$S1" $NET_FILE_1 | grep "$S2" | cut -f$FIELD_NUM -d' ' |
read V1
grep
"$S1" $NET_FILE_2 | grep "$S2" | cut -f$FIELD_NUM -d' ' |
read V2
let V=V2-V1
if [ $V -ne 0 ];
then
printf
"%60.60s _ %s: %d\n" "$S1" "$S2" $V
echo $S
>> $NET_STRING_FILE
fi
}
#==============================================================================
if [ $SNAPSHOT_NUM -eq 1 ]; then
rm -f
$DIR/logs/netstat.*
NET_FILE=$DIR/logs/netstat.1.log
netstat -s >
$NET_FILE
if [ ! -f
$NET_FILE ]; then
print
"Error: Failed to save snapshot 1."
exit 1
fi
print
"`date`: Saved netstat snapshot number 1."
elif [ $SNAPSHOT_NUM -eq 2 ]; then
NET_FILE=$DIR/logs/netstat.2.log
netstat -s >
$NET_FILE
print "`date`:
Saved netstat snapshot number 2."
# Separate the
file contents by section
for N in 1 2; do
cat
$DIR/logs/netstat.$N.log |
while read
LINE; do
if
[ "$LINE" = "Ip:" ]; then
OUT_FILE=$DIR/logs/netstat.Ip.$N
elif [ "$LINE" = "Icmp:" ]; then
OUT_FILE=$DIR/logs/netstat.Icmp.$N
elif [ "$LINE" = "ICMP input histogram:" ]; then
OUT_FILE=$DIR/logs/netstat.Icmp_input.$N
elif [ "$LINE" = "ICMP output histogram:" ]; then
OUT_FILE=$DIR/logs/netstat.Icmp_output.$N
elif [ `echo "$LINE" | grep -c "ICMP messages sent"`
-gt 0 ]; then
# Continue putting lines back in the regular Icmp file
OUT_FILE=$DIR/logs/netstat.Icmp.$N
elif [ "$LINE" = "IcmpMsg:" ]; then
OUT_FILE=$DIR/logs/netstat.IcmpMsg.$N
elif [ "$LINE" = "Tcp:" ]; then
OUT_FILE=$DIR/logs/netstat.Tcp.$N
elif [ "$LINE" = "Udp:" ]; then
OUT_FILE=$DIR/logs/netstat.Udp.$N
elif [ "$LINE" = "TcpExt:" ]; then
OUT_FILE=$DIR/logs/netstat.TcpExt.$N
elif [ "$LINE" = "IpExt:" ]; then
OUT_FILE=$DIR/logs/netstat.IpExt.$N
fi
echo $LINE >> $OUT_FILE
done
done
printf
"%80s\n" " " | tr ' ' '='
print "Netstat Differential Report."
print
# Now compare the
different sections one at a time.
for TYPE in Ip
Icmp Icmp_input Icmp_output IcmpMsg Tcp Udp TcpExt IpExt; do
NET_FILE_1=$DIR/logs/netstat.$TYPE.1
NET_FILE_2=$DIR/logs/netstat.$TYPE.2
NET_STRING_FILE=$DIR/logs/netstat.strings.txt
rm -f
$NET_STRING_FILE
touch
$NET_STRING_FILE
print
"$TYPE:"
case $TYPE
in
Ip)
report_netstat_diff_v1 'total packets received'
report_netstat_diff_v1 'with invalid addresses'
report_netstat_diff_v1 'forwarded'
report_netstat_diff_v1
'incoming packets discarded'
report_netstat_diff_v1 'incoming packets delivered'
report_netstat_diff_v1 'requests sent out'
report_netstat_diff_v1 'outgoing packets dropped'
report_netstat_diff_v1 'fragments dropped after timeout'
report_netstat_diff_v1 'reassemblies required'
report_netstat_diff_v1 'packets reassembled ok'
report_netstat_diff_v1 'packet reassembles failed'
report_netstat_diff_v1 'packet reassembles failed'
report_netstat_diff_v1 'fragments received ok'
report_netstat_diff_v1 'fragments failed'
report_netstat_diff_v1 'fragments created'
;;
Icmp)
report_netstat_diff_v1 'ICMP
messages received'
report_netstat_diff_v1 'input ICMP message failed.'
report_netstat_diff_v1 'ICMP messages sent'
report_netstat_diff_v1 'ICMP messages failed'
;;
Icmp_input)
report_netstat_diff_v2 'destination unreachable'
report_netstat_diff_v2 'timeout in transit'
report_netstat_diff_v2 'echo requests'
report_netstat_diff_v2 'echo replies'
report_netstat_diff_v2 'timestamp request'
report_netstat_diff_v2 'address mask request'
;;
Icmp_output)
report_netstat_diff_v2 'destination unreachable'
report_netstat_diff_v2 'echo request'
report_netstat_diff_v2 'echo replies'
report_netstat_diff_v2 'timestamp replies'
;;
IcmpMsg)
report_netstat_diff_v2 'InType0'
report_netstat_diff_v2 'InType3'
report_netstat_diff_v2 'InType8'
report_netstat_diff_v2 'InType11'
report_netstat_diff_v2 'InType13'
report_netstat_diff_v2 'InType17'
report_netstat_diff_v2 'OutType3'
report_netstat_diff_v2
'OutType8'
report_netstat_diff_v2 'OutType14'
;;
Tcp)
report_netstat_diff_v1 'active connections openings'
report_netstat_diff_v1 'passive connection openings'
report_netstat_diff_v1 'failed connection attempts'
report_netstat_diff_v1 'connection resets received'
report_netstat_diff_v1 'connections established'
report_netstat_diff_v1 'segments received'
report_netstat_diff_v1
'segments send out'
report_netstat_diff_v1 'segments retransmited'
report_netstat_diff_v1 'bad segments received.'
report_netstat_diff_v1 'resets sent'
;;
Udp)
report_netstat_diff_v1 'packets received'
report_netstat_diff_v1 'packets sent'
report_netstat_diff_v1 'packet
receive errors'
report_netstat_diff_v1 'packets to unknown port received.'
;;
TcpExt)
report_netstat_diff_v1 'invalid SYN cookies received'
report_netstat_diff_v1 'resets received for embryonic SYN_RECV sockets'
report_netstat_diff_v1 'packets pruned from receive queue because of socket
buffer overrun'
report_netstat_diff_v1 'ICMP packets dropped because they were
out-of-window'
report_netstat_diff_v1 'TCP sockets finished time wait in fast timer'
report_netstat_diff_v1 'time wait
sockets recycled by time stamp'
report_netstat_diff_v1 'packets rejects in established connections
because of timestamp'
report_netstat_diff_v1 'delayed acks sent'
report_netstat_diff_v1 'delayed acks further delayed because of locked
socket'
report_netstat_diff_v3 'Quick ack mode was activated' 6 'times'
report_netstat_diff_v1 'times
the listen queue of a socket overflowed'
report_netstat_diff_v1 'SYNs to LISTEN sockets ignored'
report_netstat_diff_v1 'packets directly queued to recvmsg prequeue.'
report_netstat_diff_v1 'packets directly received from backlog'
report_netstat_diff_v1 'packets directly received from prequeue'
report_netstat_diff_v1 'packets dropped from prequeue'
report_netstat_diff_v1 'packets header predicted'
report_netstat_diff_v1 'packets header predicted and directly queued to
user'
report_netstat_diff_v1
'acknowledgments not containing data received'
report_netstat_diff_v1 'predicted acknowledgments'
report_netstat_diff_v1 'times recovered from packet loss due to fast
retransmit'
report_netstat_diff_v1 'times recovered from packet loss due to SACK
data'
report_netstat_diff_v1 'bad SACKs received'
report_netstat_diff_v3 'Detected reordering' 3 'times using FACK'
report_netstat_diff_v3 'Detected reordering' 3 'times using SACK'
report_netstat_diff_v3 'Detected reordering' 3 'times using time stamp'
report_netstat_diff_v1
'congestion windows fully recovered'
report_netstat_diff_v1 'congestion windows partially recovered using Hoe
heuristic'
report_netstat_diff_v2 'TCPDSACKUndo'
report_netstat_diff_v1 'congestion windows recovered after partial ack'
report_netstat_diff_v1 'TCP data loss events'
report_netstat_diff_v2 'TCPLostRetransmit'
report_netstat_diff_v1 'timeouts after reno fast retransmit'
report_netstat_diff_v1 'timeouts after SACK recovery'
report_netstat_diff_v1 'timeouts in loss state'
report_netstat_diff_v1 'fast retransmits'
report_netstat_diff_v1 'forward retransmits'
report_netstat_diff_v1 'retransmits in slow start'
report_netstat_diff_v1
'other TCP timeouts'
report_netstat_diff_v2 'TCPRenoRecoveryFail'
report_netstat_diff_v1 'sack retransmits failed'
report_netstat_diff_v1 'times receiver scheduled too late for direct
processing'
report_netstat_diff_v1 'packets collapsed in receive queue due to low
socket buffer'
report_netstat_diff_v1 'DSACKs sent for
old packets'
report_netstat_diff_v1 'connections reset due to unexpected SYN'
report_netstat_diff_v1 'connections reset due to unexpected data'
report_netstat_diff_v1
'connections reset due to early user close'
report_netstat_diff_v1 'connections aborted due to timeout'
;;
IpExt)
report_netstat_diff_v2 'InMcastPkts'
report_netstat_diff_v2 'InBcastPkts'
;;
esac
# Show any
strings that might have been missed in the case-statement above.
diff
$NET_FILE_1 $NET_FILE_2 |
grep -v '\-\-\-' |
grep -v '[0-9],[0-9]' |
grep -v '[0-9]c[0-9]' |
grep -v -f $NET_STRING_FILE
done
print
print "End of Netstat Differential
Report."
printf
"%80s\n" " " | tr ' ' '='
print "Recent Network stats reported by
SAR."
sar -n DEV | grep
IFACE | head -1
sar -n DEV | grep
-v Average | tail -6
printf
"%80s\n" " " | tr ' ' '='
print "Connections are seen to the following
hosts."
IP_FILE=$DIR/logs/$$.ip
netstat -nat | awk
'{ print $5}' | cut -d: -f1 | sed -e '/^$/d' | sort -u > $IP_FILE
NS_FILE=$DIR/logs/$$.ns
grep '\.' $IP_FILE
|
while read IP; do
nslookup
$IP 2>&1 | grep name | cut -f2 -d'=' >> $NS_FILE
done
sort -u $NS_FILE
# Cleanup
rm $NS_FILE
rm $IP_FILE
rm
$DIR/logs/netstat.*
fi
…End of another blog by the hardworking DBA!