Search This Blog

Saturday, March 26, 2011

tcpdump wrapper for all network and systems troubleshooter

UPDATE:
This other post describe a more elegant way how to do the job tshark in network troubleshooting

Origin post:

One of the tool that a network engineers relay every day is a network sniffer. One of the most famous I believe is the 'tcpdump'.

Very often when you troubleshoot a problem you run it many time to verify the traffic on the wire. Let say that at some point you see where the problem may be and you need to sent an email with your analyze to another person.

Documenting your results can be very time consuming. To minimize our time and increase the quality of the results we would like to attach the dump files we review ourself of course. Unfortunately it can be a little annoying if we need to repeat our troubleshooting again only to save the dumps on the disk this time. Often sending the analyzed text output form tcpdumps is not enough as well.

This small tcpdump wrapper bellow can save you a lot of time by saving the tcpdump data to file and still letting you to follow the data on the screen in a live troubleshooting.

For couple of examples how to run in please scroll down.

The file with source code can be found here mytcpdump.sh

# you can define the filter and options in your bash variables
# example: 
# T_FILTER='arp or icmp or not ip ( net 10.0.0.0/8 )
# T_OPTIONS='-s0 -nn'

# ------------------------------------------

# arg1 - filter to the wireshark
# arg2 - options to wireshark
mytcpdump () {
 # parse args
 
 DEFAULT_OPT='-s0 -l -nn -w - -i any'
 
 if [ 'x-h' = x"$1" ] ; then 
  echo 
  echo "usage: mytcpdump [arg1] [arg2]"
  echo " arg1 - wireshark network filter, by example: 'arp and (net 10/8)'"
  echo " arg2 - wireshark options, default: '$DEFAULT_OPT'"
  echo ""
  echo " example:"
  echo "   mytcpdump"
  echo "   mytcpdump '(net 10.0.0.0/8 and not net 11.0.0.0/8) and port 22'"
  echo "   mytcpdump '(net 10.0.0.0/8 and not net 11.0.0.0/8) and port 22' '-s0 -l -nn -i eth0 -w -' "
  echo
  
  return 
 fi
 
 # filters
 if [ '1' != 1"$1" ] ; then 
  filter="$1"
 elif [ '2' != 2"$T_FILTER" ]; then
  filter=$T_FILTER
 else
  filter=""
 fi

 # options
 if [ '1' != 1"$2" ] ; then 
  opts="$2"
 elif [ '2' != 2"$T_OPTIONS" ]; then
  opts=$T_OPTIONS
 else
  opts="$DEFAULT_OPT"
 fi 
 
 t=`date +%s`;
 echo "[$t]: timestamp is $t" 
 echo "[$t]: wireshark optoins are <$opts>"
 echo "[$t]: wireshark filter is <$filter>"

 cmd="tcpdump $opts $filter"
 echo "[$t]: tcpdump cmd is <$cmd>"
 
 f="/var/tmp/tcpdump.$t.pcap"
 echo "[$t]: tcpdump pcap file <$f>"
 
 chain="$cmd | tee $f | tcpdump -r- -nn"
 echo "[$t]: running the bash command chains <$chain>" 
 
 $cmd | tee $f | tcpdump -r- -nn
}

alias myt='mytcpdump'

Usage help

# myt -h

usage: mytcpdump [arg1] [arg2]
 arg1 - wireshark network filter, by example: 'arp and (net 10/8)'
 arg2 - wireshark options, default: '-s0 -l -nn -w - -i any'

 example:
   mytcpdump
   mytcpdump '(net 10.0.0.0/8 and not net 11.0.0.0/8) and port 22'
   mytcpdump '(net 10.0.0.0/8 and not net 11.0.0.0/8) and port 22' '-s0 -l -nn -i eth0 -w -'



Examples:

These 2 examples bellow show how to use this small wrapper. Each time we can monitor live traffic on the console output from tcpdump and in the same time be sure that a copy of the raw tcpdump data is written to the disk.

The file you may want to copy then later is shown at the beginning after the header '[timestmap]'. In our examples the file names are:

/var/tmp/tcpdump.1301166859.pcap
/var/tmp/tcpdump.1301166869.pcap

# myt 
[1301166859]: timestamp is 1301166859
[1301166859]: wireshark optoins are <-s0 -l -nn -w - -i any>
[1301166859]: wireshark filter is <>
[1301166859]: tcpdump cmd is <tcpdump -s0 -l -nn -w - -i any >
[1301166859]: tcpdump pcap file </var/tmp/tcpdump.1301166859.pcap>
[1301166859]: running the bash command chains <tcpdump -s0 -l -nn -w - -i any  | tee /var/tmp/tcpdump.1301166859.pcap | tcpdump -r- -nn>
tcpdump: listening on any, link-type LINUX_SLL (Linux cooked), capture size 65535 bytes
reading from file -, link-type LINUX_SLL (Linux cooked)
19:14:19.837101 IP 192.168.43.111 > 212.77.100.101: ICMP echo request, id 42106, seq 5396, length 64
^Ctcpdump: pcap_loop: error reading dump file: Interrupted system call
4 packets captured
6 packets received by filter
0 packets dropped by kernel

# myt 'icmp or arp'
[1301166869]: timestamp is 1301166869
[1301166869]: wireshark optoins are <-s0 -l -nn -w - -i any>
[1301166869]: wireshark filter is <icmp or arp>
[1301166869]: tcpdump cmd is <tcpdump -s0 -l -nn -w - -i any icmp or arp>
[1301166869]: tcpdump pcap file </var/tmp/tcpdump.1301166869.pcap>
[1301166869]: running the bash command chains <tcpdump -s0 -l -nn -w - -i any icmp or arp | tee /var/tmp/tcpdump.1301166869.pcap | tcpdump -r- -nn>
tcpdump: listening on any, link-type LINUX_SLL (Linux cooked), capture size 65535 bytes
reading from file -, link-type LINUX_SLL (Linux cooked)
19:14:29.847649 IP 192.168.43.111 > 212.77.100.101: ICMP echo request, id 42106, seq 5406, length 64
^C2 packets captured
2 packets received by filter
0 packets dropped by kernel
tcpdump: pcap_loop: error reading dump file: Interrupted system call

Saturday, March 12, 2011

Funny way to print lines in reverse order

This is completely something you need see at least once in your live to know what you never should do ;).

The complete code examples bellow are taken from the original post at Print lines in reverse order

The recommended way:
$ tac file

The more funny way to achieve the same result ;)
#example1
$ cat -n file | sort -k1,1rn | cut -f 2-

#example2
$ perl -e 'print reverse <>' file

#example3; sorry, this one is not a one line command
$ i=0
$ while IFS= read -r arr[i]; do ((i++)); done < file
$ for((j=i-1;j>=0;j--)); do printf "%s\n" "${arr[$j]}"; done

Friday, March 11, 2011

LVM is reporting I/O errors after creating a snapshots and modifying it

Once you create a snapshot for one of your logical volumes you can mount it in rw mode and use it as a normal available file system/block device. After some time you may get errors like the one bellow unfortunately.

# lvs
  /dev/group1/snap: read failed after 0 of 4096 at 0: Input/output error
  Volume group group2 is exported
  LV         VG     Attr   LSize  Origin Snap%  Move Log             Copy%  Convert
  l1         group1 owi-ao 12.00m                                                  
  mirror.new group1 mwi-a- 12.00m                    mirror.new_mlog 100.00        
  snap       group1 Swi-I-  4.00m l1     100.00                                    
  snap2      group1 swi-a- 20.00m l1       0.23                                    
  snap3      group1 swi-ao  4.00m l1       1.17                                    
  snap4      group1 swi-a-  4.00m l1       1.17        

The key part to understand this is to look at the attribute descriptions of the lvs command.

$ man lvs | less -R  +2/lv_attr -X
1  Volume  type:  (m)irrored,  (M)irrored  without  initial  sync,  (o)rigin,  (p)vmove,  (s)napshot, invalid (S)napshot,
                 (v)irtual, mirror (i)mage, mirror (I)mage out-of-sync, under (c)onversion

              2  Permissions: (w)riteable, (r)ead-only

              3  Allocation policy: (c)ontiguous, c(l)ing, (n)ormal, (a)nywhere, (i)nherited This is capitalised if the volume is  cur‐
                 rently locked against allocation changes, for example during pvmove (8).

              4  fixed (m)inor

              5  State:  (a)ctive,  (s)uspended,  (I)nvalid  snapshot,  invalid  (S)uspended  snapshot, mapped (d)evice present without
                 tables, mapped device present with (i)nactive table

              6  device (o)pen

It means that our snapshot device /dev/group1/snap has run out of free space. The only way to fix it is to remove the snapshot device because we can't trust unfortunately its content any longer. Next time make sure that the additional free space for the snapshot logical volume is big enough when you plan changing a lot of data on it.

# lvremove /dev/group1/snap4
# lvcreate -n snap -s -L group1/l1

Reference:
3.8. Snapshots
LVM is reporting I/O errors, but the disk reports no problems

What storage solution is better – quick comparison between LVM mirroring and software RAID1 in Linux

With the introduction of LVM2 we have now the ability to create mirrors in LVM. It is relatively easy to set up configuration and works similar to RAID1.

At some point when playing with it you start asking yourself what is actually the technical difference between them two in Linux:
  • LVM mirroring
  • RAID1
For the end user it looks that both provide the same functionality of creating and maintaining a redundant block device. But are they very different from each other? Before I will give you couple of ideas to think about let me refresh just briefly how these 2 technologies are defined.

1. What is RAID (taken from the Red Hat docs)
13.1. What is RAID
...
RAID allows information to be spread across several disks. RAID uses techniques such as disk striping (RAID Level 0), disk mirroring (RAID Level 1), and disk striping with parity (RAID Level 5) to achieve redundancy, lower latency, increased bandwidth, and maximized ability to recover from hard disk crashes.

RAID distributes data across each drive in the array by breaking it down into consistently-sized chunks (commonly 256K or 512k, although other values are acceptable). Each chunk is then written to a hard drive in the RAID array according to the RAID level employed. When the data is read, the process is reversed, giving the illusion that the multiple drives in the array are actually one large drive.
....

2. Logical Volumes (taken from the Red Hat docs)
1.2. Logical Volumes
...
Volume management creates a layer of abstraction over physical storage, allowing you to create logical storage volumes. This provides much greater flexibility in a number of ways than using physical storage directly. With a logical volume, you are not restricted to physical disk sizes. In addition, the hardware storage configuration is hidden from the software so it can be resized and moved without stopping applications or unmounting file systems.
...

As we see the descriptions are quite different. So where the differences are going to be visible? Well, I was trying to find a good technical summary how this 2 techniques work but found only many small articles around on Google and in various docs.

Bellow is only a quick summary what could be good to know when trying to answer my question. For your information as well, the list is not meant to be complete and reflects only my personal state of knowledge.
  • How will the cache size in modern hard drives impacts the write and read operation of my block device? ( Storage Admin guide )
  • Is the read/write performance the same for LVM and RAID block devices? Looks, like it is not equal and RAID can manage a faster read throughput when accessing the data ( Linux md vs. LVM performance )
  • In RAID1 by default both disks are going to be used for read operations. This is the opposite for LVM, where by default only the ‘primary’ disk is used.
  • With LVM mirroring you can create more then only one mirror. A command like below will create for you 2 separate mirror block devices. In RAID1 you can have only one mirror and other devices will be consider as spare disk only.
# lvcreate -m 2 ...

  • You can detach the mirror from the ‘primary’ logical volume and start using it for whatever you want (backup, testing, ...). Doing this gives you immediately a copy of your primary logical volume but of course will remove the redundancy, at least temporarily. If you want to have the redundancy back you have to enable it again with a command like:
# lvchange -m ...
  • LVM supports snapshotting feature on logical volumes. At the time of writing this blog it can’t be used on the mirroring volume but you can always detach the mirror and start using it as a normal logical volume. As soon as it becomes a normal logical volume you can start creating snapshots.
# lvconvert --splitmirrors ...
# lvconvert -m1 ...
  • As long you have free space in your volume group you can have more then one snapshot.
  • For LVM mirroring you need by default more devices (+1 for the mirror device and +1 for the logs what together when you want to have only 1 mirror on a logical device you need 3 devices for it). This allocation policy, especially for logs can be changed and is controlled with the following options:
    • --mirrorlog (lvcreate)
    • --alloc anywhere (vgcreate)
  • Having a mirror in place don't limit you in use of other LVM features. It means you should be able to extend you logical volume by extending the physical volume it is base on. Once the local volume grows your file system on this block device can grow as well. All this can be done online without having to take the server off line or down.
  • Can you extend the file system so easily on the RAID1 device?
  • Depending on what mounting point are you trying to add to your system you may have some trouble during the boot phase with the default initrd (often it comes when you play with the /boot and / file systems as described here rhelv5-list - LVM mirroring )

Reference:

Monday, March 7, 2011

Searching trick for the Linux less tool

Event after so many year of working on the command line I constantly find some trick ;). This is one of them for the less tool.

With the syntax bellow the less will do a normal search like using the '/' char, but will look for the 2th occurrence of our regex expression and once found will jump to it immediately to display.

$ man tune2fs | less -R  +2/-U -X
      -U UUID
              Set   the   universally   unique   identifier   (UUID)   of   the  filesystem  to  UUID.   The  format  of  the  UUID  is  a  series  of  hex  digits  separated  by  hyphens,  like  this:
              "c1b9d5a2-f162-11cf-9ece-0020afc76f16".  The UUID parameter may also be one of the following:

                   clear  clear the filesystem UUID

                   random generate a new randomly-generated UUID

                   time   generate a new time-based UUID

              The UUID may be used by mount(8), fsck(8), and /etc/fstab(5) (and possibly others) by specifying UUID=uuid instead of a block special device name like /dev/hda1.

              See uuidgen(8) for more information.  If the system does not have a good random number generator such as /dev/random or /dev/urandom, tune2fs will  automatically  use  a  time-based  UUID
              instead of a randomly-generated UUID.

The source of it can be found on the FAQ page of the less itself: Less FAQ

How to sort data on more then one column with Linux sort command

You caught always yourself looking for old recipes you did years ago. It is time to document this once and for all.

This will produce a list of services maintained by the upstart init daemon on the Linux system that:
  • 1st is sorted alphabetically base on the status description
  • 2th is sorted numerically base on the pid value
  • 3th is sorted alphabetically base on the service name

$ initctl  list | sed 's/ (/(/g'  | sort -k2,2d -k4,4n -k1,1d | column -t  
bridge-network-interface(eth0)    start/running
bridge-network-interface(lo)      start/running
bridge-network-interface(pan0)    start/running
bridge-network-interface(virbr0)  start/running
bridge-network-interface(wlan0)   start/running
network-interface(eth0)           start/running
network-interface(lo)             start/running
network-interface(pan0)           start/running
network-interface-security        start/running
network-interface(virbr0)         start/running
network-interface(wlan0)          start/running
ufw                               start/running
upstart-udev-bridge               start/running,  process  358
udev                              start/running,  process  366
smbd                              start/running,  process  870
rsyslog                           start/running,  process  889
dbus                              start/running,  process  891
ssh                               start/running,  process  897
gdm                               start/running,  process  904
network-manager                   start/running,  process  905
avahi-daemon                      start/running,  process  912
tty4                              start/running,  process  1046
tty5                              start/running,  process  1051
tty2                              start/running,  process  1060
tty3                              start/running,  process  1062
tty6                              start/running,  process  1065
acpid                             start/running,  process  1070
cron                              start/running,  process  1084
atd                               start/running,  process  1085
libvirt-bin                       start/running,  process  1096
tty1                              start/running,  process  1627
nmbd                              start/running,  process  2156
alsa-mixer-save                   stop/waiting
anacron                           stop/waiting
apport                            stop/waiting
console-setup                     stop/waiting
control-alt-delete                stop/waiting
dmesg                             stop/waiting
failsafe-x                        stop/waiting
hostname                          stop/waiting
hwclock                           stop/waiting
hwclock-save                      stop/waiting
irqbalance                        stop/waiting
module-init-tools                 stop/waiting
mountall                          stop/waiting
mountall-net                      stop/waiting
mountall-reboot                   stop/waiting
mountall-shell                    stop/waiting
mounted-dev                       stop/waiting
mounted-tmp                       stop/waiting
mounted-varrun                    stop/waiting
networking                        stop/waiting
plymouth                          stop/waiting
plymouth-log                      stop/waiting
plymouth-splash                   stop/waiting
plymouth-stop                     stop/waiting
procps                            stop/waiting
rc                                stop/waiting
rcS                               stop/waiting
rc-sysinit                        stop/waiting
screen-cleanup                    stop/waiting
udev-finish                       stop/waiting
udevmonitor                       stop/waiting
udevtrigger                       stop/waiting
ureadahead                        stop/waiting
ureadahead-other                  stop/waiting
usplash                           stop/waiting

Google and system documentation is your friend.

$ man sort
$ info coreutils 'sort invocation'

Sorting based on Multiple columns

Sort Files Like A Master With The Linux Sort Command

How to highlight source text on your blog

There are many time situation when you would like to publish a piece of code that should have a special formatting for the viewer.

This blog describe one of the possible solution base on the javascript library: SyntaxHighlighter

To integrate it with your blog follow the description there:

Adding a Syntax Highlighter to your Blogger blog

The list of brushes (javascript scripts) that you have to copy to your blog design template depends on your needs. If you want to use these 2 like 'plain' and 'bash' brushes you have to extend the list.

<script src='http://alexgorbatchev.com/pub/sh/current/scripts/shBrushPerl.js' type='text/javascript'/>

<script src='http://alexgorbatchev.com/pub/sh/current/scripts/shBrushPlain.js' type='text/javascript'/>
<script src='http://alexgorbatchev.com/pub/sh/current/scripts/shBrushBash.js' type='text/javascript'/>

<script language='javascript'>

The list of all brushes can be found at Bundled Brushes.

To convert the 'raw html' into the 'escaped' html code you can use this convert tool .

A basic example use in your blog post can look like this:

<pre class="brush:text; highlight: [1];">
<h2>This is an example.</h2>
</pre>

pretty printing for bash command outputs

Some commands like to print the output in a way that is hard to read. Try this 'column' tool that turns the unstructured output into a pretty printed output.

$ mount | column -t
/dev/sda2         on  /                         type  ext3                   (rw,relatime,errors=remount-ro)
proc              on  /proc                     type  proc                   (rw)
none              on  /sys                      type  sysfs                  (rw,noexec,nosuid,nodev)
none              on  /sys/fs/fuse/connections  type  fusectl                (rw)
none              on  /sys/kernel/debug         type  debugfs                (rw)
none              on  /sys/kernel/security      type  securityfs             (rw)
...