Thursday, December 30, 2010

ISP Report Tool v2.0

Important Update: See new ispReport version 3.0 for the latest version of this tool.


There was quite a bit of interest in the first version of my ISP report tool.  There was also an interest to produce a graphical report.  Toward that end, I updated the tool to create and automatically update a web report.  You can see a sample manufactured report below.


The ispReport daemon runs the report generator every night just after midnight to update the report.
To manually create the report, just run "ispReport mkwww".  The web report is stored in  ~/.ispReport/www/index.html.  You can copy this file to a web server to see it remotely.  Or you can edit the script to specify where you would like the web report stored.

I also incorporated an automated monthly log rotation in order to keep from creating a single mammoth log file.

You can download this latest and most likely final version ispReport-v2.0 here.


Important Update: See new ispReport version 3.0 for the latest version of this tool.


Have a blessed day!

Brad
PS: As always, the sample scripts provided are for reference and are not supported in any way.

Wednesday, December 29, 2010

ZM: 2010 Analytics

It is good to look back over the year to gain an appreciation of progress and to plan for the future.  This blog post is devoted to looking over the stats of TheZoneManager.com for 2010.  Not including this post, I submitted 30 blog posts this year.  Most were technical and hopefully valuable.

As you can see from the map below, I had over 15,700 visitors from 131 countries around the globe.

One quarter of visitors came from direct traffic.  Another quarter came from referring sites such as blogs.sun.com, opensolaris.org, and braddiggs.com.  And the remaining half of visitors came from search engines, primarily google.  Interestingly, the highest number of searches in google were for the dsbulkloader.


The most popular blog post by almost a 2 to 1 margin was the 2009 blog post on Filesystem Cache Optimization Strategies.

The blog post with the most comments (7) was on ZFS Cache Improvements.

I also kept track of all of the script downloads to the dl.thezonemanager.com.  In 2010, there were over 5,000 downloads of the 15 scripts.  Unfortunately, I don't have a breakout by script.  That has been added for next year's re-cap.

Next year, I hope to publish much more valuable content to make 2011 better than ever.

Blessings to you and yours!


Brad

Tuesday, December 28, 2010

ISP Report Tool

Important Update: See new ispReport version 3.0 for the latest version of this tool.

My Internet Service Provider (ISP) is usually very reliable.  However, whenever problems arise, the ISP  typically takes weeks before reaching final resolution.  I typically start to notice Internet speed drops due to packet loss far before my Internet access completely stops working.  It would be nice if I had a summary report and detailed logs that I could send to the ISP support to show exactly when problems started, how long that the problems have been going on and what degree of degradation that I have experienced over that time.  I looked for a tool that could provide this sort of information and couldn't find anything that really met my needs.  So, I wrote a script called ispReport for this purpose.  The rest of this blog post details the script's capabilities and usage.

What ispReport Does
This ispReport script pings yahoo.com and abc.com every 5 minutes.  If there is any packet loss or ping can't reach the target hosts, it stores the ping results in a log file.  You can run "ispReport showlog" to view the contents of the log and "ispReport report" to show a summary report of logged data.

Download
You can download ispReport-v1.0 here.  It works on Solaris, Linux, and the OS-X.

Usage
ispReport has the following 9 subcommands.
  start - Start the script
  stop - Stop the script
  status - Check the status to see if it is running or not
  report - Create a report from the log file
  showlog - Examine the contents of the log file
  install - Install the script
  uninstall - Uninstall the script
  init - Archive the existing log file and create a new log file.
  usage - See the usage of the script

Sample Log
The following is a sample log as seen by running "ispReport report".

2010/12/28 10:29:47|yahoo.com|0|ping: cannot resolve yahoo.com: Unknown host
2010/12/28 10:29:47|abc.com|0|ping: cannot resolve abc.com: Unknown host
2010/12/28 11:15:47|yahoo.com|12|47.776/51.426/63.937/2.684
2010/12/28 11:28:09|abc.com|26|45.276/53.851/65.548/4.271

The pipe (|) deliminted columns when the host is reachable are as follows:
  Column 1: Date and time stamp
  Column 2: Host being pinged
  Column 3: ISP State in terms of percentage up
  Column 4: Number of packets sent
  Column 5: Round-Trip Time (rtt) statistics - min/avg/max/mdev

The pipe (|) deliminted columns when the host is NOT reachable are as follows:
  Column 1: Date and time stamp
  Column 2: Host being pinged
  Column 3: ISP State in terms of percentage up
  Column 4: Error message

Sample Report
One of the most important things that I need to provide support is a quantification of the outage incurred. The report function provides a summary of the outage data for all dates logged in the log file.
Below is a sample output from running "ispReport report".

This report summarizes the number of times per day 
that packet loss was detected over each of four 
packet loss percentage ranges.  For example, the 
number in the second column represents the number 
of times that a packet loss of 1-24% was detected.

Date           1-24%    25-49%    50-74%   75-100%
2010/12/27         4         0         0         8
2010/12/28         2         0         0         2

Enjoy!

Important Update: See new ispReport version 3.0 for the latest version of this tool.


Brad
PS: As always, the sample scripts provided are for reference and are not supported in any way.

Thursday, December 23, 2010

Linux Memory Information

When doing application performance analysis, I like to study the operating system view of memory from three perspectives.  These three views include the hardware view, the usage view, and the filesystem cache view.  This post will look at the tools that I use for each of these views for both my reference and your benefit.  On to the views...

Hardware View
From a performance analysis perspective, the maximum possible performance is typically achieved by filling all DIMM slots in a server. It is also important to ensure that all the DIMMs are clocked at the same and highest possible clock rate.  Both of these perspectives can be seen with the hardware lister (a.k.a. lshw).  To list just memory, run "lshw -class memory".  In the following example output, you see that there are 4 1GB DIMMs installed where each DIMM runs at a clock rate of 800MHz.

# lshw -class memory

  *-memory
       description: System Memory
       physical id: 19
       slot: System board or motherboard
       size: 8GiB
     *-bank:0
          description: SODIMM Synchronous 1067 MHz (0.9 ns)
          product: HMT351S6BFR8C-H9
          vendor: 80AD
          physical id: 0
          serial: 2A21EBFC
          slot: DIMM_A
          size: 4GiB
          width: 64 bits
          clock: 1067MHz (0.9ns)
     *-bank:1
          description: SODIMM Synchronous 1067 MHz (0.9 ns)
          product: EBJ41UF8BAS0-DJ-F
          vendor: 02FE
          physical id: 1
          serial: 5210B456
          slot: DIMM_B
          size: 4GiB
          width: 64 bits
          clock: 1067MHz (0.9ns)









lshw RPMs for RedHat and Oracle Enterprise Linux are available at http://packages.sw.be/lshw.

Usage View
The second view looks at how memory is allocated in the kernel and by apps.  For this perspective, there are several commands.  However, the top two commands that I use are "cat /proc/meminfo" and top.  Meminfo does a good job of showing how much memory is installed, in use and its allocations to the kernel.  Here is a sample output from meminfo.

# cat /proc/meminfo
MemTotal:        8185600 kB

MemFree:         7836324 kB
Buffers:          115772 kB
Cached:           115612 kB
SwapCached:            0 kB
Active:           143820 kB
Inactive:         127628 kB
Active(anon):      40528 kB
Inactive(anon):    37804 kB
Active(file):     103292 kB
Inactive(file):    89824 kB
Unevictable:           0 kB
Mlocked:               0 kB
HighTotal:       7395716 kB
HighFree:        7229432 kB
LowTotal:         789884 kB
LowFree:          606892 kB
SwapTotal:       8193144 kB
SwapFree:        8193144 kB
Dirty:                 0 kB
Writeback:             0 kB
AnonPages:         40088 kB
Mapped:            29328 kB
Shmem:             38272 kB
Slab:              21672 kB
SReclaimable:       9472 kB
SUnreclaim:        12200 kB
KernelStack:        1992 kB
PageTables:         1564 kB
NFS_Unstable:          0 kB
Bounce:                0 kB
WritebackTmp:          0 kB
CommitLimit:    12285944 kB
Committed_AS:     312748 kB
VmallocTotal:     122880 kB
VmallocUsed:       38272 kB
VmallocChunk:      79820 kB
HardwareCorrupted:     0 kB
HugePages_Total:       0
HugePages_Free:        0
HugePages_Rsvd:        0
HugePages_Surp:        0
Hugepagesize:       2048 kB
DirectMap4k:       10232 kB
DirectMap2M:      903168 kB










The top command can help you to find out the top consumers of memory on a system.  To sort the list of process by memory consumption run the top command and then press Shift-M to sort by memory.  Below is a sample output from my home server.  As you can see,  Xorg is currently the top memory consumer at .2% of the available 8GB of RAM.




Filesystem Cache View
The third and final perspective is the filesystem cache allocation.  This is a very important perspective to consider since it is not necessarily obvious to most people.  The filesystem cache  is intended to offload disk read operations into memory by storing filesystem data into memory.  For example, running the following command twice consecutively will result in the second run taking much less time than the first.

# time find /  > /dev/null

This is because the first iteration reads in the inode data for all files listed from disk and stores that information into the filesytem cache.  The second iteration reads the same information from the filesystem cache.  In my case, the first run took 27.7 seconds and the second run took only .3 seconds.

Some useful tools for determining how much memory the filesystem cache occupies includes the
following:

# free -m





# grep "^Cached" /proc/meminfo
Cached:          7671700 kB

More info on the contents of /proc/meminfo available in this article.

The following illustrates the filesystem cache (e.g. cache column) filling up as I dumped large files into the the filesystem cache.

# vmstat -nS m 10











For Linux distributions with the 2.6 and beyond kernel, vm.swappiness and vm.vfs_cache_pressure largely govern how much data can fit into the filesystem cache and how quickly that it will be swapped out from memory into swap (e.g. to disk).  RedHat and Oracle Enterprise Linux also support the ability to cap the upper boundary of the filesystem cache with vm.pagecache.  Additional information on each of these kernel properties is listed below.


vm.swappiness determines how often to swap data out from memory to virtual memory (e.g. disk).  A higher value means that the kernel will swap out often.  A value of 0 means that it won't swap to virtual memory at all.  It is safest to go no lower than 1.  The default value is 60.  You can set this property with the following command:

# sysctl -w vm.swappiness=1

To make the change permanent, you need to add the following line to /etc/sysctl.conf.

vm.swappiness = 1


vm.vfs_cache_pressure controls the tendency of the kernel to reclaim the memory used by the filesystem cache for caching directory and inode objects.  The default value is 100.  You can set this property with the following command:


# sysctl -w vm.vfs_cache_pressure=50

To make the change permanent, you need to add the following line to /etc/sysctl.conf.
vm.vfs_cache_pressure = 50


vm.pagecache has the following three values.
 1. Minimum percent of memory used for page cache.  Default is 1%.
 2. Initial percent of memory allocated for cache.  Default is 15%.
 3. Maximum percent of memory used for page cache.  Default is 100%.


You can set this property with the following command:

# sysctl -w vm.pagecache=50 70 80

To make the change permanent, you need to add the following line to /etc/sysctl.conf.
vm.pagecache = "50 70 80"


See the proc man page for further details on these kernel properties.

That is it for this post.  Have a great day and a blessed Christmas!





Brad


PS: As always, the sample scripts provided are for reference and are not supported in any way.