How to Check Network Packet Failure On Host to Filer
root@userhost ~]# ping testfiler-nas
PING testfiler-nas.us.oracle.com (192.168.0.2) 56(84) bytes of data.
64 bytes from testfiler-nas.us.oracle.com (192.168.0.2): icmp_seq=1 ttl=254 time=0.654 ms
64 bytes from testfiler-nas.us.oracle.com (192.168.0.2): icmp_seq=3 ttl=254 time=1.11 ms
64 bytes from testfiler-nas.us.oracle.com (192.168.0.2): icmp_seq=5 ttl=254 time=0.522 ms
64 bytes from testfiler-nas.us.oracle.com (192.168.0.2): icmp_seq=7 ttl=254 time=0.745 ms
64 bytes from testfiler-nas.us.oracle.com (192.168.0.2): icmp_seq=8 ttl=254 time=0.441 ms
64 bytes from testfiler-nas.us.oracle.com (192.168.0.2): icmp_seq=10 ttl=254 time=1.31 ms
--- testfiler-nas.us.oracle.com ping statistics ---
11 packets transmitted, 6 received, 45% packet loss, time 10002ms
rtt min/avg/max/mdev = 0.441/0.798/1.312/0.315 ms
[root@userhost ~]#
Solution: Check for Network/Switch configuration.
Follow Below URL to learn How Reallocate works
http://www.theselights.com/2010/03/understanding-netapp-volume-and.html
WAFL stuffs / Reallocate
priv set advanced;
measure fragmentation
wafl scan measure_layout VOLUME
output: 1-100, 1 is smallest frag, 100 is full fragementation
should stay below 10
if above 10, then
wafl scan reallocate VOLUME
wafl scan measure_layout VOLUME
Vol walf iron - checks the vol in wafl level
Wafl check ( when inconsistencies happen, when vol
becomes restricted all of a sudden )
to correct for inconsistencies volume
1. Ctrl –C while boot
2. Options – Selection ? WAFL_Check -z
For slow access or backup or performance issues
Filer> wafl scan masure_layout vol0
Filer> wafl scan measure_layout /vol/vol0/filename
Filer > wafl scan status [vol|file] ---- to view
priv set "advanced;wafl scan status" |wc -l
Scheduling any job at filer
From windows host ( admin host ), enable rsh ( windows 2003 box )
C:\> rsh filer –l root –n sysconfig –r gave the output result ( sim is filer )
Filer HTTP access
1. license http
2. httpd.enable ON
3. httpd.rootdir xxxx ---- location like /vol/vol0/<share path or
qtree >
Volume performance Optimization
Vol options volname minra ON
(minimal read ahead )
P/W
To change admin host administrator’s p/w
Filer>passwd
Filer>login administrator
Filer>new password:…..
To change root p/w
1. attach to console – straight console
2. press Ctrl-C while booting
3. On the menu choose option 3 – password change - root
Ctrl-C - boot menu options
1. Password reset --- root
2. Disk Initialize and destroy and setup new filer
New filer setup
Software get url –f filename
Software install url
Enviroment
Environment status all
Previous ONTAP on flash
Priv set diag
Version –b --- will show the contents in flash
Previous firmware upgrade of disks
Priv set advanced
Filer*>disk_fw_update
Quotas
Lines in /etc/quotas
/vol/vol0/testftp tree 10m
NFS General
/etc/exports
/vol/test -rw,root=sun1
/vol/vol1 rw,root=sun1
#mkdir /mnt/filer
#mount filer1:/vol/vol1 /mnt/filer
/etc/rntab - maintains the mount point
/etc/hosts - name and IP address
/etc/nsswitch.conf - resolution order file
Filer> exportfs
Filer > rdfile /etc/exports
filer> exportfs –a
filer>exportfs –I –o rw=<ip address>, root=<ip address>
NFS troubleshooting
Wcc –u <unix user> ---------- unix credential
>exportfs -c host pathname ro|rw|root #checks access cache
for host permission
>exportfs -s pathname # verifies the path to which a wol is
exported
>exportfs -f #flush cache access entries and reload
>exportfs -r #ensures only persistent exports are loaded
NFS error 70 - stale file handle
>vol read_fsid
# mount --- will display what protocol being used for mounting ( in unix host )
# mount –o tcp < >
Qtree security
Portmap –d
Rpcinfo –p < filer ip >
Lock Manager Release
Priv set advanced
Sm_mon –u < NFS_client_hostname>
While changing the mode
chmod 4710 oidldapd
chmod: changing permissions of `oidldapd': Input/output error
If I look in /var/log/message I see the following error:
Mar 30 19:44:59 bilbo kernel: nfs_refresh_inode: inode number mismatch
Mar 30 19:44:59 bilbo kernel: expected (0x950485c3/0x9b7609), got (0x950485c3/0x7d0b11)
Told customer to get rid of the nosuid on the exports file and that solved the issue.
Permission Denied : File handle
67000000 6ad77710 20000000 107754a 99f750f 84ce0064 67000000 6ad77700
First two numbers FSID
Next three : FID, Inode, FID
Next three : FID export point
Now, inode is different for different volume
It is found by
Priv set diag
Vol read_fsid vol0
=> gives hex number – should match any number above so that it indicates, file of which volume has problem. Hex number can be converted to decimal value as well
In unix side
# find –inum <decimal value >
# find /mnt/cleearcase –inum _________
( checking FID for above mount point )
# /etc/mnttab
( look here to find that number as well )
# ls – li prints inode numbers – in decimal – convert that to
hex
# find . –inum < number > print
( Sometimes, vol fsid number found must be reversed to get the exact place of innode )
General Permission Problems
Check the export permissions
Check the local unix system – file level and owner level
Permission and also qtreee security
( Sometimes filer permission comes to stay on top of local permission at unix box, so that it cannot be seen – they will become hidden )
To find use
# chmod
#chown
Read unix files
# cat
# more
# vi
NFS Performance
Pktt – start e5a , -dump e5a, pktt –stop ( all three– start to end)
Sysstat
Nfsstat –d ( displays cache statistic )
-z ( zero out the stat )
-m ( mount point statistics )
Perfstat –b –f filename > perfstat.begin
Perfstat –e –f filename > perfstat.end
# time mkfile 10m test ( time it takes )
# time cp test
Windows host > sio_ntap_sol 100 100 4096 100m 10 2 a.file
b.file –noflock
CPU utilization 100 percent
Customer needs to collect and send
Perfstat –f <file name> -t 5 > perfstat.out
More detail perfstat
---------------------
Perfstat –t 2 –f nasx > text.txt
Perfstat –t 2 –f nasx –P flat > text.txt
-P domains ( SMP )
~ flat
~ kahuna
~ network
~ raid
~ storage
Other NFS options
Options wafl.root_only_chown on
cifs.nfs_root.ignore_acl ON
Common NFS error messages
Nfs mount : /remote_file_system_name : Stale NFS file
handle=20
this error message means that an opened file or directory
has been destroyed or recreated
Volume priority
[root@adminhost ~]# rsh myfiler priority set volume test level=medium
[root@adminhost ~]# rsh myfiler priority show volume test -v
Volume Priority Relative Sys Priority
Service Priority (vs User)
pkinci_36b_code on Medium VeryLow
[root@adminhost ~]# rsh myfiler priority set volume test level=medium
[root@adminhost ~]#
Filer> priority show
Priority scheduler is stopped.
Priority scheduler system settings:
io_concurrency: 8
Filer> priority on
Priority scheduler starting.
Filer> Wed Mar 17 14:46:38 EDT [Filer: wafl.priority.enable:info]: Priority scheduling is being enabled
Valid level and system options include:
1. VeryHigh
2. High
3. Medium
4. Low
5. VeryLow
Filer> priority set volume vol1 level=High
Filer> priority show volume -v vol1
Volume: vol1
Enabled: on
Level: High
System: Medium
Cache: n/a
Filer> priority delete volume vol2
Filer> priority show volume vol2
Unable to find priority scheduling information for 'vol2'
Below is a sample output of the FlexShare counters:
NetApp1*> stats show prisched
prisched:prisched:queued:0
prisched:prisched:queued_max:5
NetApp1*> stats show priorityqueue
priorityqueue:vol1:weight:76
priorityqueue:vol1:usr_weight:78
[root@adminhost ~]# rsh <filer> sysstat -m or
ANY AVG CPU0 CPU1 CPU2 CPU3
100% 38% 18% 17% 20% 97%
98% 37% 18% 17% 17% 96%
100% 41% 23% 21% 22% 97%
[root@adminhost ~]# rsh <filer> priv set "diag;sysstat -M 1"
or rsh <filer> sysstat -x 1
[root@adminhost ~]# rsh <filer> reallocate status
Reallocation scans are on
No reallocation status.
[root@adminhost ~]#
Psocess Monitoring consuming more CPU
[root@adminhost ~]# rsh myfiler priv set "advanced;ps -c 4"
Warning: These advanced commands are potentially dangerous; use
them only when directed to do so by Network Appliance
personnel.
Process statistics over 43686235.339 seconds...
ID State Domain %CPU StackUsed %StackUsed Name
1 RR i 75% 1080 26% idle_thread0
2 RR i 73% 560 13% idle_thread1
3 RR i 70% 560 13% idle_thread2
4 RR i 53% 560 13% idle_thread3
q[root@adminhost ~]# rsh myfiler priv set "advanced;ps -c 4"
Warning: These advanced commands are potentially dangerous; use
them only when directed to do so by Network Appliance
personnel.
Process statistics over 47818127.225 seconds...
ID State Domain %CPU StackUsed %StackUsed Name
1 RR i 73% 968 23% idle_thread0
2 RR i 69% 512 12% idle_thread1
3 RR i 63% 512 12% idle_thread2
4 RR i 60% 512 12% idle_thread3
75 BR s 5% 3176 38% isp2400_intrd
91 BR 1 9% 2024 6% GbE/e1a
152 BG 0 4% 4440 6% NwkThd_00
153 BG 1 17% 6488 9% NwkThd_01
154 BG 2 14% 6408 9% NwkThd_02
205 RR r 11% 5928 36% raidio_thread
1415 BR w 5% 5816 17% wafl_exempt_0
1416 BR w 8% 5816 17% wafl_exempt_1
1417 RR w 7% 5832 17% wafl_exempt_2
1418 BR w 8% 5832 17% wafl_exempt_3
1420 BR k 8% 13664 41% wafl_lopri
2001 BR k 4% 13680 41% gr_scheduler
NFS lock clear
rsh myfiler "priv set advanced; sm_mon -l testhost"
To see disk speed
[root@adminhost etc]# rsh myfiler storage show disk -a |less or rsh myfiler vol status -r
Disk: 0a.16
Shelf: 1
Bay: 0
Serial: ZBHM63PH
Vendor: NETAPP
Model: X267_HKURO500SSX
Rev: AB0A
RPM: 7200
Check for busy disk
[root@adminhost etc]# rsh myfiler stats show disk:*:disk_busy
check failed disk
rsh myfiler storage show disk -p
To show all disks on the system : Use "storage show disk -T"
To get shelf details of filer : Use "storage show shelf <shelf-id>"
To check no of boxes connected to filer and no of NFS OPS
[root@adminhost etc]# rsh myfiler nfsstat -l