Thursday, 17 November 2011

Lun Resizing on Netapp Filer & Mount on Windows Host

Step 1: first offline the lun
 2. then run the command lun resize <lun path> newsize
 3. if existing size is 10g and you want to add 5g to it. give 15g as the new size
 4. online the lun. then from the windows host, run diskmgmt.msc
 5. it will show that the new space as unallocated space

6. run the command diskpart from run prompt ( On windows Host)
 7. list volume
 8. select <volume>
 9. extend
         that's it
10. verify using disk managemnet that the new size is taken and you are able to open     that partition
-------------------------------------------------------------------------------------------

Saturday, 12 November 2011

How to Run Unix Commands on Netapp filer ?

[root@adminhost ~]# rsh testfiler

testfiler>
java netapp.cmds.jsh
java netapp.cmds.jsh
IOException
jsh>

jsh> ?
?
Java Shell commands:
        cd [directory]
        pwd
        ls [-l]
        cat file
        rm file [file2 ...]
        cp src dest
        mv src dest
        ps [-l]
        kill <-1|-9> threadName
        gc
        classpath [pathname]
        syspath [pathname]
        Debug on|off
        threads
        monitors
        heap
        version
        syncdb
        du [-sk] [files or directories]
        java_class [&]
        ONTAP_cmd
jsh>

Thursday, 10 November 2011

Issue on Netapp filer

/testfiler/etc
Thu Nov 10 08:17:56 CST [testfiler: export.host.resolve.timeout:warning]: Trial 3 for the nameservice lookup request timed out.
Thu Nov 10 08:18:01 CST [testfiler: export.host.resolve.timeout:warning]: Trial 2 for the nameservice lookup request timed out.
Thu Nov 10 08:18:01 CST [testfiler: export.host.resolve.timeout:warning]: Trial 3 for the nameservice lookup request timed out.
Thu Nov 10 08:18:09 CST [testfiler: export.host.resolve.timeout:warning]: Trial 1 for the nameservice lookup request timed out.
Thu Nov 10 08:18:21 CST [testfiler: export.host.resolve.timeout:warning]: Trial 1 for the nameservice lookup request timed out.

Solution: 1. Check for route issues
               2. If only DNS server is resolving the hosts then please use FQDN on hosts..

List of Failed Electronics Modules: "1" on Netapp filer

Its a bug in 7.3.1.1P2 & 7.3.2p2. Bug ID is 383376. Its fixed in latest versions.


Solution:  you can check for any failure in # environment status shelf


List of Failed Electronics Modules: "1"


Tuesday, 8 November 2011

Netapp Filer Bottleneck & Performance Tips

How to Check Network Packet Failure On Host to Filer 

root@userhost ~]# ping testfiler-nas
PING
testfiler-nas.us.oracle.com (192.168.0.2) 56(84) bytes of data.
64 bytes from
testfiler-nas.us.oracle.com (192.168.0.2): icmp_seq=1 ttl=254 time=0.654 ms
64 bytes from
testfiler-nas.us.oracle.com (192.168.0.2): icmp_seq=3 ttl=254 time=1.11 ms
64 bytes from
testfiler-nas.us.oracle.com (192.168.0.2): icmp_seq=5 ttl=254 time=0.522 ms
64 bytes from
testfiler-nas.us.oracle.com (192.168.0.2): icmp_seq=7 ttl=254 time=0.745 ms
64 bytes from
testfiler-nas.us.oracle.com (192.168.0.2): icmp_seq=8 ttl=254 time=0.441 ms
64 bytes from
testfiler-nas.us.oracle.com (192.168.0.2): icmp_seq=10 ttl=254 time=1.31 ms

--- testfiler-nas.us.oracle.com ping statistics ---
11 packets transmitted, 6 received, 45% packet loss, time 10002ms
rtt min/avg/max/mdev = 0.441/0.798/1.312/0.315 ms
[root@userhost ~]#


Solution: Check for Network/Switch configuration. 
 
Follow Below URL to learn How Reallocate works

http://www.theselights.com/2010/03/understanding-netapp-volume-and.html


WAFL stuffs / Reallocate

priv set advanced;
measure fragmentation
wafl scan measure_layout VOLUME
output: 1-100, 1 is smallest frag, 100 is full fragementation
            should stay below 10
            if above 10, then
                wafl scan reallocate VOLUME
                wafl scan measure_layout VOLUME


Vol walf iron   - checks the vol in wafl level

Wafl check  ( when inconsistencies happen, when vol

                       becomes restricted all of a sudden )

                to correct for inconsistencies volume

                1. Ctrl –C while boot

                2. Options – Selection ? WAFL_Check -z

For slow access or backup or performance issues


Filer> wafl scan masure_layout      vol0

Filer> wafl scan measure_layout /vol/vol0/filename

Filer > wafl scan status [vol|file]    ---- to view

priv set "advanced;wafl scan status" |wc -l


 Scheduling any job at filer


From windows host ( admin host ), enable rsh ( windows 2003 box )

C:\> rsh filer –l root –n sysconfig –r    gave the output result ( sim is filer )


Filer HTTP access

1. license http

2. httpd.enable ON

3. httpd.rootdir xxxx ---- location like /vol/vol0/<share path or

                                         qtree >

Volume performance Optimization

Vol options volname minra ON

                                   (minimal read ahead )

P/W

To change admin host administrator’s p/w

Filer>passwd

Filer>login administrator

Filer>new password:…..

To change root p/w

1. attach to console – straight console

2. press Ctrl-C while booting

3. On the menu choose option 3 – password change  - root

Ctrl-C  - boot menu options

1. Password reset   ---  root

2. Disk Initialize and destroy and setup new filer


New filer setup

Software get url –f filename

Software install url

Enviroment

Environment status all

Previous ONTAP on flash

Priv set diag

Version –b   --- will show the contents in flash

Previous firmware upgrade of disks

Priv set advanced

Filer*>disk_fw_update


Quotas


Lines in /etc/quotas

/vol/vol0/testftp     tree     10m


NFS General

/etc/exports
/vol/test    -rw,root=sun1

/vol/vol1    rw,root=sun1

#mkdir /mnt/filer

#mount filer1:/vol/vol1 /mnt/filer

/etc/rntab  - maintains the mount point

/etc/hosts   - name and IP address

/etc/nsswitch.conf  - resolution order file

Filer> exportfs

Filer > rdfile /etc/exports

filer> exportfs –a

filer>exportfs –I –o rw=<ip address>, root=<ip address>

NFS troubleshooting


Wcc –u  <unix user>   ---------- unix credential

>exportfs -c host pathname ro|rw|root #checks access cache

                    for host permission

>exportfs -s pathname # verifies the path to which a wol is

                     exported

>exportfs -f #flush cache access entries and reload

>exportfs -r #ensures only persistent exports are loaded

NFS error 70 - stale file handle

>vol read_fsid

# mount   --- will display what protocol being used for mounting    ( in unix host )

# mount –o tcp < >

Qtree security


Portmap –d

 Rpcinfo –p < filer ip >

Lock Manager Release

Priv set advanced

Sm_mon –u < NFS_client_hostname>


While changing the mode

chmod 4710 oidldapd

chmod: changing permissions of `oidldapd': Input/output error

If I look in /var/log/message I see the following error:

Mar 30 19:44:59 bilbo kernel: nfs_refresh_inode: inode number mismatch

Mar 30 19:44:59 bilbo kernel: expected (0x950485c3/0x9b7609), got (0x950485c3/0x7d0b11)


Told customer to get rid of the nosuid on the exports file and that solved the issue.

Permission Denied : File handle

67000000   6ad77710  20000000  107754a 99f750f  84ce0064  67000000   6ad77700

First two numbers FSID

Next three  :  FID, Inode, FID

Next three :  FID export point


Now, inode is different for different volume

It is found by

Priv set diag

Vol read_fsid vol0

 => gives hex number – should match any number above so that it indicates, file of which volume has problem. Hex number can be converted to decimal value as well

In unix side

# find –inum <decimal value >

# find /mnt/cleearcase –inum  _________

  ( checking FID for above mount point )


# /etc/mnttab

  ( look here to find that number as well )


# ls – li   prints inode numbers – in decimal – convert that to

                hex

# find . –inum < number > print 


( Sometimes, vol fsid number found must be reversed to get the exact place of innode )

General Permission Problems

Check the export permissions

Check the local unix system – file level and owner level

                     Permission and also qtreee security

( Sometimes filer permission comes to stay on top of local permission at unix box, so that it cannot be seen – they will become hidden )

To find use

# chmod

#chown


Read unix files

# cat

# more

# vi

NFS Performance

Pktt – start e5a , -dump e5a, pktt –stop ( all three– start to end)

Sysstat

Nfsstat –d  ( displays cache statistic )

             -z  ( zero out the stat )

             -m  ( mount point statistics )

Perfstat –b –f filename > perfstat.begin

Perfstat –e –f filename > perfstat.end

# time mkfile 10m test   ( time it takes )

# time cp test

Windows host > sio_ntap_sol 100 100 4096 100m 10 2 a.file

                           b.file –noflock

CPU utilization 100 percent


Customer needs to collect and send

Perfstat –f  <file name> -t 5 > perfstat.out

More detail perfstat
---------------------
Perfstat –t 2 –f nasx > text.txt

Perfstat –t 2 –f nasx –P flat > text.txt

                                  -P  domains ( SMP )

                                       ~ flat

                                          ~ kahuna

                                          ~ network

                                          ~ raid

                                          ~ storage

Other NFS options

Options wafl.root_only_chown  on

cifs.nfs_root.ignore_acl  ON

Common NFS error messages

Nfs mount : /remote_file_system_name : Stale NFS file

                     handle=20

      this error message means that an opened file or directory

     has been destroyed or recreated

Volume priority


[root@adminhost ~]# rsh myfiler priority set volume test level=medium
[root@adminhost ~]# rsh myfiler priority show volume test -v
        Volume Priority Relative Sys Priority
                Service Priority    (vs User)
pkinci_36b_code       on   Medium      VeryLow
[root@adminhost ~]# rsh myfiler priority set volume test level=medium
[root@adminhost ~]#

Filer> priority show
Priority scheduler is stopped.

Priority scheduler system settings:
io_concurrency: 8

Filer> priority on
Priority scheduler starting.
Filer> Wed Mar 17 14:46:38 EDT [Filer: wafl.priority.enable:info]: Priority scheduling is being enabled

Valid level and system options include:

   1. VeryHigh
   2. High
   3. Medium
   4. Low
   5. VeryLow

Filer> priority set volume vol1 level=High
Filer> priority show volume -v vol1
Volume: vol1
Enabled: on
Level: High
System: Medium
Cache: n/a

Filer> priority delete volume vol2
Filer> priority show volume vol2
Unable to find priority scheduling information for 'vol2'

Below is a sample output of the FlexShare counters:

NetApp1*> stats show prisched
prisched:prisched:queued:0
prisched:prisched:queued_max:5

NetApp1*> stats show priorityqueue
priorityqueue:vol1:weight:76
priorityqueue:vol1:usr_weight:78

[root@adminhost ~]# rsh <filer> sysstat -m or
 ANY  AVG  CPU0 CPU1 CPU2 CPU3
100%  38%   18%  17%  20%  97%
 98%  37%   18%  17%  17%  96%
100%  41%   23%  21%  22%  97%


[root@adminhost ~]# rsh <filer> priv set "diag;sysstat -M 1"


or rsh <filer> sysstat -x 1


[root@adminhost ~]# rsh <filer> reallocate status
Reallocation scans are on
No reallocation status.
[root@adminhost ~]#


Psocess Monitoring consuming more CPU

[root@adminhost ~]# rsh myfiler priv set "advanced;ps -c 4"
Warning: These advanced commands are potentially dangerous; use
         them only when directed to do so by Network Appliance
         personnel.
Process statistics over 43686235.339 seconds...
   ID State Domain %CPU StackUsed %StackUsed Name
    1 RR    i       75%      1080        26% idle_thread0
    2 RR    i       73%       560        13% idle_thread1
    3 RR    i       70%       560        13% idle_thread2
    4 RR    i       53%       560        13% idle_thread3

q[root@adminhost ~]# rsh myfiler priv set "advanced;ps -c 4"
Warning: These advanced commands are potentially dangerous; use
         them only when directed to do so by Network Appliance
         personnel.
Process statistics over 47818127.225 seconds...
   ID State Domain %CPU StackUsed %StackUsed Name
    1 RR    i       73%       968        23% idle_thread0
    2 RR    i       69%       512        12% idle_thread1
    3 RR    i       63%       512        12% idle_thread2
    4 RR    i       60%       512        12% idle_thread3
   75 BR    s        5%      3176        38% isp2400_intrd
   91 BR    1        9%      2024         6% GbE/e1a
  152 BG    0        4%      4440         6% NwkThd_00
  153 BG    1       17%      6488         9% NwkThd_01
  154 BG    2       14%      6408         9% NwkThd_02
  205 RR    r       11%      5928        36% raidio_thread
 1415 BR    w        5%      5816        17% wafl_exempt_0
 1416 BR    w        8%      5816        17% wafl_exempt_1
 1417 RR    w        7%      5832        17% wafl_exempt_2
 1418 BR    w        8%      5832        17% wafl_exempt_3
 1420 BR    k        8%     13664        41% wafl_lopri
 2001 BR    k        4%     13680        41% gr_scheduler


NFS lock clear


 rsh myfiler "priv set advanced; sm_mon -l  testhost"

To see disk speed

[root@adminhost etc]# rsh myfiler storage show disk -a |less or  rsh myfiler vol status -r
Disk:             0a.16
Shelf:            1
Bay:              0
Serial:           ZBHM63PH
Vendor:           NETAPP
Model:            X267_HKURO500SSX
Rev:              AB0A
RPM:              7200

Check for busy disk

[root@adminhost etc]# rsh myfiler stats show disk:*:disk_busy

check failed disk

rsh myfiler storage show disk -p

To show all disks on the system : Use "storage show disk -T"


To get shelf details of filer : Use "storage show shelf <shelf-id>"


To check no of boxes connected to filer and no of NFS OPS

[root@adminhost etc]# rsh myfiler nfsstat -l

Troubleshooting NFS

Common NFS Errors "No such host" - Name of the server is specified incorrectly "No such file or directory" - Either...