Home > FAS3040, NetApp, Uncategorized > NetApp: OnTAP VOL Move and Relocation Procedures – CIFS Volumes – VOL MOVE METHOD

NetApp: OnTAP VOL Move and Relocation Procedures – CIFS Volumes – VOL MOVE METHOD

I recently upgraded to NetApp OnTAP v8.1.3 on some production controllers.  The primary benefit to us for this version is the ability to transparently deal with 32bit and 64bit objects on NetApp.  Prior to v8.1, objects (eg: Vol/LUN/etc) on a 32bit aggregate could not be relocated, moved, Volume SnapMirror (block based) to a 64bit aggregate – be it either local or remote.  Because of this, we could not migrate data between some aggregates locally, and we were limited as to Source and Destinations for VSM relationships.  Post v8.1+, this issue has gone away.  As such, we are now relocating data to better place it on the system, as non-disruptively as possible.

There are many ways to migrate the data, but by far, the easiest is the built in “vol move” command.  This command automates the following high level tasks:

  • Creates a new temporary volume matching the source (locally)
  • Marks the volume ‘restricted’ so that it can be a VSM destination
  • Creates the VSM relationship to the temporary volume and performs the initial transfer (eg: 100GB)
  • After the initial transfer, it performs a catch-up sync of any changes that occurred during the first sync (eg: 10GB)
  • It then performs incremental catch-up sync’s quicker and quicker to get to a 0-time point (eg: 1GB, then 100MB, then 1MB, etc)
  • It then does a “cutover”, performs the last sync.  Renames the old volume, renames the new volume to the old name, deletes the old volume, and cleans up the VSM relationship it used.
  • To the system, the new volume is in the new location with the existing name, all snapshots, etc.

However, there is work to be done to make this happen, along with some caveats:

  • VOL MOVE cannot be used on NFS or CIFS exported volumes.  These require a manual procedure of a VSM and manual cutover.
    • The default on our NetApp’s is TO CREATE NFS and CIFS auto-export/auto-shares.  This is frustrating in that it makes this process a little more difficult.  LUN based VOL’s, and VSM/QSM/SV destination volumes likely should not be shared off.  If they are, they’re for us to use administratively (eg: to quickly view the share and see what is there), so they can have CIFS broken off temporarily.  Long term, this options should be disabled.
    • The VOL MOVE process CAN be used IF it is acceptable to stop the CIFS or NFS sharing on the volume for the period of the migration.  This would be a DISRUPTIVE migration, or at least a much longer duration than the method that will be described here.
  • A controller can only perform a single VOL MOVE at any time.
  • This process CAN also be done with volumes containing SnapVault, but I will cover that separately (and also covers OSSV)

There are two methods to be used here:
Option 1 – Use VOL MOVE, similar to any other non-CIFS share.  This is a largely automated move, but still has manual steps.  The CIFS share will be OFFLINE for the entire duration of the work.
Option 2 – Manually perform a Volume SnapMirror (VSM).  This process is 100% manual, but the CIFS share can be online except for a brief (~2-5 minute) period during cutover, which can be managed as to when it occur.

And there are two scenarios:
Scenario 1 – CIFS Share, volume IS NOT replicated to another NetApp
Scenario 2 – CIFS Share, volume IS replicated to another NetApp (eg: VSM/QSM/SV)

As these are complicated procedures and put data both at risk and/or an outage, I will be creating multiple documents to describe each time of movement.  This will prevent documentation that contains many “if…” “sometimes….” or other wording.  You will be able to choose the appropriate HOWTO and follow it through start to finish.

IF YOU HAVE NOT PERFORMED THIS TASK ON TEST VOLUMES, WITH REPLICATION,  YOU SHOULD NOT PERFORM THIS TASK IN PRODUCTION.  THERE ARE MANY MOVING PARTS HERE, THAT REQUIRE ATTENTION TO DETAIL AND EXPERIENCE.

OPTION 1 – DISRUPTIVE METHOD, USING VOL MOVE TO AUTOMATE MIGRATION, WITH or WITHOUT A VOLUME SNAPMIRROR RELATIONSHIP

  1. Identify a destination AGGREGATE with sufficient space for your volume to move to.

    On the FILER> run “aggr show_space <aggr>”

    CONTROLLERA> aggr show_space aggr1

    Aggregate ‘aggr1’

        Total space    WAFL reserve    Snap reserve    Usable space       BSR NVLOG           A-SIS          Smtape

      10831120000KB    1083112000KB             0KB    9748008000KB             0KB      19893840KB             0KB

    Space allocated to volumes in the aggregate

    Volume                          Allocated            Used       Guarantee

    vol0                          264199884KB      82748964KB          volume

    VOL_CIFS_03                  1755527744KB    1731532708KB            none

    SM_VOL_SERVER_DB4_SMSQL        21665848KB      21139136KB            none

    SM_VOL_SERVER_DB1_SMSQL       379584568KB      88907220KB          volume

    SM_VOL_SERVER_DB4_MDF         562387320KB     555229264KB            none

    SM_VOL_SERVER_DB1_SMSQLMOUNT     227750764KB     196552504KB          volume

    SM_VOL_SERVER_DB1_SP2010_MDF     168704284KB     119878260KB          volume

    SM_VOL_SERVER_INFDB2_SMSQL      60088504KB      59479840KB            none

    ndm_dstvol_1376588545           8081740KB       4306004KB    (restricted)

    Aggregate                       Allocated            Used           Avail

    Total space                  3447990656KB    2859773900KB    6,279,819,496KB

    Snap reserve                          0KB             0KB             0KB

    WAFL reserve                 1083112000KB     117340548KB     965771452KB

    The highlighted portion shows the available space on the destination aggregate.  I have also added the ‘000 separator of “,” for readability.  We can see that “aggr1” has 6.2TB free, roughly.

  2. Identify the size of the volume to be moved, by performing the same task on the source aggregate – “aggr4” in this example:

    CONTROLLERA> aggr show_space aggr4

    Aggregate ‘aggr4’

        Total space    WAFL reserve    Snap reserve    Usable space       BSR NVLOG           A-SIS          Smtape

      12150265344KB    1215026532KB     546761940KB   10388476872KB             0KB           384KB             0KB

    Space allocated to volumes in the aggregate

    Volume                          Allocated            Used       Guarantee

    SM_VOL_SERVER_DB1_MDF        1028051932KB     279,358,528KB          volume

    SM_VOL_SERVER_INFDB2_MDF      305358308KB     302,311,992KB            none

    Aggregate                       Allocated            Used           Avail

    Total space                  1333410240KB     581670520KB    8930637892KB

    Snap reserve                  546761940KB             0KB     546761940KB

    WAFL reserve                 1215026532KB     255693876KB     959332656KB

    Both of the volumes listed are around 300GB.  These will easily fit into the 6.2TB free on aggr1.

  3. If the VOL in question, is a VOLUME SNAPMIRROR (VSM) Primary folder, we must break the relationship to the secondary while we perform our tasks:

    On the SECONDARY filer:

    Snapmirror update <secondary_filer>:<secondary_vol>

    snapmirror quiesce <secondary_filer>:<secondary_vol>

    snapmirror break <secondary_filer>:<secondary_vol>

    Example:

    snapmirror update CONTROLLERA:SM_VOL_AVW_TEST1

    snapmirror quiesce CONTROLLERA:SM_VOL_AVW_TEST1

    snapmirror break CONTROLLERA:SM_VOL_AVW_TEST1

  4. We need to obtain the CIFS information for the VOLUME.  We can obtain this by searching through the \\<FILER>\C$\ETC\CIFSCONFIG_SHARE.CFG file:

    Q:\etc>find /i “VOL_AVW_TEST1” \\CONTROLLERA\c$\etc\cifsconfig_share.cfg

    ———- \\CONTROLLERA\C$\ETC\CIFSCONFIG_SHARE.CFG

    cifs shares -add “VOL_AVW_TEST1$” “/vol/VOL_AVW_TEST1” -comment “Purge Sept 1”

    cifs access “VOL_AVW_TEST1$” everyone Full Control

    cifs access “VOL_AVW_TEST1$” S-1-5-XX-544 Read

    cifs access “VOL_AVW_TEST1$” S-1-5-XX-xxxxxxxx-118882789-1848903544-53248 Full Control

    However – note that there is both “CIFS SHARES –ADD” command which contains the volume name “VOL_AVW_TEST1” and “CIFS ACCESS” which contains the share name.  In the above example, the CIFS share name matches that of the volume – this will NOT always be the case.  If the share has other names, or there are QTree’s or other subdirectories shared, they may show all details:

    Q:\etc>find /i “VOL_AVW_TEST1” \\CONTROLLERA\c$\etc\cifsconfig_share.cfg

    ———- \\CONTROLLERA\C$\ETC\CIFSCONFIG_SHARE.CFG

    cifs shares -add “VOL_AVW_TEST1$” “/vol/VOL_AVW_TEST1” -comment “Purge Sept 1”

    cifs access “VOL_AVW_TEST1$” everyone Full Control

    cifs access “VOL_AVW_TEST1$” S-1-5-32-544 Read

    cifs access “VOL_AVW_TEST1$” S-1-5-21-xxxxxxxx-118882789-1848903544-53248 Full Control

    cifs shares -add “TEST_AVW_ADMIN1” “/vol/VOL_AVW_TEST1/Admin” -comment “Purge Sept 1”

Here you can see that a second share DOES exist, but you do not see the “cifs shares” commands for it.  So the recommendation then is:

  1. Look for the “VOL_NAME” in the file and look for ONLY “CIFS SHARES –ADD” commands:

    Q:\etc>find /i “VOL_AVW_TEST1” \\CONTROLLERA\c$\etc\cifsconfig_share.cfg | find /i “cifs shares -add”

    cifs shares -add “VOL_AVW_TEST1$” “/vol/VOL_AVW_TEST1” -comment “Purge Sept 1”

    cifs shares -add “TEST_AVW_ADMIN1” “/vol/VOL_AVW_TEST1/Admin” -comment “Purge Sept 1”

  2. Look for the “SHARE NAME” to get the “CIFS ACCESS” commands:

    Q:\etc>find /i “VOL_AVW_TEST1$” \\CONTROLLERA\c$\etc\cifsconfig_share.cfg | find /i “cifs access”

    cifs access “VOL_AVW_TEST1$” everyone Full Control

    cifs access “VOL_AVW_TEST1$” S-1-5-32-544 Read

    cifs access “VOL_AVW_TEST1$” S-1-5-21-xxxxxxxx-118882789-1848903544-53248 Full Control

    Q:\etc>find /i “TEST_AVW_ADMIN1” \\CONTROLLERA\c$\etc\cifsconfig_share.cfg | find /i “cifs access”

    cifs access “TEST_AVW_ADMIN1” everyone Full Control

    cifs access “TEST_AVW_ADMIN1” S-1-5-21-xxxxxxxx-118882789-1848903544-53248 Full Control

You now have ALL of the CIFS commands and configuration for the volume in question.  Each “CIFS ACCESS” contains one user or group setting.   Once we have the above details we can recreate only the relevant portions of the CIFS commands:

cifs shares -add “VOL_AVW_TEST1$” “/vol/VOL_AVW_TEST1” -comment “Purge Sept 1”
cifs access “VOL_AVW_TEST1$” everyone Full Control
cifs access “VOL_AVW_TEST1$” S-1-5-32-544 Read
cifs access “VOL_AVW_TEST1$” S-1-5-21-xxxxxxxx-118882789-1848903544-53248 Full Control
cifs shares -add “TEST_AVW_ADMIN1” “/vol/VOL_AVW_TEST1/Admin” -comment “Purge Sept 1”
cifs access “TEST_AVW_ADMIN1” everyone Full Control
cifs access “TEST_AVW_ADMIN1” S-1-5-21-xxxxxxxx-118882789-1848903544-53248 Full Control

  1. In order for VOL MOVE to work, the VOL must not be a CIFS share or NFS export.  So we will need to REMOVE the shares – which is why we got the commands above.

    To remove the shares we run a “CIFS SHARES –DELETE <SHARE_NAME>”.  Using the above example:

    cifs shares -delete “VOL_AVW_TEST1$”
    cifs shares -delete “TEST_AVW_ADMIN1”

The CIFS shares for the folder are now removed, and CIFS access to the shares is terminated.  USERS ARE NOW OFFLINE.

There exists the possibility that the share is exported via NFS as well – as this is a default.   As this is not required, we will remove this as well:

exportfs -z /vol/VOL_AVW_TEST1

  1. We then run the VOL MOVE:

    # We perform the actual “vol move” – this will vary in time based on how much data is being moved and

    # how busy the controller is.  Because it returns control to the console, we cannot perform the next

    # steps via script until the process is done.

    vol move start VOL_AVW_TEST1 aggr1

    The “vol move” can be checked via a “vol move status” or a “snapmirror status”.  Samples for another volume are shown below:

    CONTROLLERA> vol move status

    Source                    Destination                     CO Attempts    CO Time     State

    VOL_AVW_TEST1             aggr1                           3              60          setup

    CONTROLLERA> snapmirror status VOL_AVW_TEST1

    Snapmirror is on.

    Source                              Destination                           State          Lag        Status

    127.0.0.1:VOL_AVW_TEST1  CONTROLLERA:ndm_dstvol_1376588545     Uninitialized  –          Transferring  (23 GB done)

  1. Immediately following the volume indicating successful:

    Fri Aug 16 11:16:57 MDT [CONTROLLERA:vol.move.End:info]: Successfully completed move of volume VOL_AVW_TEST1 to aggr aggr1.

    Run the CIFS commands we obtained in step 4 – this will recreate the Shares and Share Security.

    cifs shares -add “VOL_AVW_TEST1$” “/vol/VOL_AVW_TEST1” -comment “Purge Sept 1”
    cifs access “VOL_AVW_TEST1$” everyone Full Control
    cifs access “VOL_AVW_TEST1$” S-1-5-32-544 Read
    cifs access “VOL_AVW_TEST1$” S-1-5-21-xxxxxxxx-118882789-1848903544-53248 Full Control
    cifs shares -add “TEST_AVW_ADMIN1” “/vol/VOL_AVW_TEST1/Admin” -comment “Purge Sept 1”
    cifs access “TEST_AVW_ADMIN1” everyone Full Control
    cifs access “TEST_AVW_ADMIN1” S-1-5-21-xxxxxxxx-118882789-1848903544-53248 Full Control

As the Volume Name is not changed, only the location, these commands transfer over.

The CIFS shares for the folder are now re-added, and CIFS access to the shares is available.  USERS ARE NOW ONLINE, but have been offline for the duration of the VOL MOVE command.

  1. Resync the original VSM relationship from the Primary:

    On the SECONDARY filer:

    snapmirror resync –S <primary_filer>:<primary_vol> <secondary_filer>:<secondary_vol>

    Example:

    snapmirror resync -S CONTROLLERB:VOL_AVW_TEST1 SM_VOL_AVW_TEST1

    You may be prompted to delete changed snapshots. Review the list, but what you’ll find is that the snapshots on the Destination are older than the Source, and the source likely has deleted some due to age in your migration window. If acceptable, continue.

  1. Troubleshooting steps:

    The only real issue I’ve seen is that the “cutover” attempt which normally tries 3 times with a 60 second cutover time, times out.  If the task is performed at night, when the controller(s) are busier with background tasks (eg: DeDupe, Replication, Backups, etc) then some processes will take longer, and 60 seconds may not be quick enough.  If the VOL being moved was a LOCAL LUN (eg: for an SQL server) then >60 seconds might be a concern to do live, and we’d have to make it work.  But for a VSM Secondary Destination folder, we don’t care if it takes 300 seconds to complete, as the VSM is “broken-off” until it’s done anyway.  Theory says that there should be no net-new data coming in anyway, so the cutover should be able to complete.

    In the event that this occurs try:

    # VOL MOVE with 5 retries and 300 second cutover.

    vol move start VOL_AVW_TEST1 aggr1 –r 5 –w 300

  2. Logging:

    If you are looking for logs, the easiest way is to map a drive to the C$ share of the controller.  Then, you can browse the “messages” log and search for “vol.move”.  Example:

    Q:\etc\log>find /i “vol.move” \\CONTROLLERA\c$\etc\log\messages | find /i “SP2010”

    Thu Aug 15 08:32:27 MDT [CONTROLLERA:vol.move.Start:info]: Move of volume SM_VOL_SERVER_DB1_SP2010_MDF to aggr aggr1 started

    Thu Aug 15 08:33:16 MDT [CONTROLLERA:vol.move.transferStart:info]: Baseline transfer from volume SM_VOL_SERVER_DB1_SP2010_MDF to ndm_dstvol_1376577147 started.

    Thu Aug 15 08:50:37 MDT [CONTROLLERA:vol.move.transferStatus:info]: Baseline transfer from volume SM_VOL_SERVER_DB1_SP2010_MDF to ndm_dstvol_1376577147 took 1032 secs and transferred 80961556 KB data.

    Thu Aug 15 08:50:39 MDT [CONTROLLERA:vol.move.transferStart:info]: Update from volume SM_VOL_SERVER_DB1_SP2010_MDF to ndm_dstvol_1376577147 started.

    Thu Aug 15 08:51:05 MDT [CONTROLLERA:vol.move.transferStatus:info]: Update from volume SM_VOL_SERVER_DB1_SP2010_MDF to ndm_dstvol_1376577147 took 19 secs and transferred 6920 KB data.

    Thu Aug 15 08:51:12 MDT [CONTROLLERA:vol.move.transferStart:info]: Update from volume SM_VOL_SERVER_DB1_SP2010_MDF to ndm_dstvol_1376577147 started.

    Thu Aug 15 08:51:39 MDT [CONTROLLERA:vol.move.transferStatus:info]: Update from volume SM_VOL_SERVER_DB1_SP2010_MDF to ndm_dstvol_1376577147 took 20 secs and transferred 216 KB data.

    Thu Aug 15 08:51:42 MDT [CONTROLLERA:vol.move.updateTimePrediction:info]: Expected time for next update from volume SM_VOL_SERVER_DB1_SP2010_MDF to ndm_dstvol_1376577147 is 1 secs to transfer 64 KB data.

    Thu Aug 15 08:51:47 MDT [CONTROLLERA:vol.move.cutoverStart:info]: Cutover started for vol move of volume SM_VOL_SERVER_DB1_SP2010_MDF to aggr aggr1.

    Thu Aug 15 08:52:09 MDT [CONTROLLERA:vol.move.cutoverEnd:info]: Cutover finished for vol move of volume SM_VOL_SERVER_DB1_SP2010_MDF to aggregate aggr1 – time taken 21 secs

    Thu Aug 15 08:52:18 MDT [CONTROLLERA:vol.move.End:info]: Successfully completed move of volume SM_VOL_SERVER_DB1_SP2010_MDF to aggr aggr1.

    These are the same messages that appear on the console screen, interrupting your work.  Looking at them in the messages log allows you to filter and make them more readable. 

By changing the filters:

Q:\etc\log>find /i “vol.move” \\CONTROLLERA\c$\etc\log\messages | find /i “Baseline”

Tue Aug 13 17:03:12 MDT [CONTROLLERA:vol.move.transferStart:info]: Baseline transfer from volume VOL_AVW_TEST1 to ndm_dstvol_1376434956 started.
Tue Aug 13 17:04:24 MDT [CONTROLLERA:vol.move.transferStatus:info]: Baseline transfer from volume VOL_AVW_TEST1 to ndm_dstvol_1376434956 took 68 secs and transferred 6318532 KB data.
Tue Aug 13 17:22:19 MDT [CONTROLLERA:vol.move.transferStart:info]: Baseline transfer from volume VOL_AVW_TEST1 to ndm_dstvol_1376436098 started.
Tue Aug 13 17:41:17 MDT [CONTROLLERA:vol.move.transferStatus:info]: Baseline transfer from volume VOL_AVW_TEST1 to ndm_dstvol_1376436098 took 1131 secs and transferred 31618820 KB data.
Thu Aug 15 11:20:25 MDT [CONTROLLERA:vol.move.transferStart:info]: Baseline transfer from volume SM_VOL_SERVER_INFDB2_SMSQL to ndm_dstvol_1376587157 started.
Thu Aug 15 11:32:02 MDT [CONTROLLERA:vol.move.transferStatus:info]: Baseline transfer from volume SM_VOL_SERVER_INFDB2_SMSQL to ndm_dstvol_1376587157 took 685 secs and transferred 59376696 KB data.
Thu Aug 15 11:43:08 MDT [CONTROLLERA:vol.move.transferStart:info]: Baseline transfer from volume SM_VOL_SERVER_INFDB2_MDF to ndm_dstvol_1376588545 started.

You can start to collect other details.  But looking for “Baseline” we can see each baseline transfer, with their # of seconds and # of KB transferred, which could help us for statistics to use to estimate future transfers.  As we see both the “Started” and “completed” time, we can also get a bearing on when the transfers occurred, if we’re trying to compare to logs for other reasons.                                                                                                                                                             

Advertisements
Categories: FAS3040, NetApp, Uncategorized
  1. ada
    September 24, 2014 at 4:56 PM

    Excellent paper helped me a lot!!

  2. Ranj
    August 11, 2015 at 8:47 AM

    Hi

    We are using Data ONTAP 8.2.3p3 on our FAS8020 in 7-mode and we have 2 aggregates, a SATA and SAS aggregate.

    I want to decommission the SATA aggregate as I want to move that tray to another site. If I have a flexvol containing 3 qtrees CIFS shares can I use data motion (vol copy) to move the flex vol on the same controller but to a different aggregate without major downtime?

    I am aware that possibly there may be a small downtime with the CIFS share terminating but I plan to do this work out of core business hours.

    Is this possible?

    Many Thanks

    • August 11, 2015 at 9:00 AM

      That shouldn’t be a problem. Are your existing VOL’s or QTree’s SnapMirror/Vaulted anywhere or otherwise linked to something else?

      If not, and they’re just shared off VOL’s, then the high level would be:
      * vol copy as you indicate
      * at time of cutover, remove all the CIFS shares from it – I find if you modify the CIFS config to get the commands, makes it very easy to do in a batch. You’re now OFFLINE
      * do a final vol copy to sync the last bit of data
      * republish the CIFS shares. If access works as expected, you’re back online.
      * Cleanup the old VOL’s – perhaps in a day or two to ensure things are good. Take the vol offline though, to prevent confusion and potential for error.

      Note that removing a tray, to move to another site, WILL be a downtime of the controller. I’m not aware of any ability to remove a shelf, while hot/operating.

    • September 28, 2015 at 10:54 PM

      That does sound like it would work. You can use my Option 2. This will copy the data in the background, then as you get close to your outage you repeat the tasks to get less and less changed data to sync. Then shutdown CIFS, and finalize the transfer.

  3. Ranj
    August 12, 2015 at 1:49 AM

    Thank you for that quick response! excellent detailed blog btw!

    Yes the flexvol in question does have a few qtrees which are snapmirrored weekly to a destination filer.

    Would I need to break this relationship?

    I am confused and trying to understand your third point on why I would need to do a final vol copy as I would assume the whole volume would copy on the first step?

    Also as this is only moving the aggregate from one to another will still mean it will remain on the same filer just on different disks and from a client point of view they will still access it using their mapped drive e.g. \\netappfiler\share name so I am slightly confused with why all those steps need to take place.

    The CIFS shares if this makes a difference are also only accessed on windows clients and not linux so I assume I dont need to do anything on the exports side…

    Thanks

  1. No trackbacks yet.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: