Archive

Archive for the ‘Windows2008R2’ Category

HOWTO: Set Logon as a Service Dynamically via GPO

March 16, 2015 Leave a comment

I recently ran into a situation where a client has a group per server for Administrators, Remote Desktop Users, and hopefully, Service Accounts.  This may or may not be the best way of dealing with this, but it does solve a need by moving user access to AD vs configuration on local servers.  It’s a little easier to centralize and manage by administrators that may have access to AD but not the servers themselves (eg: HelpDesk users).  The problem, as indicated below, is that setting the rights for the service account/groups has been getting done manually to the systems as they are built or needed.  This has resulted in inconsistencies, as one might expect.  So I found a way to standardize and bring it all “back up to code”, as it were.

 

PROBLEM:

You have a need to set a user or group to have “Log on as a Service” or “Log on as a Batch Job” rights.  This can be done via the Local Security Policy (secpol.msc) or via GPO.  However, there are two obvious issues with this:

1) Using SECPOL.MSC means you’re editing the local security policy.  While this may be the only way to accomplish this, it is decentralized and uncertain to maintain. 

2) Using the GPO method only allows you to set a particular set of user(s) or group(s) to the affected machines

However, if you have a need to set a 1:1 relationship with a dynamic name to the system, GPO’s and the Local Security Policy leave something to be desired.  There is no functionality within the GPO to say “Apply GRP-%SERVERNAME%-SVC” to have this rights, and have it apply as needed – at least for the Logon As a Service right.  Using other methods you can allocate to existing groups with existing rights, but you cannot either dynamically specify a group in THIS GPO location, affect the Local Security Policy, or set the rights for this local group. 

REQUIREMENT:

  • Have each server/system have a group such as GRP-SERVER01-SVC group identifying service accounts.  This would be a company policy scenario, and would ensure that administration and auditing of local group memberships was ONLY done via Active Directory, and could be done via delegated rights by users who may not have rights to login to the server. 
  • Have the group apply only to the named server.  Eg: GRP-SERVER01-SVC should have rights on SERVER01, but not SERVER02 or SERVER03
  • If possible, one should also be able to add to the local group a GRP-ALLSERVERS-SVC for a service account that might be globally allowed. Eg: DOMAIN\svcAutomation, DOMAIN\svcBackup, etc. 
  • Centrally manageable
  • Automatic, dynamic, updates and standardizes over time. 
  • OPTIONAL – also do similar for the pre-existing local groups of “Administrators” and “Remote Desktop Users” for a corresponding GRP-%COMPUTERNAME%-ADM and GRP-%COMPUTERNAME%-RDP as appropriate.

PROCESS:

1) Obtain the file “NTRIGHTS.EXE” from the Windows 2003 Resource Kit found at https://www.microsoft.com/en-us/download/details.aspx?displaylang=en&id=17657

Unpack/install the Resource Kit and copy the file where appropriate. 

2) Copy the file centrally to a location that is accessible by the MACHINE account, not a user.  A great example would be to place the file in \\DOMAIN\NETLOGON, as this allows Read/Execute.

3) Create a script that will run in that location that contains the following:

====== SET_LOGONASSERVICE.BAT – BEGIN =====

@echo off 

net localgroup "Service Accounts" /add /Comment:"Used for allowing Service Accounts local rights" >> \\SERVER\INSTALLS\BIN\logs\SET_LOGONASSERVICE.LOG

\\SERVER\INSTALLS\BIN\ntrights +r SeServiceLogonRight -u "Service Accounts" -m \\%COMPUTERNAME% >> \\SERVER\INSTALLS\BIN\logs\SET_LOGONASSERVICE.LOG 

====== SET_LOGONASSERVICE.BAT – END =====

4) If required, this script can be called via PSEXEC and executed against a list of computers:

C:\bin>psexec @SERVER.LST -u DOMAIN\$USER$ -p  -h -d -C -f \\SERVER\SHARE\BIN\SET_LOGONASSERVICE.BAT

This MUST be run with the –u / -p switch to specify the user to use with the –h “highest privileges”.  The –C must also be used to copy the batch file to the local system so it can run. 

You will see entries in the log similar to:

Granting SeServiceLogonRight to Service Accounts on \\NW-ADCS1... successful 

Granting SeServiceLogonRight to Service Accounts on \\NW-DC1... successful 

Granting SeServiceLogonRight to Service Accounts on \\NW-DC2... successful 

5) We now have a local group called “Service Accounts” and this local group has the rights “Logon as a Service”. 

We can verify this by running “SECPOL.MSC” on one of the servers and checking the rights assignments:

clip_image002

Sure enough, the local “Service Accounts” group is listed.

6) We can now handle the remainder of this via normal GPO’s for Restricted Groups, using DYNAMIC naming. 

Open the GPO editor and create a new GPO and name it something obvious such as “LOCAL_RESTRICTED_GROUPS”, and then edit it.

7) Browse to COMPUTER CONFIGURATION -> PREFERENCES -> CONTROL PANEL SETTINGS -> LOCAL USERS AND GROUPS:

clip_image004

Right click and select NEW -> LOCAL GROUP

8) Now we modify the properties for this group:

clip_image006

We will choose UPDATE for an action, as the group should already exist based on our previous work. 

The group name will be “SERVICE ACCOUNTS”. 

Click ADD to add members

clip_image008

This is where the magic comes in.  If you press the “…” beside the NAME, you can search for the group/user based on a traditional ADUC type search.  But we don’t want that.  Instead, place your cursor in the NAME field.  Press the F3 key:

clip_image010

We get a list of VARIABLES!  We want to use ComputerName so that we can reference the group as GRP-%COMPUTERNAME%-SVC and each computer will get its own group.  Click SELECT.

clip_image012

Note the variable shows %ComputerName% as expected.  Modify that as needed to have the GRP- and -SVC prefix and suffix.

clip_image014

Click OK to close this window.

I’ve chosen to also add an -ADM and –RDP group for Administrators and Remote Desktop Users as this is another use case.

clip_image016

Close and save the GPO

9) Link your GPO appropriately:

clip_image018

Here I have a GROUPS-TEST OU and I have placed my NW-VEEAM01 server in this OU, along with the 3 associated groups.   This will limit impact during testing.

10) On the system in question, check the current group memberships:

clip_image020

11) On the system in question, run a “gpupdate /force”

12) Again on the system in question, confirm the updated group membership:

clip_image022

There you have it.  The ADM/RDP groups were easy as they not only pre-exist, but are pre-defined.  The complication really was the “Service Accounts” group, which both does not pre-exist, and has no special rights by default or built in direct way of adding them via the command line. 

The recommendation would be to run the SET_LOGONASSERVICE.BAT as part of the server build process/scripts, or have it pre-done in your deployment image/WIM/VM Template.  Equally, a PSEXEC run against all servers in the domain could force set this group on a periodic basis to ensure the rights existed.  Additional error checking could be built in to check if the command was successful, check if the domain group exists, create it if required, etc. 

Some post comments:

  • Remember that the local account has a SID.  If it is deleted, and recreated with the same name, that won’t be enough as the Log on as a Service right will be assigned to the old SID
  • As the batch file creates the account with a description and we didn’t tell the GPO to do so, it’ll create a new group if required, but with no description.  This is your identifier that something is off, and hopefully that helps you troubleshoot.
Advertisements

Windows Patching – What happens when you aren’t paying attention.

November 19, 2014 Leave a comment

Yesterday, I posted some details about MS14-068 and MS14-066 (https://vnetwise.wordpress.com/2014/11/19/cve-2014-6324-ms14-068-and-you/) and of course today, have had to do some investigating into a few sites that have a variety of patching systems.  Some are using SCCM, some WSUS, some have policies and procedures, some don’t.  But I noticed a potential ‘perfect storm’(?) of situations that could cause some of them grief – and it was more than just one.

Let me draw you a picture of what is a pretty common environment:

  • WSUS exists for updates, because that’s “the responsible thing to do”
  • WSUS was likely configured some time ago, and no one likes it because it’s not sexy or fancy, so it doesn’t get any love.  Thus, it is probably running on Windows 2008 or 2008 R2.
  • Someone at some point *did* ensure that WSUS was upgraded or installed with WSUS 3.0 SP2

This all sounds pretty good, on the face of it.  Now let’s introduce some real world into this environment….

  • Someone decreed that they shall “only install Critical and Security Updates” – Updates, Update Rollups, Feature Packs, etc, would not be installed.
  • Procedures state that you will install updates that are previous month or older – so  you’re staying 30 days out, which is reasonable – let someone else go on Day0.
  • Those same procedures state that you will look at the list, and select the Critical and Security Updates from the last month, and approve them.
  • Nothing is stated for what to do about the current month’s patches – they are left as “unapproved” – but also not “declined”

Alright, so still pretty “common” and at face value, not that bad.  A year or two goes by, and now you introduce Windows 2012 and Windows 2012 R2 to the mix.  This itself is not a problem, but it’s where you start to see the cracks.  Without even having to look at the environment, I know already the things I want to be looking for….

  • Because the current month’s updates are not being “Declined”, they’re showing up in the list as “missing”.  If you have 10 updates, and 8 are approved and 2 are not, you will only ever possibly show 90% patched.  The remaining two WSUS/WU knows are “available”, but “I don’t have them.  You want to decline those so they only show up as 8 updates and 100% success.  Otherwise, how do you know at a glance if the missing update is the approved one that SHOULD be there, or one from this month?  Your reporting is bad.  See: https://vnetwise.wordpress.com/2014/03/24/howto-tweaking-wsus-so-it-only-reports-on-updates-you-care-about/

 

  • Because the process counts on someone approving “last months” updates and not “all previous updates”, there’s almost certainly going to be some weird “gap” where there is a period of a few months that isn’t approved and isn’t installed for some reason.  But the “assumption” is that they’re all healthy.  Because the previous point doesn’t “decline” any updates, the reports for completion are untrustworthy – and/or never reviewed anyways.

 

  • Next, Windows 2012+ has been introduced.  There’s a KB that is required to be installed on the WSUS server *and* rebuild of the WSUS package on the client to ensure compatibility.  See MS KB2734608 (http://support.microsoft.com/kb/2734608).  Because this is an “Update” and neither Critical nor Security, it is not applied to either the WSUS server or the clients.

 

 

  • In order for the Windows 2012/2012R2 WU/WSUS behavior to actually be changed, you need GPO’s that Windows 2012/2012R2 understands.  In order for that to be true, you need 2012+ ADMX files in your GPO environment.  Preferably in your GPO “Central Store” (again – https://vnetwise.wordpress.com/2014/03/20/howto-dealing-with-windows-2012-and-2012-r2-windows-update-behavior-and-the-3-day-delay/).  But because Windows 2012 and 2012 R2 were likely “added to the domain” with no testing, studying, certification, or reading, this wasn’t done.  Equally, even if it WAS done, most likely someone is still editing the GPO’s on the 2008/2008R2 based Domain Controller – which wipes out the ADMX based changes and replaces them with ADM files and the subset of options that they understand.  You’ll never know this happened though, and even if you jump up and down and tell people not to do it, they will.

 

  • No one is ever doing a WSUS cleanup, so Expired, Superceded, etc updates are still present.  Which isn’t helping anyone.

 

So to make that detail a little shorter:

  • Choosing Critical and Security Updates only is causing you to miss out on *required* updates.  Stop being “fancy” – just select them all please.
  • Because you’re choosing “date ranges” of updates, you’re missing some from time to time.  Stop being “fancy” – select “from TODAY-## to END”
  • If you introduce a new OS to your environment, you need to ensure your AD and GPO’s support them.

On top of the Updates and Update Rollups above that cause those issues, let’s take a quick look at some of the other things that are NOT considered Critical or Security Updates:

November 2014 update rollup for Windows RT 8.1, Windows 8.1, and Windows Server 2012 R2:

    That’s just ONE Update Rollup.  None of those look like ANYTHING I’d want to happen to my servers.  </Sarcasm> So why WOULDN’T I want to install those?  Yes, there may be features you’re not using.  Perhaps you don’t use DeDuplication or DFS-R.  Won’t it be fun later when you install those Roles/Features, and WSUS scans that server, and says “all good, nothing to update” for you?  Tons of fun!
    So, long story short – please stop being fancy.  You’re introducing complexity and gaps into your environment, and actually making things harder.  This means more work for you and your staff and co-workers.  That likely don’t have enough time and resources as it is.
    Don’t pay technical debt….

HOWTO: Tweaking WSUS so it only reports on updates you care about

March 24, 2014 Leave a comment

WSUS is a great built in tool for working with Windows Updates, but sometimes it takes a bit of effort to find the best way to use that tool. Here are a few things to help make the system run smoother.

The following assumptions are made:

  • You deploy updates during a Quarterly Outage, every 3 months – eg: March, June, Sept, Dec month end weekend.
  • You must validate the patches in advance, including a DEV and TEST domain or environment.
  • There isn’t enough time from “Patch Tuesday” to deploy in DEV, test for a week or two, deploy in TEST, test for a week or two, then approve for Production – which might only be two weeks from Patch Tuesday
  • To accommodate the above schedule, you then install “Current Month -1” for all updates. Thus in March, you would deploy and approve Dec/Jan/Feb updates, but NOT Mar.
  • This allows you to install in DEV the week after Feb Patch Tuesday. You can then install in TEST two weeks later, or about the beginning of Mar. TEST can then be run for 2-4 weeks depending on Quarterly Outage window, to validate and be certain of updates in Production.
  • It is acceptable for TEST and PRODUCTION to be out of sync for this period. There needs to be a balance between TEST/PRODUCTION being identical and being able to pre-validate updates.

1) Approving Updates

In the WSUS console, click on SERVER -> UPDATES -> ALL UPDATES, and then click in the main window.

clip_image002

Right click on the headers and choose to show “RELEASE DATE”. Sort by RELEASE DATE.

NOTE: In my example here I’m showing “APPROVAL=DECLINED”. You would be choosing “APPROVAL=UNAPPROVED” but currently I have none to use as an example.

clip_image004

Sort by the RELEASE DATE column. Remember that as we are in MAR of 2014, we do NOT want to select any ##/03/2014’s as they are “too new”. Select ALL PREVIOUS updates from “Month -1” or 02/2014 in this case. Right click and choose APPROVE.

clip_image006

You want to click on the parent level and choose APPROVED (which has already been done here, as indicated by the GREEN CHECK that is shaed out). Repeat this but then choose APPLY TO CHILDREN – if this is appropriate for all of your WSUS Groups. In this environment, WSUS is only used for Windows Server OS groups, and they’re grouped by “Automatic”, “Manual”, and “Primary/Secondary” groupings. As such, they all GET the same updates, it’s just to have different schedules and methods for installation. Click OK. A new dialog will pop up as it sets each update to APPROVED and will take some time to complete.

Until you perform this step, you will see the updates in reports showing computers that require the update, but they’re not allowed to install it. Thus, even if you go and perform a manual Windows Update check, you’ll never see the updates to be able to select them. A sample update report would look like:

clip_image008

The APPROVAL column for the update(s) would say “Not approved”. The STATUS column will know if the system has already downloaded and staged the update.

2) Declining Updates.

The above all seems well enough, except for the non-obvious results. For this month you’ll have Mar/2014 updates not approved and as the months go buy you’ll have downloaded the updates for Apr/May/Jun. Your reports are now going to show that your systems aren’t 100% compliant, even if you install all current updates available. You’ll spin your wheels trying to figure out why WSUS says you have 2 updates outstanding, but the Windows Update client says “no updates found”. This is because WSUS knows about the updates, and will indicate they’re available but not approved. So your system DOES require them, but you haven’t let them off the leash yet. So the report is in fact, valid. But what it’s really showing you is “next time you do updates, you’ll need to install these updates”. That’s great for the week AFTER quarterly outages, but it does nothing to help you DURING or just after the outage to measure success.

To fix this issue, what you want to do is DECLINE the updates.

clip_image010

Change the APPROVAL drop down to show “ANY EXCEPT DECLINED”, which will not show all previously declined updates. Sort by the RELEASE DATE column. Remember that as we are in MAR of 2014, we DO want to select ONLY ##/03/2014’s as they are “too new” to be Approved Select ALL updates from the last Approved date or 03/2014 in this case. Right click and choose APPROVE. (this is counter-intuitive)

clip_image012

Choose “NOT APPROVED” (still not intuitive – you’re going to want to try looking for a “DECLINE” option, and it’s not an option – you need “NOT APPROVED”) from the top level drop down. Then click again and choose APPLY TO CHILDREN. Then click OK.

Now when you pull reports on your system, you’ll actually see 100%:

clip_image014

You now want to keep performing updates on your servers until everything shows 100%. That will then be:

    • All KNOWN updates
    • Including APPROVED, which will actually allow installation of a KNOWN update
    • NOT including DECLINED, which will not show them as “needed” in your reports of % columns.

3) Each month between “now” and “Next Quarterly Update”

This will now make you fine for “Today” assuming today is “March 2014, after Patch Tuesday, but before April 2014, Patch Tuesday”. However, come April/May/June Patch Tuesday, new updates will get downloaded to the WSUS server. For your reports to remain accurate, you’ll need to come into WSUS and set all the new updates to “DECLINED”. Follow the same process you did in Step #2, only of course you’ll see more than just 03/2014 to select. Just select from the first date of ##/03/2014 and go to the bottom and repeat the DECLINE option.

4) NEXT Quarterly Update cycle:

Steps #1 and #2 above assume you have a net-new WSUS installation. If you’ve done this process before, then come Jun/2014 when you need to select Mar/Apr/May months for approval, you’re going to have all of Mar/Apr/May/June of 2014 set as “DECLINED”. You need to now set them to approved, as well as the now downloaded Apr/May.

Similar to Step #3, you’re now going to take all your Mar/Apr/May updates and set them to “APPROVED”. You’ll want to do this immediately following the May Patch Tuesday, as this will then let your reports be accurate to reflect the number of updates and systems required. You can now provide accurate details on how long and how many updates you will need to perform.

5) Just BEFORE NEXT Quarterly Update cycle:

Understandably, you’ll now show accurate reports for May 2014 and you’ll no longer show 100% up to date, as of course you are not. However, as soon as Jun 2014 rolls around, your numbers will be inflated again because of updates that are now known after June’s Patch Tuesday but are not approved. This will, as per Step 2, skew your numbers and prevent you from hitting 100% success in your maintenance window. So ensure you set then all June updates to “DECLINED”.

A general rule of thumb might be that following a Patch Tuesday you should:

  • Go in and APPROVE all previous month updates
  • Go in and DECLINE all current month updates

This would allow non-critical servers that are set to update automatically on some schedule, to keep up to date on a monthly basis vs waiting for quarterly. This provides two benefits:

1) You get the new updates tested (albeit in limited fashion) on existing servers up to 3 months prior to quarterly outages

2) There is far less load and number of systems to be manually or brute force updated during your maintenance window. Less load, means less IOPS on shared storage, which means updates perform quicker, which means you can do more/other maintenance in the same outage window.

HOWTO: Force WSUS Client to Update using PSEXEC

March 21, 2014 2 comments

WSUS is a great tool for automating and managing Windows Updates to various systems in a domain. However, it’s not really all that granular, which is a problem. While you could say “install all updates at 03:00 on Saturday”, you can’t say “and after rebooting, check again, because you’re still in the maintenance window”. You also can’t specify “do it RIGHT NOW, don’t wait for a random period” and there are some difficulties with “reboot when complete, don’t want 5-15 minutes, don’t wait 3 days, do it now”.

It turns out there some undocumented switches for the Windows Update Client (wuauclt.exe). Various lists can be found all over, I’ve found one at: http://kickthatcomputer.wordpress.com/2013/03/06/windows-update-command-line-options/

If you use these methods it might take you a bit of tweaking and fighting to make it work. Specifically if you’re having issues with Windows 2012/2012R2 systems, check: HOWTO: Dealing with Windows 2012 and 2012 R2 Windows Update Behavior and the 3 Day Delay.

This method can be pushed out to all systems via PSexec. Note though that there are some things to watch for:

· The GPO must be set to “4 – Download and Install Updates”. If it is set to “3 – Download and Notify” then all the “wuauclt /UpdateNow” in the world won’t make it do what it’s not allowed

· Except for maybe on Windows 2012/2012R2 systems, where it will think it’s in a maintenance window, and well, you said “UpdateNow”, so let’s do that.

· I’ve found it to be intermittent if the Day/Hour for the option to install in the GPO is not set near the time you’re pushing out. This doesn’t matter so much if you’re doing a scheduled restart such as “Sunday @ 03:00”. But if you have a manual maintenance window where you’re trying to brute force blast out and confirm all the updates that starts at Friday @ 20:00 – you should probably ensure that the GPO is set to the same, especially given that this batch file will refresh the GPupdate.

· As time goes on through that maintenance window, update the GPO time as well. They must go hand in hand.

clip_image002

What you’ll see is that it will schedule the installation for the next day. In the above example, C:\WINDOWS\WINDOWSUPDATE.LOG is showing that on 2014-03-20 at 2:20AM it says it is scheduling the installation to occur at March 21 2014 at 12:00AM. This is because the first line indicates the GPO setting is “Every day” @ “00:00”. So if anything, you’d like that to be “the next hour, of the same or following day”. Watch things like running Friday at 11:45PM and not changing your “Install Day” from Friday to Saturday to accommodate the 00:00 or 01:00 next time.

· There doesn’t seem to be any harm in pushing out the batch file to a system that’s already updating, other than it will restart the Windows Update service. Where possible though, you want to push it to systems that are not otherwise installing. I don’t yet have a method for knowing if a current update process is occurring. Perhaps if you took the “ping” process that is the timer, and made it a “start /wait” with a title, then looked to see if a process was running with that title, don’t run…. But this is as far as I’ve gotten for now.

· Periodically check the WSUS console for “Last Updated” and “Last Reported” to get an idea for what systems need checking. Also look at the % complete column to know which systems are done.

With all that said, the batch file itself:

===== WSUS_FORCE.BAT =====

@echo off

SET WSUSSERVER=FSRVCLOWSUS1

SET WSUSSHARE=WSUSLOGS

SET WSUSLOG=WSUS_FORCED.LOG

REM

REM PSEXEC Usage

REM psexec @SERVER.LST -u svcautomation -H -f -c -D \\FSRVCLOWSUS1\E$\WSUS\bin\WSUS_FORCE.bat

REM

REM

REM Run a GPUPDATE

REM

gpupdate /force

REM

REM Restart services and refresh Windows Update

REM

net stop wuauserv

REG DELETE "HKLM\Software\Microsoft\Windows\CurrentVersion\WindowsUpdate\Auto Update" /v LastWaitTimeout /f

REG DELETE "HKLM\Software\Microsoft\Windows\CurrentVersion\WindowsUpdate\Auto Update" /v DetectionStartTime /f

REG DELETE "HKLM\Software\Microsoft\Windows\CurrentVersion\WindowsUpdate\Auto Update" /v NextDetectionTime /f

net start wuauserv

wuauclt /scannow

wuauclt /resetauthorization /detectnow

echo %COMPUTERNAME% Checking for WSUS Update at %DATE% %TIME% >>\\%WSUSSERVER%\%WSUSSHARE%\%WSUSLOG%

wuauclt /r /ReportNow

echo %COMPUTERNAME% Installing WSUS Update at %DATE% %TIME% >>\\%WSUSSERVER%\%WSUSSHARE%\%WSUSLOG%

wuauclt /UpdateNow

:CHECK_REBOOT_REQUIRED

REM

REM This registry key only exists if WSUS indicates a reboot is required. Thus, keep checking for it to appear, and then reboot

REM

ping 127.0.0.1 -n 61 > nul

reg query "HKLM\Software\Microsoft\Windows\CurrentVersion\WindowsUpdate\Auto Update\RebootRequired" >nul

if %ERRORLEVEL%==1 goto CHECK_REBOOT_REQUIRED

if %ERRORLEVEL%==0 goto REBOOT

GOTO CHECK_REBOOT_REQUIRED

GOTO END

:REBOOT

echo %COMPUTERNAME% Rebooting after WSUS Update at %DATE% %TIME% >>\\%WSUSSERVER%\%WSUSSHARE%\%WSUSLOG%

shutdown -r -t 0

GOTO END

:END

===== WSUS_FORCE.BAT =====

HOWTO: Troubleshoot high CIFS IOPS on NetApp

October 15, 2013 Leave a comment

In my office, we’ve been keeping an eye on some high usage on our EDM based NetApp, particularly due to backup times with NDMP and NetBackup to tape.  In doing so, from time to time, I see other issues.  One that comes up is an unusually high CIFS (Windows File Shares) usage.  In order to troubleshoot this, and get to the root of the issue – much like one might with SFLOW or a sniffer on the network – these are the steps you can take.

1) Obtain the system stats on the NetApp from the console, by running “sysstat –c 12 –x 10”.  –c 12 indicates 12 iterations or “counts”, and –x 10 indicates “every 10 seconds”

clip_image001

I’m sure that at first, these results look a little confusing – so I’ll break them down:

· GREEN shows the NFS OPS per Second.  Because VM backups are occurring at night (during this example), it is expected that there will be high NFS load on the system.

· ORANGE shows the NET KB/Sec OUT – the activity that would be used by VM’s on NFS performing READS FROM the SAN – thus, NET OUT traffic.

· PURPLES shows the DISK KB/Sec READ.  We expect this to be high both because of NFS OPS, but also because of…..

· GRAY shows the TAPE KB/sec WRITE, which is NDMP traffic OUT, WRITING to the tapes.

So all of this is largely expected and accounted for.  Except why is CIFS, shown in RED, so high?  CIFS is used by Windows users and not NFS, not part of internal SAN activity (eg: “aggr scrub”, “vol move”, “reallocate –p”, SnapMirror/Vault, etc), and is not part of NDMP or the backups.  So – what is the activity?

(Also worth noting – Cache Age is 0s – indicating we are massively overrunning the available system cache, or it would show how many seconds or minutes of cache exists, and 0s is none)

2) From a Windows server where you are logged in with an Administrator account (to make pass-thru authentication to the SAN work better), open Computer Management:

clip_image002

Right click on COMPUTER MANAGEMENT in the MMC and choose “CONNECT TO ANOTHER COMPUTER”.  Enter the name of the NetApp controller – eg: NETAPP1.  Because NetApp has licenced API’s and functionality from Microsoft for their Domain participation and CIFS sharing, much of the CIFS portion of the controller can be managed as if it were a remote Windows Server.

3) Expand SYSTEM TOOLS -> SHARED FOLDERS -> SESSIONS:

clip_image003

Click on the # OPEN FILES header to sort by the highest number.

As you can see, we now know the USERNAME and COMPUTERNAME that has the highest number of files open.  The immediate questions that come to mind here are:

· Why so many users with more than one connection?

· Why so many open files?

· What might the activity be that is causing high CIFS IO, as an open file doesn’t cause IO, it simply has a lock.

4) Expand SYSTEM TOOLS -> SHARED FOLDERS -> OPEN FILES:

clip_image004

If we sort by ACCESSED BY, and look for the highest users by # of open files in the previous step, we know to look for username “saxxx.xxxxx”.  Looking at the files opened, we can reasonably assume that there is something going on with an ArcGIS GDB (GeoDatabase), by the number of GDB files open (had to be obfuscated, sorry).  Likely this user is either a) active or b) running some long-running task overnight.

User “pnxxxx” however, just appears to have open files.

The difficulty here is that none of this tells you WHAT the users are doing.  But it gives you a reasonable place to go and look and investigate.

HOWEVER – something to keep in mind – if a user is say copying a large number of files (eg: robocopy backup, zip archive, etc), the above method MAY NOT find it.  That activity will look like the user opening one file after the other, closing the first, and may never appear as the user having more than one file open at any given time.  These methods above are intended to be a guide, not a solution.

The downside to all of this?  Users who hammer on the system over the weekend, affect the performance of backups.  We try not to do backups during the week so their user experience is good – but the reverse is not always true, when we need backups to get the highest priority and performance.  All of that said – the system is there for the users to perform work for the company, so it is an understandably necessary evil.

5) There IS an option that one can set to get more information though:

NETAPP1> options cifs.per_cifs.per_client_stats.enable off

However, it is recommended by NetApp to leave the “per_client_stats” set to disabled, unless needed as the tracking of these stats …. Puts more load on the system, and thus can slow it down in a situation where you are already troubleshooting poor performance.  It is worth knowing it exists, in case asked by NetApp Support to enable it for troubleshooting.

To enable the option, simply run:

NETAPP1> cifs top
The cifs.per_client_stats.enable option must be on to use “cifs top”
NETAPP1> options cifs.per_client_stats.enable on

As you can see, “cifs top” will not provide any useful information until “per_client_stats” are enabled.  You can safely disable them when you’re done troubleshooting.

NETAPP1> cifs top
ops/s  reads(n, KB/s) writes(n, KB/s) suspect/s   IP              Name
553 |      0     0 |       0     0 |        0 | 172.21.250.45     DOMAIN\svcspaceobserver
    19 |      0     0 |       0     0 |        0 | 172.23.0.67       DOMAIN\jasxxxxx
    10 |      0     0 |       0     0 |        0 | 172.23.0.67       DOMAIN\jasxxxxx
     2 |      1   145 |       0     0 |        0 | 172.21.250.30     DOMAIN\cmxxxx
     0 |      0     0 |       0     0 |        0 | 172.21.1.48       DOMAIN\lawxxxx
     0 |      0     0 |       0     0 |        0 | 172.22.0.65       DOMAIN\saxxxxx
     0 |      0     0 |       0     0 |        0 | 172.22.17.76      DOMAIN\derxxxx
     0 |      0     0 |       0     0 |        0 | 172.21.61.133   DOMAIN\nexexxxxxx
     0 |      0     0 |       0     0 |        0 | 192.168.52.66   DOMAIN\aroxxxx
     0 |      0     0 |       0     0 |        0 | 172.22.17.56      DOMAIN\armaxxxx

When you run “cifs top” you may need to give the system some time to collect those “per_client_stats”, perhaps 60-120 seconds.  But then what you see is shown above.   Clearly, “DOMAIN\svcspaceobserver” is the biggest cultprit here – at 553 OPS/second.  You can see it is NOT doing a lot of KB/sec read or write, but simply crawling the file system results in a lot of “operations”.  This would be one of those situations that would not show up as a high number of open files in Computer Management, as it is massively sequential access, one operation at a time.

Don’t forget to DISABLE the “per_client_stats” once you’re done troubleshooting, as there is no point collecting this information if it will not be used.

NETAPP1> options cifs.per_client_stats.enable off

So the short moral of the story?  Don’t run Space Observer on a share on a NetApp during backups – or it will just compound poor backup performance.   Hopefully this information might help you troubleshoot high CIFS activity in the future.

Categories: CIFS, NetApp, Windows2008R2

2008R2_LAB: Configuring DHCP in a Windows 2008 R2 environment

August 5, 2013 Leave a comment

Now that the Lab has a working Active Directory Domain Controller (AD DC) and DNS functionality, the next thing we need is to provide DHCP services.  This will allow us to spin up both servers and workstations alike, and provide TCP/IP addresses to it automatically. 

Information you will require to complete this task:

  • User the lab is for – eg: David Lock – we need this for the initials to use
  • The Subnet to use for the LAN interface of the lab – eg: 192.168.5.0/24
  • The IP address to use for the Monowall LAN interface (default gateway) of the lab – eg: 192.168.5.1/24
  • The IP address to use for the XX-DC1 VM – eg: 192.168.5.11/24 – this will be the DNS server address in the DHCP scope
  • Start and End DHCP scope IP’s – typically 192.168.x.101 through 192.168.x.199.

1) Login to the first domain controller we have created for the lab – eg: NWL1-DC1, DL-DC1, etc.   When Server Manager starts, click on ADD ROLES.

clip_image002

2) Check off DHCP Server and click NEXT:

clip_image004

3) The left hand size will show a number of screens that we’ll go through.

clip_image006

Read the Introduction and click NEXT.

4) We only have one NIC installed, so choose the only option and click NEXT.

clip_image008

5) Configure the basics of the DHCP scope:

clip_image010

Enter the PARENT DOMAIN name of the lab – eg: NETWISELAB.LOCAL, DLLAB.LOCAL, etc.  Enter the IP address of the DC and click VALIDATE and ensure it turns green and shows VALID.  If a second DNS server exists for redundancy, enter it here.  Click NEXT.

6) On the WINS screen, choose WINS IS NOT REQUIRED and click NEXT:

clip_image012

7) On the SCOPES screen, click ADD:

clip_image014

clip_image016

Fill in the appropriate information.  The SCOPE NAME is logical, and is for reference so it can be anything you like.  I recommend a STARTING and ENDING address of <SUBNET>.101 through 199, which provideds for more than enough DHCP space.  Enter the subnet mask of 255.255.255.0 and enter the default gateway as appropriate.  Click OK.

clip_image018

Click NEXT.

8) On the IPv6 Stateless Mode screen, choose DISABLE and click NEXT.

clip_image020

9) To AUTHORIZE the DHCP scope/server in AD, choose USER CURRENT CREDENTIALS, to use the DOMAIN\Administrator account you are logged in with.  Choose NEXT.

clip_image022

10) On the CONFIRMATION screen, choose INSTALL.

clip_image024

11) The RESULTS screen should show INSTALLATION SUCCEED.  Click CLOSE.

clip_image026

At this point, there is now a working DHCP scope for the lab domain.  No further configuration is required for basic DHCP services. 

Some very quick usage information:

  • DHCP Console can be found in the Administrative Tools folder (along with DNS Console):

clip_image028

  • If you expand out the folders in the console, you’ll see what will soon become familiar options:

clip_image030

ADDRESS POOL is the pool you defined, and where you would edit those settings.

clip_image032

ADDRESS LEASES will be where you find devices that have requested and obtained an IP address.

clip_image034

You can RIGHT-CLICK on an lease and choose ADD TO RESERVATION

clip_image036

Which will then make the lease show up under the RESERVATIONS folder. 

clip_image038

SCOPE OPTIONS will show the settings you entered in the DHCP wizard. 

  • If you are doing this in VMware Workstation, you may need to click on EDIT -> VIRTUAL NETWORK EDITOR:

clip_image040

Here you will need to UNCHECK the “USE LOCAL DHCP SERVICE” so that your server will provide DHCP, and not VMware Workstation.

2008R2_LAB: Creating the first AD DC in a Windows 2008 R2 environment

August 5, 2013 Leave a comment

NOTE: For the purposes of this example, a default gateway of 192.168.79.2 is used, provided by VMware Workstation NAT.  It is recommended that you use a firewall VM such as Monowall or pfSense, etc.  By doing so, the entire infrastructure is portable to another VMware Workstation environment, converted to HyperV or moved to vSphere.  If you are doing so, either adjust the Default Gateway shown to that of your VM firewall –or- set the IP of the VM firewall to match this address.

Information you will require to complete this task:

· User the lab is for – eg: David Lock – we need this for the initials to use

· The Subnet to use for the LAN interface of the lab – eg: 192.168.79.0/24

· The IP address to use for the Monowall LAN interface (default gateway) of the lab – eg: 192.168.79.2/24

· The IP address to use for the XX-DC1 VM – eg: 192.168.79.11/24

1) Once the VM is booted and you have logged in locally, start Server Manager

2) In Server Manager:

clip_image002

Click on VIEW NETWORK CONNECTIONS.

3) Right click on the NIC and choose PROPERTIES:

clip_image004

4) Uncheck TCP/IP v6.  Select TCP/IP v4, and choose PROPERTIES:

clip_image006

5) If necessary, open a CMD prompt to find the existing IP address.  This won’t be needed if the IP address is dictated by the environment.  Run IPCONFIG:

clip_image008

Here we can see the lab environment for the example is 192.168.79.159/24.  So our subnet will be 192.168.79.0/24 and a default gateway of 192.168.79.2  The other lab environment using MONOWALL VM’s will likely be .1.

6) Return to the LOCAL AREA CONNECTION PROPERTIES window and click PROPERTIES.

clip_image010

Enter the TCP/IP address information as follows:

IP ADDRESS                        = <SUBNET>.11

SUBNET MASK                  = 255.255.255.0

DEFAULT GW                     = <SUBNET>.2 (or .1 if that is the MONOWALL config)

PREFERRED DNS               = The IP configured in IP Address (the server itself)

Press OK to close the TCP/IP properties.  The system may automatically test the configuration if you have checked the “VALIDATE SETTINGS UPON EXIT” box.  Doing so, will issue a warning that the DNS server is not responding – and it will not be, given that we have yet to configure it.    Press OK.  Then press CLOSE to close the NIC settings.

7) Return to SERVER MANAGER and choose CHANGE SYSTEM PROPERTIES in the upper right hand corner.

clip_image012

clip_image014

We are going to click CHANGE to change the COMPUTER NAME.

clip_image016

Change the COMPUTER NAME for your DC as appropriate.  Don’t worry about changing the WORKGROUP or DOMAIN NAME at this time.  Press OK.

NOTE:  However, if for some reason your VM was cloned from a Domain Joined system and *IS* a member of a domain, then set the radio box to WORKGROUP and enter the name WORKGROUP (or TEST or whatever) into the WORKGROUP box).

clip_image018

Press OK.  Press CLOSE on the SYSTEM PROPERTIES window and reboot when prompted.

clip_image020

At this point what we have is a VM with a proper COMPUTER NAME and TCP/IP settings. 

8) When the computer restarts, login.  Server Manager will start by default.  Under Option 3, click ADD ROLES:

clip_image022

clip_image024

Click NEXT.  You may wish to check SKIP THIS PAGE BY DEFAULT. 

On the next screen, you can select the roles to install.  You might be tempted to select DHCP, DNS and AD Domain Services all at the same time.  However, if you do:

clip_image026

You are told you cannot.  As per the message, we’ll choose to ONLY install ACTIVE DIRECTORY DOMAIN SERVICES. 

clip_image028

You’ll be told you need to install .NET Framework 3.5.1 Features.  Click ADD REQURIED FEATURES.  Then click NEXT.

clip_image030

On the ACTIVE DIRECTORY DOMAIN SERVICES screen, click NEXT.

clip_image032

Then click INSTALL.

NOTE: that this only INSTALLS the ROLE but it does NOT configure it.  We must still run DCPROMO.EXE later to actually CONFIGURE ADDS. 

clip_image034

When the installation completed, you’ll see INSTALLATION SUCCEEDED and can click CLOSE.

9) Next, we’ll run DCPROMO.EXE.

clip_image036

Which will check if ADDS binaries are installed:

clip_image038

The wizard will launch:

clip_image040

Click NEXT

clip_image042

There is a blurb about compatibility with Windows NT 4.0 type systems.  We can largely ignore that, and click NEXT.

clip_image044

We are going to choose CREATE A NEW DOMAIN IN A NEW FOREST and click NEXT.

clip_image046

Name the domain something appropriate, based on your lab standards.  This might be something like First Initial Last Initial LAB.LOCAL (eg: DLLAB.LOCAL).  In my case, it is NETWISELAB.LOCAL.  Click NEXT.

clip_image048

Choose a FOREST FUNCTIONAL LEVEL of Windows Server 2008 R2.  Note the details it lists.  Click NEXT.

clip_image050

The wizard will automatically force a selection of configuring the server as an AD Global Catalog Server.  Check the box for DNS Server as well, so it will set up DNS for you.  Click NEXT.

clip_image052

This message will appear.  It’s not something we really need to worry as this is the first DNS server installed, and there is no delegation.  In a full domain this will often be seen in cases where your DNS server may be non-Windows based, and additional work is required on your part to facilitate Active Directory.  Click YES.

clip_image054

Accept the default locations for the folders, as there is no good reason to NOT use the defaults.  Click NEXT.

clip_image056

Choose and enter a password for DIRECTORY SERVICES RESTORE MODE ADMINISTRATOR account.  The likelihood you’ll never need this is slim, but it must be set.  This will become the ADMINISTRATOR password as well.  Enter the password twice and click NEXT.

clip_image058

The summary screen will show.  Click NEXT.

clip_image060

The installation will begin and show updates to status on this window.  Check the REBOOT ON COMPLETION box, and then wait for it to complete.

10) When the computer reboots, press CTRL+ALT+DEL and login as the Administrator. 

clip_image062

Note that you don’t want to login with the COMPUTERNAME\Administrator account it suggests.  Click on LOGIN AS ANOTHER USER and ensure it shows LOG ON TO: NETWISIELAB (your domain name) and use the Administrator login.

clip_image064

Now, when Server Manager starts, things look a little different. 

The following ROLES are now installed:

                DNS Server, Active Directory Domain Services

The following FEATURES are now installed:

                Group Policy Management (GPMC), Remote Server Administration Tools (RSAT), SNMP Services (from my template, not via the DCPROMO) and .NET Framework 3.5.1 Features. 

At this point, you now have a function AD DC for the lab.   You should consider doing some or all of the following:

  • Disable the Windows Firewall on the Domain Network – only on a LAB network.  This will help simplify basic learning, but at some point, you will need to know how to configure things WITH firewalls enabled.  Thus, where possible, leave the Windows Firewall enabled as a Best Practice.
  • Configure Windows Updates to be automatic.
  • Check for Windows Updates to keep the server up to date.
  • Create user(s).  At the very least, consider creating a couple of “Test” users to use for server and workstation tasks that are not Domain Administrator that you can mess around with.  Also, create a copy of the Administrator account for yourself to use so that if you create an issue, you can always get back in using the Administrator account.
  • Enable Remote Desktop.  However, as this lab will be isolated from your network, you will likely be using the VMware Worksation/vSphere Console or the Hyper-V console and NOT RDP.  That said, your test (isolated) workstations or other servers you may choose to use RDP to connect, so it doesn’t hurt to have it installed.