Archive

Archive for the ‘Dell’ Category

HOWTO: Dell vSphere 6.0 Integration Bits for Servers

October 5, 2015 3 comments

When I do a review of a vSphere site, I typically start by looking to see if best practices are being followed – then look to see if any of the 3rd party bits are installed. This post picks on Dell environments a little, but the same general overview holds true of HP, or IBM/Lenovo, or Cisco, or…. Everyone has their own 3rd party integration bits to be aware of. Perhaps this is the part where everyone likes Hyper Converged because you don’t have to know about this stuff. But as an expert administering it, you should at least be aware of it, if not an expert.

I’m not going to into details as to how to install or integrate these components. I just wanted to make a cheat sheet for myself, and maybe remind some folks that regardless of your vendor, make sure you check for the extra’s – its part of why you’re not buying white boxes, so take advantage of it. Most if it is free!

The links:

I’ve picked on a Dell PowerEdge R630 server, but realistically any 13G box would have the same requirements. Even older 11/12G boxes such as an R610 or R620 would. So first we start with the overview page for the R630 – remember to change that OS selection to “VMware ESXi v6.0”
http://www.dell.com/support/home/us/en/04/product-support/product/poweredge-r630/drivers

 

Dell iDRAC Service Module (VIB) for ESXi 6.0, v2.2
http://www.dell.com/support/home/us/en/04/Drivers/DriversDetails?driverId=2XHPY

You’re going to want to be able to talk to and manage the iDRAC from inside of ESXi, so get the VIB to allow you to do so. This installs via VUM incredibly easy.

 

Dell OpenManage Server Administrator vSphere Installation Bundle (VIB) for ESXi 6.0, v8.2
http://www.dell.com/support/home/us/en/04/Drivers/DriversDetails?driverId=VV2P2

Next, you’ll want to be able to handle talking to OMSA on the ESXi box itself, to get health, management, inventory, and other features. Again, this installs with VUM.

 

OpenManage™ Integration for VMware vCenter, v3.0
http://www.dell.com/support/home/us/en/04/Drivers/DriversDetails?driverId=8V0JG

This will let your vCenter present you with various tools to manage your Dell infrastructure right from within vCenter. Installs as an OVF and is a virtual appliance, so no server required.

 

VMware ESXi 6.0
http://www.dell.com/support/home/us/en/04/Drivers/DriversDetails?driverId=CG9FP 

Your customized ESXi installation ISO. Note the file name – VMware-VMvisor-Installer-6.0.0-2809209.x86_64-Dell_Customized-A02.iso – based on the -2809209 and the –A02 and a quick Google search, you can see that this is v6.0.0b (https://www.vmware.com/support/vsphere6/doc/vsphere-esxi-600b-release-notes.html) vs v6.0U1.

 

Dell Systems Management Tools and Documentation DVD ISO, v.8.2
http://www.dell.com/support/home/us/en/04/Drivers/DriversDetails?driverId=4HHMH

You likely will not need this for a smaller installation, but it can help out if you need to standardize, by allowing you to configure and export/import things like BIOS/UEFI, Firmware, iDRAC, LCC, etc settings. Can’t hurt to have around.

 

There is no longer a need for a “SUU” – Systems Update Utility, as the Lifecycle Controller built into ever iDRAC, even the Express will allow you to do updates from that device. I recommend doing them via the network as it is significantly less hassle than going through the Dell Repository Builder, and downloading your copies to a USB/ISO/DVD media and doing it that way.

Now, the above covers what you’ll require for vSphere. What is NOT immediately obvious is the tools you may want to use in Windows. Even though you now have management capability on the hosts and can see things in vCenter, you’re still missing the ability to talk to devices and manage them from Windows – which is where I spend all of my actual time. Things like monitoring, control, management, etc, all are done from within Windows. So let’s go ahead and change that OS to “Windows Server 2012 R2 SP1” and get some additional tools:

 

Dell Lifecycle Controller Integration 3.1 for Microsoft System Center Configuration Manager 2012, 2012 SP1 and 2012 R2
http://www.dell.com/support/home/us/en/04/Drivers/DriversDetails?driverId=CKHYR

If you are a SCCM shop, you may very much want to be able to control the LCC via SCCM to handle hardware updates.

 

Dell OpenManage Server Administrator Managed Node(windows – 64 bit) v.8.2
http://www.dell.com/support/home/us/en/04/Drivers/DriversDetails?driverId=6J8T3

Even though you’ve installed the OMSA VIB’s on ESXi, there is no actual web server there. So you’ll need to install the OMSA Web Server tools somewhere – could even be your workstation – and use that. You’ll then select “connect to remote node” and specify the target ESXi system and credentials.

 

Dell OpenManage Essentials 2.1.0
http://www.dell.com/support/home/us/en/04/Drivers/DriversDetails?driverId=JW22C

If you’re managing many Dell systems and not just servers, you may want to go with OME if you do not have SCCM or similar. It’s a pretty good 3rd party SNMP/WMI monitoring solution as well. But will also allow you to handle remote updates of firmware, BIOS, settings, etc, on various systems – network, storage, client, thin client, etc.

 

Dell OpenManage DRAC Tools, includes Racadm (64bit),v8.2
http://www.dell.com/support/home/us/en/04/Drivers/DriversDetails?driverId=9RMKR 

RACADM is a tool I’ve used before and have some links on how to use remotely. But this tool can grandly help you standardize your BIOS/IDRAC settings via a script.

 

Dell Repository Manager ,v2.1
http://www.dell.com/support/home/us/en/04/Drivers/DriversDetails?driverId=2RXX2

The repository manager as mentioned is a tool you can use to download only the updates required for your systems. Think of it like WSUS (ish).

 

Dell License Manager
http://www.dell.com/support/home/us/en/04/Drivers/DriversDetails?driverId=68RMC

The iDRAC is the same on every system it is only the licence that changes. To apply the Enterprise license, you’ll need the License Manager.

 

Hopefully this will help someone keep their Dell Environment up to date. Note that I have NOT called out any Dell Storage items such as MD3xxx, Equallogic PSxxxx, or Compellent SCxxxx products. If I tried to, the list would be significantly longer. Also worth noting is that some vendors _networking_ products have similar add ins, so don’t forget to look for those as well.

Advertisements

Modifying the Dell C6100 for 10GbE Mezz Cards

June 11, 2015 3 comments

In a previous post, Got 10GbE working in the lab – first good results, I talked about getting 10GbE working with my Dell C6100 series.  Recently, a commenter asked me if I had any pictures of the modifications I had to make to the rear panel to make these 10GBE cards work.  As I have another C6100 I recently acquired (yes, I have a problem…), that needs the mods, it seems only prudent to share the steps I took in case it helps someone else.

First a little discussion about what you need:

  • Dell C6100 without the rear panel plate to be removed
  • Dell X53DF/TCK99 2 Port 10GbE Intel 82599 SFP+ Adapter
  • Dell HH4P1 PCI-E Bridge Card

You may find the Mezz card under either part number – it seems that the X53DF replaced the TCK99.  Perhaps one is the P/N and one is the FRU or some such.  But you NEED that little PCI-E bridge card.  It is usually included, but pay special attention to the listing to ensure it does.  What you DON’T really need, is the mesh back plate on the card – you can get it bare. 

2015-06-11 21.18.132015-06-11 21.17.46

Shown above are the 2pt 10GbE SFP+ card in question, and also the 2pt 40GbE Infiniband card.  Above them both is the small PCI-E bridge card.

2015-06-11 21.19.24

You want to remove the two screws to remove the backing plate on the card.  You won’t be needing it, and you can set it aside.  The screws attach through the card and into the bracket, so once removed, reinsert the screws to the bracket to keep from losing them.

2015-06-11 21.17.14

Here we can see the back panel of the C6100 sled.  Ready to go for cutting.

2015-06-11 21.22.232015-06-11 21.24.48

You can place the factory rear plate over the back plate.  Here you can see where you need to line it up and mark the cuts you’ll be doing.  Note that of course the bracket will sit higher up on the unit, so you’ll have to adjust for your horizontal lines. 

2015-06-11 21.23.092015-06-11 21.22.49

If we look to the left, we can see the source of the problem that causes us to have to do this work.  The back panel here is not removable, and wraps around the left corner of the unit.  In systems with the removable plate, this simply unscrews and panel attached to the card slots in.  In the right hand side you can see the two screws that would attach the panel and card in that case.

2015-06-11 21.35.38

Here’s largely what we get once we complete the cuts.  Perhaps you’re better with a Dremel than I am. Note that the vertical cuts can be tough depending on the size of the cutting disk you have, as they may have interference from the bar to remove the sled. 

2015-06-11 21.36.162015-06-11 21.36.202015-06-11 21.36.28

You can now attach the PCI-E bridge card to the Mezz card, and slot it in.  I found it easiest to come at about 20 degree angle and slot in the 2 ports into the cut outs, then drop the PCI-E bridge into the slot.  When it’s all said and done, you’ll find it pretty secure and good to go.

That’s really about it.  Not a whole lot to it, and if you have it all in hand, you’d figure it out pretty quick.  This is largely to help show where my cut lines ended up compared tot he actual cuts and where adjustments could be made to make the cuts tighter if you wanted.  Also, if you’re planning to order, but are not sure if it works or is possible, then this is going to help out quite a bit.

Some potential vendors I’ve had luck with:

http://www.ebay.com/itm/DELL-X53DF-10GbE-DUAL-PORT-MEZZANINE-CARD-TCK99-POWEREDGE-C6100-C6105-C6220-/181751541002? – accepted $60 USD offer.

http://www.ebay.com/itm/DELL-X53DF-DUAL-PORT-10GE-MEZZANINE-TCK99-C6105-C6220-/181751288032?pt=LH_DefaultDomain_0&hash=item2a513890e0 – currently lists for $54 USD, I’m sure you could get them for $50 without too much negotiating.

Categories: C6100, Dell, Hardware, Home Lab

HOWTO: Migrate RAID types on an Equallogic array

October 17, 2014 Leave a comment

I’ve run into a situation where I have a need to change RAID types on an Equallogic PS4100 in order to provide some much needed free space.  Equallogic supports on the fly migration as long as you go in a supported migration path:

clip_image002

  • RAID 10 can be changed to RAID50 or RAID6
  • RAID 50 can be changed to RAID6
  • RAID 6 cannot be converted.

By changing from RAID50 to RAID6 on a 12x600GB SAS unit, we can go from 4.1TB to 4.7TB, which will help get some free space and provide some extra life to this environment. 

1) Login to the array, and click on the MEMBER, then MODIFY RAID CONFIGURATION:

clip_image004

Note that the current RAID configuration is shown as “RAID 50” and STATUS=OK.

2) Select the new RAID Policy of RAID6:

clip_image006

Note the change in space – from 4.18TB to 4.69TB, and a net change of 524.38GB, or about 12% extra space.  Click OK.

3) During the conversion, the new space is not available – which should be expected:

clip_image008

After the conversion, the space will be available.  Until then, the array status will show as “expanding”, as indicated.  Click OK.

4) You can watch the status and see that the RAID Status does indeed show “expanding” and a PROGRESS of 0%:

clip_image010

After about 7 hours, we’re at 32% complete.  Obviously this is going to depend on the amount of data, size of disks, load on the array, etc.   But we can sarely assume this will take at least 24 hours to complete. 

5) When the process completes, you will see that the RAID Status is OK as well as the MEMBER SPACE area will show free space:

clip_image012

Understandably, you now need to use this space.  It won’t be automatically applied to your existing Volumes/LUN’s, so you’re left with two obvious choices – grow an existing volume or create a net new one.  As it is expected creating net-new is understood, I’ll demonstrate how to grow an existing volume.

6) On the bottom left of the interface, select VOLUMES:

clip_image014

Then in the upper left, expand the volumes:

clip_image016

Select the volume you wish to grow.  I’ll choose EQVMFS1.

clip_image018

Click MODIFY SETTINGS and then the SPACE tab.  Change the volume size accordingly.  It does indicate what the Max (1.34TB) can be.  I would highly recommend you reserve at least some small portion of space – just in case you ever completely fill a volume you may need to grow it slightly to even be able to mount it.  Even if small, always leave an escape route. 

Click OK.

clip_image020

You are warned to create a snapshot first.  As these volumes are empty, we won’t be needing to do this.  Click NO.

clip_image022

Note the volume size now reports as 1.3TB.

7) Next, we go to vSphere to grow the volume. 

Right click on the CLUSTER and choose RESCAN FOR DATASTORES:

clip_image024

Next, once that completes (watch the Recent Tasks panel), select a host with the volume mounted and go to the CONFIGURATION -> STORAGE tab.  Right click on the volume and choose PROPERTIES.

clip_image026

8) Click INCREASE on the next window:

clip_image028

Then select the LUN in question:

clip_image030

NOTE that in this example, I’m upgrading a VMFS3 volume.  It will ultimately be blown away and recreated as VMFS.  But if you are doing this, you will see warnings if you try to grow about 2TB, as it indicates.  Click NEXT.

clip_image032

Here we can see the existing 840GB VMFS as well as the new Free Space of 491GB.  Click NEXT.

clip_image034

Choose the block size, if it allows you.  Again, this is something you won’t see on a VMFS5 datastore.  Click NEXT and then FINISH.

9) As this is a clustered volume, once complete, it will automatically trigger a rescan on all the remaining cluster hosts to pick up the change:

clip_image036

You don’t have to do anything for this to happen. 

And that’s really about it.  You have now expanded the RAID group on the Equallogic, and added the space to an existing volume.  Some caveats of course to mention at this point:

  • Changing RAID types will likely alter your data protection and performance expectations.  Be sure you have planned for this.
  • As noted before, once you go RAID6 you can’t go anywhere from there without a offload and complete rebuild of the array.
  • If you hit the wall, and got back ~ 10%, this is your breathing room.  You should be evaluating space reclamation tactics, new arrays, etc.  This only gets you out of today’s jam.
Categories: Dell, Equallogic, ISCSI, Storage, vSphere

Design Exercise–Scaling Up–Real World Example

October 13, 2014 Leave a comment

My previous post on Design Exercise- Scaling up vs Scaling out appeared to be quite popular. A friend of mine recently told me of an environment, and while I have only rough details of it, it gives me enough to make a practical example of a real world environment – which I figured might be fun. He indicated that while we’d talked about the ideas in my post for years, it wasn’t until this particular environment that it really hit home.

Here are the highlights of the current environment:

  • Various versions of vSphere – v3.5, v4.x, v5.x, multiple vCenters
  • 66 hosts – let’s assume dual six core Intel 55xx/56xx (Nahelem/Westermere) CPU’s
  • A quick tally suggests 48GB of RAM per host.
  • These hosts are blades, likely HP. 16 Blades per chassis, so at least 4 chassis. For the sake of argument, let’s SAY it’s 64 hosts, just to keep it nice and easy.
  • Unknown networking, but probably 2x 10GbE, and 2x 4Gbit/FC, with passthru modules

It might be something very much like this. In which case, it might be dual 6 core CPU’s, and likely only using 1GbE on the front side. This is probably a reasonable enough assumption for this example, especially since I’m not trying to be exact and keep it theoretical.

http://www.ebay.ca/itm/HP-c7000-Blade-Chassis-16x-BL460c-G6-2x-6-C-2-66GHz-48GB-2x-146GB-2x-Gbe2c-2x-FC-/221303055238?pt=COMP_EN_Servers&hash=item3386b0a386

I’ve used the HP Power Advisor (http://www8.hp.com/ca/en/products/servers/solutions.html?compURI=1439951#.VDnvBfldV8E) to determine the power load for a similarly configured system with the following facts:

  • 5300 VA
  • 18,000 BTU
  • 26 Amps
  • 5200 Watts total
  • 2800 Watts idle
  • 6200 Watts circuit sizing
  • 6x 208V/20A C19 power outlets
    clip_image001

We’ll get to that part later on. For now, let’s just talk about the hosts and the sizing.

Next, we need to come up with some assumptions.

  • The hosts are likely running at 90% memory and 30% CPU, based on examples I’ve seen. Somewhere in the realm of 2764GB of RAM and 230 Cores.
  • The hosts are running 2 sockets of vSphere Enterprise Plus, with SnS – so we have 128 sockets of licences. There will be no theoretical savings on net-new licences as they’re already owned – but we might save money on SnS. There is no under-licencing that we’re trying to top up.
  • vSphere Enterprise Plus we’ll assume to be ~ $3500 CAD/socket and 20% for SnS or about $700/year/socket.
  • The hosts are probably not licenced for Windows Data Center, given the density – but who knows. Again, we’re assuming the licences are owned, so no net-new savings but there might be on Software Assurance.
  • We’re using at least 40U of space, or a full rack for the 4 chassis
  • We’re using 20,800 Watts or 21 kWhr
  • While the original chassis are likely FC, let’s assume for the moment that it’s 10gbE ISCSI or NFS.

Now, let’s talk about how we can replace this all – and where the money will come from.

I just configured some Dell R630 1U Rack servers. I’ve used two different memory densities to deal with some cost assumptions. The general and common settings are:

  • Dell R630 1U Rack server
  • 2x 750 Watt Power Supply
  • 1x 250GB SATA just have “a disk”
  • 10 disk 2.5” chassis – we won’t be using local disks though.
  • 1x PERC H730 – we don’t need it, but we’ll have it in case we add disks later.
  • Dual SD module
  • 4x Emulex 10GbE CNA on board
  • 2x E5-2695 v3 2.3GHz 14C/28T CPU’s

With memory we get the following numbers:

  • 24x 32GB for 768GB total – $39.5 Web Price, assume a 35% discount = $26K
  • 24x 16GB for 368GB total – $23.5 Web Price, assume a 35% discount = $15.5K

The first thing we want to figure out is if the memory density is cost effective. We know that 2x of the 384GB configs would come to $31K or $6K more than the 2x servers. So even without bothering to factor for licencing costs, we know it’s cheaper. If you had to double up on vSphere, Windows Data Center, Veeam, vCOPS, etc, etc, then it gets worse. So very quickly we can make the justification to only include the 768GB configurations. So that’s out of the way. However, it also tells us that if we need more density, we do have some wiggle room to spend more on better CPU’s with more cores/speeds – we can realistically spend up to $3K/CPU more and still work out to be the same as doubling the hosts with half the RAM.

Now how many will we need? We know from above “Somewhere in the realm of 2764GB of RAM and 230 Cores”. 230 cores / 28 cores per server means we need at least 8.2 hosts – we’ll assume 9.; 2764GB of RAM, only requires 3.6 hosts. But we also need to assume we’ll need room for growth. Based on these numbers, let’s work with the understanding we’ll want at least 10 hosts to give us some overhead on the CPU’s, and room for growth. If we’re wrong, we have lots of spare room for labs, DEV/TEST, finally building redundancy, expanding poorly performing VM’s, etc. No harm In that. This makes the math fairly easy as well:

  • $260K – 10x Dell R630’s with 768GB
  • $0 – licence savings from buying net new

We’ve now cost the company, $260K, and so far, haven’t shown any savings or justification. Even just based on hardware refresh and lifecycle costs, this is probably a doable number. This is $7.2K/month over 36 months.

What if we could get some of that money back? Let’s find some change in the cushions.

  • Licence SnS savings. We know we only need 20 sockets now to licence 10 hosts, so we can potentially let the other 108 sockets lapse. At $700/socket/year this results in a savings of $75,600 per year, or $227K over 36 months. This is 87% of our purchase cost for the new equipment. We only need to find $33K now
  • Power savings.
    clip_image002
    The Dell Energy Smart Solution Advisor (http://essa.us.dell.com/dellstaronline/Launch.aspx/ESSA?c=us&l=en&s=corp) suggests that each server will require 456Watts, 2.1 Amps and 1600 BTU of cooling. So our two solutions look like
    clip_image003
    I pay $0.085/kWhr here so I’ll use that number. In the co-location facilities I’m familiar with, you’re charged per power whip not usage. But as this environment is on site, we can assume they’re being charged only as used.
    We’ve now saved another $1K/month or $36K over 36 months. We have saved $263K on a $260K purchase. How am I doing so far?
  • I

  • Rack space – we’re down from 40U to 10U of space. Probably no cost savings here, but we can reuse the space
  • Operational Maintenance – we are now doing Firmware, Patching, Upgrades, Host Configuration, etc, across 10 systems vs 64. Regardless of if that time accounts for 1 or 12 hours per year per server, we are now doing ~ 84% less work. Perhaps now we’ll find the time to actually DO that maintenance.

So based on nothing more than power and licence *maintenance*, we’ve managed to recover all the costs. We also have drastically consolidated our environment, we can likely “finally” get around to migrating all the VM’s into a single vSphere v5.5+ environment and getting rid of the v3.5/v4.x/etc mixed configuration that likely was left that way due to “lack of time and effort”.

We also need to consider the “other” ancillary things we’re likely forgetting as benefits. Everyone one of these things that a site of this size might have, represents a potential savings – either in net-new or maintenance:

  • vCloud Suite vs vSphere
  • vCOPS
  • Veeam or some other backup product, per socket/host
  • Window Server Data Center
  • SQL Server Enterprise
  • PernixData host based cache acceleration
  • PCIe/2.5” SSD’s for said caching

Maybe the site already has all of these things. Maybe they’re looking at it for next year’s budget. If they have it, they can’t reduce their licences, but could drop their SnS/Maintenance. If they’re planning for it, they now need 84% less licencing. My friends in sales for these vendors won’t like me very much for this, I’m sure, but they’d also be happy to have the solution be sellable and implemented and a success story – which is always easier when you don’t need as many.

I always like to provide more for less. The costs are already a wash, what else could we provide? Perhaps this site doesn’t have a DR site. Here’s an option to make that plausible:

  • $260K – 10x R630’s for the DR site
  • $0K – 20 sockets of vSphere Enterprise – we’ll just reuse some of the surplus licencing. We will need to keep paying SnS though.
  • $15K – 20 sockets of vSphere Enterprise SnS
  • $40K – Pair of Nexus 5548 switches? Been a while since I looked at pricing
    Spend $300K and you have most of a DR environment – at least the big part. You still have no storage, power, racks, etc. But you’re far closer. This is a much better use of the same original dollars. The reason for this part of the example is because of the existing licences and we’re not doing net-new. The question of course from the bean-counters will be “so what are we going to do, just throw them away???”

Oh. Right. I totally forgot. Resale J

http://www.ebay.ca/itm/HP-C7000-Blade-Enclosure-16xBL460C-G6-Blades-2xSix-Core-2-66GHZ-X5650-64GB-600GB-/271584371114?pt=COMP_EN_Servers&hash=item3f3bb0a1aa

There aren’t many C7000/BL460C listed as “Sold” on eBay, but the above one sold for ~ $20K Canadian. Let’s assume you chose to sell the equipment to a VAR that specializes in refurbishing – they’re likely to provide you with 50% of that value. That’s another $10K/chassis or $40K for the 4 chassis’.

As I do my re-read of the above, I realize something. We need 9 hosts to meet CPU requirements, but we’d end up with 7680GB of RAM where we only really require 2764GB today. This brings the cost down to ~ $31K Web Price or $20K with 35% discount. At a savings of $6K/server, we’d end up with 5120GB of RAM – just about double what we use today, so lots of room for scale up. We’ll save another $60K today. In the event that we ever require that capacity, we can easily purchase the 8*32GB/host at a later date – and likely at a discount as prices drop over time. However – often the original discount is not applied to parts and accessory pricing for a smaller deal, so consider if it actually is a savings. How would you like a free SAN? J Or 10 weeks of training @ $6K each? I assume you have people on your team who could benefit from some training? Better tools? Spend your money BETTER! Better yet, spend the money you’re entrusted to be the steward of, better – it’s not your money, treat it with respect.

A re-summary of the numbers:

  • +$200K – 10x R630’s with 512GB today
  • +$0K – net-new licencing for vSphere Enterprise Plus
  • -$227K – 108 sockets of vSphere SnS we can drop, over 3 years.
  • -$36K – Power savings over 3 years
  • -$40K – Resale of the original equipment

Total: $103K to the good.

 

Footnote: I came back thinking about power.  The Co-Location facility I’ve dealt with charges roughly:

  • $2000/month for a pair of 208V/30A circuits
  • $400/month for a pair of 110V/15A circuits
  • $Unknown for a pair of 20A circuits, unfortunately.

I got to thinking about what this environment would need – but also what it has.  In my past, I’ve seen a single IBM Blade Center chassis using 4x 208V/30A circuits, even if it could have been divided up better.  So let’s assume the same inefficiency was done here.  Each HP C-Series chassis at 25.4A would require 3x Pairs, or 12x Pairs for the total configuration – somewhere in the area of $24,000/month in power.  Yikes!  Should it be less?  Absolutely.  But it likely isn’t, based on the horrible things I’ve seen – probably people building as though they’re charged by usage and not by drop.

The 10x Rack servers if I switch them to 110V vs 208V indicate they need the 3.5A each – which is across both circuits..  This I think is at max, but let’s be fair and say you wouldn’t put more than 3x (10.5A) on a 15A circuit.  So you need 4x $400 pairs for $1600/month in power.  Alternatively, you could put them all on a 208V/30A pair for 21A total, for $2000/month.  If you could, this would be the better option as it lets you use only one pair of PDU’s, and you have surplus for putting in extra growth, Top of Rack switching, etc. 

So potentially, you’re also going to go from $24K to $2K/month in power.  For the sake of argument, let’s assume I’m way wrong on the blades, and it’s using half the power or $12K.  You’re still saving $10K/month – or $360K over 36 months.  Did you want a free SAN for your DR site maybe?  Don’t forget to not include the numbers previously based on usage vs drop power, or your double dipping on your savings. 

(New) Total: $427K to the good – AFTER getting your new equipment. 

Hi.  I just saved you half a million bucks

Categories: Dell, Design, Hardware, VMware

Design Exercise-Fixing Old or Mismatched Clusters

October 2, 2014 Leave a comment

In two previous posts, I talked about some design examples I’ve seen:

Design Exercise – Scaling up vs Scaling out

Design Exercise – DR Reuse

Today I’m going to talk about the “No problem, we’ll just add a host” problem.  But not in the “one more of the same” scenario, instead a “we can’t get those any longer, so we’ll add something COMPLETELY different” scenario.

Regardless of if the current site is something like previously described with matching systems (eg: 4x Dell PE2950’s) or random systems, often when capacity runs out, budget is likely low, and so the discussion comes up to “Just add a host”. But as we know from previous examples, adding additional hosts costs money for not only hardware, but licences. I have two different example sites to talk about:

Example 1:

  • 4x Dell PE2950, 2x 4 Core, 32GB RAM, 4x 1GBE hosts

Example 2:

  • 1x Dell T300, 1x 4 Core, 32GB RAM, 2x 1GbE
  • 1x HP DL380 G6, 2x 4 Core, 64GB RAM, 4x 1GbE
  • 1x Dell R610, 2x 6 Core, 96GB RAM, 4x 1GbE

In both cases, we’ll assume that the licencing won’t change as we’re not going to discuss actually adding any hosts, so all software/port counts remain the same.

As you can see, neither environment is particularly good. They both old, but Example 2 is horribly mismatched. DRS is going to have a hell of a time finding proper VM slots to use, the capacity is mis-matched, and nothing is uniform. The options to fix this all involve investing good money after bad. But often an environment that is this old or mismatched, likely ended up this way due to lack of funds. We can talk about proper planning and budgeting until we’re blue in the face, but what we need to do right now is fix the problem. So let’s assume that even if we could add or replace one of the hosts with something more current, like your $7000 R620 with 2x 6 Core and 128GB, this is not in budget. Certainly, 3-4 of them is not, and certainly not for the bigger/better systems at $10K+.

So what if we go used? Ah, I can hear it now, the collective rants of a thousand internet voices. “But we can’t go used, it’s old and it might fail, and it’s past it’s prime”. Perhaps – but look at what the environments currently are. Plus, if someone had something ‘newer’, that they’d owed for 2 years into a 3-5 year warranty, it would be “used” as well, no? Also, accepting for complete and spontaneous host failures, virtualization and redundancy gives us a lot of ways to mitigate actual hardware failures. Failing network ports, power supplies, fans, etc, will all trigger a Host Health alert. This can be used to automatically place the host in Maintenance Mode and have DRS evacuate it and send you an e-mail. So yes, a part may fail, but we build _expecting_ that to be true.

Now assume that the $7000 option for a new host *IS* in budget. What could we do instead? We certainly don’t want to add a single $7000 host to the equation, for all the reasons noted. Now we look into what we can do with off-lease equipment. This is where being a home-labber has its strengths – we already know what hardware is reliable and plentiful, and still new enough to be good and not quite old enough to be a risk.

What if I told you that for about $1500 CAD landed, you could get the following configuration:

Example 1 can now, for around $6000 CAD, replace all 4 hosts with something newer, that will have 16 more cores, and 4x the RAM. It’s not going to be anywhere near the solution from the other day with the 384GB hosts – but it’s also not going to be $40K in servers. Oh, plus 8U to 4U, power savings, etc.

Example 2 is able to replace those first 2 hosts and standardize, for around $3000.

In either case, they’re still “older” servers. A Dell R610 is circa 2009-2012, so you’re still looking at a 2-5 year old server at this point – which might be a little long in the tooth. But if the power is enough for you, and you’re just trying to add some capacity and get out of “scary old” zone, it might not be so bad. Heck, either of these sites are likely going to be very happy with the upgrades. Questions will need to be answered such as:

  • Lifespan – how long are we expecting these servers to be a solution for? Till the end of next calendar year or about 16-18 months? That’s fine.
  • Budget – are we doing this because we have run out of budget for this year but *NEED* “something”? Has next year’s budget been locked away and this was ‘missed’, but you still need ‘something’?  If we assume these are 18 month solutions, to get us from now (Oct 2014) to “after next budget year” (Jan 2016), then Example 1 is $333/month and Example 2 is $167/month. Money may be tight, but that’s a pretty affordable way of pushing off the reaper.  Heck, I know people with bigger cell phone bills.
  • Warranty – these may or may not come with OEM warranty. Are you okay with that? Maybe what makes the most sense is just pick up an extra unit for “self-warranty” – it is almost certainly still cheaper than extending the OEM warranty. Remember though, OEM support also helps troubleshoot weird issues and software incompatibilities, etc. Self-warrantying just gets you hard parts, that you can swap – if you have time and energy to do so. Check if the secondary market reseller will over next day parts, that may be sufficient for you. Also, check if the vendor of the hardware you’re choosing will allow you to download software updates (eg: management software, firmware, BIOS, etc) without a service contract. Dell, at this point, still does, which is why I like them (for customers and my lab).  Oh, an advantage of the extra unit for “self-warranty”?  You can use it for Dev/Test, learning, testing things you want to try, validating hardware configurations, swapping parts for testing suspected issues, etc.
  • Other Priorities – do you need to spend the same money you’d spend on new hosts, elsewhere? Maybe you need a faster SAN today, because you’re out of capacity as well, and you have to make a choice. You can fix it next year, but you can’t fix both at once, regardless of effort or good intentions. Maybe you want to go to 10GbE switches today in preparation. Perhaps you want to spend the same money on training, so that your staff can “do more with less” and have “smarter people” instead of “more thingies, with no one to run them”.

I fully realize that off-lease, eBay, secondary market is going to throw up automatic “no’s” for a lot of people. Also, many management teams will simply say no. Some will have an aversion to “buying from eBay” – fine, call the vendor from their eBay auction, and get a custom quote with a PO directly, and but it just like you would from any other VAR. The point of the matter is, you have options, even if you’re cash strapped.

BTW, if anyone was thinking “why not just get R620’s” which are newer, you certainly could – http://www.ebay.ca/itm/DELL-POWEREDGE-R620-2-x-SIX-CORE-E5-2620-2-0GHz-128GB-NO-HDD-RAILS-/111402343301?pt=COMP_EN_Servers&hash=item19f018db85. One can get an R620, 2x 6 Core E5-2620, 128GB RAM (16x8GB almost certainly, but 24 DIMM slots), 4x 1GbE, iDRAC, etc, for about $3000. This would give you more room to grow and is newer equipment, but it starts getting much closer to the $7000 configuration direct from Dell with 3 year warranty, 10GbE ports, etc. Still, 4x $3K is much less than 4x $7K, and $16,000 is a lot of money you could spend on something else. Just watch you’re not paying too close to retail for it to be not be worth it.

The trick, coming from a home-lab guy, is to be “just old enough to not be worth any money to someone else” but “just new enough to still be really useful, if you know what you’re doing.”

Also, consider these options for the future.  Remember that ROI involves a sale.  Let’s say you purchased the brand new $7000 servers and made it 5 year warranty vs 3 year for… 20% more or about $8500.  You’re almost certainly not going to use it for 5 years.  But in 2.5 years, when you want to put that server on the secondary market, and it still has 2+ years of OEM warranty left – you’re going to find it has significantly more resale value.

This is no different than leasing the ‘right’ car with the ‘right options’, because you know it’ll have a higher resale value at the end of the lease.  If you’re the kind of person that would never “buy new, off the lot” and would always buy a “1-2 year old lease-return, so someone else can pay the depreciation” – this solution is for you.

If in one scenario you haul the unit away to recycling (please, call me, I offer this service for free Smile), and another you sell the equipment to a VAR for $2000/unit you can use as credit on your next purchase or services…

Categories: Dell, Design, Hardware, VMware, vSphere

Design Exercise–DR or Dev/Test Re-use

October 1, 2014 1 comment

In a previous post, I recently discussed some of the benefits of Scaling Up vs Scaling Out (https://vnetwise.wordpress.com/2014/09/28/design-exercise-scaling-up-vs-scaling-out/) and how you can save money by going big. In that example, the site already had 4 existing hosts, wanted 5 new ones, but settled on 3. We can all guess of course what the next thing to get discussed was, I’m sure…

“So let’s reuse the old 4 hosts, because we have them, and use them… for DR or a DEV/TEST environment”. This should be no surprise, that “because we have them” is a pretty powerful sell. Let’s talk about how that might actually cost you considerably more money than you should be willing to spend.

Just as quick reminder, a summary of the hardware and configurations in question:

OLD HOSTS: Dell PowerEdge 2950 2U, 2x E5440 2.8GHz 4 Core CPU, 32GB DDR2

NEW HOSTS: Dell PowerEdge R620 1U, 2x E5 2630L 2.4GHz 6 Core CPU, 384GB DDR3

1) Licencing

Our example assumes that the site needed new licencing for the new hardware – either it didn’t have any, it expired, it was the wrong versions, who knows. So if you reutilize those 4 hosts, you’re going to need 4-8 licences for everything. Assuming the same licence types and versions (eg: vSphere Enterprise Plus, Windows Server Data Center, Veeam Enterprise, etc. ) that works out to be:

  • 4x $0 hosts as above = $0
  • 8x Sockets of vSphere Enterprise Plus @ ~ $3500/each with SnS = $28,000
  • 4x Windows Server Data Center licences @ ~ $5000/each = $20,000
  • 8x Sockets of Veeam B&R licences @ ~ $1000/each = $8,000

Total Cost = $56,000

Total Resources = 89GHz CPU, 128GB RAM

That’s a lot of licencing costs, for such little capacity to actually run things.

2) Capacity

That’s only 128GB of RAM to run everything, and 96GB when taking into account N+1 maintenance. Even if it IS DEV/TEST or DR, you’ll still need to do maintenance. These particular servers COULD go to 64GB each, using 8GB DIMM’s, but they’re expensive and not really practical to consider.

3) Connectivity

Let’s assume part of why you were doing this, is to get rid of 1GbE in your racks. Maybe they’re old. Maybe they’re flaky. Maybe you just don’t want to support them. In either event, let’s assume you “need” 10GbE on them, if for no other reason but so that your Dev/Test *actually* looks and behaves like Production. No one wants to figure out how to do things in Dev with 12x1GbE and then try to reproduce it in Prod with 4x10GbE and assume it’s all the same. So you’ll need:

  • 8x 2pt 10GbE PCIe NICs @ $500 each = $4000
  • 8x TwinAx SFP+ cables @ $50 each = $400

We’ve now paid $4400 to upgrade our hosts to be able to use the same 10GbE infrastructure we were using for Prod. For servers that are worth maybe $250 on Kijiji or eBay (http://www.ebay.ca/itm/Dell-Poweredge-III-2950-Server-Dual-Quad-Core-2-83GHz-RAID-8-Cores-64Bit-VT-SAS-/130938124237?pt=COMP_EN_Servers&hash=item1e7c8537cd). Not the best investment.

4) Real Estate / Infrastructure

Re-using these existing hosts means 8U of space, probably 2x the power required, likely internal RAID and disks that is just burning up power and cooling.

A quick summary shows that we’ve now spent somewhere in the area of $60,000 to “save money” by reusing our old hardware. This will take up 8U of rack space, probably consume 1600W of power, and we’re investing hardware in very old equipment.

But what if we did similar to what we did with the primary cluster for Prod, and just bought… 2 more of the bigger new hosts.

2x Dev/Test Hosts @ 384GB:

  • 2x $11,500 hosts as above = $23,000
  • 4x Sockets of vSphere Enterprise Plus @ ~ $3500/each with SnS = $14,000
  • 2x Windows Server Data Center licences @ ~ $5000/each = $10,000
  • 4x Sockets of Veeam B&R licences @ ~ $1000/each = $4,000
  • 8x SFP+ TwinAx cables @ ~ $50/each = $400

Total Cost = $51,400

Total Resources = 57.6GHz CPU, 768GB RAM

Compared to:

Total Resources = 89GHz CPU, 128GB RAM

So we’ve now spent only $51,400 vs $60,000, and ended up with 6x the capacity on brand new, in warranty, modern hardware. The hardware is 100% identical to Prod. If we need or want to do any sort of testing in advance – vSphere patches, Firmware Upgrades, hardware configuration changes, we can now do so in Dev/Test and 100% validate that it will behave EXACTLY that way in Prod as it IS in fact exact. All of your training and product knowledge will also be the same, as you don’t have to consider variances in generations of hardware. We’re also going to use 2U and probably 600W of power vs 8U and 1600W.

If this is all in one site, and being used as Dev/Test you have a couple of ways you could set this up. We’re assuming this is all on the same SAN/storage, so we’re not creating 100% segregated environments. Also, the 10GbE switching will also be shared. So do you make a 3 node Prod and a 2 node Dev/Test cluster? Or do you make a 5 node cluster with a Prod and Dev/Test Resource Pool and use NIOC/SIOC to handle performance issues?

If this is for a second site, to potentially be used as DR, we’ve now saved $30K on the original solution and $8K on the solution we’re discussing now. This is $38K that you could spend on supporting infrastructure for your DR site – eg: 10GbE switching and SAN’s, which we haven’t accounted for at all. Granted $38K doesn’t buy a lot of that equipment – but it SURE is better than starting at $0. You just got handed a $40K coupon.

So, when you feel the urge to ask “but what should we do with this old hardware, can’t we do anything with it?” – the answer is “Yes, we can throw it away”. You’ll save money all day long. Give it to the keeners in your environment who want a home lab and let them learn and explore. If you really have no one interested… drop me a line. I ALWAYS have room in my lab or know someone looking. I’ll put it to use somewhere in the community .

Categories: Dell, Design, Hardware, VMware, vSphere

HOWTO: Fix Dell Lifecycle Controller Update issues

September 29, 2014 2 comments

Let’s say you’re in the middle of upgrading some Dell 11g hosts. They all have iDRAC 6 and Lifecycle Controllers, and you’ve downloaded the latest SUU DVD for this quarter. Then you want to update everything. You reboot the host, you press F10 to enter the LCC, you tell it to use the Virtual Media mounted SUU DVD ISO that it recognizes, it finds your updates, and you say go… only to get this:

clip_image001

Uh. So who authorizes them, because this is from a Dell SUU DVD, that’s about as good as I can get.

Turns out, I’m not the first person to have this problem, though it’s an older issue:

http://www.sysarchitects.com/solved-updates-you-are-trying-apply-are-not-dell-authorized-updates

http://en.community.dell.com/support-forums/servers/f/177/t/19475476

http://frednotes.wordpress.com/2012/11/21/the-updates-you-are-trying-to-apply-are-not-dell-authorized-updates/

It looks like the issue is that the LCC is at 1.4.0.586 currently – and needs to be 1.5.2 or better. 1.6.5.12 is current as of my SUU DVD, as you can see above. The other problem, is that Dell provides updates “in OS” for Linux and Windows – which doesn’t really help ESXi hosts at all. Seems the solution for this is a “OMSA Live CD” which I’ve never heard of until today. This can be found at: http://linux.dell.com/files/openmanage-contributions/om74-firmware-live/ and really good instructions on its use at: http://en.community.dell.com/techcenter/b/techcenter/archive/2014/03/20/centos-based-firmware-images-with-om-7-4-with-pxe

Now, the other alternative should have been to mount the SUU ISO as Virtual Media ISO and boot from it. But for whatever reason, this isn’t working and after selecting it, it just boots the HDD. I’m assuming, this is because the firmware on the iDRAC/LCC is too old and having some issues booting the ISO. That’s fine. I didn’t troubleshoot it too much after it failed 3 times in a row. I dislike hardware reboots that take 10 minutes, which is why I like VM’s, so I went looking for an alternative solution, and was happy with it.

When the system boots from the OM74 Live CD, it will auto-launch the update GUI:

clip_image002

Right now, you only need to do the Dell Lifecycle Controller. You could of course do more, but the point for me is to get the LCC working, then move back to doing the updates via that interface. So we’ll ONLY do the one update from here.

Click UPDATE FIRMWARE, and then:

clip_image003

Click UPDATE NOW in the upper left corner. You can see the STATUS DESCRIPTION showing it is being updated.

When the update is complete, you can then reboot the system and retry using the Unified System Configurator/Lifecycle Controller to complete the rest of your updates. (HOWTO- Using Dell iDRAC 7 Lifecycle Controller 2 to update Dell PowerEdge R420, R620, and R720 s would be a good place to look)