Archive

Archive for the ‘Hardware’ Category

Modifying the Dell C6100 for 10GbE Mezz Cards

June 11, 2015 3 comments

In a previous post, Got 10GbE working in the lab – first good results, I talked about getting 10GbE working with my Dell C6100 series.  Recently, a commenter asked me if I had any pictures of the modifications I had to make to the rear panel to make these 10GBE cards work.  As I have another C6100 I recently acquired (yes, I have a problem…), that needs the mods, it seems only prudent to share the steps I took in case it helps someone else.

First a little discussion about what you need:

  • Dell C6100 without the rear panel plate to be removed
  • Dell X53DF/TCK99 2 Port 10GbE Intel 82599 SFP+ Adapter
  • Dell HH4P1 PCI-E Bridge Card

You may find the Mezz card under either part number – it seems that the X53DF replaced the TCK99.  Perhaps one is the P/N and one is the FRU or some such.  But you NEED that little PCI-E bridge card.  It is usually included, but pay special attention to the listing to ensure it does.  What you DON’T really need, is the mesh back plate on the card – you can get it bare. 

2015-06-11 21.18.132015-06-11 21.17.46

Shown above are the 2pt 10GbE SFP+ card in question, and also the 2pt 40GbE Infiniband card.  Above them both is the small PCI-E bridge card.

2015-06-11 21.19.24

You want to remove the two screws to remove the backing plate on the card.  You won’t be needing it, and you can set it aside.  The screws attach through the card and into the bracket, so once removed, reinsert the screws to the bracket to keep from losing them.

2015-06-11 21.17.14

Here we can see the back panel of the C6100 sled.  Ready to go for cutting.

2015-06-11 21.22.232015-06-11 21.24.48

You can place the factory rear plate over the back plate.  Here you can see where you need to line it up and mark the cuts you’ll be doing.  Note that of course the bracket will sit higher up on the unit, so you’ll have to adjust for your horizontal lines. 

2015-06-11 21.23.092015-06-11 21.22.49

If we look to the left, we can see the source of the problem that causes us to have to do this work.  The back panel here is not removable, and wraps around the left corner of the unit.  In systems with the removable plate, this simply unscrews and panel attached to the card slots in.  In the right hand side you can see the two screws that would attach the panel and card in that case.

2015-06-11 21.35.38

Here’s largely what we get once we complete the cuts.  Perhaps you’re better with a Dremel than I am. Note that the vertical cuts can be tough depending on the size of the cutting disk you have, as they may have interference from the bar to remove the sled. 

2015-06-11 21.36.162015-06-11 21.36.202015-06-11 21.36.28

You can now attach the PCI-E bridge card to the Mezz card, and slot it in.  I found it easiest to come at about 20 degree angle and slot in the 2 ports into the cut outs, then drop the PCI-E bridge into the slot.  When it’s all said and done, you’ll find it pretty secure and good to go.

That’s really about it.  Not a whole lot to it, and if you have it all in hand, you’d figure it out pretty quick.  This is largely to help show where my cut lines ended up compared tot he actual cuts and where adjustments could be made to make the cuts tighter if you wanted.  Also, if you’re planning to order, but are not sure if it works or is possible, then this is going to help out quite a bit.

Some potential vendors I’ve had luck with:

http://www.ebay.com/itm/DELL-X53DF-10GbE-DUAL-PORT-MEZZANINE-CARD-TCK99-POWEREDGE-C6100-C6105-C6220-/181751541002? – accepted $60 USD offer.

http://www.ebay.com/itm/DELL-X53DF-DUAL-PORT-10GE-MEZZANINE-TCK99-C6105-C6220-/181751288032?pt=LH_DefaultDomain_0&hash=item2a513890e0 – currently lists for $54 USD, I’m sure you could get them for $50 without too much negotiating.

Advertisements
Categories: C6100, Dell, Hardware, Home Lab

IBM RackSwitch–40GbE comes to the lab!

May 20, 2015 3 comments

Last year, I had a post about 10GbE coming to my home lab (https://vnetwise.wordpress.com/2014/09/20/ibm-rackswitch10gbe-comes-to-the-lab/).  This year, 40GbE comes! 

This definitely falls into the traditional “too good to pass up” category.  A company I’m doing work for picked up a couple of these, and there was enough of a supply that I was able to get my hands on a pair for a reasonable price.  Reasonable at least after liquidating the G8124’s from last year.  (Drop me a line, they’re available for sale! Smile)

Some quick high level on these switches, summarized from the IBM/Lenovo RedBooks (http://www.redbooks.ibm.com/abstracts/tips1272.html?open):

  • 1U Fully Layer 2 and Layer 3 capable
  • 4x 40Gbe QSFP+ and 48x 10GbE SFP+
  • 2x power supply, fully redundant
  • 4x fan modules, also hot swappable.
  • Mini-USB to serial console cable (dear god, how much I hate this non-standard part)
  • Supports 1GbE Copper Transceiver – no issues with Cisco GLC-T= units so far
  • Supports Cisco Copper TwinAx DAC cabling at 10GbE
  • Supports 40GbE QSFP+ cables from 10GTek
  • Supports virtual stacking, allowing for a single management unit

Front panel of the RackSwitch G8264

Everything else generally falls into line with the G8124.  Where those are listed as “Access” switches, these are listed as “Aggregation” switches.  Truly, I’ll probably NEVER have any need for this many 10GbE ports in my home lab, but I’ll also never run out.  Equally, I now have switches that match production in one of my largest environments, so I can get good and familiar with them.

I’m still on the fence about the value of the stacking.  While these are largely going to be used for ISCSI or NFS based storage, stacking may not even be required.  In fact there’s an argument to be made about having them be completely segregated other than port-channels between them, so as to ensure that a bad stack command doesn’t take out both.  Also the Implementing IBM System Networking 10Gb Ethernet Switches guide, it shows the following limitations:

When in stacking mode, the following stand-alone features are not supported:
Active Multi-Path Protocol (AMP)
BCM rate control
Border Gateway Protocol (BGP)
Converge Enhanced Ethernet (CEE)
Fibre Channel over Ethernet (FCoE)
IGMP Relay and IGMPv3
IPv6
Link Layer Detection Protocol (LLDP)
Loopback Interfaces
MAC address notification
MSTP
OSPF and OSPFv3
Port flood blocking
Protocol-based VLANs
RIP
Router IDs
Route maps
sFlow port monitoring
Static MAC address addition
Static multicast
Uni-Directional Link Detection (UDLD)
Virtual NICs
Virtual Router Redundancy Protocol (VRRP)

That sure seems like a lot of limitations.  At a glance, I’m not sure anything there is end of the world, but it sure is a lot to give up. 

At this point, I’m actually considering filling a number of ports with GLC-T’s and using that for 1GbE.  A ‘waste’, perhaps, but if it means I can recycle my 1GbE switches, that’s an additional savings.  If anyone has a box of them they’ve been meaning to get rid of, I’d be happy to work something out. 

Some questions that will likely get asked, that I’ll tackle in advance:

  • Come on, seriously – they’re data center 10/40GbE switches.  YES, they’re loud.  They’re not, however, unliveable.  They do quite down a bit after warm up, where they run everything at 100% cycle to POST.  But make no mistake, you’re not going to put one of these under the OfficeJet in your office and hook up your NAS to it, and not shoot yourself. 
  • Power is actually not that bad.  These are pretty green, and drop power to unlit ports.  I haven’t hooked up a Kill-a-Watt to them, but will tomorrow.  They’re on par with the G8124’s based on the amp display on the PDU’s I have them on right now. 
  • Yes, there are a couple more Winking smile  To give you a ballpark, if you check eBay for a Dell PowerConnect 8024F and think that’s doable – then you’re probably going to be interested.  You’d lose the 4x10GBaseT combo ports, but you’d gain 24x10GbE and 4x 40GbE.
  • I’m not sure yet if there are any 40GbE compatible HBA – just haven’t looked into it.  I’m guessing Mellanox ConnectX-3 might do it.  Really though, even at 10GbE, you’re not saturating that without a ton of disk IO. 

More to come as I build out various configurations for these and come up with what seems to be the best option for a couple of C6100 hosts. 

Wish me luck!

Categories: Hardware, Home Lab, IBM, RackSwitch

Design Exercise–Scaling Up–Real World Example

October 13, 2014 Leave a comment

My previous post on Design Exercise- Scaling up vs Scaling out appeared to be quite popular. A friend of mine recently told me of an environment, and while I have only rough details of it, it gives me enough to make a practical example of a real world environment – which I figured might be fun. He indicated that while we’d talked about the ideas in my post for years, it wasn’t until this particular environment that it really hit home.

Here are the highlights of the current environment:

  • Various versions of vSphere – v3.5, v4.x, v5.x, multiple vCenters
  • 66 hosts – let’s assume dual six core Intel 55xx/56xx (Nahelem/Westermere) CPU’s
  • A quick tally suggests 48GB of RAM per host.
  • These hosts are blades, likely HP. 16 Blades per chassis, so at least 4 chassis. For the sake of argument, let’s SAY it’s 64 hosts, just to keep it nice and easy.
  • Unknown networking, but probably 2x 10GbE, and 2x 4Gbit/FC, with passthru modules

It might be something very much like this. In which case, it might be dual 6 core CPU’s, and likely only using 1GbE on the front side. This is probably a reasonable enough assumption for this example, especially since I’m not trying to be exact and keep it theoretical.

http://www.ebay.ca/itm/HP-c7000-Blade-Chassis-16x-BL460c-G6-2x-6-C-2-66GHz-48GB-2x-146GB-2x-Gbe2c-2x-FC-/221303055238?pt=COMP_EN_Servers&hash=item3386b0a386

I’ve used the HP Power Advisor (http://www8.hp.com/ca/en/products/servers/solutions.html?compURI=1439951#.VDnvBfldV8E) to determine the power load for a similarly configured system with the following facts:

  • 5300 VA
  • 18,000 BTU
  • 26 Amps
  • 5200 Watts total
  • 2800 Watts idle
  • 6200 Watts circuit sizing
  • 6x 208V/20A C19 power outlets
    clip_image001

We’ll get to that part later on. For now, let’s just talk about the hosts and the sizing.

Next, we need to come up with some assumptions.

  • The hosts are likely running at 90% memory and 30% CPU, based on examples I’ve seen. Somewhere in the realm of 2764GB of RAM and 230 Cores.
  • The hosts are running 2 sockets of vSphere Enterprise Plus, with SnS – so we have 128 sockets of licences. There will be no theoretical savings on net-new licences as they’re already owned – but we might save money on SnS. There is no under-licencing that we’re trying to top up.
  • vSphere Enterprise Plus we’ll assume to be ~ $3500 CAD/socket and 20% for SnS or about $700/year/socket.
  • The hosts are probably not licenced for Windows Data Center, given the density – but who knows. Again, we’re assuming the licences are owned, so no net-new savings but there might be on Software Assurance.
  • We’re using at least 40U of space, or a full rack for the 4 chassis
  • We’re using 20,800 Watts or 21 kWhr
  • While the original chassis are likely FC, let’s assume for the moment that it’s 10gbE ISCSI or NFS.

Now, let’s talk about how we can replace this all – and where the money will come from.

I just configured some Dell R630 1U Rack servers. I’ve used two different memory densities to deal with some cost assumptions. The general and common settings are:

  • Dell R630 1U Rack server
  • 2x 750 Watt Power Supply
  • 1x 250GB SATA just have “a disk”
  • 10 disk 2.5” chassis – we won’t be using local disks though.
  • 1x PERC H730 – we don’t need it, but we’ll have it in case we add disks later.
  • Dual SD module
  • 4x Emulex 10GbE CNA on board
  • 2x E5-2695 v3 2.3GHz 14C/28T CPU’s

With memory we get the following numbers:

  • 24x 32GB for 768GB total – $39.5 Web Price, assume a 35% discount = $26K
  • 24x 16GB for 368GB total – $23.5 Web Price, assume a 35% discount = $15.5K

The first thing we want to figure out is if the memory density is cost effective. We know that 2x of the 384GB configs would come to $31K or $6K more than the 2x servers. So even without bothering to factor for licencing costs, we know it’s cheaper. If you had to double up on vSphere, Windows Data Center, Veeam, vCOPS, etc, etc, then it gets worse. So very quickly we can make the justification to only include the 768GB configurations. So that’s out of the way. However, it also tells us that if we need more density, we do have some wiggle room to spend more on better CPU’s with more cores/speeds – we can realistically spend up to $3K/CPU more and still work out to be the same as doubling the hosts with half the RAM.

Now how many will we need? We know from above “Somewhere in the realm of 2764GB of RAM and 230 Cores”. 230 cores / 28 cores per server means we need at least 8.2 hosts – we’ll assume 9.; 2764GB of RAM, only requires 3.6 hosts. But we also need to assume we’ll need room for growth. Based on these numbers, let’s work with the understanding we’ll want at least 10 hosts to give us some overhead on the CPU’s, and room for growth. If we’re wrong, we have lots of spare room for labs, DEV/TEST, finally building redundancy, expanding poorly performing VM’s, etc. No harm In that. This makes the math fairly easy as well:

  • $260K – 10x Dell R630’s with 768GB
  • $0 – licence savings from buying net new

We’ve now cost the company, $260K, and so far, haven’t shown any savings or justification. Even just based on hardware refresh and lifecycle costs, this is probably a doable number. This is $7.2K/month over 36 months.

What if we could get some of that money back? Let’s find some change in the cushions.

  • Licence SnS savings. We know we only need 20 sockets now to licence 10 hosts, so we can potentially let the other 108 sockets lapse. At $700/socket/year this results in a savings of $75,600 per year, or $227K over 36 months. This is 87% of our purchase cost for the new equipment. We only need to find $33K now
  • Power savings.
    clip_image002
    The Dell Energy Smart Solution Advisor (http://essa.us.dell.com/dellstaronline/Launch.aspx/ESSA?c=us&l=en&s=corp) suggests that each server will require 456Watts, 2.1 Amps and 1600 BTU of cooling. So our two solutions look like
    clip_image003
    I pay $0.085/kWhr here so I’ll use that number. In the co-location facilities I’m familiar with, you’re charged per power whip not usage. But as this environment is on site, we can assume they’re being charged only as used.
    We’ve now saved another $1K/month or $36K over 36 months. We have saved $263K on a $260K purchase. How am I doing so far?
  • I

  • Rack space – we’re down from 40U to 10U of space. Probably no cost savings here, but we can reuse the space
  • Operational Maintenance – we are now doing Firmware, Patching, Upgrades, Host Configuration, etc, across 10 systems vs 64. Regardless of if that time accounts for 1 or 12 hours per year per server, we are now doing ~ 84% less work. Perhaps now we’ll find the time to actually DO that maintenance.

So based on nothing more than power and licence *maintenance*, we’ve managed to recover all the costs. We also have drastically consolidated our environment, we can likely “finally” get around to migrating all the VM’s into a single vSphere v5.5+ environment and getting rid of the v3.5/v4.x/etc mixed configuration that likely was left that way due to “lack of time and effort”.

We also need to consider the “other” ancillary things we’re likely forgetting as benefits. Everyone one of these things that a site of this size might have, represents a potential savings – either in net-new or maintenance:

  • vCloud Suite vs vSphere
  • vCOPS
  • Veeam or some other backup product, per socket/host
  • Window Server Data Center
  • SQL Server Enterprise
  • PernixData host based cache acceleration
  • PCIe/2.5” SSD’s for said caching

Maybe the site already has all of these things. Maybe they’re looking at it for next year’s budget. If they have it, they can’t reduce their licences, but could drop their SnS/Maintenance. If they’re planning for it, they now need 84% less licencing. My friends in sales for these vendors won’t like me very much for this, I’m sure, but they’d also be happy to have the solution be sellable and implemented and a success story – which is always easier when you don’t need as many.

I always like to provide more for less. The costs are already a wash, what else could we provide? Perhaps this site doesn’t have a DR site. Here’s an option to make that plausible:

  • $260K – 10x R630’s for the DR site
  • $0K – 20 sockets of vSphere Enterprise – we’ll just reuse some of the surplus licencing. We will need to keep paying SnS though.
  • $15K – 20 sockets of vSphere Enterprise SnS
  • $40K – Pair of Nexus 5548 switches? Been a while since I looked at pricing
    Spend $300K and you have most of a DR environment – at least the big part. You still have no storage, power, racks, etc. But you’re far closer. This is a much better use of the same original dollars. The reason for this part of the example is because of the existing licences and we’re not doing net-new. The question of course from the bean-counters will be “so what are we going to do, just throw them away???”

Oh. Right. I totally forgot. Resale J

http://www.ebay.ca/itm/HP-C7000-Blade-Enclosure-16xBL460C-G6-Blades-2xSix-Core-2-66GHZ-X5650-64GB-600GB-/271584371114?pt=COMP_EN_Servers&hash=item3f3bb0a1aa

There aren’t many C7000/BL460C listed as “Sold” on eBay, but the above one sold for ~ $20K Canadian. Let’s assume you chose to sell the equipment to a VAR that specializes in refurbishing – they’re likely to provide you with 50% of that value. That’s another $10K/chassis or $40K for the 4 chassis’.

As I do my re-read of the above, I realize something. We need 9 hosts to meet CPU requirements, but we’d end up with 7680GB of RAM where we only really require 2764GB today. This brings the cost down to ~ $31K Web Price or $20K with 35% discount. At a savings of $6K/server, we’d end up with 5120GB of RAM – just about double what we use today, so lots of room for scale up. We’ll save another $60K today. In the event that we ever require that capacity, we can easily purchase the 8*32GB/host at a later date – and likely at a discount as prices drop over time. However – often the original discount is not applied to parts and accessory pricing for a smaller deal, so consider if it actually is a savings. How would you like a free SAN? J Or 10 weeks of training @ $6K each? I assume you have people on your team who could benefit from some training? Better tools? Spend your money BETTER! Better yet, spend the money you’re entrusted to be the steward of, better – it’s not your money, treat it with respect.

A re-summary of the numbers:

  • +$200K – 10x R630’s with 512GB today
  • +$0K – net-new licencing for vSphere Enterprise Plus
  • -$227K – 108 sockets of vSphere SnS we can drop, over 3 years.
  • -$36K – Power savings over 3 years
  • -$40K – Resale of the original equipment

Total: $103K to the good.

 

Footnote: I came back thinking about power.  The Co-Location facility I’ve dealt with charges roughly:

  • $2000/month for a pair of 208V/30A circuits
  • $400/month for a pair of 110V/15A circuits
  • $Unknown for a pair of 20A circuits, unfortunately.

I got to thinking about what this environment would need – but also what it has.  In my past, I’ve seen a single IBM Blade Center chassis using 4x 208V/30A circuits, even if it could have been divided up better.  So let’s assume the same inefficiency was done here.  Each HP C-Series chassis at 25.4A would require 3x Pairs, or 12x Pairs for the total configuration – somewhere in the area of $24,000/month in power.  Yikes!  Should it be less?  Absolutely.  But it likely isn’t, based on the horrible things I’ve seen – probably people building as though they’re charged by usage and not by drop.

The 10x Rack servers if I switch them to 110V vs 208V indicate they need the 3.5A each – which is across both circuits..  This I think is at max, but let’s be fair and say you wouldn’t put more than 3x (10.5A) on a 15A circuit.  So you need 4x $400 pairs for $1600/month in power.  Alternatively, you could put them all on a 208V/30A pair for 21A total, for $2000/month.  If you could, this would be the better option as it lets you use only one pair of PDU’s, and you have surplus for putting in extra growth, Top of Rack switching, etc. 

So potentially, you’re also going to go from $24K to $2K/month in power.  For the sake of argument, let’s assume I’m way wrong on the blades, and it’s using half the power or $12K.  You’re still saving $10K/month – or $360K over 36 months.  Did you want a free SAN for your DR site maybe?  Don’t forget to not include the numbers previously based on usage vs drop power, or your double dipping on your savings. 

(New) Total: $427K to the good – AFTER getting your new equipment. 

Hi.  I just saved you half a million bucks

Categories: Dell, Design, Hardware, VMware

Design Exercise-Fixing Old or Mismatched Clusters

October 2, 2014 Leave a comment

In two previous posts, I talked about some design examples I’ve seen:

Design Exercise – Scaling up vs Scaling out

Design Exercise – DR Reuse

Today I’m going to talk about the “No problem, we’ll just add a host” problem.  But not in the “one more of the same” scenario, instead a “we can’t get those any longer, so we’ll add something COMPLETELY different” scenario.

Regardless of if the current site is something like previously described with matching systems (eg: 4x Dell PE2950’s) or random systems, often when capacity runs out, budget is likely low, and so the discussion comes up to “Just add a host”. But as we know from previous examples, adding additional hosts costs money for not only hardware, but licences. I have two different example sites to talk about:

Example 1:

  • 4x Dell PE2950, 2x 4 Core, 32GB RAM, 4x 1GBE hosts

Example 2:

  • 1x Dell T300, 1x 4 Core, 32GB RAM, 2x 1GbE
  • 1x HP DL380 G6, 2x 4 Core, 64GB RAM, 4x 1GbE
  • 1x Dell R610, 2x 6 Core, 96GB RAM, 4x 1GbE

In both cases, we’ll assume that the licencing won’t change as we’re not going to discuss actually adding any hosts, so all software/port counts remain the same.

As you can see, neither environment is particularly good. They both old, but Example 2 is horribly mismatched. DRS is going to have a hell of a time finding proper VM slots to use, the capacity is mis-matched, and nothing is uniform. The options to fix this all involve investing good money after bad. But often an environment that is this old or mismatched, likely ended up this way due to lack of funds. We can talk about proper planning and budgeting until we’re blue in the face, but what we need to do right now is fix the problem. So let’s assume that even if we could add or replace one of the hosts with something more current, like your $7000 R620 with 2x 6 Core and 128GB, this is not in budget. Certainly, 3-4 of them is not, and certainly not for the bigger/better systems at $10K+.

So what if we go used? Ah, I can hear it now, the collective rants of a thousand internet voices. “But we can’t go used, it’s old and it might fail, and it’s past it’s prime”. Perhaps – but look at what the environments currently are. Plus, if someone had something ‘newer’, that they’d owed for 2 years into a 3-5 year warranty, it would be “used” as well, no? Also, accepting for complete and spontaneous host failures, virtualization and redundancy gives us a lot of ways to mitigate actual hardware failures. Failing network ports, power supplies, fans, etc, will all trigger a Host Health alert. This can be used to automatically place the host in Maintenance Mode and have DRS evacuate it and send you an e-mail. So yes, a part may fail, but we build _expecting_ that to be true.

Now assume that the $7000 option for a new host *IS* in budget. What could we do instead? We certainly don’t want to add a single $7000 host to the equation, for all the reasons noted. Now we look into what we can do with off-lease equipment. This is where being a home-labber has its strengths – we already know what hardware is reliable and plentiful, and still new enough to be good and not quite old enough to be a risk.

What if I told you that for about $1500 CAD landed, you could get the following configuration:

Example 1 can now, for around $6000 CAD, replace all 4 hosts with something newer, that will have 16 more cores, and 4x the RAM. It’s not going to be anywhere near the solution from the other day with the 384GB hosts – but it’s also not going to be $40K in servers. Oh, plus 8U to 4U, power savings, etc.

Example 2 is able to replace those first 2 hosts and standardize, for around $3000.

In either case, they’re still “older” servers. A Dell R610 is circa 2009-2012, so you’re still looking at a 2-5 year old server at this point – which might be a little long in the tooth. But if the power is enough for you, and you’re just trying to add some capacity and get out of “scary old” zone, it might not be so bad. Heck, either of these sites are likely going to be very happy with the upgrades. Questions will need to be answered such as:

  • Lifespan – how long are we expecting these servers to be a solution for? Till the end of next calendar year or about 16-18 months? That’s fine.
  • Budget – are we doing this because we have run out of budget for this year but *NEED* “something”? Has next year’s budget been locked away and this was ‘missed’, but you still need ‘something’?  If we assume these are 18 month solutions, to get us from now (Oct 2014) to “after next budget year” (Jan 2016), then Example 1 is $333/month and Example 2 is $167/month. Money may be tight, but that’s a pretty affordable way of pushing off the reaper.  Heck, I know people with bigger cell phone bills.
  • Warranty – these may or may not come with OEM warranty. Are you okay with that? Maybe what makes the most sense is just pick up an extra unit for “self-warranty” – it is almost certainly still cheaper than extending the OEM warranty. Remember though, OEM support also helps troubleshoot weird issues and software incompatibilities, etc. Self-warrantying just gets you hard parts, that you can swap – if you have time and energy to do so. Check if the secondary market reseller will over next day parts, that may be sufficient for you. Also, check if the vendor of the hardware you’re choosing will allow you to download software updates (eg: management software, firmware, BIOS, etc) without a service contract. Dell, at this point, still does, which is why I like them (for customers and my lab).  Oh, an advantage of the extra unit for “self-warranty”?  You can use it for Dev/Test, learning, testing things you want to try, validating hardware configurations, swapping parts for testing suspected issues, etc.
  • Other Priorities – do you need to spend the same money you’d spend on new hosts, elsewhere? Maybe you need a faster SAN today, because you’re out of capacity as well, and you have to make a choice. You can fix it next year, but you can’t fix both at once, regardless of effort or good intentions. Maybe you want to go to 10GbE switches today in preparation. Perhaps you want to spend the same money on training, so that your staff can “do more with less” and have “smarter people” instead of “more thingies, with no one to run them”.

I fully realize that off-lease, eBay, secondary market is going to throw up automatic “no’s” for a lot of people. Also, many management teams will simply say no. Some will have an aversion to “buying from eBay” – fine, call the vendor from their eBay auction, and get a custom quote with a PO directly, and but it just like you would from any other VAR. The point of the matter is, you have options, even if you’re cash strapped.

BTW, if anyone was thinking “why not just get R620’s” which are newer, you certainly could – http://www.ebay.ca/itm/DELL-POWEREDGE-R620-2-x-SIX-CORE-E5-2620-2-0GHz-128GB-NO-HDD-RAILS-/111402343301?pt=COMP_EN_Servers&hash=item19f018db85. One can get an R620, 2x 6 Core E5-2620, 128GB RAM (16x8GB almost certainly, but 24 DIMM slots), 4x 1GbE, iDRAC, etc, for about $3000. This would give you more room to grow and is newer equipment, but it starts getting much closer to the $7000 configuration direct from Dell with 3 year warranty, 10GbE ports, etc. Still, 4x $3K is much less than 4x $7K, and $16,000 is a lot of money you could spend on something else. Just watch you’re not paying too close to retail for it to be not be worth it.

The trick, coming from a home-lab guy, is to be “just old enough to not be worth any money to someone else” but “just new enough to still be really useful, if you know what you’re doing.”

Also, consider these options for the future.  Remember that ROI involves a sale.  Let’s say you purchased the brand new $7000 servers and made it 5 year warranty vs 3 year for… 20% more or about $8500.  You’re almost certainly not going to use it for 5 years.  But in 2.5 years, when you want to put that server on the secondary market, and it still has 2+ years of OEM warranty left – you’re going to find it has significantly more resale value.

This is no different than leasing the ‘right’ car with the ‘right options’, because you know it’ll have a higher resale value at the end of the lease.  If you’re the kind of person that would never “buy new, off the lot” and would always buy a “1-2 year old lease-return, so someone else can pay the depreciation” – this solution is for you.

If in one scenario you haul the unit away to recycling (please, call me, I offer this service for free Smile), and another you sell the equipment to a VAR for $2000/unit you can use as credit on your next purchase or services…

Categories: Dell, Design, Hardware, VMware, vSphere

Got 10GbE working in the lab–first good results

October 2, 2014 12 comments

I’ve done a couple of posts recently on some IBM RackSwitch G8124 10GbE switches I’ve picked up.  While I have a few more to come with the settings I finally got working and how I figured them out, I have had some requests from a few people as to how well it’s all working.   So a very quick summary of where I’m at and some results…

What is configured:

  • 4x ESXi hosts running ESXi v5.5 U2 on a Dell C6100 4 node
  • Each node uses the Dell X53DF dual 10GbE Mezzanine cards (with mounting dremeled in, thanks to a DCS case)
  • 2x IBM RackSwitch G8124 10GbE switches
  • 1x Dell R510 Running Windows 2012 R2 and StarWind SAN v8.  With both an SSD+HDD VOL, as well as a 20GB RAMDisk based VOL.  Using a BCM57810 2pt 10GbE NIC
    Results:
    IOMeter against the RAMDisk VOL, configured with 4 workers, 64 threads each, 4K 50% Read/50% Write, 100% Random:

image

StarWind side:

image

Shows about 32,000 IOPS

And an Atto Bench32 run:

image

Those numbers seem a little high.

I’ll post more details once I’ve had some sleep, I had to get something out, I was excited Smile

Soon to come are some details on the switches, for ISCSI configuration without any LACP other than for inter-switch traffic using the ISL/VLAG ports, as well as a “First time, Quick and Dirty Setup for StarWind v8”, as I needed something in the lab that could actually DO 10GbE, and  had to use SSD and/or RAM to get it to have enough ‘go’ to actually see if the 10GbE was working at all.

I wonder what these will look like with some PernixData FVP as well…

UPDATED – 6/10/2015 – I’ve been asked for photos of the work needed to Dremel in the 10GbE Mezz cards on the C6100 server – and have done so!  https://vnetwise.wordpress.com/2015/06/11/modifying-the-dell-c6100-for-10gbe-mezz-cards/

Design Exercise–DR or Dev/Test Re-use

October 1, 2014 1 comment

In a previous post, I recently discussed some of the benefits of Scaling Up vs Scaling Out (https://vnetwise.wordpress.com/2014/09/28/design-exercise-scaling-up-vs-scaling-out/) and how you can save money by going big. In that example, the site already had 4 existing hosts, wanted 5 new ones, but settled on 3. We can all guess of course what the next thing to get discussed was, I’m sure…

“So let’s reuse the old 4 hosts, because we have them, and use them… for DR or a DEV/TEST environment”. This should be no surprise, that “because we have them” is a pretty powerful sell. Let’s talk about how that might actually cost you considerably more money than you should be willing to spend.

Just as quick reminder, a summary of the hardware and configurations in question:

OLD HOSTS: Dell PowerEdge 2950 2U, 2x E5440 2.8GHz 4 Core CPU, 32GB DDR2

NEW HOSTS: Dell PowerEdge R620 1U, 2x E5 2630L 2.4GHz 6 Core CPU, 384GB DDR3

1) Licencing

Our example assumes that the site needed new licencing for the new hardware – either it didn’t have any, it expired, it was the wrong versions, who knows. So if you reutilize those 4 hosts, you’re going to need 4-8 licences for everything. Assuming the same licence types and versions (eg: vSphere Enterprise Plus, Windows Server Data Center, Veeam Enterprise, etc. ) that works out to be:

  • 4x $0 hosts as above = $0
  • 8x Sockets of vSphere Enterprise Plus @ ~ $3500/each with SnS = $28,000
  • 4x Windows Server Data Center licences @ ~ $5000/each = $20,000
  • 8x Sockets of Veeam B&R licences @ ~ $1000/each = $8,000

Total Cost = $56,000

Total Resources = 89GHz CPU, 128GB RAM

That’s a lot of licencing costs, for such little capacity to actually run things.

2) Capacity

That’s only 128GB of RAM to run everything, and 96GB when taking into account N+1 maintenance. Even if it IS DEV/TEST or DR, you’ll still need to do maintenance. These particular servers COULD go to 64GB each, using 8GB DIMM’s, but they’re expensive and not really practical to consider.

3) Connectivity

Let’s assume part of why you were doing this, is to get rid of 1GbE in your racks. Maybe they’re old. Maybe they’re flaky. Maybe you just don’t want to support them. In either event, let’s assume you “need” 10GbE on them, if for no other reason but so that your Dev/Test *actually* looks and behaves like Production. No one wants to figure out how to do things in Dev with 12x1GbE and then try to reproduce it in Prod with 4x10GbE and assume it’s all the same. So you’ll need:

  • 8x 2pt 10GbE PCIe NICs @ $500 each = $4000
  • 8x TwinAx SFP+ cables @ $50 each = $400

We’ve now paid $4400 to upgrade our hosts to be able to use the same 10GbE infrastructure we were using for Prod. For servers that are worth maybe $250 on Kijiji or eBay (http://www.ebay.ca/itm/Dell-Poweredge-III-2950-Server-Dual-Quad-Core-2-83GHz-RAID-8-Cores-64Bit-VT-SAS-/130938124237?pt=COMP_EN_Servers&hash=item1e7c8537cd). Not the best investment.

4) Real Estate / Infrastructure

Re-using these existing hosts means 8U of space, probably 2x the power required, likely internal RAID and disks that is just burning up power and cooling.

A quick summary shows that we’ve now spent somewhere in the area of $60,000 to “save money” by reusing our old hardware. This will take up 8U of rack space, probably consume 1600W of power, and we’re investing hardware in very old equipment.

But what if we did similar to what we did with the primary cluster for Prod, and just bought… 2 more of the bigger new hosts.

2x Dev/Test Hosts @ 384GB:

  • 2x $11,500 hosts as above = $23,000
  • 4x Sockets of vSphere Enterprise Plus @ ~ $3500/each with SnS = $14,000
  • 2x Windows Server Data Center licences @ ~ $5000/each = $10,000
  • 4x Sockets of Veeam B&R licences @ ~ $1000/each = $4,000
  • 8x SFP+ TwinAx cables @ ~ $50/each = $400

Total Cost = $51,400

Total Resources = 57.6GHz CPU, 768GB RAM

Compared to:

Total Resources = 89GHz CPU, 128GB RAM

So we’ve now spent only $51,400 vs $60,000, and ended up with 6x the capacity on brand new, in warranty, modern hardware. The hardware is 100% identical to Prod. If we need or want to do any sort of testing in advance – vSphere patches, Firmware Upgrades, hardware configuration changes, we can now do so in Dev/Test and 100% validate that it will behave EXACTLY that way in Prod as it IS in fact exact. All of your training and product knowledge will also be the same, as you don’t have to consider variances in generations of hardware. We’re also going to use 2U and probably 600W of power vs 8U and 1600W.

If this is all in one site, and being used as Dev/Test you have a couple of ways you could set this up. We’re assuming this is all on the same SAN/storage, so we’re not creating 100% segregated environments. Also, the 10GbE switching will also be shared. So do you make a 3 node Prod and a 2 node Dev/Test cluster? Or do you make a 5 node cluster with a Prod and Dev/Test Resource Pool and use NIOC/SIOC to handle performance issues?

If this is for a second site, to potentially be used as DR, we’ve now saved $30K on the original solution and $8K on the solution we’re discussing now. This is $38K that you could spend on supporting infrastructure for your DR site – eg: 10GbE switching and SAN’s, which we haven’t accounted for at all. Granted $38K doesn’t buy a lot of that equipment – but it SURE is better than starting at $0. You just got handed a $40K coupon.

So, when you feel the urge to ask “but what should we do with this old hardware, can’t we do anything with it?” – the answer is “Yes, we can throw it away”. You’ll save money all day long. Give it to the keeners in your environment who want a home lab and let them learn and explore. If you really have no one interested… drop me a line. I ALWAYS have room in my lab or know someone looking. I’ll put it to use somewhere in the community .

Categories: Dell, Design, Hardware, VMware, vSphere

HOWTO: Fix Dell Lifecycle Controller Update issues

September 29, 2014 2 comments

Let’s say you’re in the middle of upgrading some Dell 11g hosts. They all have iDRAC 6 and Lifecycle Controllers, and you’ve downloaded the latest SUU DVD for this quarter. Then you want to update everything. You reboot the host, you press F10 to enter the LCC, you tell it to use the Virtual Media mounted SUU DVD ISO that it recognizes, it finds your updates, and you say go… only to get this:

clip_image001

Uh. So who authorizes them, because this is from a Dell SUU DVD, that’s about as good as I can get.

Turns out, I’m not the first person to have this problem, though it’s an older issue:

http://www.sysarchitects.com/solved-updates-you-are-trying-apply-are-not-dell-authorized-updates

http://en.community.dell.com/support-forums/servers/f/177/t/19475476

http://frednotes.wordpress.com/2012/11/21/the-updates-you-are-trying-to-apply-are-not-dell-authorized-updates/

It looks like the issue is that the LCC is at 1.4.0.586 currently – and needs to be 1.5.2 or better. 1.6.5.12 is current as of my SUU DVD, as you can see above. The other problem, is that Dell provides updates “in OS” for Linux and Windows – which doesn’t really help ESXi hosts at all. Seems the solution for this is a “OMSA Live CD” which I’ve never heard of until today. This can be found at: http://linux.dell.com/files/openmanage-contributions/om74-firmware-live/ and really good instructions on its use at: http://en.community.dell.com/techcenter/b/techcenter/archive/2014/03/20/centos-based-firmware-images-with-om-7-4-with-pxe

Now, the other alternative should have been to mount the SUU ISO as Virtual Media ISO and boot from it. But for whatever reason, this isn’t working and after selecting it, it just boots the HDD. I’m assuming, this is because the firmware on the iDRAC/LCC is too old and having some issues booting the ISO. That’s fine. I didn’t troubleshoot it too much after it failed 3 times in a row. I dislike hardware reboots that take 10 minutes, which is why I like VM’s, so I went looking for an alternative solution, and was happy with it.

When the system boots from the OM74 Live CD, it will auto-launch the update GUI:

clip_image002

Right now, you only need to do the Dell Lifecycle Controller. You could of course do more, but the point for me is to get the LCC working, then move back to doing the updates via that interface. So we’ll ONLY do the one update from here.

Click UPDATE FIRMWARE, and then:

clip_image003

Click UPDATE NOW in the upper left corner. You can see the STATUS DESCRIPTION showing it is being updated.

When the update is complete, you can then reboot the system and retry using the Unified System Configurator/Lifecycle Controller to complete the rest of your updates. (HOWTO- Using Dell iDRAC 7 Lifecycle Controller 2 to update Dell PowerEdge R420, R620, and R720 s would be a good place to look)