[aklug] Re: OK, akluggers, riddle me this (FC SAN/switching question)

From: adam bultman <adamb@glaven.org>
Date: Thu May 06 2010 - 13:21:23 AKDT

Arthur Corliss wrote:
> Sorry for the delayed response. I saved the message so I could respond
> later, then promptly forgot about it. :-P
>
> On Tue, 4 May 2010, adam bultman wrote:
>
>
>> OK, akluggers, here's a question for you.
>>
>> I have twin SANs in a datacenter. Currently, I have two FC switches,
>> and connected directly are some servers. Each server has a connection to
>> each switch.
>>
>>
>> Each SAN has a single FC connection to each switch, which gives you a
>> setup that looks like this (in theory - I have only one 'server' in the
>> diagram here):
>>
>> http://www.glaven.org/currentFCsetup.pdf
>>
>> I'm now getting another set of servers and another set of two FC
>> switches, which will be connected to the SAN.
>> Each of the new FC switches will have a connection to the SAN, as you
>> can see in this increasingly confusing diagram:
>>
>> http://www.glaven.org/newFCsetup.pdf
>>
>> The question that I have is, "should I connect the two sets of switches
>> together?"
>>
>> For example, connect: SW1 to SW3 and SW2 to SW4
>> Or: Connect SW1 to SW4, and SW2 to SW3
>>
>> Reasons for: The interconnected switches would give the servers
>> connected more paths to the SAN, and more throughput. Possible recovery
>> from a switch failure.
>> Reasons against: Complicates the setup, requires changing the domain
>> IDs on the FC switches. I also don't know if this will cause problems
>> with FC. Also, I doubt I'm using the full capacity of the FC SAN.
>>
>> I've been "told" that in a two-switch setup, you shouldn't connect the
>> two switches, even though it provides more paths - although I haven't
>> been given a concrete reason why.
>>
>>
>>
>> So, what do you think? I'm using WWPN/WWNN based zoning. I'm just
>> trying to get as much throughput as I can, without unnecessarily
>> complicating the setup or causing difficult to determine problems.
>>
>
> Who told you that you needed two switches per server? That seems to be a
> waste. As long as you have zoning you should be able to put both servers
> into the same switches. And as along as your servers and your SAN supports
> multipath I/O you should be able to do load balancing across both fabrics.
> Based on the small size of your FC network it looks like there's no
> practical benefit to the extra switches.
>
>
Well, these servers are HP Bladeservers - so I automatically have a port
on each switch directly connected. The 'new' servers I'm getting will
have external switches, which makes things more complicated.
As for the number of switches - it's for redundancy purposes. If I have
all my servers on one switch, and the switch resets, dies, etc - I'm
toast. If I have multipathing, and two switches, I'll suffer a path
loss, but I'll still be OK. (Correct me if I misinterpreted what you
were saying.)

> Your SAN may place limitations on this, however. On the IBM FAStT series,
> for instance, you have dual controllers, but only one controller is allowed
> to be active on a LUN at a time. You can't load balance across paths using
> different controllers, you can only do fail-over. In addition to that,
> multiple connections from one controller is still hubbed together, so you
> have some practical limitations through the server there, as well.
>
>
Network Appliance's clustering is Active-Active, so all paths are
active. A particular head will 'own' a volume, LUN, etc, but accesses
made to the "wrong" head will go through the cluster interconnect to the
'right head'. Accesses made to the 'wrong' head gives you a warning on
the SAN, but linux, VMWare, etc knows how to choose the preferred path
as to avoid the wrong head unless the "owner" head stops responding. If
the "owner" head dies, the partner head takes over those luns. (Also,
the two cluster nodes mirror each other's NVRAM, so you don't lose any
writes or data.)
(I have an HP MSA array that is 'active/active', but only because they
sandwiched two active/passive controllers together. It doesn't work
well, at all.)
> In short, depending on your SAN server you may only expect a maximum
> throughput based on a single port rate. The LUN configuration itself also
> plays a big role in performance. What RAID level you're using on the
> backing array, block size, cache settings on the SAN, etc. Because of the
> number of variables, all meant to be tweakable for specific work loads, you
> should never expect raw I/O throughput to get near FC port line rate.
>
>
Oh, I never expected to get my full throughput. I just don't want to
end up hamstrung in the future. The performance I get even through
2Gbit FC exceeds multipathed iSCSI over gigE, so 4Gbit FC is another
step up, and multipathed 4Gbit FC is better yet! Although if I fiddle
with my multipath.conf too much, I cut my throughput via a multipathed
connection to less than a single FC connection. Go figure (and go back
to defaults.)

What'll be even MORE interesting is when I connect the second ports on
my tape drives to the new switches, which will give me multipathing on
THOSE suckers. That'll be a party to configure with Veritas...

Adam

> --Arthur Corliss
> Live Free or Die
> ---------
> To unsubscribe, send email to <aklug-request@aklug.org>
> with 'unsubscribe' in the message body.
>
>

-- 
Adam
---------
To unsubscribe, send email to <aklug-request@aklug.org>
with 'unsubscribe' in the message body.
Received on Thu May 6 13:21:44 2010

This archive was generated by hypermail 2.1.8 : Thu May 06 2010 - 13:21:44 AKDT