Re: Er.... for Sendmail users......enough

From: Christopher E. Brown <cbrown@woods.net>
Date: Thu Sep 07 2006 - 03:20:41 AKDT

On Thu, 31 Aug 2006, Matthew Schumacher wrote:

> Christopher E. Brown wrote:
> > Pairs of Cyrus servers acting as hot and standby for a shared SCSI or
> > iSCSI target. Each server pair handles a subset of the users. Proxy
> > redirectors send the user session to the correct back end server. If a
> > server fails that subset of users is offline till the standby takes over.
> > OR
> > NetApp/other based NFS shares, with a hashed maildir directory tree. The
> > tree can be on one netapp, or spread across multiple units for performance
> > reasons (already assuming netapp == netapp cluster with FC-AL takeover).
> > Since true maildir is lockless and multi-access/NFS safe you can
> > have 10 - 20 Courier servers, all mounting the tree, and any system is
> > able to handle any account. No proxy-redirection required. Same applies
> > on the delivery side, any SMTP host can deliver to any account.
>
> This is an excellent point and something to consider depending on your
> availability and performance requirements. From one perspective the
> lockless nature of maildir/courier really makes clustering simple, but I
> doubt it will be as fast at some operations (not even a netapp can
> open/parse/close 4700 files in .050 seconds), and you still depend on a
> single file system.

Horazontal scaling the whole way. A properly done dir structure (and when
your are doing virtual service, the location of a users box is entirely up
to you) can easily be spread across multiple storage systems (read
multiple Netapp clusters). And of course you can always spread to
multiple clusters, though this requires domain seperation or POP/IMAP
multiplexors.

And actually even older netapps will easily support 10,000+ fileops/sec.
You are right that one server via NFS cannot though.

> Cyrus has murder which an IMAP/POP3 aggregator. This allows you to have
> a cluster of servers and split up the mailboxes between them, but that
> doesn't buy you any fault tolerance. To solve that problem, the new
> version of Cyrus now supports replication so you can have your mail
> sync'd to a hot standby in (near) realtime, but this solution is much
> more complex than the courier setup mentioned above.

Yep, I love having 8, 16, 32, etc identical systems (running the exact
same config files for the mail software). It really beats multiple
configs and multiplexing when it comes time to scale or rebuild systems.

> Some people have deployed two instances of cyrus on two servers, and
> setup half the mailboxes on one server as primary, and the other half on
> the second server as primary. They then have the second instance on
> server A act as secondary to the primary instance on server B and vise
> versa. This is complex, but does buy you complete redundancy without
> buying a netapp.

I am skeptical about the hot/backup server account replication. The
courier crowd has been working on this for (IIRC) over 6 years, and they
always had issues with replication/syncing performance and failover. I
have never seen (outside of small test systems) this work.

And, I would trust a FCAL or SCSI multi shared volume, a NetApp cluster
(FCAL multitarget with service takeover) over a complex software solution.

> Given the pros and cons, I would probably mock up both solutions in my
> lab before making the call. On one hand cyrus is very efficient and
> very quick, but has a bit more complexity when doing replication. On
> the other hand courier looks very simple to cluster, but probably would
> not perform as well when fetching headers or searching.

I have personally built/run Cyrus and Courier system with 10,000+ users,
and Courier up to almost 50,000.

I am biased, I like simple and hard to break.

Back when both projects started, Cyrus went for indexing and accel, at the
expense of maildirs strongpoints, multi-access, hard to corrupt,
compatable. Courier tried to add to/improve while maintaining the
strengths of maildir. I prefer the courier model, specially when we start
talking about dozens of servers and tens of thousands of users (think
corrupted indexes/etc).

As to performance, last mockup I did was a few years back, with systems
running the same message load on the same hardware/OS (Quad Xeons, 2G
RAM, hardware RAID5 USCSI, Slackware, 2.4.x kernels).

Overall was about the same, with message lookup being very slightly faster
on Cyrus for accounts with < 1000 messages, and moderatly faster with
10,000 messages. Message deliver was the reverse, with Courier (maildrop
LDA) running 15 messages/sec faster (35 vs 20) system wide. Both system
were doing Quota enforcement. And while Courier deliver performance
stayed nearly the same as the message count in each account grew (forced
quota recalc every 15 min slowed a bit), Cyrus dropped to 12 messages/sec
once the average message count per account topped 5,000.

My summary...

Add 10% IO (ops and throughput) to your backend storage solution, and 10%
CPU to your servers and reap the benifits of horazontal scaling.

I figure the lack of multiplexors alone pays for a couple extra server to
keep things peppy.

Course *none* of this really matters until you are talking 10s of
thousands of users with a HA requirment. At lower levels, a quality
external RAID chassis on a quality server will do you just fine. If you
want to get finicky, keep a full warm spare server ready to mount the
array. Just make sure to keep both server on isolated SCSI buses.

---------
To unsubscribe, send email to <aklug-request@aklug.org>
with 'unsubscribe' in the message body.
Received on Thu Sep 7 03:21:17 2006

This archive was generated by hypermail 2.1.8 : Thu Sep 07 2006 - 03:21:17 AKDT