Best way to test high-volume SMTP email sending code? - c#

I've written a component in a Windows service (C#) which is responsible for sending sometimes large volumes of emails. These emails will go to recipients on many domains – really, any domain. (Yes, the recipients want the email. No, I'm not spamming. Yes, I'm in complaince with CAN-SPAM. Yes, I'm aware sending email from code sucks.) Many of the emails are transactional (generated in response to user actions); some are bulk (mail-merges basically).
I do not want to rely on an external SMTP server. (Among other considerations, the thought of having to check a mailbox for bounce messages and trying to parse them gives me bad feelings.)
My design is fairly simple. Both the transactional and bulk messages are generated and inserted into a DB table. This table contains the email envelope and content, plus an attempt count and retry-after date.
The service runs a few worker threads which grab 20 rows at a time and loop through each. Using the Simple DNS Plus library, I grab the MX record(s) of the recipient's domain and then use System.Net.Mail.SmtpClient to synchronously send the email. If the call to Send() succeeds, I can dequeue the email. If it temporarily fails, I can increment the attempt count and set an appropriate retry-after date. If it permanently fails, I can dequeue and handle the failure.
Obviously, sending thousands of test emails to hundreds of different actual domains is a Very Bad Idea. However, I definitely need to stress-test my multi-threaded send code. I'm also not quite sure what the best way is to simulate the various failure modes of SMTP. Plus, I want to make sure I get past the various spam control methods (graylisting to name the most relevant to the network layer of things).
Even my small-scale testing difficulties are exacerbated by my recent discovery of my ISP blocking connections to port 25 on any server other than my ISP's SMTP server. (In production, this thing will of course be on a proper server where port 25 isn't blocked. That does not help me test from my dev machine.)
So, the two things I'm most curious about:
How should I go about testing my code?
What are the various ways that SmtpClient.Send() can fail? Six exceptions are listed; SmtpException and SmtpFailedRecipientsException seem to be the most relevant.
Update: Marc B's answer points out that I'm basically creating my own SMTP server. He makes the valid point that I'm reinventing the wheel, so here's my rationale for not using an 'actual' one (Postfix, etc) instead:
Emails have different send priorities (though this is unrelated to the envelope's X-Priority). Bulk email is low priority; transactional is high. (And any email or group of emails can be further configured to have an arbitrary priority.) I need to be able to suspend the sending of lower-priority emails so higher-priority emails can be delivered first. (To accomplish this, the worker threads simply pick up the highest priority items from the queue each time they get another 20.)
If I've already submitted several thousand bulk items to an external SMTP server, I have no way of putting those on hold while the items I wish to submit now get sent. A cursory Google search shows Postfix doesn't really support priorities; Sendmail prioritizes on information in the envelope, which does not meet my needs.
I need to be able to display the progress of the send process of a blast (group of bulk emails) to my users. If I've simply handed all of my emails off to an external server, I have no idea how far along in actual delivery it is.
I'm hesitant to parse bounce messages because each MTA's bounce message is different. Sendmail's is different from Exchange's is different from [...]. Also, at what frequency do I check my bounce inbox? What if a bounce message itself isn't delivered?
I'm not too terribly concerned with a blast failing half-way through.
If we're talking catastrophic failure (app-terminating unhandled exception, power failure, whatever): Since the worker threads dequeue each email from the database when it is successfully delivered, I can know who has received the blast and who hasn't. Further, when the service resets after a failure, it simply picks up where it left off in the queue.
If we're talking local failure (a SmtpException, DNS failure, or so forth): I just log the failure, increment the email's attempt counter, and try again later. (Which is basically what the SMTP spec calls for.) After n attempts, I can permanently fail the message (dequeue it) and log the failure for examination later. This way, I can find weird edge cases that my code isn't handling – even if my code isn't 100% perfect the first time. (And let's be honest, it won't be.)
I'm hoping the roll-my-own route will ultimately allow me to get emails out faster than if I had to rely on an external SMTP server. I'd have to worry about rate-limiting if the server weren't under my control; even if it were, it's still a bottleneck. The multithreaded architecture I've gone with means I'm connecting to multiple remote servers in parallel, decreasing the total amount of time it takes to deliver n messages.

Assume you've got two servers available. One will be the sender, one will be the receiver. You can set up DNS (or even just hosts files) on both with a long series of fake domains. As far as the two servers are concerned, those domains are perfectly valid as the local DNS servers are authoritative for them, but are completely invalid as far as the rest of the net is concerned. Just make sure the resolver checks the hosts file before DNS.
Once that's done, you can have the sending server spam the receiving server to your heart's content, as have the receiver do various things to test your code's reactions. Greylisting, TCP delays, hard bounces, ICMP unreachables, ICMP hops exceeded, etc...
Of course, given you have to test all these conditions, you're basically creating your own SMTP server, so why not use an actual one to begin with? I'd guess the effort required to do some basic parseing of bounce messages will be far less than having to come up with code chunks to handle all the failure modes that postfix/sendmail/exim/etc... already handle perfectly well on their own.
And this is especially true when you consider your sending code has to be perfect from the get-go. If an email blast fails part-way through and only half the recipient list gets the message, you're in a far bigger hole than if a few hundred or a few thousand messages bounce. Or worse yet, fails in multiple different ways (some servers unreachable, some greylisting you for excessive traffic, etc...). Whereas bounces will happily sit in the incoming queue until you process them manually, or patch up your bounce parser to handle them.

After searching around, I ended up firing up Papercut on several extra machines I had laying around. I then populated my database with test addresses *#[test-machine-*.local].
While this did work well enough, I tested with 25 send threads and it looked like I was overwhelming the four computers running Papercut. Several hundred send attempts experienced TCP connection failures; those messages were properly requeued to be sent later (and ultimately did arrive). However, out of 25,000 test emails, about 500 simply disappeared – adding up the *.eml files in Papercut's folder on each test machine yielded only ~24,500.
Now I'm left wondering whether the missing emails are a due to a problem in my code, or if Papercut dropped messages which it reported in SMTP as 250 OK.

Related

Concerned about methods of backlogging network messages in the event of a LAN outage

I'm working in a Windows 7 Embedded environment, with very few resources left after everything is loaded in.
There are about 8 different message types that the clients (Win 7 Embedded) can send to the server, but only 2 of them are of high importance (the must be sent after any period of network outage). The other 6 messages I have set to retry for about 30 seconds if there is a send failure.
My concern is how I will be holding these messages in memory. My two ideas so far are:
1) having the threads the messages are trying to send on sleep until network connectivity is regained (this could lead to a lot of sleeping threads if, say, the network was out for several days).
2) writing messages that were unable to send to a file, then flushing the file and sending each message when connectivity is regained.
What I'm wondering is if method 1 would cause too much overhead, if say 50 threads were in a 'Sleep' state?
If so, should I go with option 2?
Perhaps there's another, more clever option, I haven't considered yet also.
2 of them are of high importance
If this is the case, you should probably not rely on leaving the messages in memory, especially with the consideration
if, say, the network was out for several days
If there is a power loss or other restart/failure, the messages would be lost if in RAM.
I suggest serializing the messages to a persistent store (e.g. disk). When you detect that network connectivity is restored, check for serialized messages and send them then. Make sure the message is transmitted before deleting from disk (just in case the network goes down again after you first detected it up, but before you send the message).

Is there a way to return any unsent messages in zeromq?

I currently have a system in which the apps ping each other to check they're still alive. I have 1 client, which sends requests, 1 router which distributes them and 2 workers which do the work and return the results.
If a worker dies, the router works it out and only sends to the other. This gives me time to see whats gone wrong and act accordingly. If the router dies, then the client knows about it. What I would like to know however is if my client works out that the router's died, (which it can do), then it takes any messages being queued up and simply sends them back.
It's a chat app, so somebody will send a message, this will go to the client dealer port. The dealer will suddenly go "waaah the router's dead!" and send the message back along with an error to say the system is currently down. This way if catastrophy hits, users will know that the system is down and stop sending messages until it comes back up.
Is there a way to do this? Am I right in saying that the high water mark knows internally how many messages have currently not been sent, and maybe I could use some function to get all the queued messages back off zero mq?
Many thanks
Nope, you'll have to handle it in your application. You'll need to keep a copy of the message, if ZMQ pukes and is unable to send it, you can manage that appropriately in your application. Once you have determined it has gone through, then you can discard it.
For those who have a similar thought, I found the best solution was actually to add a timeout on the client side. Cache the last data requests on the users browser and if there hasnt been a reply within a suitable time frame, file an error report.
Failing that, the only other options I could find were to either handle the caching yourself or use a different messaging platform (rabbitmq, activemq). The memory requirements alone bothered me too much to consider these however personally.

Why would you send email in a batch instead of sending individually?

We have a Windows Service to check if certain conditions are met, we will send an email to the customer. We will have about 50 emails to send everyday. My question is, is it better to send emails out individually (i.e. every time the condition is met, triggers sendmail function) or queue all the emails and send in a batch? Is it better to send in a batch because of performance reasons? But we only send about 50 emails a day so it doesn't matter too much? How would you queue the emails if the emails should be send in a batch?
Many Thanks
Generally the reason for batching multiple e-mails into a single e-mail is so as not to irritate the recipient, rather than for performance reasons. 50 e-mails throughout the day can be very annoying and will quickly cause the recipient to "tune out," whereas a single e-mail containing all pertinent notifications may be easier to digest.
As to how to re-queue the e-mails, it would be best if you could modify the service itself to store the outgoing e-mails in a file or buffer and only send the contents of that file or buffer once a certain threshold has been reached - be that a time threshold or a size threshold.
If you're only sending 50 emails a day, the point is entirely moot.
In terms of the actual server, SMTP doesnt care if you send in batch or individually, it is just working through an email backlog queue to send out.
The only real concern should be if you need to continously query your DB for emails to send, or if you want to query your DB for batches of email to send (to cut down on DB queries).
Ignoring performance, there's another thing you need to keep in mind: does the user need that e-mail as soon as the condition is met? If so, don't even consider queueing the messages unless you're emptying the queue on a nearly constant basis.
Otherwise, it's really up to you. 50 e-mails a day isn't going to break your server, so I wouldn't worry about performance. On the other hand, if it's convenient for you and your users to send the e-mails in batches, go for it.

Test c# email sending speed

I have a c# application that sends an email out to all employees in my database (not XPmail.)
I have over 300 employees and I was told it is a little slow. IS there anyway I can test the speed of CC'ing 300 employees and sending it out? I cant time stamp each email since its all carbon copied after the read loop in the database.
The first thing to check is whether you're sending 300 e-mails to 1 person each or 1 e-mail bcc'd (not to or cc'd, bcc'd) to 300 people. If the former, you really should do the latter. Even better, you should have a distribution list set up on your server for this.
Regardless, the problem is almost certainly at your e-mail (smtp) server. There won't be anything you can change in your code to make it faster, and using a different language or platform won't help — it's all up to the smtp server and the bandwidth available.
Sending a single email with many CC's or BCC's is exactly that - a single email. From that point, it's up to the mail server to dispatch the individual messages. Although you likely have little control over the mail software itself, it should always be faster than queuing up 300 individual messages.

Meaningful interaction with IIS SMTP Server in .Net

Our business sends a newsletter to a vast number of subscribers every week. When the business was very young, before I joined, they used a "free" version of some mass mailer that took six hours to send 5K mails and fell foul of every reverse DNS check on the internet.
I upgraded this to a bespoke .Net widget that ran on the correct server and could send up to about 20k mails in half an hour with full DNS compliance. Unfortunately (or fortunately depending on your standpoint) our mail list has now outgrown this simple tool. In particular its lack of adequate throttling, it can make more mails than the server can comfortably send at once. I need to actually monitor how full the IIS SMTP server's available outgoing mail storage allocation is and throttle the load accordingly.
Unfortunately I can find no information on where a mail object goes when (or even if) it is turned into a mail. I can implement a filesystemwatcher if I have a place to watch, currently I don't. If no actual mail file is ever created I guess I will have to create one to implement the functionality but I need to know where to put it. It would also be more reassuring to allow the system to confirm sending somehow but I have no idea how to go about retrieving data from the system that says a mail has been sent.
Extensive Googling has proven vague on these points; so I was wondering if anyone here knew where I could get a guide to these problems, or could otherwise point me in the right direction.
Many thanks.
EDIT: In the end I gave up trying to measure throughput on the IIS SMTP server as a bad job. It just didn't seem to want to play. I'm now carrying out my logging in a separate location and just shunting it through to the SMTP server thereafter. I still don't know of anyone who really bothers trying to keep tabs on the doings of the IIS SMTP server and so this question as of this writing goes unanswered.
Oh well...
Okay so I've been working on this project for ages now and I thought I might share my findings with the world.
The IIS SMTP Server
All mails created using the IIS SMTP server are sent, in the first instance, to the Pickup Directory. If you are sending one mail then you will have to operate in Matrix time to actually ever see it there because it will probably just go, straight away.
On an individual mail's way out of the door it passes through the queue folder in IIS.
If you wanted to watch the Performance Counter to monitor this process you ould look at the "Remote Queue Length". (The reason for this is that the "Local Queue Length" monitors mails sent "Locally" within the network. "Remote" in this instance refers to "Outside into the world". The specific definition of "Local" escapes me as we send no local mail but I imagine it means queued to go to mailboxes contained within the specific installation of IIS on the server or any local grouping thereof.)
From an Exchange point of view it seems to be the equivalent of mails sent within the Exchange Domain and those sent out of that domain into the wider world.
Anyhow. The Remote Queue Length doesn't tell the whole story. You also have to look at the Remote Retry Queue, the number of Current Outbound Connections and, for belt and braces sake the actual number of files in the queue directory.
Here's why:
Remote Queue: All messages that have not
yet been sent, however many times
this has been tried. The number of
mails currently assigned to any open
connections are not counted as they
are in a state of "being tried".
Remote Retry Queue: All messages that
have not yet been sent that have, at
some point in the past, been assigned
to an open connection for delivery.
Obviously the delivery must have
failed or the message would have been
delivered. Any messages currently
assigned to an open connection for a
retry are not counted.
Current Outbound Connections: Shows when the
server is attempting to send queued
mails, more than one message can be
assigned to an outbound connection.
Messages thus assigned are not
counted in either the Remote Queue or
the Remote Retry queue. Physical
Files in the queue directory: This
shows the number of mails still in
the Queue directory. This will
decrease as mails are successfully
delivered.
Example: If you have 0 outbound connections and 50 mails in the Queue directory then the Remote Queue, Retry Queue and Physical files will all read at 50. When a retry flag is raised (this is a setting in IIS) the number of connections increases and the number of mails in the queues decreases. Until a mail is delivered the number of physical files remains the same. However as more than one mail can be sent on a current connection 1 connection may result in Remote Queue and Retry Queue lengths of 47 or lower. If, during the retry event, any mails are successfully delivered the number of physical files in the Queue directory will then decrease. When the connection closes the queue counters should all stabilise again.
Logging
It is possible with .Net's Mail library to Specify a Pickup directory separate from the IIS default. Here you can queue mails and get a bespoke service to occasionally move the mails into the IIS directory where the IIS service will take over and send out queued mails.
To do this you will be looking for the SmtpClient object's "DeliveryMethod" property which should be set to SmtpDeliveryMethod.SpecifiedPickupDirectory.
To actually set the SpecifiedPickupDirectory you should set the SmtpClient's PickupDirectoryLocation property.
When mails are delivered to this location they are stored as .eml files. The filename is a GUID. This means that multiple emails will be despatched in an essentially random order. You could, in theory, write code to address this situation if desired. The .eml file follows a standard format which can be read by opening the .eml in notepad. Parsing this will allow you to extract information for a log.
I hope this high level overview of the way the SMTP server in IIS works is of some assistance to someone in a similar position to the one I was in in March.
I would use the PerformanceCounter component to read the SMTP Service's Local Queue Length counter. That should keep you in control :-)
If your .net widget is bespoke, why not just throttle it's output to some (definable) throughput?
As an alternative you might be able to fiddle with some registry settings for the SMTP server.
http://blog.rednael.com/CommentView,guid,dc20366c-3629-490a-a8ee-7e8f496ef58b.aspx
Apparently there are also some WMI counters (SMTP Server\Remote Queue Length and SMTP Server\Remote Retry Queue Length) that will give you useful information.
http://www.tech-archive.net/Archive/Internet-Server/microsoft.public.inetserver.iis.smtp_nntp/2008-02/msg00011.html

Categories

Resources