[TAG] Backup MXes considered harmful

Rick Moen rick at linuxmafia.com
Mon Oct 31 21:20:31 MSK 2005


----- Forwarded message from Rick Moen <rick at linuxmafia.com> -----

Date: Mon, 31 Oct 2005 10:12:08 -0800
From: Rick Moen <rick at linuxmafia.com>
To: TAG <tag at lists.linuxgazette.net>
To: Michael Siladi <msiladi at ix.netcom.com>
Cc: Tony Cratz <cratz at hematite.com>,
	Deirdre Saoirse Moen <deirdre at deirdre.net>,
	Bhroam Mann <bmann at starbug.org>
Subject: Re: Fwd: Mail delivery failed: returning message to sender

Quoting Michael Siladi (msiladi at ix.netcom.com):

> I think that Rick's explanation of "MX" clears-up much for me.

Glad to help.

> However, I'm wondering why the primary domain is still rejecting email 
> after several days.  Is Hematite having issues?

I'd speculate that primary MX wwind might have been either _briefly_ 
offline or that the sending host at Earthlink / Netcom might have a
transient problem opening a connection for reasons of its own.  There
are lots of transient connection failures; not necessarily anyone's
fault.  

I say "briefly" because I just manually sent a test mail directly to the 
primary MX, and it worked fine.  Here's how you do it, using nothing
more complex than a telnet client, connecting directly to the
destination MX's SMTP port (TCP port 25):

  ~ $ telnet wwind.hematite.com smtp
  Trying 63.192.0.42...
  Connected to wwind.hematite.com.
  Escape character is '^]'.
  220 hematite.com ESMTP Sendmail 8.12.11/8.12.11;
  mimedefang/SpamAssassin; Mon, 31 Oct 2005 09:46:29 -0800 (PST)
  HELO linuxmafia.com
  250 hematite.com Hello linuxmafia.com [198.144.195.186], pleased to meet you
  MAIL FROM: rick at linuxmafia.com
  250 2.1.0 rick at linuxmafia.com... Sender ok
  RCPT TO: postmaster at westercon60.org
  250 2.1.5 postmaster at westercon60.org... Recipient ok
  DATA
  354 Enter mail, end with "." on a line by itself
  From: rick at linuxmafia.com
  To: postmaster at westercon60.org
  Subject: test message - please ignore

  Hi, Tony.  Just a test message to illustrate SMTP for Michael. 
  Please ignore, and apologies for the intrusion.
  .
  250 2.0.0 j9VHkTHM016627 Message accepted for delivery
  quit
  221 2.0.0 hematite.com closing connection
  Connection closed by foreign host.
  [rick at linuxmafia]
  ~ $ 

Please note that "wwind.hematite.com" and "hematite.com" are two names
for the same machine:

  [rick at linuxmafia]
  ~ $ host wwind.hematite.com
  wwind.hematite.com has address 63.192.0.42
  wwind.hematite.com mail is handled by 10 wwind.hematite.com.
  wwind.hematite.com mail is handled by 20 smtp-relay.pbi.net.
  [rick at linuxmafia]
  ~ $ host hematite.com
  hematite.com has address 63.192.0.42
  hematite.com mail is handled by 20 smtp-relay.pbi.net.
  hematite.com mail is handled by 10 wwind.hematite.com.
  [rick at linuxmafia]
  ~ $


As I was saying, it's ordinary and expected for SMTP hosts to
occasionally be unreachable for short stretches of time from various
remote locations:  The Internet is good, but hardly infallible.  That's
why there's a traditional 4-day timeout period for SMTP delivery.

I therefore have seen (in the prior e-mail) no reason to suspect a
particular problem at wwind:  The problem was (and probably remains)
that at least one of the backup MX hosts that's supposed to be
backstopping wwind in its handling of westercon60.org's mail isn't
relaying that domain's mail.

That has been for many years a recurring problem with backup MX hosts.
Which is one reason why I gave up on the concept.


Here's my attempt to do the same manual SMTP session, except this time
talking to the backup MX, instead of the primary:

  [rick at linuxmafia]
  ~ $ telnet starbug.org smtp  
  Trying 66.120.20.121...
  Connected to starbug.org.
  Escape character is '^]'.
  220 choam.starbug.org ESMTP Sendmail 8.12.10/8.12.10; Mon, 31 Oct 2005 09:56:59 -0800
  HELO linuxmafia.com
  250 choam.starbug.org Hello linuxmafia.com [198.144.195.186], pleased to meet you
  MAIL FROM: rick at linuxmafia.com
  250 2.1.0 rick at linuxmafia.com... Sender ok
  RCPT TO: postmaster at westercon60.org
  550 5.7.1 postmaster at westercon60.org... Relaying denied. Proper authentication required.
  quit
  221 2.0.0 choam.starbug.org closing connection
  Connection closed by foreign host.
  [rick at linuxmafia]
  ~ $

As you can see, the backup MX host is misconfigured:  The admin either
was unaware that he had agreed to relay mail for westercon60.org (a
necessary precondition of serving as backup MX) or didn't know how, or
did know how but overlooked that configuration step (e.g., because
he installed a new sendmail version that like all modern MTAs defaults
to no relaying).

A somewhat paranoid sysadmin verifies that backup MXes are willing to
relay their mail, at the time he/she creates the DNS "MX" entry that 
informs the public they can drop off mail there.  A _truly_ paranoid 
sysadmin knows that a backup MX that isn't screwing up today may start
screwing up tomorrow.

And, in my view, a truly paranoid and _wise_ sysadmin says "You know,
screw it.  I don't need backup MXes anyhow, and any third-party service
that I have to keep checking lest it start enthusiastically rejecting 
my mail, and that can do its worst damage during my own downtime, is 
far too much trouble to keep around."


----- End forwarded message -----





More information about the TAG mailing list