Google is eating our mail

25.04.2019 20:06

I've been running a small SMTP and IMAP mail server for many years, hosting a handful of individual mailboxes. It's hard to say when exactly I started. whois says I registered the tablix.org domain in 2005 and I remember hosting a mailing list for my colleagues at the university a bit before that, so I think it's safe to say it's been around 15 years.

Although I don't jump right away on every email-related novelty, I've tried to keep the server up-to-date with well accepted standards over the years. Some of these came for free with Debian updates. Others needed some manual work. For example, I have SPF records and DKIM message signing setup on the domains I use. The server is hosted on commercial static IP space (with the very same IP it first went on-line) and I've made sure with the ISP that correct reverse DNS records are in place.

Homing pigeon

Image by Andreas Trepte CC BY-SA 2.5

From the beginning I've been worrying that my server would be used for sending spam. So I always made sure I did not have an open relay and put in place throughput restrictions and monitoring that would alert me about unusual traffic. In any case, the amount of outgoing mail has stayed pretty minimal over the years. Since I'm hosting just a few personal accounts these days, there have been less than 1000 messages sent to remote servers over SMTP in the last 12 months. I've given up on hosting mailing lists many years ago.

All of this effort paid off and, as far as I'm aware, my server was never listed on any of the public spam black lists.

So why am I writing all of this? Unfortunately, email is starting to become synonymous with Google's mail, and Google's machines have decided that mail from my server is simply not worth receiving. Being a good administrator and a well-behaved player on the network is no longer enough:

550-5.7.1 [...] Our system has detected that this
550-5.7.1 message is likely unsolicited mail. To reduce the amount of spam sent
550-5.7.1 to Gmail, this message has been blocked. Please visit
550-5.7.1  https://support.google.com/mail/?p=UnsolicitedMessageError
550 5.7.1  for more information. ... - gsmtp

Since mid-December last year, I'm regularly seeing SMTP errors like these. Sometimes the same message re-sent right away will not bounce again. Sometimes rephrasing the subject will fix it. Sometimes all mail from all accounts gets blocked for weeks on end until some lucky bit flips somewhere and mail mysteriously gets through again. Since many organizations use Gmail for mail hosting this doesn't happen just for ...@gmail.com addresses. Now every time I write a mail I wonder whether Google's AI will let it through or not. Only when something like this happens you realize just how impossible it is to talk to someone on the modern internet without having Google somewhere in the middle.

Of course, the 550 SMTP error helpfully links to a wholly unhelpful troubleshooting page. It vaguely refers to suspicious looking text and IP history. It points to Bulk Sender Guidelines, but I have trouble seeing myself as a bulk sender with 10 messages sent last week in total. It points to the Postmaster Tools which, after letting me jump through some hoops to authenticate, tells me I'm too small a fish and has no actual data to show.

Screenshot of Google Postmaster Tools.

So far Google has blocked personal messages to friends and family in multiple languages, as well as business mail. I stopped guessing what text their algorithms deem suspicious. What kind of intelligence sees a reply, with the original message referenced in the In-Reply-To header and part quoted, and considers it unsolicited? I don't discount the possibility that there is something misconfigured at my end, but since Google gives no hint and various third-party tools I've tried don't report anything suspicious I've ran out of ideas where else to look.

My server isn't alone with this problem. At work we use Google's mail hosting and I've seen this trigger happy filter from the other end. Just recently I've overlooked an important mail because it ended up in the spam folder. I guess it was pure luck it didn't get rejected at the SMTP layer. With my work email address I'm subscribed to several mailing lists of open source software projects and regularly Google will decide to block this traffic. I know since Mailman sends me a notification that my address caused excessive bounces. What system decides, after months of watching me read these messages and not once seeing me mark one as spam, that I suddenly don't want to receive them ever again?

Screenshot of the mailing list probe message.

I wonder. Google as a company is famously focused on machine learning through automated analytics and bare minimum of human contact. What kind of a signal can they possibly use to train these SMTP rejects? Mail gets rejected at the SMTP level without user's knowledge. There is no way for a recipient to mark it as not-spam since they don't know the message ever existed. In contrast to merely classifying mail into spam/non-spam folders, it's impossible for an unprivileged human to tell the machine it has made a mistake. Only the sender knows the mail got rejected and they don't have any way to report it either. One half of the feedback loop appears to be missing.

I'm sure there is no malicious intent behind this and that there are some very smart people working on spam prevention at Google. However for a metric driven company where a majority of messages are only passed with-in the walled garden, I can see how there's little motivation to work well with mail coming from outside. If all training data is people marking external mail as spam and there's much less data about false positives, I guess it's easy to arrive to a prior that all external mail is spam even with best intentions.

This is my second rant about Google in a short while. I'm mostly indifferent to their search index policies, however this mail problem is much more frustrating. I can switch search engines, but I can't tell other people to go off Gmail. Email used to work, from its 7-bit days onward. It was one standard thing that you could rely on in the ever changing mess of messaging web apps and proprietary lock-ins. And now it's increasingly broken. I hope people realize that if they don't get a reply, perhaps it's because some machine somewhere decided for them that they don't need to know about it.

Posted by Tomaž | Categories: Life | Comments »

When the terminal is not enough

19.04.2019 9:26

Sometimes I'm surprised by utilities that run fine in a terminal window on a graphical desktop but fail to do so in a remote SSH connection. Unfortunately, this seems to be a side-effect of software on desktop Linux getting more tightly integrated. These days, it's more and more common to see a command-line tool pop up a graphical dialog. It looks fancy, and there might be security benefits in the case of passphrase entries, but it also means that often the remote use case with no access to the local desktop gets overlooked.

The GNOME keyring management is one offender I bump against the most. It tries to handle all entry of sensitive credentials, like passwords and PINs on a system and integrates with the SSH and GPG agents. I remember that it used to interfere with private SSH key passphrase entry when jumping from one SSH connection to another, but that seems to be fixed in Debian Stretch.

On the other hand, running GPG in a SSH session by default still doesn't work (this might also pop up when, for example, signing git tags):

$ gpg -s gpg_test
gpg: using "0A822E7A" as default secret key for signing
gpg: signing failed: Operation cancelled
gpg: signing failed: Operation cancelled

This happens when that same user is also logged in to the graphical desktop, but the graphical session is locked. I'm not sure exactly what happens in the background, but something somewhere seems to cancel the passphrase entry request.

The solution is to set the GPG agent to use the text-based pin entry tool. Install the pinentry-tty package and put the following into ~/.gnupg/gpg-agent.conf:

pinentry-program /usr/bin/pinentry-tty

After this, the passphrase can be entered through the SSH session:

$ gpg -s gpg_test
gpg: using "0A822E7A" as default secret key for signing
Please enter the passphrase to unlock the OpenPGP secret key:
"Tomaž Šolc (Avian) <tomaz.solc@tablix.org>"
4096-bit RSA key, ID 059A0D2C0A822E7A,
created 2013-01-13.

Passphrase:

Update: Note however that with this change in place graphical programs that run GPG without a terminal, such as Thunderbird's Enigmail extension, will not work.

Another offender are PulseAudio- and systemd-related tools. For example, inspecting the state of the sound system over SSH fails with:

$ pactl list
Connection failure: Connection refused
pa_context_connect() failed: Connection refused

Here, the error message is a bit misleading. The problem is that the XDG environment variables aren't set up properly. Specifically for PulseAudio, the XDG_RUNTIME_DIR should be set to something like /run/user/1000.

This environment is usually set by pam_systemd.so PAM module. However, some overzealous system administrators disable the use of PAM for SSH connections, so it might not be set in an SSH session. To have these variables set up automatically, you should have the following in your /etc/ssh/sshd_config:

UsePAM yes
Posted by Tomaž | Categories: Code | Comments »