I spent a good part of the last few days trying to debug a very weird problem involving postfix and opendkim, so I thought it was a good idea to write the entire experience down for anybody who might be encountering the same (or a similar) problem. This was probably the weirdest misbehaviour I managed to trigger without involving any real bug…
On a system I control, I installed opendkim for signing only and configured postfix to interact with it: installation was smooth as usual and everything was deployed in an hour or two. The emails sent by the system are partly anonymized and some headers are therefore stripped before the mail goes out. For this reason, DKIM was configured to sign only some of the headers (and not all of them) or the signatures would fail to validate for remote users.
I sent a few test emails to Gmail and everything seemed to be fine: mail signed, signature header as expected, Google verifying the signature correctly and so on. So I told other users of the same server that the feature was enabled and to poke me in case something was wrong. Immediately one user wrote back saying he wasn’t seeing any signature at all in the emails he was sending.
I checked the logs for his email and found this:
Mar 24 12:22:56 myserver opendkim[32082]: 0CA2F2F1: can’t determine message sender; accepting
The presence of a “From” or “Sender” header within the email is mandatory for DKIM, otherwise the mail can’t be signed; this message was saying that the mail had none and was therefore refusing to sign it.
Easy.
We repeated the experiment a few times and observed that in fact some of the emails sent by this user had a signature applied, but most had none. We quickly realized a weird aspect, however: every email, when inspected on the receiving side, had all its headers in place, From included, so the problem could not be caused by the MUA, that was my first, obvious, candidate as a culprit for the misbehaviour…
Perhaps it was opendkim’s fault: failing to properly parse a From header that was actually there… But then why nobody was reporting anything about such a major problem?!?
Fortunately, on the same machine I also run another milter, something I wrote myself that I had no problem modifying, so I used it to log every header it received from postfix through the milter interface and compare these logs with the emails opendkim was refusing to sign. Result? No doubt, opendkim was right: no From header was being passed!
WTF?
As a next step, I added a rule in my header_checks, like this:
/^\S+:/ WARN HEADER_DEBUG
The meaning is: trigger a warning for every header, so I can see in my logs every header postfix sees, like the following:
Mar 24 16:45:02 myserver postfix/cleanup[22026]: A473C429D2: warning: header Subject: Test 1234 from unknown[1.1.1.1]; from=<dkimsender@skullkrusher.net> to=<dkimrecipient@skullkrusher.net> proto=ESMTP helo=<[1.1.1.1]>: HEADER_DEBUG
This way I could compare the headers seen by postfix with the headers passed to the milters and found a slightly different correlation: it wasn’t the From header only. Where the From header was present, another header was missing. Actually, in every single email there was a missing header and after a few more tests I also figured out how the missing header was always the first one.
This reminded me something.
Milters do not receive all the headers from postfix; one is actually excluded: the Received header added by postfix itself. This happens to guarantee compatibility with milters written for sendmail, as sendmail doesn’t pass that header either and milters therefore do not expect the header to be there.
After a short search I managed to find out a thread I read a while back:
https://groups.google.com/forum/#!topic/mailing.postfix.users/O5arYSqexis
The interesting part is this:
I wrote a small milter to archive incoming emails. While testing I
noticed, that I never got the Received line inserted by the local
Postfix itself. A tcpdump of the milter traffic doesn’t show the
Received line too. Out of curiosity I prepended a header line with
smtpd_sender_restrictions/./ PREPEND X-MTA: Local
and checked again, whether I could see this X-MTA line in the milter
header callback. I did not, but to my surprise the previously missing
Received line was there. Removing the X-MTA PREPEND made the Received
line dissappear again.So it seems to me, that the milter client eats the first header line
and the milter does not see it.
…and I assume you see the light too, now.
Amongst the other anonymization practices I had in place, there was also the removal of the locally-added Received header, through a proper IGNORE entry in my header_checks.
As the HEADER_DEBUG said above, this was taking place before the milter came into play; then, when the mail was passed to the milters, the first header was skipped (even if it wasn’t a Received header), causing the apparent loss of a random header (apparent, ’cause the header was still there, although not seen by the milters). For MUAs that tend to write the From header as the first one (like the one used by my poor user) the missing header was the From, causing opendkim to fail.
Once this was figured out, finding a solution has obviously been trivial: instead of applying an IGNORE to remove the locally added Received, I’m now using a REPLACE rule to strip away sensitive informations without removing the entire header.
This way that header remains in place, it can be again set apart by the milter interface, but all the other headers are passed to applications as expected.
So opendkim is finally happy, and I’m am too…
Just had exactly the same problem. Thank you for pointing out the solution.
How did you do this replace rule?
With something like
/^Received: .*mail\.bofhland\.org.*((?:Mon|Tue|Wed|Thu|Fri|Sat|Sun), .*)$/ REPLACE Received: from mua.bofhland.org (mua.bofhland.org [192.168.1.1]) by mail.bofhland.org; $1
in header_checks.
This rule will strip the original IP, rDNS and HELO for the MUA, replacing them with fixed values while leaving the rest of the information (queue-ID and timestamp) unaltered.
Note that in my case the hostname “mail.bofhland.org” for my local mailserver will only appear in the headers when the mail came in through the MSA smtpd instance, as the default one, listening on port 25, is associated with another hostname.
IOW, the hostname defined in main.cf is something else, while the “submission” entry in master.cf is passed the parameter
-o myhostname=mail.bofhland.org
If this weren’t the case, the rule above would replace the locally-added Received header for any incoming mail too…
Thank you very much indeed for posting this! A good analysis with helpful solutions. I was also using IGNORE and changed it to REPLACE which I actually prefer.