After analyzing millions of DMARC reports, I came to the disappointing conclusion that only a fraction of them comply with the DMARC IETF RFC guidelines. Most of them lack mandatory elements or hold incorrect element values. I think domain owners deserve better from organizations like Google, Yahoo!, and Mail.ru.
DMARC aggregate reports are a valuable source of information when it comes to monitoring email deliverability and DKIM and SPF policy alignment. They provide feedback to domain owners so they can monitor authentication and judge threats. Reports can also be used to set up and harden SPF and DKIM policies, as this will minimize the risk of third parties sending spam or phishing emails on behalf of the domain owner.
The DMARC IETF RFC also includes an XML schema with guidelines for the report (Appendix C). Unfortunately, some of these guidelines are not that well written, which is possibly why many parties have trouble complying with them.
We at URIports receive thousands of DMARC reports every day and have noticed that many reports do not validate. In most cases, missing information can lead to inconsistencies and guesswork. The more accurate and complete these reports are, the more aware domain owners are of the deliverability of their emails.
The table below shows a real-time list of the organizations from which we've received the most (non-compliant) reports in the last three days.
If an organization has 0% compliant reports, every single report contains a validation error. Either an element was missing or had an incorrect or empty value.
The DMARC aggregate report has an element with the name policy_published. This name would indicate that the elements within contain the domain's published policy. The RFC explains this element as PolicyPublishedType:
The comments mention applied, which is in contrast to the name of the element (policy_published), as some organizations that send aggregate reports do not send failure reports and thus do not "apply" the "fo" (failure reporting options) element. This particular element's comment also implies that it is optional: "failure reporting options in effect." On the other hand, this element has a default minOccurs value of 1, so it should not be omitted.
The comments are to blame, and that's why so many organizations have a different implementation. I think the element policy_published should be just that: the published policy. The default value should be used when a policy tag is omitted because it is optional (adkim, aspf, sp, pct, and fo).
A typical missing value in the identifiers element is envelope_from. It should contain the RFC5321.MailFrom domain, but is left out by some organizations, even though it has a minOccurs value of 1 and could contain valuable information. Fastmail and SpamTitan have a near-perfect score if it wasn't for a few percent of reports without the envelope_from element. I've reached out to ask why only some reports lack this element, but they haven't responded yet.
UPDATE 2019-07-31: After some analysis, I've discovered that the records with a missing envelope_from element have a helo SPF scope. Both parties use the same Mail::DMARC generator, so I've filed an issue on their GitHub page.
UPDATE 2019-08-06: Alexander Brotman from Comcast has brought to my attention that some messages (non-delivery notifications, delivery status notifications, and message disposition notifications) have a null reverse-path. The question now is whether these messages should be included in DMARC aggregate reports. I have submitted this question to the DMARC IETF working group and removed the penalty for organizations that do not specify the envelope_from in less than 5% of their reports or specify the helo SPF scope.
Some inconsistencies are easily fixed, like the missing or empty DKIM selector and SPF scope, and the empty extra_contact_info that should just be omitted instead of having an empty value.
Drafts VS RFC
XML elements IdentifierType 'envelope_from, SPFAuthResultType scope, DKIMAuthResultType selector, and feedback version were all added to the DMARC (pre-IETF) draft in January 2013. It looks like most organizations (even the founding contributors like Google, Yahoo!, and LinkedIn) based their code on the pre-2013 drafts and haven't touched that code since. This explains why XML elements are missing from their reports. The PolicyPublishedType fo element was added to draft-kucherawy-dmarc-base-02 in December 2013. You would expect that as soon as the RFC was published in March 2015, the organizations would update to the final XML schema; unfortunately, most didn't.
As more and more domains adopt DKIM, SPF, and DMARC, I'm hoping more aggregate report sending parties will make the reports as complete as possible so we at URIports can help domain owners to make the internet a better and safer place.
Any comments? Do you (dis)agree with my interpretations of the RFC guidelines? Please, find me on Twitter @freddieleeman.