After analyzing millions of DMARC reports, I came to the disappointing conclusion that only a fraction of them comply with the DMARC IETF RFC guidelines. Most of them lack mandatory elements or hold incorrect element values. I think domain owners deserve better from organizations like Google, Yahoo! and Mail.ru.
DMARC aggregate reports are a valuable source of information when it comes to monitoring email deliverability and DKIM and SPF policy alignment. They provide feedback to domain owners so they can monitor authentication and judge threats. Reports can also be used to set up and harden SPF and DKIM policies, as this will minimize the risk of third parties sending spam or phishing emails on behalf of the domain owner.
The DMARC IETF RFC also includes an XML schema with guidelines on what the report should look like (Appendix C). Some of these guidelines are not that well written and that is possibly why a lot of parties have trouble to comply with them.
We at URIports receive thousands of DMARC reports every day and have noticed that many reports do not validate. In most cases, missing information can lead to inconsistencies and guesswork. The more accurate and complete these reports are, the better are domain owners aware of their email deliverability.
In the table below we show a real-time list of the organizations that we've received the most (non-compliant) reports from in the last 7 days.
If an organization has 0% compliant reports, that means that every single received report contained a validation error. Either an element was missing, or had an incorrect or empty value.
The DMARC aggregate report has an element with the name "policy_published". This name would indicate that the elements within, contain the domain's published policy. The RFC explains this element as "PolicyPublishedType":
The comments mention "applied" which is in contrast to the name of the element ("policy_published"), as some organizations that send aggregate reports do not send failure reports and thus do not "apply" the "fo" (Failure reporting options) element. This particular element's comment also implies that it is optional: "failure reporting options in effect". On the other hand, this element has a default "minOccurs" value of 1, so it should not be omitted.
If you ask me, the comments are to blame, and that's why so many organizations have a different implementation. I think the element 'policy_published' should just be just that: 'the published policy'. When a policy tag is omitted because it is optional (adkim, aspf, sp, pct and fo), the tag's default value should be used.
A common missing value in the 'identifiers' element is 'envelope_from'. It should contain the RFC5321.MailFrom domain, but is left out by some organizations, even though it has a minOccurs value of 1 and could contain valuable information. Fastmail and SpamTitan have a near-perfect score, if it wasn't for a few percent of reports without the 'envelope_from' element. I've reached out to ask why only some reports lack this element, but they haven't responded yet.
UPDATE 2019-07-31: After some analysis I've discovered that the records with a missing 'envelope_from' element have a 'helo' SPF scope. Both parties use the same Mail::DMARC generator, so I've filed an issue on their GitHub page.
UPDATE 2019-08-06: Alexander Brotman from Comcast has brought to my attention that some messages (non-delivery notifications, delivery status notifications and message disposition notifications) have a 'null reverse-path'. The question now is whether these messages should be included in DMARC aggregate reports. I have submitted this question to the DMARC IETF working group and removed the penalty for organization that do not specifying the 'envelope_from' in less than 5% of their reports.
Some inconsistencies are easily fixed, like the missing or empty DKIM selector and SPF scope, and the empty 'extra_contact_info' that should just be omitted instead of having an empty value.
Drafts VS RFC
XML elements IdentifierType 'envelope_from', SPFAuthResultType 'scope', DKIMAuthResultType 'selector', and feedback 'version' were all added to the DMARC (pre-IETF) draft in January 2013. It looks like most organizations (even the founding contributors like Google, Yahoo! and LinkedIn) based their code on the pre 2013 drafts and haven't touched that code since. This explains why XML elements are missing from their reports. The PolicyPublishedType 'fo' element was added to draft-kucherawy-dmarc-base-02 in December 2013. You would expect that as soon as the RFC was published in March 2015, the organizations would update to the final XML schema, unfortunately most didn't.
As more and more domains adopt DKIM, SPF and DMARC, I'm hoping more aggregate report sending parties will make the reports as complete as possible so we at URIports can help domain owners in making the internet a better and safer place.
Have any comments? Do you (dis)agree with my interpretations of the RFC guidelines? Please, let me know. Find me on Twitter @freddieleeman.