After the paper was published, we have received many feedback from the DANE community; especially, the two authors of RFC7671, Viktor Dukhovni and Wes Hardaker, helped us a lot improve the quality of the paper. This page is for our response to their valuable feedback and we would like to discuss what we have updated to the paper and what we have not, along with our rationale.
1. What we have updated because either we were wrong or the text on the paper was not clear.
The text on “Unsuitable Usages” in Section 5.5 of the paper is a muddle.
We have removed the subsection, Unsuitable Usage, as our previous claims (written below) were wrong: “If the domain owner has a certificate issued by a CA, but serves a TLSA record with the DANE-EE or DANE-TA usage, they do not benefit fully from the security measures that DANE provides (instead, they should use the PKIX-EE or PKIX-TA Certificate Usage) … Therefore, a domain name owner should avoid setting their TLSA records with the DANE-EE or DANE-TA usage when they serve a certificate issued by a CA … Consequently, these records could have used PKIX-EE or PKIX-TA Certificate Usages, thus having the additional benefit of certificate validation through two independent mechanisms (DANE and PKIX)”
“buggy TLS applications do not bother to check the validity of certificates“: This is wrong. The authors of the paper failed to note that in SMTP with DANE-EE(3), per RFC7672 both the expiration date and all names in the certificate MUST be ignored.
We acknowledged that DANE-EE must ignore the expiration date and names, and we actually verified the cert per RFC7672; however, for the clarification, we have updated the paper.
DANE-TA(2) description is not correct; These are not necessarily root CAs in the usual sense of top-level certification authority self-signed certificates. Nor are the leaf certificates necessarily directly signed by the DANE-TA(2) trust-anchor. The verification of a certificate chain via a DANE-TA(2) trust anchor is still PKIX validation, it just employs the DANE-supplied trust anchor, rather than one of the (typically CA/B Forum WebPKI) CAs pre-configured on the client.
The reason why it looked misleading was that we didn’t put ultimately on the text. Thus, we have updated the paper as following: `a trust anchor (i.e., a root certificate), thus permitting any leaf certificates as long as they are ultimately signed by the trust anchor (DANE-TA)’
Because the SMTP protocol can use three possible port numbers (25, 465, and 587), we send three TLSA record requests for each MX record This is a mistake. The SMTP protocol, two different services, SMTP (MTA-to-MTA) and SUBMIT (MUA-to-MTA). MX records are only defined for SMTP, and NOT for SUBMIT. There is no expectation that the same hosts are both the inbound SMTP servers and the outbound submission servers for a given domain. MUAs do not perform MX lookups to find their submission servers. And since no MUAs are known to support DANE, it makes no sense to probe for working or non-working TLSA records on the submission ports (587 and 465).
We have updated the results with datasets collected through port 25 and clarified the text.
It appears that no measurements were performed for domains without MX records. But, in SMTP a domain without MX records is implicitly its own MX host, and a non-trivial fraction of DNSSEC-signed domains have no MX RRs. There is also no mention of checking the security status of the A and AAAA records of the explicit or implicit MX hosts before proceeding to TLSA lookups. Therefore, the reported number of TLSA lookups each hour is both much too low, likely because domains with no MX RRs are not included, and perhaps also includes some MX hosts that should be excluded, because their address records are unsigned.
We understand the limitation. Thus, we have added a footnote: It is possible that working SMTP servers may not have MX records [RFC5321], which are excluded from our measurement.
Finally, we observe that 9 out of 29 mail service providers use DNS resolvers outside their own network, which makes them vulnerable to man-in-the-middle attacks As mentioned above, this conclusion is not necessarily valid, a validating forwarding resolver avoids the issue, while taking advantage of an upstream cache. This is likely to become more common as more resolvers add support for DoH, and more users (wisely or otherwise) configure their resolvers to use it.
We have added a footnote: “However, if they use DNS-over-Encryption protocols such as DNS- over-HTTPS or DNS-over-TLS, such attacks can be mitigated.”
“If email service providers wish to support DANE, the software of their DNS servers and DNS resolvers must be able to understand TLSA records and to support DNSSEC to validate DNS responses.” This is not correct, just DNSSEC support is sufficient. Neither authoritative servers, nor especially iterative resolvers need any special knowledge of TLSA records. Only the application consuming the DNS response needs to understand TLSA records. The primary authoritative server for a zone needs to provide some mechanism to insert TLSA records to its database, but this too can often be done generically,
We have updated the sentence: “the software of their DNS servers must be able to support DNSSEC to validate DNS responses” and added the following footnote: “Only the application that uses TLSA records needs to understand TLSA records; for example, the authoritative DNS server can publish a TLSA record as a generic record (with TYPE52 instead of TLSA) ”
the DANE operational practice recommends to avoid using PKIX-EE and PKIX-TA More to the point, RFC7672 specifically designates these as not defined in SMTP.
We added more clarification on the paper.
2. What we have NOT updated with reasonings
36% of TLSA records cannot be validated due to missing or incorrect DNSSEC records: This claim is substantially misleading. The 36% is mostly composed of domains which, though internally signed, lack a signed delegation (DS records) in the parent zone, or in other words the domain is effectively unsigned from the perspective of most resolvers. The typical situation is an unsigned domain whose MX hosts are in a signed zone and have TLSA records. This is common because some DNS providers (e.g. Cloudflare) routinely sign hosted zones, even when they are not yet in a position to provision the corresponding DS records. Signing a zone that is not yet delegated security is NOT mismanagement. Nor is it mismanagement for an unsigned domain to use a DANE-enabled email hosting provider with signed MX hosts that have TLSA records.
This is a very good comment; however, we still believe the domains without DS records are misconfigured and mismanaged as there is no way to protect themselves from MITM attacks. It might be impossible for them to configure DNSSEC correctly; for example, their registries simply might not support DNSSEC (thus, unsigned). However, given that all of the TLDs we studied are signed, we believe they should have configured their DNSSEC chains correctly. We acknoweledge that some registrars do not allow their customers to upload DS records, but according to our recent study, 15 of the 20 popular registrars that are in charge of more than 50% of .com, .net, .org domains now support DNSSEC, thus we believe they should have configured the DNSSEC chains correctly by uploading DS records matched with their DNSKEYs.
It is claimed that 14% of TLSA records do not match the presented certificates, but this figure is not plausible in light of our own measurements. We find that ~3.4% percent of TLSA RRsets fail to match the server certificate chain. Perhaps the authors failed to take into account that in a TLSA RRset it is sufficient for any one TLSA record to match the certificate chain. It is in fact normal to also find additional TLSA records that do not match the ceritificate chain, especially when key rollover is handled by pre-publishing TLSA records for future keys. As outbound deployment grows making any misconfiguration more readily apparent to the domain owner, and as more DANE-aware tools for automating certificate updates become available, we expect that the error rate will decline. But even the current rate is 10 times smaller than reported in the paper, and its impact is much lower still.
We acknowledged that having at least one TLSA record matched with certificates is enough; we will look into it.
“14.17% of them (TLSA records) are inconsistent with their certificates”: This is misleading, because it fails to weight the TLSA RRsets by the number of affected domains2. A problem with the TLSA records of an MX host for a vanity domain hosting the email of single hobbyist is less significant than a problem with a professionally operated MX host that supports thousands of domains or millions of user mailboxes. We also take issue with the number, in a more comprehensive survey, covering all signed TLDs, we find 301 TLSA RRsets not consistent with their certificates (including cases where the MX host does not consistently offer STARTTLS) out of 8370 SMTP servers with TLSA RRs. This shows that only 3.58% of SMTP servers with TLSA records have the reported issues, but as mentioned above, this is not the most relevant metric. The number of affected domains is 462, as compared with 1.84 million domains with DANE-enabled SMTP servers. This shows that operator error is largely confined to a small number of SOHO domains.
We partially agreed; however, we clearly specified that we consider only “TLSA records” in the methodology section.
“only four email service providers support DANE for both outgoing and incoming emails, but two of them have drawbacks of not checking the Certificate Usage in TLSA records”: Of course more than just the four reported providers support outbound DANE. Some of the early adopters not mentioned are posteo.de, mailbox.org, kabelmail.de and udmedia.de. In addition to web.de, 1&1 also operate gmx.de (with millions of email users between them), both domains are DANE-enabled inbound and outbound. There are many more, though admittedly most and especially the largest providers such as gmail.com, outlook.com and yahoo.com are not yet on board.
It is always hard to make a “clear” and “objective” list. As it is practically impossible to estimate the market share of the email service providers, we referred to Adobe’s leaked user email database from 2013 to rank the email domains based on popularity and choose the top 25 providers.
“ …the complexity of DANE leads to many opportunities for mismanagement… TLSA records may have DNSSEC errors such as expired signatures…: True, but actually very rare. When automatic zone resigning fails, it rarely fails for just the TLSA records. More typically the DNSKEY RRset signature would also expire, and the entire zone becomes “bogus”. DNSSEC failure is not DANE-specific. All DNS resolution for the domain fails for users behind validating DNS resolvers, such as the popular public resolvers at 18.104.22.168, 22.214.171.124, 126.96.36.199, etc. Out of ~10.8 million domains in the DANE/DNSSEC survey operated by the authors of this document, ~60 thousand have DNSSEC resolution issues, with the majority of those being long-standing breakage for parked domains where nobody cares whether they’re working or not, and so the problems are ignored. Working DNS (including DNSSEC) is required for DANE, and though DNS servvice outages do happen for some domains some of the time, this is not a significant problem.
“… or the certificates may be inconsistent with published TLSA records“: As already explained, this is not a significant problem in practice, when properly weighted by impact. Also, since email is store-and-forward, a transient mismatch that is promptly corrected generally does not result in loss of email.
We are not arguing that DANE is mis-designed nor intrinsically vulnerable. In fact, there are many potential opportunities that a single mistake can make DANE validation failure, which we observed.
“ … On the client side, DNS resolvers may not validate TLSA records properly”: This is speculative. No evidence of this being an actual problem for sites implementing outbound DANE is presented.
We found that mail.com and tutanota.com did not check the usage field correctly on Table 3.
“In this paper, we present a comprehensive study of the entire DANE ecosystem for SMTP“: But in fact only the top three (by DANE domain count) TLDs nl, se and com, plus numbers 12 and 15 (net and org).
Here, the “entire” ecosystem does not mean we studied “all domains on the earth”. We focused on all possible angles (DNS, SMTP, DANE) that contribute to the DANE ecosystem and we believe that this is comprehensive.
“we tested four popular MTA … implementations to see if email providers can easily support DANE; we find that two popular MTAs correctly support DANE for both incoming and outgoing emails: The list is far from comprehensive. MTAs that support DANE include not only Postfix and Exim, but also Halon, PowerMTA, CloudMark, Cisco ESA, MeTA1, indimail-mta, …
We have to use open source MTA programs to confirm they do support DANE not only by running them but also by checking source code as spcified on the paper: “To obtain a list of popular open source MTA programs, we refer to a prior study that showed four popular MTAs (Exim, Postfix,Sendmail, Exchange13), together had a 61% market share in 2015.
” we see how many signed TLSA records do not have corresponding DS records”: As explained previously, signed zones that are not securely delegated are common and are not an operational error. They do represent known obstacles to getting DS RRs published, but are not a DANE-specific issue. That said, a claimed 18.5% of TLSA RRsets being in zones that are not delegated signed is much higher than expected. Most likely this is instead counting unsigned domains whose MX hosts are in signed domains with TLSA records (reported as 19% earlier in the paper).
We still think it is an operational error regardless if it is intended or not as having DS records are not optional specs to deploy DANE correctly.
“However, we observe that mail.com, tutanota.com do not check whether the Certificate Usage value of the TLSA record is consistent with the certificate. Given that self-signed certificates can never be PKIX valid, they should have rejected the invalid certificates during the TLS handshake.”: As explained above, this is normal expected behaviour from Postfix 2.11–3.1. The PKIX-EE(1) usage is treated as though it were DANE-EE(1), and no harm is done. Since the benefit of this work-around proved miniscule, it was withdrawn in Postfix 3.2. So with some confidence we can guess that these providers were using Postfix in that version range.
We explained this behavior: “they might (1) ignore a TLSA record whose Certificate Usage is either PKIX-TA or PKIX-EE (as these usages are not recommended ), or (2) skip the PKIX certificate validation except for checking the Certificate Association Data.”