Monday, October 4, 2010

BoB (Belkin F1PI243EGau) DNS is still broken

Update Oct 19, 2010:Contacting Belkin sales and customer feedback and pointing out my increasing efforts to show how poorly they've performed in public finally got a response from Belkin. Admittedly the response is to say they've put the issue though to an "overseas engineer" ... but maybe something will happen. More likely, it'll stay trapped in another layer of disinterest and poor management, this time one I can't apply direct pressure to.

UPDATE Oct 6, 2010:An indirect approach, by bothering a friend who works at iiNet, got this issue through the support wall and to people who can deal with it. Belkin has been notified at a higher level too. It's a real shame that a clearly demonstrable issue like this got stuck behind support people at both companies, to the point where I had to bother a friend who shouldn't have to deal with this stuff just to get the issue looked into. Sometimes tech support acts as a barrier that prevents a company from finding out about real problems, an issue I've seen not only with iiNet and Belkin but with endless other companies. Anyway, hopefully Belkin will be getting onto this now.


A year ago, I reported to iiNet that AAAA lookups in the BoB (F1PI243EGau) DNS forwarder were always timing out, rather than returning SERVFAIL or correctly forwarding the query to the upstream server. This is the cause of the slow browsing issues reported for the BoB.

A couple of months ago I got hold of a pre-release firmware that fixed this, and about a month ago the firmware was finally put up for public use. This fixes the AAAA issues, so browsers like Firefox and Safari on IPv6-capable operating systems like Mac OS X and Windows 7 don't take an eternity to resolve every DNS query.

Unfortunately, Belkin didn't take this as a hint to properly test their DNS forwarder. They fixed AAAA lookup, but didn't fix it to return SERVFAIL when it encountered something it didn't understand, and failed to test other record types like TXT and SRV.

Sure enough, TXT and SRV lookup have the same problem. This is currently causing problems with Google Talk (using Pidgin) that require the configuration of a fallback connect server to bypass TXT record lookup.

Belkin support do not understand the problem. The low-level support folks at iiNet don't seem to get it either. Neither are passing the problem on to somebody with the experience and knowledge to understand the problem, and neither seem to have access to suitable hardware - or the inclination to use it if they do - to verify the issue.

Here's the explanation I sent to them.


The BoB (Belkin F1PI243EGau) doesn't handle DNS queries for TXT and SRV records correctly. Instead of forwarding these queries correctly or even returning SERVFAIL, it fails to respond, causing a client-side timeout.

Before the latest firmware update it had the same problem with AAAA (IPv6 DNS) lookups.

This issue causes problems with Google Talk (why I noticed the problem), Active Directory, SPF mail lookups, etc. It's particularly obvious when using Pidgin for Google Talk, as it relies on correct SRV record handling, where the official Google Talk client falls back to "brain dead router" mode hard-coded defaults if SRV lookups fail.

I won't write a whole document on what TXT and SRV records are and why they're important here; that's what Google is for. What I will do is provide you with instructions on how to demonstrate that the BoB's dns resolver doesn't handle them correctly while other DNS resolvers do.

I'm going to assume you're on Windows. If you're on Mac or Linux, replace "nslookup -a=txt" with "dig +short -t txt" ; otherwise the commands are the same.

In a command prompt window (start>run->cmd.exe) on a machine that has access to a BoB (F1PI243EGau) with IP 10.1.1.1, run:

nslookup -q=srv talk.google.com 10.1.1.1
nslookup -q=txt gmail.com 10.1.1.1

Both are queries for valid, existing records. You will find that those requests both time out:

C:\Users\Craig>nslookup -q=txt gmail.com
Server: bob.iad
Address: 10.1.1.1
DNS request timed out.
timeout was 2 seconds.
DNS request timed out.
timeout was 2 seconds.
*** Request to bob.iad timed-out

C:\Users\Craig>nslookup -q=srv talk.google.com
Server: bob.iad
Address: 10.1.1.1
DNS request timed out.
timeout was 2 seconds.
DNS request timed out.
timeout was 2 seconds.
*** Request to bob.iad timed-out

Now run the same commands again, but this time use a real DNS server directly, bypassing the BoB. Because I have no way of determining what your local DNS server IPs are, I'll give you an example that uses Google's public DNS directly, but you'll find it works with any DNS server other than the BoB:

nslookup -q=srv talk.google.com 8.8.8.8
nslookup -q=txt gmail.com 8.8.8.8

( 8.8.8.8 and 4.4.4.4 are Google's public DNS servers ).

Note that they succeed, because we're bypassing the BoB's broken forwarding DNS resolver?

C:\Users\Craig>nslookup -q=txt gmail.com 8.8.8.8
Server: google-public-dns-a.google.com
Address: 8.8.8.8
Non-authoritative answer:
gmail.com text =
"v=spf1 redirect=_spf.google.com"

C:\Users\Craig>nslookup -q=srv talk.google.com 8.8.8.8
Server:  google-public-dns-a.google.com
Address:  8.8.8.8
Non-authoritative answer:
talk.google.com canonical name = talk.l.google.com
l.google.com
        primary name server = ns3.google.com
        responsible mail addr = dns-admin.google.com
        serial  = 1429061
        refresh = 900 (15 mins)
        retry   = 900 (15 mins)
        expire  = 1800 (30 mins)
        default TTL = 60 (1 min)

As you can see, the BoB's forwarding DNS resolver isn't correctly handling TXT and SRV record lookups. Note that this persists even if I override the ISP-configured DNS servers and use Google's instead, so it's not a problem with my ISP DNS. In any case, queries directly to my ISP DNS work fine. TXT and SRV queries fail from my Windows 7 desktop, my Linux laptop, and spare Mac OS X machine I borrowed; this is not a problem specific to one computer or operating system, it's a firmware issue.


For bonus points, here are the brillant replies from the Belkin support staff:

Dear Belkin User,

Thank you for contacting Belkin Technical Support.

We understand your concern and We will be happy to assist you with your queries.

In order to isolate the issue we suggest you to update the firmware on the router.

Please find the link below where you can update the firmware for the router F1PI243EGau. http://en-au-support.belkin.com/app/answers/detail/a_id/2703/session/L3NpZC8xUHpIcm9iaw%3D%3D/kw/f1p1241egau/r_id/166/sno/0

(Editorial note: this link refers to the firmware for the wrong router, the F1PI241EGau, so clearly even Belkin can't keep track of their model number scheme)

Please do feel free to write back if you have any further queries,we will be more than happy to assist you.

When I replied to ask if they really meant that firmware, and pointed out that my original report showed that I was already running the latest firmware for my router, the F1PI243EGau, they replied with:

Dear Belkin User,

Thank you for contacting Belkin Technical Support.

We understand your concern and We will be happy to assist you with your queries.

(Look like another template answer to you? Me too.)

We would appreciate if you please get back to us with the following information to isolate the issue:

  1. Are you trying to do a NAT loop back?
  2. Are you trying to use DDNS?

Please do feel free to write back, and we will be more than happy to assist you. Please take a moment to review our Knowledge Base at http://belkin.com/support or let us know.

After this brillant uttering, they went silent and stopped responding to further requests. Belkin's support for their product as demonstrated here is an abject, total failure.

No comments:

Post a Comment