ping and tracert networking question

*Need some help from network gurus.

My internet access at home is via WISP.

When it is working properly, I can ping and tracert the tower with very little latency.

Occasionally it will stop working, and I’ve discovered that when that happens I can ping the tower but tracert does not work (or takes an inordinately long time).

What would be some possible explanations that would fit that set of facts?

This happens on my home network as well. The situation isn’t exactly the same, since I don’t have a WISP. Instead, I have sort of a LAN within a LAN that I use for testing purposes(I work as a researcher for a distributed systems and security lab). For me, the problem is old wifi hardware that is probably nearing the end of its lifespan, which I am also too lazy to fix until it dies completely :smiley:

Can you give more information on what hardware you are using? That could be the problem. Also, do you know anyone else who uses the same WISP who may be having similar issues? If that is the case, it’s probably an issue on your provider’s end, in which case you should contact them and see if they can isolate the issue.

*Ubquiti AirOS PowerStation2

Hmm–unless you bought it used, I doubt that’s where the problem is. When it is not working properly, are you still able to navigate to a web page(i.e. make HTTP requests) successfully?

No. For purposes of this discussion, not being able to access web pages is what I mean by “not working properly”.

Just checking :slight_smile:

How frequently does this occur? Multiple times per day? Per week? Also, I assume you’ve taken a look at your PowerStation during these times to make sure nothing was obviously out of place?

How many hops in your traceroute when it is working?
Are you going to a DNS entry or an IP?

Couple of times a week. Random.

Actually, it has a Murphy sensor. Its happens at the worst possible times. Like when I’m about to submit my tax return, or in the middle of a purchase or a bank transaction.

Also, I assume you’ve taken a look at your PowerStation during these times to make sure nothing was obviously out of place?

It’s on the roof (2 stories). At night I can sort of see the signal strength LEDs with binoculars.

Even a tracert to an IP 3 hops away doesn’t work (or takes forever).

Trying to figure out if you are pinging your local gateway which will work even without the tower but tracerouting to a website which times out over ICMP because the tower is not connected (you can traceroute over UDP as well). That is why I asked how many hops when you are working I assumed you tracerouted to a fixed place in the network before the Internet.

It could be DNS resolution. Most implementations of tracert will attempt to resolve each IP along the way, and this can take a long time with a flaky DNS server.

A couple suggestions:

a) Try running tracert with the -d flag (Windows) or -n (Linux).

b) Try using an open DNS service, like OpenDNS (208.67.222.222/208.67.222.220) or Google Public DNS (8.8.8.8/8.8.4.4).

I thought I answered that:

Maybe I’m not understanding what you’re asking.

OK, this is where I have some confusion. I thought DNS was to resolve a name into an IP address. Why is DNS required if I am already providing the IP address?

A couple suggestions:

a) Try running tracert with the -d flag.

b) Try using an open DNS service, like OpenDNS (208.67.222.222/208.67.222.220) or Google Public DNS (8.8.8.8/8.8.4.4).

Thanks. I will try that next time it misbehaves.

By default, tracert will still use DNS to try to look up a name for each hop so it can display it as part of the output. Use “-d” (Windows) or “-n” (Linux) to tell it to skip that part.

I went ahead and changed my DNS as you suggested. I’ll know in about a week if that fixes the problem.

The IP 3 hops away is probably at the network level near that ISP’s perimeter.
So if the DNS doesn’t turn out to be the issue you probably have lost your connection to them over that link.

If you have the information to access the equipment on your side you can probably query the status of the connectivity to the tower from the web interface. I think that unit supports SNMP as well so you could query the status of the link from that. They appear to have an MIB for that purpose available. Then you could work it out such that you can monitor the signal level and data transmission rates.

Just out of curiosity, what addresses are you pinging and what addresses are you tracerting too?

This question leads to an iceberg answer, in that there’s a lot to say. Here’s the tip, so to speak:

> I can ping the tower but tracert does not work (or takes an inordinately long time).

The short answer is that you are most likely not pinging the tower, but something more local. The DNS servers are not likely to be local either, so DNS name resolution isn’t likely to work either when you are having connectivity issues.

The analogy here is that to reach an endpoint (specified by IP address, possibly after the additional step of DNS name resolution from a textual name to an IP address that is an independent network operation that itself requires a healthy network connection), you have to traverse a network of roads. You can think of each intersection on this network of roads as having its own IP address. To get almost anywhere, the first few intersections are going to be the same, they take you from your home, out of your neighborhood, and onto a major road. Once you’re on this road, chances are you are going to get where you are going, unless there is a problem toward the other end of the journey, when you are getting back onto local roads in the destination neighborhood. Or, if there’s no one home at the address on the other end (the site or server is out but you can get to everything else).

There is a span of road that is sometimes out in your case, most likely the over-the-air link. If this happens, you can reach intersections or even end addresses that are local and do not require traversing the span where there’s an outage, but can’t get further.

Take a tracert when things are working, and let the IP addresses be resolved back into textual names (don’t turn of DNS resolution when tracert for a working connection). These names may give you clues. Keep the list of names and IP addresses to some well-known site handy for comparison when things are out.

You will likely find that when you run a tracert that stops at some point, you can ping the addresses that it could reach, you just can’t ping ones that are further away on the network. The first place you can reach when things are working but cannot when they are out is the far end of the link that is giving you problems. If you want, you can post details here and we can comment further.

Hope this helps!

The tower is local. The AirOS PowerStation2 radio/antenna on my roof (mentioned in an earlier post) is aimed directly at the top of the tower, 1.2 miles away, line-of-sight.

3 hops: PC -> router -> radio -> tower.

BTW, I have ruled out the router. If I bypass the router and connect the PC directly to the radio when the problem occurs, it does not fix the problem. I do not have access to the radio, other than to cold-boot it by cycling the power (PoE), which I have tried, to no avail.

When the problem occurs, I can successfully ping the tower, but tracert to it times out (or takes an inordinately long time to complete).

This setup had been working almost flawlessly for 3 years. Problems started occurring occasionally about a year ago, and have become more frequent and longer in duration.

I wouldn’t classify a 1.2 mile over the air hop as local, particularly the over the air part – but see below.

This is the part where I’d be inclined to measure twice. When you supply an IP address to ping, how do you know it is the address of something at the tower? It is possible the radio has its own address (cable modems do, for instance). If you have a list of IP addresses from a tracert taken when things are working, you can try pinging each of these addresses and should see very similar results as when running tracert, in either the case where things are working or not. In essence, tracert is automatically sending pings out is succession, one hop further each time.

Another thing you might try is opening a browser and typing the IP address you have for the tower and see if you can get to an admin interface. You should certainly be able to do this with the IP address of the router, to see how this works. Apologies if you’ve already tried this – it is hard to find the right level of detail without being insulting and/or writing a whole lot here.

Do you have a list of names that you obtained when things were working (tracerout output but not using the option to suppress DNS translation)? It might be interesting to match this against the list of hops you supplied above and knowing just where things stop when they are not working would be very helpful. It could be further into the network on the other side of the radio link, but I’d suspect this link until I knew for sure it was good.

Are things worse when it is raining, at certain times of day, or any other pattern you’ve noticed?

Sorry for so many questions and so little actual help!

BTW, if you can get to the admin interface for the radio, you should check to be sure your firmware is current. Same goes for the router, but this isn’t likely to be as important in this case.