Strange mDNS behavior

On my machine, I often get into this state when connected to the robot via wireless:

  1. I can ping roborio-488-FRC.local
  2. I can ping roborio-488-FRC.lan

However, the FRC Driver Station has no comms. When I go to the diagnostics tab, the “Robot Radio” is green, but all other lights are not lit.

Pressing the arrow next to our team number in the Setup tab gives “None Found”, even when shift-clicking.

The nimDNSResponder service is running.

If I plug in directly using USB, I have connection. Strangely enough, if I then pull out the USB cable I get “disconnected” for about 5-10 seconds, and then it connects (via wireless). This is a decent enough workaround, but still troubling.

Any suggestions? How can I debug this further?

I have had similar issues with mDNS. I have found that by giving the roboRIO a static IP address of 10.te.am.2 seems to fix most issues when mDNS doesn’t work. You can configure the RoboRIO to use a static ip address through the RoboRIO Webdashboard: http://wpilib.screenstepslive.com/s/4485/m/24166/l/262266?data-resolve-url=true&data-manual-id=24166#NetworkConfiguration

We have also had this issue this year, and we are having issues dropping code, while testing. Last night we tried to tether the bot, but still had issue.

Will try the static ip trick on Saturday.

The m in mDNS stands for multi-cast, and from logs, there are times when the Windows multi-cast services seems to be using cached results and the DS is not able to resolve – until something causes the multi-cast stuff to rebuild its tables, and then everything works again.

I have not seen this at all this year on our test system or on the robots, but I saw it occasionally last year. Other mDNS implementation gotchas were fixed, but this one didn’t have an obvious workaround that we wanted to incorporate into the DS. Bringing an interface up or down seems like the sort of thing that might poke the multi-cast to rebuild. Disconnecting and reconnecting an ethernet cable would often do it when that was the interface to the robot.

I understand that static IPs are a solution of sorts, but they introduce other issues, especially when that robot shows up at an event. The robot won’t connect to the field, so the CSA or FTA discovers and changes the roboRIO and DS laptop settings. But that may break the programming laptop connectivity, so the team may change it back again, …

So if you go the static route, please report the issues that caused you to go static. Don’t go static just because the cool kids are doing it. And please set your stuff back to use DHCP when you go to an event.

Thanks from a guy wearing an orange hat.
Greg McKaskle

Greg,

Thank you for that description about the mDNS cacheing issue. You mentioned that you could see this from the logs. Can you share where those logs are and what to look for. We have seen connection issues and a solution was to disconnect and reconnect the Ethernet cable on the robot…which is what you mentioned. Next time this happens I will try to investigate what is happening in more detail. Are there any Windoze commands to expose the multi-cast services information?

-Hugh

The logs I was referring to were PCAP trace logs taken using Wireshark or TCPDump. They are not easy to pour through, and thus not easy to draw definitive conclusions.

I looked briefly for command line tools that we might use to diagnose or work around the issue. I didn’t find anything. I’d love to hear from others if they know of something.

Greg McKaskle

We are constantly having issues this year.

Labview, almost everytime we make a change and deploy we lose connection the first time, requires us to send code a second time to enable.

Many times the driver station does not detect comms. We are actually running code from the same laptop, the driver station on the same laptop does not see roborio.

I have also had the webpage up and working and the driver station not have comms.

Here is a list of issues that my team was having that went away when we gave the robot a static ip address.

  • Driver Station sometimes does not connect to the robot after a Java code deploy (Communications and Robot Code status are both red in the Driver Station). Rebooting the RoboRIO temporary resolves this.
  • A Java deploy from Eclipse on Windows sometimes can not find the robot at roboRIO-3928-FRC.local. Again, rebooting the RoboRIO temporary resolves this.
  • Riolog in Eclipse is empty, it does not show any printouts or messages from the roboRIO.

My team went to multiple competitions last year with a static ip and had no issues. There are multiple people on my team that know how to change the roboRIO’s network settings if problems do arise.

Before 2015 the cRIO had a static ip address of 10.te.am.2 by default. Does anyone know why this was changed? I can not think of any case where two robots with the same team number would be on the same network.

Since you mention the RIOLog issue, because of the new radio, one thing you may want to try is to move the ethernet connection to the roboRIO to the other radio port. I believe you want it connected to the one next to the power connector.

While it is totally fine to list issues on CD, that isn’t the same as a bug report. With the new website, I’m not certain where the official reporting location is, but perhaps the ScreenStepsLive bugs would be appropriate.

As for static IP addresses, they were never ideal, and the old system used them because it was too difficult to get DHCP and all of the other modern services up and going in all configurations. The benefit of static IPs are that it is a manually controlled, rigid, simple system. The downside is that things are manual and must adhere to the system at all times. Taking a programming laptop online to submit code or look up a manual, then back to the robot was error prone and tedious. DNS and mDNS let the computer and devices do these tasks for you, but they aren’t bulletproof. Static IPs aren’t bulletproof either. The roboRIO and DS fallback to static IPs, mostly as a safety net, so they should continue to work, but they are not the recommended setup, and for example, I don’t know that they get much testing, beta or otherwise.

Feel free to do whatever meets your needs at your shop, but when you come to an event, it is best to follow the current FRC guidelines for configuring your radio, computers, firmware versions, etc.

Greg McKaskle

On the Radio Configuration Utility, it says the Ethernet cable should be plugged in to the port farthest away from the power cable.

I agree that the photos on ScreenSteps show that you are to use the port away from power for configuring the radio. I didn’t find anything specifying which port should be used to connect to the roboRIO. Can you point it out?

My previous post was conveying what I was told was the recommended the configuration on the robot. Honestly, I haven’t seen this issue, so anything I tell you is second hand. Since there are only two ports, as soon as I see it, I will try both and have a stronger recommendation.

Greg McKaskle

See Team Update 8
at the top bolded Control System Note

Still having issue, spent 1 hour of out of bag time today trying to get it to communicate with drivers station, reflashed firmware on router, reconfigured router, reboot computer after router was up, hard reset roborio, gave up on wan, plugged in hard line to router, still no comms…

tried 3 laptops with driver station, new release.

put the old router on bot, everything works fine.

Not looking forward to playing week 1 with this router…

Are you sure your Windows firewall is completely disabled?

I think I fixed a Driver Station communication weirdness at our shop today. We’ve been having problems establishing communication with the robot right after turning everything on. Enough messing around with the network settings gets it working.

The firewall was off, but only for “private” networks. The laptop apparently makes a distinction between the two different routers even with the same SSID, and it seems to have decided that the new router has established a “public” network. Maybe turning off the other half of the firewall fixed it…or maybe doing that reached the necessary amount of messing around with the network settings, and it’ll be broken again next time we try.

We have the exact same issue as you this year. Last year as well as this year, we have run using a static IP on the roboRIO. We had no trouble last year, but for some reason this year are having trouble deploying code every now and then. Ultimately we found that pinging team-xxxx-FRC.local, would take a while to connect and pinging 10.xx.xx.2 would connect instantly. We attempted to switch our deployment target to our static IP, and it fixed our problems with deploying, but if you choose this route, you cannot tether directly via USB because it will assign the roboRIO a different IP.

Another note, anytime we have this deployment issue, we always have a red or blank comm light. For now we have been resetting the roboRIO to get a green comm light. This is just our fix for now, and it will work for the time being. We are going to work more on it this week and will keep you posted. I’m guessing your team is having a similar problem to us, and I hope I helped to identify the problem more.

Are there problems with using the correct static IP for the roboRIO (10.te.am.2) or are the problems only when using a non-standard static IP?

We used mDNS dynamic IPs for all of the beta testing season and it worked very well as long as we used that computer only as a driver station, where the only active network interface was connected to an SSID “1519” Cisco router, with the OM5P-AN in bridge mode.

However, if we changed to our preferred situation of having two active network interfaces: (1) a physical ethernet cable to an SSID “1519” Cisco router, and (2) a WiFi DHCP connection to a different WiFi network for Internet access, we would get very unpredictable results.

Switching to a static IP address (the correct 10.te.am.2 address) helps a lot with the issues we’ve been seeing.

For what it’s worth, we competed at the “Week Zero” scrimmage event in NH with the real FIRST field using the correct (10.te.am.2) static IP address and encountered zero problems.

Call me paranoid, but I remain a big believer in using static IP addresses for the roboRIO, the OM5P-AN bridge, any other IP devices on the robot (we use 1 or 2 Axis 206 cameras), the driver station laptop, and (when at our build area) the router which serves the “TEAM” SSID network (e.g. “1519”).