ROMI: Random Disconnect Issues

Hi Everyone,

For the Fall off-season this year, to get the students prepped for the season, Team 2876 developed a “Mini Robot Challenge” using the ROMI First robot kits.

Some teams have been experiencing seemingly random hangs, latency, disconnect issues when connected and simulating the ROMI. Sometimes, the ROMIs can run for 1-2 hours straight without any issues…other times, the ROMIs will exhibit the issue within 5 minutes.

Here’s the basic setup:

  • ROMI First kit from Polulu
  • Raspberry Pi 4
  • WPILibPi 2023.2.1 Release
  • Fresh NiMh Rechargable Batteries (reported voltage > 7v)

When the issue happens, the ROMI is unresponsive to command and the gyro/accelerometer readings in the SimGUI freezes:
image

At times, the ROMI recovers on its own, but most times, the ROMI requires a full power cycle before it can be controlled again.

  • Happens on multiple laptops
  • Happens on multiple ROMIs
  • The RPi4 appears to still be running. The WebGUI is responsive and still reports the current voltage.
  • The Romi 32U4 board FW appears to still be running. There is still a PWM signal on the EXT3/EXT4 pins (verified with a digital oscilloscope).

To me, It seems as if the I2C communication between the RPi4 and Romi 32U4 is stuck

I did manage to capture the console log via the WebGUI. The issue coincided with the “[ROMI] Lost all connections, resetting state” message. In this case, the ROMI managed to recover in it’s own:

2023-02-19T22:38:37.952Z [ROMI] warn: DS Packet Heartbeat Lost
2023-02-19T22:38:38.042Z [ROMI] info: DS Packet Heartbeat Acquired
Socket closed (1006): 
[ROMI] Lost all connections, resetting state
2023-02-19T22:41:19.770Z [SVC-DS] info: Advertised NT Server address updated to: 
2023-02-19T22:41:19.776Z [NTCORE-CLIENT-V3] info: NT Connection lost
2023-02-19T22:41:20.061Z [ROMI] warn: DS Packet Heartbeat Lost
Successfully connected
2023-02-19T22:43:59.819Z [ROMI] info: New WS Connection from 10.0.0.105
2023-02-19T22:43:59.822Z [SVC-DS] info: Advertised NT Server address updated to: 10.0.0.105
2023-02-19T22:43:59.875Z [ROMI] info: DS Packet Heartbeat Acquired
2023-02-19T22:43:59.876Z [ROMI] info: Robot DISABLED
2023-02-19T22:45:01.347Z [ROMI] info: Robot ENABLED

Does anyone know why/when the “DS Packet Heartbeat Lost” is printed? Not sure if that’s a clue.

Any thoughts on how to debug would be extremely helpful since our competition is this Monday Dec 18th!

Tnx!

Does this happen only when the student’s code is running? What about if you have a blank/example project running on the romi?

not sure about all your disconnecting issue but for the 1-2 hr run, I would suspect that those are from the batteries being too low. I don’t recall any of my NiMH batteries ever can lasted more than a total run time of about an hour at all (from 2020/2021) and those were running with Pi3’s. The Pi4’s need more power than the 3’s.

other thing to check - since you are using 0.0 are your team number, I assumed this is across the board for all other ROMI’s. Make sure that NONE of the ROMI’s WifiAP are set to ‘auto connect’ on the Drive Stations. Otherwise, it could literally be hopping from 1 to another and given all has the same 0.0. team number, they would all appear to ‘magically’ reconnect back up.

1 Like

Hi Tom,

The code base, Romi, or laptop doesn’t matter. We switched code bases, ROMIs, and Laptops…and the issue seems to happen on all of them eventually. Some more frequently than others.

Hi @tkchan,

The 1-2 hour run was not continuously running the motors. I had the ROMI/laptop connected for 2 hours, but only actuated the motors every 5-10 minutes or so to see if it remained connected. The system lasted until the batteries ran out.

We are using 10.0.0.2 for all of the ROMI IP addresses . I think there’s a lot of merit to multiple DS trying to connect to the same ROMI. E.g. when mentors test the ROMIs at their respective homes, no one can get them to fail. However, at school last night, it seemed to happen to multiple teams at roughly the same time!

We will try changing IP addresses (and/or the WIFI passwords) for the ROMIs at tonight’s meeting and report back. Thanks for the suggestion!

When we did ours, I kept all the IP as our regular team’s IP. But I just make all the Wifi password different for each of them (and each PI’s AP is numbers…the password is numbered). I can’t remember exactly but I think what I did was took the ‘master’ PI image, copy to the uSD card, then I open one of the configure file (think it was under /boot/ but will need to check doc) and just change the AP’s name and its Password.

Good luck!

Hi @tkchan,

Our SSIDs are already different, but all have the same password. I agree with your suspicion that multiple DS have the wifi credentials stored for multiple ROMIs. I’m hoping that changing the password for each ROMI will resolve our immediate issues.

An update:

  • Changing the WIFI passwords did not help
  • Changing the WIFI channel did not help

However, what seems to have helped, but not conclusive, yet:

  • Plugging in the laptop to a charger seems to minimize/prevent the disconnect. Perhaps the issue is related to power management on the laptop/wifi?

How many Romis are you running at once? Last year I helped run the programming portion of FIM’s rookie workshop, we had issues with ~10+ plus Romis in same room. Unlikely a battery issue unless they start beeping (they are programmed to do so under 5V if I remember correctly).

We also found that connecting them to a router in bridge mode is definitely more reliable than connecting directly to the Romi.

I believe having multiple routers does improve the issue with more than 10 but the issue may still exist.

Yes, wifi doesn’t do very well with lots of access points crowded into the same space. The technical reason for this is that APs don’t negotiate with other APs to manage transmission timeslots, so they end up just stepping on each other, leading to high packet loss etc. Running a single AP with many devices in bridge mode doesn’t have this issue.

We have around 6-7 ROMIs active at the same time, all using AP mode. We do also experience occasional laggy response between the DS and ROMI, which very well could be explained by high packet loss that @Peter_Johnson suggested.

We will see if we have time to switch the teams/ROMIs over from AP to bridge mode and assign static IPs for each ROMI. It would involve changing each team’s code from using 10.0.0.2 to their specific IP address, but I think it’s doable.

Thanks for all the feedback, everyone! The First community has always been super supportive and helpful!

Hi Everyone,

We were not able to switch the ROMIs to bridge mode. When we did, it no longer was accessible. I checked the config files and the /boot/wpa_supplicant_wpilibpi.conf file was empty. Seems like a possible bug?

I noticed that the new XRP supports multiple bridge networks with a fallback to AP mode. I’m guessing this should also be possible for the ROMIs, too, since it is also running Linux. @zeequeue any thoughts?

The XRP does not run Linux, it’s a RP2040 microcontroller, so more like an Arduino. They’re very different platforms. I’m currently working on upgrading WPILibPi to the latest Bookworm Pi image, which means the networking stuff needs to change to use NetworkManager anyway.

Good to hear! I’m looking forward to trying the updated Romi image.

Hi All, I just wanted to follow up to say that we did finally figure out the ROMI disconnect issue. It was indeed caused by having too many ROMIs broadcasting as Access Points in a confined space. Once we switched all ROMIs to connect in bridge mode to our WIFI network, the lag, delay, and disconnect issues were gone.

For whatever reason, though, the ROMI WebGUI zeroes out the WIFI config file when changing to bridge mode, so I had to manually configure the settings. In the process, I figured out a way to add support for multiple client WIFI networks with automatic fallback to AP mode. I used WPILibPi 2023.2.1 for this.

I know it’s likely the next WPILibPi release will have support for bridge + AP mode fallback built-in, but I wanted to share this interim solution with the community!

2024-02-03 ROMI WIFI Client with AP Fallback.pdf (122.7 KB)

1 Like

Our team had similar problems with ROMIs dropping wifi connections after being connected for only a few minutes. We had at most 3 ROMIs running at a time, but it didn’t matter if we only had one powered on. I suspect it was the school’s wifi routers were to blame. Each classroom has it’s own high powered cisco wifi AP. We were able to get a modest improvement by moving out into the stairwell (as far from an AP as possible), but not enough to make the ROMIs usable on campus.

The same ROMIs worked just fine when students took them home.

1 Like

This topic was automatically closed 365 days after the last reply. New replies are no longer allowed.