Consistent out of memory errors when connecting multiple NT clients

Our team is currently working on building a robot for an offseason competition and we’ve run into several out of memory errors when testing code on the robot. I believe I’ve narrowed down the issue to the number of NT clients connected because of some tests that I’ve conducted. The issues started when we added a Limelight 2 (v2023.6.0) for Apriltag vision and connected it to the RoboRIO through a network switch. When we turned on the robot it would work fine until the point where I connected either AdvantageScope (v3.0.0-beta-5) or Shuffleboard (v2024.1.1-beta-3). The robot would continue running for about 2-4 seconds until it threw a memory error. It would then restart the robot code and then throw another memory error after ~2 seconds of running. I tried disconnecting the Limelight (and removing references to it in the code) and was able to connect two instances of both AS and SB to the robot without issue for several minutes. With AS connected I tried reconnecting the Limelight and once again it would error after a couple seconds. I retried this test with SB connected instead of AS and it had the same result. We had a Limelight 1 available so I tried connecting that alongside the Limelight 2 without connecting anything else to NT and yet again it threw a memory error.

I don’t really know how to solve this issue besides not connecting AS or SB to the robot but that’s a huge detriment to the software team and the drivers. I know that other teams are able to run a lot more cameras than we are and still connect AS and SB to their robot and be perfectly fine, so I’m very curious as to what is the actual issue with our robot. One of our programmers also brought up the fact that the Limelight 2 connects with NT3 instead of NT4 which might be an issue if NT for some reason doesn’t like using both NT3 and NT4 at the same time. I’ll have more info on the memory errors later today when I recreate the issue and record what gets printed to the console.

Here’s our robot code repo if somehow we’re doing something very wrong in our code that’s causing the memory errors. We’re using WPILib v2024.1.1-beta-3.

It might be better to take this conversation to the beta forums so we can track it better. What would help me debug as well is if you can capture what all is getting published (eg a copy of the datalog if you’re saving one). One thing that’s unique to NT3 clients is they effectively subscribe to everything in NT, but that’s also true of Shuffleboard currently. It may also be that there’s a large amount of data getting sent by the Limelight that causes a memory allocation spike as we replicate it to other clients. Do you see a problem with just limelight connected (without AS and/or shuffleboard)?

Thanks for responding!

  1. Are the beta forums you refer to the allwpilib GitHub Issues/Discussions? If not where can I find them?
  2. I should be able to retrieve a datalog later today.
  3. When the robot is running with just the Limelight it doesn’t throw any memory errors.

For beta there is a dedicated issues/discussions area. wpilibsuite/2024Beta · Discussions · GitHub

My guess is that the combination of lots of data being sent by Limelight combined with AS subscribing with options to send the full history of things may be causing a lot of memory usage (preserving history with less frequent periodic sends means the server has to save it until it can send it). Does Limelight+SB work without AS connected?

Limelight & SB still throws a memory error even without AS connected.

I made this spreadsheet when performing the tests to try to narrow down what would be the issue (note that “Limelight 1” is the physical Limelight 2 and “Limelight 2” is the physical Limelight 1)

I got 8 .wpilogs for you (if that’s what you were looking for)

https://drive.google.com/drive/folders/1hIwOF4e5stuOtcupPMGb-eL-owuXXTCK?usp=sharing

General notes:

  • Limelight IP: 10.6.86.11
  • Log notes generated by reviewing the console that is logged by AdvantageKit in AdvantageScope
  • Memory error is consistently
    OpenJDK Client VM warning: INFO: os::commit_memory(0xb439f000, 32768, 1) failed; error='Not enough space' (errno=12)
    #
    # There is insufficient memory for the Java Runtime Environment to continue.
    # Native memory allocation (mmap) failed to map 32768 bytes for committing reserved memory.
    # An error report file with more information is saved as:
    # /tmp/hs_err_pid2652.log
    
    (We’ve gotten other types of memory errors so I thought I’d clarify)

Log notes:

  • 0_InitialBootup
    • Log after powering on the robot
    • Crashes 20 seconds after Shuffleboard is connected
  • 4_Disconnected_SB
    • Disconnected Shuffleboard shortly after robotInit()
    • Crashes 0.167 seconds (!!!) after AdvantageScope is connected
  • 6_LongCrash
    • Somehow runs for a full minute before crashing
  • 7_Disconnected_LL
    • Disconnected the Limelight
    • Constant lljson error prints because of disconnected Limelight
    • Unknown crash cause: Probably because I started downloading the logs then

I have created a GitHub discussion in the 2024Beta repo

1 Like

This topic was automatically closed 365 days after the last reply. New replies are no longer allowed.