Severe (>100%?) packet loss

Can you post some of the logs (both dslog and dsevents).

For #9 is this with an ethernet cable or USB?

2 Likes

It depends on whether the different buildings are really independent. For example if they were all from the same school with the same wifi mesh network setup, that wouldn’t necessarily be a valid test.

1 Like

You’re right. It was a different build space shared with another team that was not seeing these issues as I understand it.

ds logs.zip (3.3 MB)

Somewhat unorganized, sorry. Thursday’s logs have a whole lot, but they’re there on Monday too. 9 is with Ethernet.

It’s happened

  1. In our shop at our school, which technically is in range of access points, but weak enough signal such that nobody actually has a functional internet connection. Tiny concrete box.
  2. In a mentor’s garage in a very residential area with not that many APs around.
  3. In a different school 10 ft away from another team’s robot which was not having said issues.

We can try wifi analyzer tomorrow.

2 Likes

Maybe log the traffic with Wireshark?

3 Likes

Do you have any ds logs from when you ran with a blank project? A few observations from the logs you sent

  1. You have a lot of text warnings/errors/logs being sent. I’ve read in a few places e.g. 1, 2, 3, that string concatenation is slower this year in Java or something along those lines. It may just be confirmation bias, but I definitely feel like it is worse this year for whatever reason.
  1. Your free memory on the RIO is quite low in all your data
  2. The USB camera messages keep being printed, is this expected?
  3. 8:31:44 on 02_23_21_30_40 shows DS laptop CPU at 98%. This data is not at all granular enough for any kind of correlation, but may be something to peek at?

No idea if any of the above are related to your issues, but I have seen funky things with network traffic and either CPU usage or lots of things being printed to the DS. It’s likely worth cleaning up the various errors either way, and it may reduce some of the issues you’re seeing as a happy side effect?

2 Likes

When you took the switch out of the equation did you disconnect power to the switch? Wondering if you have too much on VRM, or maybe the VRM is not functioning correctly.

2 Likes

I don’t have logs for an empty project but I can get them tomorrow and post.

  1. There’s some stale CAN messages that we induced in the later logs; they are there because we lowered some CAN frame timings but are still logging more than we need to, so upon accessing stale data (in order to log it) it prints the error (or at least thats my understanding). We plan on cleaning up that soon but it was happening before that anyhow. // We do a decent amount of string concatenation but… I don’t know. I feel like concatenating Strings shouldn’t be causing issues unless you’re running 1950s hardware.

  2. I don’t actually know where to see the RAM in the logs, or DS CPU. I see rio cpu, but not DS.

  3. The USB camera thing is silly, I forgot we had that line in there, we can remove it, we don’t have a camera aside from Limelight. Leftover code from last year I guess.

We can try a new VRM on Monday. We left the power to the switch on.

It gets logged periodically to the event list like this:

image

But during a test just look at task manager to get better data.

2 Likes

Yes, it’s different. Java 17 removed the workaround we used to use to speed this up (-Djava.lang.invoke.stringConcat=BC_SB). We added a compiler option ( -XDstringConcat=inline) to try to get back to baseline performance but it may not work exactly the same under all situations or may make some situations worse than last year in order to avoid the worst worst case.

Unfortunately, without the above option… you can get hundreds of ms of delay for the first call to long string concatenations (later calls don’t have the slowdown). See Loading...

5 Likes

Unfortunately free memory is not what we (or the DS) should be looking at / logging. It should be looking at available memory. Free memory treats memory consumed for caching as “used”, so is a pretty useless metric, whereas available memory is the true measure of how much memory applications can still allocate (because the caches can be discarded). I forgot to ask NI to change it this year, hopefully we’ll remember to change this for 2024.

3 Likes

I have seen DS laptop slowdown cause packet loss and control lag. Try quitting all other apps (including shuffleboard) to see how that affects things.

1 Like

Something I don’t see on your list of things tried: switching the radio to the other frequency band. Our school shop had terrible packet loss on 2.4GHz but is not too bad on 5GHz. The loss pattern was quite different from what you’re seeing though – the loss rate was constantly high rather than peaking so dramatically.
But it’s one more fairly simple test that you can do and would probably point you away from radio interference as the most likely cause of the problem if it didn’t make a difference.

1 Like

I’m once again asking for a consolidated place to provide feedback on the DS that is taken seriously. The NI forums aren’t it.

7 Likes

This may be a long shot, but did you try swapping out the Rio?

1 Like

Assuming you didn’t make any modifications before deploying this code to the Rio I think this eliminates your code as the major contributing factor.

I would probably treat this as a hardware/firmware issue as my number one troubleshooting. You said you swapped out cables, radio, and switch. You didn’t mention whether you re-flashed/swapped out the Roborio.

The common intersection across all your tests as I read them is it is the same Roborio.

We haven’t yet, but it’s on the to-do list for today.

Is your robot radio password protected?

You mention having a limelight. Are you by chance streaming a high bit rate video stream to your dashboard AND have the 3Mbps data cap enabled on your radio? If your video stream is > 3Mbps the radio will drop a lot of packets to keep you under the limit.

3 Likes

Wait, what? That doesn’t effect anything does it?