our teleop is taking to long around 23 ms. I have been through the code and i can not find any issues was hoping someone would see something we are not.
We found this year that using SmartDashboard.put*
in the periodic()
method of a subsystem greatly increased the scheduler loopâs run time.
You could try commenting those out and running a simple timing function on Scheduler.getInstance().run()
to see if it had any effect on loop times.
We attached visualvm
, a free profiler, to the running roborio code to see where itâs spending its time.
Interestingly, updates of the LiveWindow
data showed up quite high.
Drilling down further, LiveWindow.updateValues()
ends up querying for example our Solenoids, which was surprisingly slow and unfortunate because we never actually look at the LiveWindow data. The fix there was to call LiveWindow.disableAllTelemetry()
to suppress it.
We also throttled some smart dashboard updates, only sending data every Nth call to the periodic methods, but donât want to suppress them because the data is actually useful to have on the dashboard.
My main point: Try to measure where your code is really spending its time, for example via visualvm
.
I am concerned about LiveWindow on our robot; we appear to be having trouble with NT updates. But we are not using Test mode and thus LiveWindow should not be active. See https://wpilib.screenstepslive.com/s/currentCS/m/smartdashboard/l/255405-enabling-test-mode-livewindow
Do you (or anyone else) know why LiveWindow would be running when Test mode is not enabled?
LiveWindow telemetry is always active unless you disable it (e.g. with LiveWindow.disableAllTelemetry()). Test mode only enables NT changes changing things like motor values.
It looks like youâve managed to hit this issue (hence confirming that more than one team is definitely running into this):
which I suspect is related to the performance issues raised by CTRE: CTRE Phoenix 5.14.1 performance improvements?
It would be great if you could show us more detailed timing. Itâd be interesting to see what parts of the Solenoid code actually end up being slow, and how long it takes on average. (I suspect most of the time is spent in HAL.getSolenoid?)
Umm, should that not be announced loudly? Certainly, it is not obvious from the documentation. That can be quite a load on the Rio and on the network. We have been experiencing what looks like overloading of NetworkTables, and this is now my #1 suspect.
No, NetworkTables does not consume much network bandwidth or CPU. A few hundred bytes at a 10 Hz rate is not much data. In terms of CPU load, the NetworkTable set() calls are non-blocking; itâs basically grabbing a lock (that doesnât have much contention on it) and setting a value in a hash table, this takes a dozen microseconds at most per value, so even a hundred values would be less than 1 ms.
However, if you have a lot of CAN devices, those CAN calls are indeed turning out to be much higher overhead than we had anticipated, and indeed donât need to be polled every loop (since the data is actually sent at a much slower rate).
I believe we have around 12 CAN devices (lots of motors), so yes that could be a problem.
In our case LiveWindow.updateValues()
was mostly stuck in getting the solenoid states via SolenoidJNI.getSolenoid()
. The Java profiler doesnât show detail below the JNI call, but I assume itâs CAN bus access.
Recent CTR blog message suggest that the CTR lib optimizes CAN bus access, i.e. only the first call out to a CAN bus device set fetches data, and other calls within the same update period get cached data. But without the LiveWindow
we would have no solenoid access at all unless somebody pushes buttons which actuate them, so that would be even better.
Looking at the LiveWindow code I was thinking that itâs only active in test mode. Took a while to notice that itâs actually always enabled unless you specifically disable it (and then it would only run in test mode).
For what itâs worth, being able to run visualvm
has been extremely helpful to us, but itâs a little hard to enable it because you need to add options like these to the java command that starts the robot:
-Dcom.sun.management.jmxremote=true -Dcom.sun.management.jmxremote.port=1099 -Dcom.sun.management.jmxremote.local.only=false -Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.authenticate=false -Djava.rmi.server.hostname=172.22.11.2
Thatâs doable for somebody familiar with Linux: ssh
to the robot, use vi
to edit the robotCommand
. But then you need to re-do that each time you deploy code to the robot, which restores the robotCommand
.
Would be great to add that to the gradle setup, but no idea how to do that.
you can put those args in the jvmArgs
array in the frcJavaArtifact
section of your build.gradle
to accomplish that.
Just tonight I found what seemed like an major delaying factor in Java:
Improbable Java slow down in combining doubles and strings
This seems to explain the problems weâve been seeing, and I could see it causing ripple problems throughout the code stack if my analysis is correct.
Did you have to do anything special to get visualvm to read the data from the RIO? I had no luck (but I am far from a Java expert) when I tried.
Our code doesnât seem to have a jvmArgs
option in the default-generated frcJavaArtifact
section.
Can you provide an example or point to a link that shows how to add JVM arguments properly to the build.gradle file? Would it just look like this?
frcJavaArtifact('frcJava') {
targets << "roborio"
// Debug can be overridden by command line, for use with VSCode
debug = frc.getDebugOrDefault(false)
jvmArgs = [
"-Dcom.sun.management.jmxremote=true",
"-Dcom.sun.management.jmxremote.port=1099",
"-Dcom.sun.management.jmxremote.local.only=false",
"-Dcom.sun.management.jmxremote.ssl=false",
"-Dcom.sun.management.jmxremote.authenticate=false",
"-Djava.rmi.server.hostname=172.22.11.2"
]
}
Hereâs what I did:
Get visualvm from https://visualvm.github.io
When you launch it, youâll see locally running Java programs.
The hard part is instrumenting the roborio so that you can connect to the java instance running inside the robot.
For now we do that by connecting to the robot and editing the file over there that starts the robot code.
We installed âgitâ support, and when you open the âgit bashâ shell, you can connect to the robot via
# Use 10.TE.AM.2 when using radio, otherwise the 'USB' IP address
ssh lvuser@172.22.11.2
If ssh
issues âWARNING: REMOTE HOST IDENTIFICATION HAS CHANGED!â because you had used it with a different RoboRIO:
# Remove old SSH info, then try to connect again
rm .ssh/known_hosts
ssh lvuser@172.22.11.2
There is no password required.
Edit the file robotCommand
. By default it will look like this:
/usr/local/frc/JRE/bin/java -Djava.library.path=/usr/local/frc/third-party/lib -jar "/home/lvuser/NameOfYourProject.jar"
Add these settings:
-Dcom.sun.management.jmxremote=true -Dcom.sun.management.jmxremote.port=1099 -Dcom.sun.management.jmxremote.local.only=false -Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.authenticate=false -Djava.rmi.server.hostname=172.22.11.2
The settings need to be added after the .../bin/java
, and use the correct IP for the hostname
setting. To do that with the vi
editor:
vi robotCommand
# Move cursor to the end of .../bin/java
# Press 'i' for insert
# Paste the text for the added settings
# Press `ESC` to exit insert mode
# Type ':wq' and press 'Enter' to write and quit
Stop the running Java instance:
/usr/local/frc/bin/frcKillRobot.sh
After a few seconds you should see that a new one is automatically started, this time using the additional JMX settings:
ps | grep java
Back on the laptop, start visualvm
and connect to the program running on the robot.
- File, Add JMX Connection
- âConnection:â 172.22.11.2:1099 (respectively 10.TE.AM.2:1099)
- Check âDo not require SSL connectionâ
- A new entry with a âpidâ should appear under the âRemoteâ list. Double-click, then check âMonitorâ, âSample⌠CPUâ etc.
As mentioned in the discussion above, it should be possible to add those -Dcom.sun.m...
settings to the build.gradle file, and then you simply deploy your code as usual, no need to ssh
into the robot and edit files over there after each deployment.
Just tried the build.gradle route. Adding this worked for me:
artifacts {
frcJavaArtifact('frcJava') {
targets << "roborio"
// Debug can be overridden by command line, for use with VSCode
debug = frc.getDebugOrDefault(false)
// ADDED SECTION TO ENABLE visualvm
jvmArgs = [ "-Dcom.sun.management.jmxremote=true",
"-Dcom.sun.management.jmxremote.port=1099",
"-Dcom.sun.management.jmxremote.local.only=false",
"-Dcom.sun.management.jmxremote.ssl=false",
"-Dcom.sun.management.jmxremote.authenticate=false",
"-Djava.rmi.server.hostname=172.22.11.2"
]
// END OF ADDED SECTION
}
Now simply âdeployâ, then on the laptop start visualvm, File, Add JMX Connection, âConnection:â 172.22.11.2:1099, âŚ
Ideally, since gradle is uploading to the roborio and thus knows its IP adddess, it should be able to automagically place that IP address in the "-Djava.rmi.server.hostname=<The IP Address>"
, but no idea how to do that.
this is our cpu usage. not sure what waitfornotifieralarm is but it seems to be using up the entire cpu.
I would call that a false positive. That function is waiting for the next time step in TimedRobot. So it would make sense that most of the time is being spent there (e.g. if your code spends 10 ms doing something, and the loop time is 50 ms, it will spend 40 ms in that function).