I am trying to deploy code to a test bench setup. I click “deploy robot code” in vs code. It says “build successful” but, nothing appears in the riolog, the driver station still says “no robot code”, the COMM light on the roborio is still red. I have restarted my computer and the rio repeatedly. I have uninstalled frc game tools and vs code and reinstall them. I have reformatted the rio repeatedly. I have reconfigured the radio repeatedly. I have replaced the rio, the radio and, the cables connecting them. I have been struggling with this for 3 days now please help!!
Are you:
A: Using the 2022.3.1 release of WPILIB,
B: Is your roboRIO using the new v4.0 image? (You have to run the RIO imaging tool as an administrator for it to work)
I had the same issue, but with phoenixtuner not getting the CAN IDs.
Yes I am using 2022.3.1 and v4.0, my phoenix tuner is working fine btw
Can you give us a screenshot of the whole Driver Station window with the Diagnostics tab open?
We encountered the same problem this week, however it was resolved by reformatting (it was already on the latest image). I’m not sure if the root cause is the same, but you may not be alone here.
Have you tried deploying and/or reformatting from another machine?
We’ve also experienced the same thing on two separate roborios. Both were updated with the 2022v4.0 image. Both times we were able to reformat the robrio and the problem goes away for a while until it comes back.
I’ll try to see if there are any logs that I can pull off tomorrow.
So you can reproduce this problem? Can you define exactly what steps you are taking to reproduce it? Can you simplify any of those steps from custom code to examples or generic code and still reproduce it?
It seems to happen during the normal course of development and testing. We will be deploying code just fine for a time period, then a deploy will show successful but the robot code will never load. One of our roborios hasn’t been reimaged yet after experiencing the problem. I will deploy an empty command based project to it and report back.
I have created a empty command template and deployed it and it worked! I’m going to copy my code over one section at a time to see what the problem is.
I copied over all of the code and it works fine! I don’t know what the problem was but its fixed now.
So here is a list of things we tried in order today with the affected roborio
- Tried the same code, with no change from yesterday. ( The thought being that a good night sleep, and powercycle is what the robot needed. ) Did not work.
- Tried the same code with deploy-dirty, which should have ignored caching and copied the jar over. Did not work
- Tried using a freshly generated template project. Did not work
- verified the jar from the template did copy and the robot command looked correct.
- Manually started the robot code from a shell. This did work.
- Tried the deploy again, did not work.
- Tried to disable auto startup flag via the roborio flash tool.
- Tried the deploy again, did not work.
- Reimaged, deploy copied a bunch more. ( Even more than deploy-dirty ) and Did work.
I was working with the 2158 roborio last night ( with @AndrewI ) and looking back through my notes today.
When the deploy worked after reimaging, the pstree showed:
|-lvrt-daemon(1673)---su(3514)---MainAppThread(3515)---sh(3551)---bash(3555)-+-FRC_ConsoleTee(3567)
| |-java(3566)
| `-tee(3568)
Which are the following processes:
3514 admin 0:00 /bin/su -- lvuser -l -c /etc/init.d/lvrt-wrapper CRASHED_AND_RESTART /var/run/lvrt_wrapper.pid
3515 lvuser 0:03 {MainAppThread} ./lvrt
3551 lvuser 0:00 /bin/sh -c /usr/local/frc/bin/frcRunRobot.sh
3555 lvuser 0:00 /bin/bash -l -c /usr/local/frc/JRE/bin/java -XX:+UseConcMarkSweepGC -XX:+AlwaysPreTouch -Djava.lang.invoke.stringConcat=BC_SB -Djava.library.path=/usr/lo
3566 lvuser 0:45 /usr/local/frc/JRE/bin/java -XX:+UseConcMarkSweepGC -XX:+AlwaysPreTouch -Djava.lang.invoke.stringConcat=BC_SB -Djava.library.path=/usr/local/frc/third-pa
3567 lvuser 0:00 FRC_ConsoleTee
3568 lvuser 0:00 {tee} /bin/busybox.nosuid /usr/bin/tee /var/local/natinst/log/FRC_UserProgram.log
If I take a look at the open files for the lvrt process:
3515 /usr/local/natinst/labview/lvrt /dev/null
3515 /usr/local/natinst/labview/lvrt /dev/tty0
3515 /usr/local/natinst/labview/lvrt /dev/tty0
3515 /usr/local/natinst/labview/lvrt /dev/null
3515 /usr/local/natinst/labview/lvrt /dev/tty0
3515 /usr/local/natinst/labview/lvrt socket:[28416]
3515 /usr/local/natinst/labview/lvrt socket:[28418]
3515 /usr/local/natinst/labview/lvrt socket:[28433]
3515 /usr/local/natinst/labview/lvrt /var/local/natinst/log/lvrt_20.0_lvuser_cur.txt
3515 /usr/local/natinst/labview/lvrt /usr/local/natinst/labview/english/rtapp.rsc
3515 /usr/local/natinst/labview/lvrt /home/lvuser/natinst/bin/TBLStartupApp.rtexe
3515 /usr/local/natinst/labview/lvrt pipe:[29014]
3515 /usr/local/natinst/labview/lvrt pipe:[29014]
3515 /usr/local/natinst/labview/lvrt /var/volatile/tmp/natinst/shared/mutex/lvrt_traceengine
3515 /usr/local/natinst/labview/lvrt /var/local/natinst/tracelogs/lvrt_trace.log
3515 /usr/local/natinst/labview/lvrt socket:[28438]
3515 /usr/local/natinst/labview/lvrt /usr/local/natinst/labview/tdtable.tdr
3515 /usr/local/natinst/labview/lvrt /etc/natinst/lvrt.conf.d
3515 /usr/local/natinst/labview/lvrt /var/volatile/tmp/nipal/.initClnLock
3515 /usr/local/natinst/labview/lvrt /var/volatile/tmp/nipal/.proc1Lock
3515 /usr/local/natinst/labview/lvrt /var/volatile/tmp/nipal/NISSPALSharedDataSegment
3515 /usr/local/natinst/labview/lvrt /dev/nipalk
3515 /usr/local/natinst/labview/lvrt socket:[29017]
3515 /usr/local/natinst/labview/lvrt socket:[29018]
3515 /usr/local/natinst/labview/lvrt socket:[28446]
3515 /usr/local/natinst/labview/lvrt socket:[29020]
3515 /usr/local/natinst/labview/lvrt pipe:[28453]
3515 /usr/local/natinst/labview/lvrt pipe:[28454]
I was trying to understand the lvuser_daemon script, and found this command in it:
/usr/local/natinst/bin/nirtcfg -l -f /etc/natinst/share/lvrt.conf
[LVRT]thpolicy_tcrit=fifo
[LVRT]thpri_tcrit=94
[LVRT]thpolicy_vhigh=rr
[LVRT]thpri_vhigh=6
[LVRT]thpolicy_high=rr
[LVRT]thpri_high=5
[LVRT]RTCPULoadMonitoringEnabled=True
[LVRT]RTMode=True
[LVRT]appFont="paratype-pt sans" 12
[LVRT]dialogFont="paratype-pt sans" 12
[LVRT]systemFont="paratype-pt sans" 12
[LVRT]RTSupportsAppEvents=TRUE
[LVRT]SocketSetReuseAddr=TRUE
[LVRT]RTTarget.ApplicationPath=/home/lvuser/natinst/bin/TBLStartupApp.rtexe
[LVRT]RTTarget.LaunchAppAtBoot=True
Which based on: https://knowledge.ni.com/KnowledgeArticleDetails?id=kA03q000000YHpnCAG&l=en-US
Seems that the TBLStartupApp.rtexe is actually the code running and executing the robot command.
Which seems to be a labview program? https://www.reddit.com/r/FRC/comments/2r9yty/starting_c_code_at_boot/
Looking to the lsof/ps I captured before reimaging, lvrt is running, and TBLStartupApp.rtexe is an open file for one process. ( But the robotCommand and Java command aren’t showing. )
Since this happened more than twice, I am going to assume it will happen again.
When it does, I plan to try and manually trigger a deploy after the jar is copied, and robotCommand is correctly created by following these steps on an ssh terminal:
I may also try to run gradlew --debug deploy from the command line. ( Didn’t do it last night as the team member was running all of the deploys, and running it from vscode. Hard to capture output that way. )
If anyone has any further ideas for data gathering. ( How to get more logs from lvrt or TBLStartupApp.rtexe or who I should talk to about either of those, or confirming or adding to what I need to do to manually start the robot code, please let me know. )
Bumping this thread, we had the same problem show up tonight. Tried switching code base, computer, connection deployed over, power cycling. We ended up reimaging the Rio and deploys started working, but that doesn’t seem sustainable for a competition setting.
For the roborio2, perhaps this could be a reason for a second sd card. Of course that may only save you once.
Bumping this, as at least 2 teams that I know of missed a match this weekend alone due to this issue. There is still no known fix besides re-imaging, which takes longer than a timeout.
This is a majorly critical problem, and every time you deploy, there’s a ~1% chance your rio bricks.
We need data to figure out what’s causing this. If anyone has a Rio 2 that runs into this problem, please make an image of the SD card (you can use Etcher to do this) and post it somewhere we can grab. Unfortunately I don’t think there’s a good way to make an image of a Rio 1.
Is what you need for imaging a RIO 1.0. It will work to create an image file on most RoboRIOs provided there is enough free space - it requires the space so that it can tar the files on the local system prior to sending them across.
The source is actually available here: https://github.com/ni/rad
I just ran into this over the weekend, and started spelunking around the file system on the Roborio after finding this thread. I discovered that the frcRunRobot.sh
script in /usr/local/frc/bin
had lost all permissions and was 0 bytes in size, with a timestamp of that day. It was also marked as owned by admin:administ
, whereas the other files in that directory are owned by 1000:1000
. I copied the script from a working Roborio into place, and matched ownership / permissions on the file, and our deployed program immediately started running!
As a new forum user, I can’t upload a copy of the script, but maybe someone else can. Not sure if this is even the same thing that’s happening to other people, but it’s worth a check before going through the process of re-imaging the rio.
Thank you! This is exactly the info we needed, and I think I know what’s happening so we can avoid it for 2023.