We too are seeing this issue and so far the only solution for us is to bypass the switch. But then no cameras. So not a real solution.
Seeing the “internal error” problem, or getting timeouts? What is your full network configuration? Just a switch shouldn’t cause any issues, but what’s connected to the switch might. If you’re streaming video on a slower/congested wifi connection, that could cause issues—make sure your radio is set to 5 GHz mode and look to see if there’s other networks active on the same channel.
Doesn’t seem to be fixed with the new ds, now sometimes enabling and disabling will cause the ds to flicker whether it can resolve the rio, despite pings still going through
No video or cameras at this point. We can drive and pick up with no issue. BUT if we shoot, we get three shots and then communication is lost for about 20 seconds. Even if we shoot slowly. If we connect the radio to the RoboRio all works. But with the switch, (tried 3 switches ALL with the same issue) even if powered by a second 12V we stop. We have swapped the POE, Ethernet Cables, Neo Motors (gen 1 & 2), Ethernet Switches, RoboRio 1 & 2, unplugged the Ethernet to the Raspberry Pi’s, Voltage Regulator and two different laptops, and we tried to use the second radio port for a switch to the Rasberry Pi’s but we could not ping them, so again a no-go. I posted a picture of the errors.
Having only the chance to quickly skim this thread.
I look forward to downloading the NI Tools update, I wish I saw this thread sooner like before this weekend.
I see in this thread talk of a web browser being enough to trigger the 44000 error. I personally wondered if something like pathplanner hot update or additional NT traffic with Advantagekit logging might be a contributing factor as well.
But these are only guesses that may have no footing. But I ask this from playing around/ trial and error trying to find a work around with the pre-patch tools this past weekend.
Any thoughts?
For anyone still seeing this issue, upgrade NI Game Tools first. There was at least one bug fixed that should make the disables less likely to occur. If you’re still getting random disables after upgrading, there should be a more useful message in the DS log. Post the error message here along with screenshots of the DS timing window when the problem occurs (along with computer specs, what other software is running, etc) to help NI diagnose.
Here is the error message
Yeah this is the error I’m seeing
Can you both click on the gear and choose “View Timing” then post a screenshot of that?
Let me clarify and add some detail to dhrivnaks post.
We installed the update to game tools and now we get better error messages but the problem persists. Robot randomly disables when shooting.
We’re running radio at 5g.
Performance monitor on drivers station shows no issues and there doesn’t seem to be any misbehaving software.
We’ve replaced all hardware and Ethernet cables with no effect.
Our network consists of a radio with cable in the Poe port connected to a 10/100 switch. The switch is connected to the roborio, a raspberry pi and an orange pi. We completely disconnected the orange pi and raspberry pi and the problem persists.
If we remove the switch and connect the radio directly to the roborio the problem goes away.
If we tether with the switch in place the problem goes away.
We tried installing 2023 drivers station on a pc and the symptoms changed. There are random occurrences where the software and communications indicators on the drivers station tun red and all motors stop but the drivers station remains enabled, then a second or two later motors restart. On the 2024 drivers station the bot is also disabled.
Find below a couple of charts note our pc speed and memory should be sufficient.
The chart below shows no significant issues until disable occurs. The communications and software indicators on the drivers station turn red as soon as the first green spike occurs. Robot is disabled when the blue spike occurs.
It’s odd but we don’t see random disable occur when driving or running other motors such as the shooter until we actually run a note through the shooter.
Thinking we might have electrical noise causing communication errors we powered the switch from an independent battery with no positive effect.
We’ve tried everything that’s been recommended to try to identify the source of the problem. Seems to be something that is embedded in the drivers station code. We’re at the limit of what we can do.
It seems like you’ve effectively isolated the problem to the switch. Have you tried replacing the switch with a different switch, or a different make/model of switch? What’s the switch make/model?
We’ve replaced the switch, the roborio, the radio and the Poe power supply.
I can check model of switch tomorrow but I believe it is one of those from first choice…. We have a couple of different models. We’re considering buying a 1 gb switch hoping that it has less latency.
Very strange… this is apropos:
Maybe the computer is doing some sort of periodic ARP probe that confuses the switch and causes it to drop packets for a bit?
Third member of 4020 chiming in here. The switch being used in the test resulting in the screenshot by @Brhea is a D-Link GO-SW-5E. It was removed from our 2023 competition bot, having run successfully through 2 regionals, worlds, and an offseason event. We found the problem originally when using a D-Link DES-1005C, which had successfully run on a competition bot from a some season before 2023.
An additional note on the computer. The failures are exactly the same with two different computers running as driver stations. One is Windows 10 and the other Windows 11. The failure timing is always right after a shot, although it does not happen after every shot. That is the randomness. The failures happen anywhere from after the 1st shot to after the 5th shot. The most likely shot count before failure is very close between 2 and 3. The failure is not random in a general sense. We can drive the robot around indefinitely, intake as much as we would like, and can run the conveyor and shooter (without shooting) as much as we would like. We aren’t browning out during/after a shot. The voltage barely dips when shooting. We have stress tested voltage drops by driving the robot stalled against a wall and this does not cause a loss of communication and subsequent disable.
If you move the switch “relatively far” away does the problem persist?
What is your shooter configuration anything special?
Strange problem for sure only other thing I can think of is trying a brain box or similar switch
We’ll have to try moving the switch away with some longer cables tomorrow. We have switched out all cables, but have not tried particularly long ones.
Shooter is nothing special. Two NEOs/Spark MAXes running low inertia AM Stealth wheels on separate axles 1:1 belt drive. Pretty common 0.5" compression. Failures occur whether running motors in open or closed loop (feedforward+P). Failures seem to be pretty much the same running 30%, 100% or anywhere else. I don’t remember trying less than 30%.
We’ve got a gigabit switch on order to try just in case the gigabit capable circuitry might reduce any latency. Not a Brainboxes switch though. I’m probably going to try running a gigabit router configured as a switch tomorrow.
Paulonis found a gb router and configured it as a switch. Swapped this for our switch and the disables no longer occur. We’ve ordered a gb switch which we will test in a couple of days.
The chart traces shown above for DS timings are loop rates. The green sawtooth indicates when status packets either aren’t being returned from the robot program or they are not being processed by the laptop. Other loops are fine. The blue spike is odd, but may simply coincide with the print statement posting from that loop indicating that it is disabling the robot – I’ll have to force this to happen to verify.
My suspicion is that this has nothing to do with speed of the switch or communications, but power to the switch. My suspicion is that when your robot shoots, either a voltage drop or a mechanical disruption to the cabling is causing your ethernet switch to reboot. Notice that the status seems to come back quickly.
If you look at your DS logs of that event, I suspect you will see about a 10 sec comms loss which is what I associate with DLink boot times (10 to 12).
If you would like to post the logs or a screenshot of the log, we can see if the voltage drops due to a shot. And that brings up the topic of how your switch is powered. If it is simply being powered off the PDH/PDP, that power is not boosted and will droop. If the switch is a 12V supply, it may reboot at 9V or 8V or ???
So even if your issue seems to go away, think about how your accessories are powered and ensure they get the voltage supply they expect when the robot motors are working hard.
Id like to add, my team is having the same problem (The 44004 Driver Station etc etc one) Were on the latest patch, nothing open, no switch, just radio with POE, really seems like a problem with the driver station or rio image
edit: Pure swerve, full battery, nothing running but the modules and a pigeon
Can you post a picture of the “view timing” window and the error message?