Weird NetworkTables behavior correlated with cs::MjpegServer

My team uses a raspberry pi imaged with WPILibPi for vision processing, and networktables is used to control many program runtime options as well as exchange output data. This summer I have been working to make the system more expandable, and have run into some strange networktables behavior when trying to create and manage multiple output streams (cs::MjpegServer).
In order to control which input each MjpegServer is connected to, we have a class extending cs::MjpegServer and containing a shared pointer to a networktable with all necessary data. Whenever there is just one output stream, everything works as normal, but any more than 2 and approximately half of the streams don’t output anything. At first I thought this was an issue with the streams themselves, but after extensive testing the only common variable seemed to be the presence and use of networktables within the container class. Below is the class definition as well as the constructor, which shows how the networktable values are initialized.

Whenever I remove all calls that involve networktables in the constructor, all the streams work as normal and output video, but if the networktables code is included, then the first 50% of however many streams I want are invalid and display nothing. Below are some test cases showing the behavior.

In the images, the highlighted terminal area where no name is outputted correlate to which streams are not working in any given test. As far as what OutlineViewer shows, its even more strange. Whenever I open OutlineViewer for the first time, all of the information shows up as normal for the “Streams” ntables. Whenever I rerun the program with OutlineViewer still open, all tables that correspond to valid streams stay updated, but those that are invalid “toggle” between having missing information and being accurate. For example, if I were to rerun the program with 8 output streams, then the first 4 ntables would not have the “Port” entry. If I rerun again, then all the tables would be good - this just toggles every time. Since this is the only time I have ever seen this behavior, I’m thinking that it is related to what is happening with the streams, and not just an issue with OutlineViewer.
Going back to how the stream objects, they are stored within an std::vector in the larger “VisionServer” class. In the test cases I have provided they use the following method to emplace back the stream objects.

This use of instantiation is just used for simplicity for the examples, and is not the cause of the problem. If I add a stream one-by-one, the problem exists, and even if I add waits inbetween calls the problem also exists. I have tried flushing ntables within each all also, but that did nothing. What makes it even more strange is that I use the same pattern of extending cscore classes and storing networktable shared pointers for other parts of the code (pipelines and cameras), and have never had this issue. I guess at this point I feel like I’ve done everything, and thus it might be a deeper WPILib issue that I cannot fix, and thats why I’m making this post. For the full code, our vision library is on github here: GitHub - FRC3407/VisionServer: A C++ vision processing library designed for raspberry pi deployment on an FRC robot, under the “lib-vs” directory. I probably didn’t clarify everything, so make sure to ask about any additional specifics that may need to be considered. Thanks in advance to any suggestions/fixes!

This line is problematic:
inline static const std::shared_ptr<nt::NetworkTable> streams_table{VisionServer::base_table->GetSubTable("Streams")};

In C++, static initialization order is undefined between compilation modules. So both base_table may not be initialized in time, and the overall NT infrastructure may not be up and running yet. Try switching away from the use of static’s and see if that helps your problem.

Alright I moved all the base tables to be initialized in the VisionServer constructor, which is for certain called before any of the streams are… and the same result. I even added a 3 second wait between the NT instance creation and any other calls, but that also did nothing.

The code on GitHub doesn’t seem to exactly match your code snippets above re: OutputStream, but looking at the code on github, the OutputStream constructor is also adding a NT listener with a this capture. If the class is moveable or copyable, that pointer will be invalidated if the instance is moved… so e.g. when you’re building a vector of OutputStream objects, and the vector is resized, the objects get moved, but the previously captured this is now pointing to invalid memory… UB. The right way to fix this is to make the class non-moveable and non-copyable and use unique_ptr instead of using the class by value.

Yeah so I haven’t commited in a while as I still have not fixed this issue, but I have already done away with those listeners and replaced them with a check that runs during each processing frame. What is showed in the original post is all that is left of the OutputStream constructor. I just commited everything that I have changed so it should be up to date now.
The mention of resizing the vector does make me wonder though, since only the first half of the streams are broken - meaning the vector probably reallocates, and whatever streams were already added are the ones affected? I guess I’m not sure what exactly there is left to be invalidated, could the shared pointer to the networktable somehow be affected?

EDIT: I just recompiled with an added “.reserve(8)” in the VS constructor, and all the streams worked, so this is definitely the issue. I’m still not sure exactly why though?

We’d have to see the full code to see why. Usually issues like this are caused in other places in code, so bits and pieces are not helpful, nor is out of date code.

Your move and copy constructors for OutputStream do not move/copy the base class… so the underlying MjpegServer class is just getting default-constructed (and the moved-from one is eventually getting destroyed). You need to add MjpegServer(std::move(o)) to the move constructor (and similar for the copy constructor).

This worked. Thanks for your time!

This topic was automatically closed 365 days after the last reply. New replies are no longer allowed.