Best way to accomplish hosting web server on Jetson (+couple other misc. questions)

Hi CD,

I’ve been working on setting up all of our vision code in the offseason, and I was wondering what the best way to set up an OpenCV control dashboard would be. At the moment, I’m planning on giving our Jetson TX1 a static IP and hosting an Apache web server on it, which is hosting a simple site with some bootstrap sliders for filter calibration and a preview window. I was thinking of having a json file which contains all of the variables that the sliders would control, and the C++ OpenCV code would just read method inputs from that json file.

The only thing I can’t figure out is how to display a preview of OpenCV frames in the hosted HTML5 site… I read somewhere about FFMPEG conversion or something… :confused:
It would be amazing to have a bootstrap toggle of some sort which allows us to preview the OpenCV processed feed and raw camera feed if we have to.

If there’s a better way to accomplish this dashboard project, let me know. I’m aiming to make calibration and previewing camera feed easy. I’d rather not have to play around with gstreamer if I can avoid it.

On a side note, has anyone tried using UDP packets for sending data to the rRIO? Would you recommend that over NetworkTables or a serial connection?


Edit: Maybe using gstreamer is a good idea? Hardware encoding/decoding is never a bad thing. Only problem is I have no idea how to use gstreamer in conjunction with OpenCV

I’m going to ignore all the web server stuff… mostly because I don’t have a good answer but I know it can be done.

As for sending UDP packets - yeah you can do that. I would highly recommend looking at ZeroMQ instead of just sending raw UDP frames but that’s me.

What’s the downside to sending raw UDP frames? Just packet loss?

Plus a lot of low-level coding to define a protocol which you get basically for free with a slightly higher level library.

Yeah, what he said.

  1. Web Interface
    In the past I’ve found success with Spark (for java), or Mongoose (C++). Both of these have websocket libraries you can use to communicate with a javascript script running on the client browser side. Previously I just forwarded the bounding boxes of vision targets over and rendered them on a HTML5 canvas (here), which proved to be very resource efficient both in transport and rendering.

You can use other methods of sending video, but unless you need it, I find sending just the tracked data is enough.

  1. UDP
    Go for it. Making your own protocol isn’t too difficult, for example, you could just send bounding boxes of your targets as 4 integers (x, y, width, height), with one bounding box per packet. You could get more advanced, or more simple, it’s really up to you. You could even use something like Google’s Protobuf or my own Fixed-Width Interchange (shameless plug) if you want something to abstract your protocol even more.

For a dashboard you could always use NetworkTables and either SmartDashboard or FRCDashboard.

As for showing a stream, mjpeg-streamer is always useful. I also think you can use the cscore MjpegStreamer object to host a stream. This might also solve your other issue as I believe the website that it hosts has some exposure and lighting config sliders.

Wait, what does just sending the bounding boxes tell you? How can you debug and calibrate at competitions?

When you run your vision tracking code, you’re usually finding contours. You can draw a bounding box around these contours and it’s simply just a rectangle. These represent your vision tracking targets (i.e. the retroreflective material).

By sending them to the webui, I could draw a line in the middle of the canvas and calibrate based on the offset between the centre of the box and the centre of the canvas. Using some simple math you can calculate the angle offset with the focal length known

Oh, yeah you could use them to calculate the angle offset, but the entire point of making a dashboard like this is so you can tune the values in your OpenCV filters. In other words, we are trying to preview our OpenCV processing filters so we can tune the values to track the tape well.

At competition, they always give you a “calibration period” on the first day, where teams can calibrate their robot’s OpenCV filters to the field’s lighting. This project intends to make that (and debugging) easy.

We moved off ZeroMQ on a project because we couldn’t get the retry behavior that we wanted from it, and kept running into use after free issues. After watching one of our best engineers spend a month debugging it (full time), I’ve been very hesitant to suggest it again. YMMV obviously. For a UDP protocol, we rolled our own. For a TCP protocol, we’ve been using grpc to good effect.

I’ve been quite happy with seasocks. Essentially, encode your image as a jpeg or png, base64 encode it, and send it over a websocket. Then, drop it in an image tag when it arrives. It works really well. I’ve got live plotting of controls data working with it (1 khz data), and am able to edit parameters. After watching all the cool things supported by using webpages instead of traditional thick UIs, I’m a believer. (I’ve debugged systems across the country with SSH + port forwarding)

TIL. If this dude says it can be done then I’d trust his word over mine. I’m not into the pain of rolling my own protocol though - I wish I was that hardcore.

As long as you limit yourself to 1472 bytes (MTU of a UDP packet), it’s pretty easy. The following is what we did last year. The size then is in the UDP packet header, so it’s free from a protocol point of view.



#include <arpa/inet.h>
#include <math.h>
#include <sys/socket.h>
#include <unistd.h>
#include <string>
#include <vector>

#include "aos/common/macros.h"
#include "aos/common/scoped_fd.h"

namespace aos {
namespace events {

// Simple wrapper around a transmitting UDP socket.
// LOG(FATAL)s for all errors, including from Send.
class TXUdpSocket {
  TXUdpSocket(const std::string &ip_addr, int port);

  // Returns the number of bytes actually sent.
  int Send(const char *data, int size);

  ScopedFD fd_;


// Simple wrapper around a receiving UDP socket.
// LOG(FATAL)s for all errors, including from Recv.
class RXUdpSocket {
  RXUdpSocket(int port);

  // Returns the number of bytes received.
  int Recv(void *data, int size);

  ScopedFD fd_;


}  // namespace events
}  // namespace aos


C++ file

#include "aos/vision/events/udp.h"

#include <string.h>

#include "aos/common/logging/logging.h"
namespace aos {
namespace events {
TXUdpSocket::TXUdpSocket(const std::string &ip_addr, int port)
  sockaddr_in destination_in;
  memset(&destination_in, 0, sizeof(destination_in));
  destination_in.sin_family = AF_INET;
  destination_in.sin_port = htons(port);
  if (inet_aton(ip_addr.c_str(), &destination_in.sin_addr) == 0) {
    LOG(FATAL, "invalid IP address %s
", ip_addr.c_str());

  PCHECK(connect(fd_.get(), reinterpret_cast<sockaddr *>(&destination_in),

int TXUdpSocket::Send(const char *data, int size) {
  // Don't fail on send. If no one is connected that is fine.
  return send(fd_.get(), data, size, 0);
RXUdpSocket::RXUdpSocket(int port)
  sockaddr_in bind_address;
  memset(&bind_address, 0, sizeof(bind_address));

  bind_address.sin_family = AF_INET;
  bind_address.sin_port = htons(port);
  bind_address.sin_addr.s_addr = htonl(INADDR_ANY);

  PCHECK(bind(fd_.get(), reinterpret_cast<sockaddr *>(&bind_address),
int RXUdpSocket::Recv(void *data, int size) {
  return PCHECK(recv(fd_.get(), static_cast<char *>(data), size, 0));

}  // namespace events
}  // namespace aos


  ::aos::events::RXUdpSocket recv(8080);
  char rawData[65507];

  while (true) {
    VisionData target;  // The protobuf with your data in it.
    int size = recv.Recv(rawData, 65507);
    CHECK (target.ParseFromArray(rawData, size));

Really, not bad at all. I’d be quite tempted to wrap the rx part up in a class which returns (or populates) the protobuf all at once to hide the buffer, but that’s just cleanup work.

TCP is a lot more work. There’s a lot more which can go wrong. I’d just go look at grpc if you want to do that.

I’ve always been one for rolling your own protocol also. If you’re using C/++ and both systems are the same endianness, you can even transmit structs directly assuming all inner types are not pointers.

typedef struct {
    uint16_t id;
    int32_t x, y, h, w;
} Rectangle;

// ....

void sendRect(SOCKET sock, Rectangle myRect) {
    send(sock, &myRect, sizeof(myRect));

void recvRect(SOCKET sock, Rectangle *myRect) {
    recv(sock, myRect, sizeof(*myRect), 0);

Don’t forget to verify that your data is, indeed, valid. In the above example, one would check against the “id” integer to verify the packet is what we’re expecting.

If you’re using Java, the ByteBuffer class is a good place to start, however you can’t serialize data objects directly like in C/++.

EDIT: Also make sure that both your systems represent data types in the same way. Use the inttypes header where possible to avoid issues.

While that works for simple data, you get into compatibility problems pretty fast. There has been a lot of work put into data formats which are endian agnostic and upgradeable. I’ve been using protobufs to good effect for years on a bunch of different projects. I’ve heard good things about flatbuffers and capnproto, but I don’t have direct experience with them. Especially in a FRC environment where you should be learning, it’s worth spending some time reading up on the different message serialization libraries and learning what problems they are trying to solve.

I think we’ve beaten this one to death. Give us a holler if you need more advice.

Have you watched Team 254’s video on Vision Processing. It’s super useful:

Also, regarding tuning, it’s preferable to tune in a way such that you’ll only need to tune once and no matter what type of lighting you’re in, it should still get a good image. This is mainly HSV tuning and also adjusting the brightness of your camera as well as the ISO (see 254 vid). For OpenCV, the easiest way to do this is something like this: (ignore the picamera stuff)

I wouldn’t make a webserver if you can avoid it as it just adds complications to your project and may slow down the processing itself during a match. The easiest way to get around that while tuning is to simply hook up a monitor and do a cv2.imshow() to see the mask vs no mask.