Jetson TK1 from First Choice Keeps Failing

Hello all,

We are having issues with our Jetson TK1 which received from First choice.

It is running the latest Jetpack 2.0.

It starts up fine, and I can ssh into it to load custom made programs. However, after a while the file system auto mounts itself as read-only.

If I recall correctly, this is automatically done by ubuntu to protect the file system if it detects some error which can make the system unstable.

So this lead me down a faulty hard disk path.

running a few commands I get the following

dmesg | grep "EXT4-fs error"

42.960625] EXT4-fs error (device mmcblk0p1): ext4_mb_generate_buddy:755: group 5, 3697 clusters in bitmap, 3696 in gd
87.527366] EXT4-fs error (device mmcblk0p1): __ext4_journal_start_sb:62: Detected aborted journal

and

sudo fsck /dev/mmcblk0p1 -y -c -f -b -l

/dev/mmcblk0p1: ***** FILE SYSTEM WAS MODIFIED *****
/dev/mmcblk0p1: ***** REBOOT LINUX *****
/dev/mmcblk0p1: 166977/917504 files (0.0% non-contiguous), 2920712/3670016 blocks
Error writing block 32768 (Attempt to write block to filesystem resulted in short write). Ignore error? yes

Error writing block 98304 (Attempt to write block to filesystem resulted in short write). Ignore error? yes

Error writing block 163840 (Attempt to write block to filesystem resulted in short write). Ignore error? yes

Error writing block 229376 (Attempt to write block to filesystem resulted in short write). Ignore error? yes

Error writing block 294912 (Attempt to write block to filesystem resulted in short write). Ignore error? yes

Error writing block 819200 (Attempt to write block to filesystem resulted in short write). Ignore error? yes

Error writing block 884736 (Attempt to write block to filesystem resulted in short write). Ignore error? yes

Not sure where to go from here. It looks like the disk has a hardware failure. This report goes on for miles.

Any recommendation on a fix, or anyone know whom or how I can contact NVIDIA when related to a FIRST item?

Thanks,
Kevin

What happens if you do not load the custom program? Does the Jetson remain stable then?

Thanks for the reply. It continues to fail even if our custom program doesn’t run. Our vision program doesn’t write out to the hard disk.

We are powering it through the provided wall wart power supply which came with the Jetson.

I have re-flashed it 3 times, using the Jetson 2.0 with the same issue. I have tested our program on another Jetson TK1 which does not fail at all, and runs for hours without auto remounting to a read-only drive.

Try running ‘fsck’ on the root filesystem as root.

fsck -f /dev/root

When the error occurs, the fs will already be mounted read only.
The -f flag should force a check even though its an ext4 fs.

We had pretty bad luck with the reliability of the TK1. At one point, we had 2 that flat out refused to boot. I’d be careful if you decide to run this on your robot

Ouch…

We have not had those issues. Would highly recommend putting them in a case of some sort though. We also ran ours off of the VRM last year if that helps.

The TX1 is proving to be more of a challenge however… It seems very prone to power fluctuations.

Do you have CAD for a case or links to one?

This is what we used last year:

I threw this together for the TX1 this year and we are using a modified version of it:

Could you provide any insight on the choice of using a full enclosure case vs. the ‘sneeze guard’?? What would be risked if a more minimal (no side covering) case was used?

Uhh… crap can get in the side? Nothing really, just what I felt like CAD’ing that day. I do what I want! :smiley:

Ok. I will try to run that command and report back. Everytime the FS goes read-only, a reboot does not fix. It continues to start-up as read-only. Trying to remount only automatically goes back t read-only. I need to re-flash the drive and then will try again. I was going to try and boot off an SD card.

Thanks for the warnings Tom, any idea what caused your failures? We used a beaglebone in the past with much success, but learned that the rootfs needs to be read-only to avoid intermittent power shutdowns from corrupting the disk. Typically we plan to run our TK1 in read-only mode, but right now, we keep it in write mode because were actively developing on it.

What did you do with the 2 that didn’t boot. Did you contact NVIDIA somehow?

I do have it in a case, we modified the 3D printed NVIDIA case, and keep it in there.

My best guess is eMMC corruption due to yanking the power. Mounting the file system as read only probably makes this better. We didn’t really have the cycles to investigate and fix this so we just ended up using another solution.