I was digging through the source and found this
https://github.com/opencv/opencv/blo...pp#L1536-L1690
So now I'm curious: does the performance boost come from not using CV_NEON in your OpenCV library build, or because NEON intrinsics are significantly slower than using plain assembly?