It is my understand that Raspberry Pi support NEON (I'm targeting Pi 4 and Zero 2), although this is not directly documented anywhere that I could find. There are intrinsics for NEON, but is there any way to get any auto-vectorization working?
After much digging trying to get any auto-vectorization for any ARMv8 arch, I found that -march=armv8-a does nothing, but -march=armv8-a+sve finally works. But this code crashes on Pi 4 with SIGILL. Is this because Pi 4 doesn't support SVE (and NEON is not SVE)? Or could it be due to data alignment issues?
I'm majorly confused because vector instructions are supposed to be mandatory for ARMv8-A? So why no auto-vectorization?
After much digging trying to get any auto-vectorization for any ARMv8 arch, I found that -march=armv8-a does nothing, but -march=armv8-a+sve finally works. But this code crashes on Pi 4 with SIGILL. Is this because Pi 4 doesn't support SVE (and NEON is not SVE)? Or could it be due to data alignment issues?
I'm majorly confused because vector instructions are supposed to be mandatory for ARMv8-A? So why no auto-vectorization?
Statistics: Posted by VioletGiraffe — Sat Jan 27, 2024 2:29 pm — Replies 0 — Views 46