Bug 1876834

Summary: No 'vst1q_f32_x2' intrinsic on AArch64
Product: Red Hat Enterprise Linux 8 Reporter: Marcin Juszkiewicz <mjuszkie>
Component: gccAssignee: Marek Polacek <mpolacek>
gcc sub component: system-version QA Contact: Alexandra Petlanová Hájková <ahajkova>
Status: CLOSED ERRATA Docs Contact:
Severity: unspecified    
Priority: unspecified CC: ahajkova, fweimer, jakub, mcermak, ohudlick, sipoyare, vmukhame
Version: 8.3Keywords: Bugfix, Triaged
Target Milestone: rc   
Target Release: 8.0   
Hardware: aarch64   
OS: Unspecified   
Whiteboard:
Fixed In Version: gcc-8.4.1-1.el8 Doc Type: No Doc Update
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2021-05-18 13:28:00 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
patch against git.centos.org gcc repo none

Description Marcin Juszkiewicz 2020-09-08 10:07:26 UTC
Created attachment 1714068 [details]
patch against git.centos.org gcc repo

Description of problem:

Pytorch can not be built on AArch64 in CentOS 8. Build fails with:

/root/pytorch/aten/src/ATen/cpu/vec256/vec256_float_neon.h:235:7: error: ‘vst1q_f32_x2’ was not declared in this scope

I filled bug against pytorch [1] and with help of Sebastian Pop did a fix [2]. But proper solution would be fixing it in gcc by backporting two patches.

1. https://github.com/pytorch/pytorch/issues/44198
2. https://github.com/pytorch/pytorch/pull/44199


Version-Release number of selected component (if applicable):

8.3.1-5.el8.0.2

How reproducible:

always

Steps to Reproduce:
1. Clone https://github.com/pytorch/pytorch
2. cd pytorch
3. USE_CUDA=0 BUILD_CAFFE2_OPS=0 USE_DISTRIBUTED=0 USE_QNNPACK=0 USE_XNNPACK=0 python3 setup.py install

Actual results:

In file included from /root/pytorch/aten/src/ATen/cpu/vec256/vec256.h:10,
                 from /root/pytorch/aten/src/ATen/native/cpu/Loops.h:35,
                 from /root/pytorch/aten/src/ATen/native/Normalization.cpp:10:
/root/pytorch/aten/src/ATen/cpu/vec256/vec256_float_neon.h:262:3: warning: type qualifiers ignored on function return type [-Wignored-qualifiers]
   const float operator[](int idx) const {
   ^~~~~
/root/pytorch/aten/src/ATen/cpu/vec256/vec256_float_neon.h:267:3: warning: type qualifiers ignored on function return type [-Wignored-qualifiers]
   const float operator[](int idx) {
   ^~~~~
/root/pytorch/aten/src/ATen/cpu/vec256/vec256_float_neon.h: In member function ‘void at::vec256::{anonymous}::Vec256<float>::store(void*, int64_t) const’:
/root/pytorch/aten/src/ATen/cpu/vec256/vec256_float_neon.h:235:7: error: ‘vst1q_f32_x2’ was not declared in this scope
       vst1q_f32_x2(reinterpret_cast<float*>(ptr), values);
       ^~~~~~~~~~~~
/root/pytorch/aten/src/ATen/cpu/vec256/vec256_float_neon.h:235:7: note: suggested alternative: ‘vld1q_f32_x2’
       vst1q_f32_x2(reinterpret_cast<float*>(ptr), values);
       ^~~~~~~~~~~~
       vld1q_f32_x2
/root/pytorch/aten/src/ATen/cpu/vec256/vec256_float_neon.h:242:7: error: ‘vst1q_f32_x2’ was not declared in this scope
       vst1q_f32_x2(reinterpret_cast<float*>(tmp_values), values);
       ^~~~~~~~~~~~
/root/pytorch/aten/src/ATen/cpu/vec256/vec256_float_neon.h:242:7: note: suggested alternative: ‘vld1q_f32_x2’
       vst1q_f32_x2(reinterpret_cast<float*>(tmp_values), values);
       ^~~~~~~~~~~~
       vld1q_f32_x2
gmake[2]: *** [caffe2/CMakeFiles/torch_cpu.dir/build.make:2014: caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/Normalization.cpp.o] Error 1
gmake[1]: *** [CMakeFiles/Makefile2:6058: caffe2/CMakeFiles/torch_cpu.dir/all] Error 2
gmake: *** [Makefile:141: all] Error 2

Expected results:

pytorch builds

Additional info:

Comment 2 Marek Polacek 2020-09-29 19:55:45 UTC
GCC 8.4 contains:

commit 2c55e6caa9432b2c1f081cb3aeddd36abec03233
Author: Sameera Deshpande <sameera.deshpande>
Date:   Thu May 31 08:46:20 2018 +0000

    Patch implementing vld1_*_x3, vst1_*_x2 and vst1_*_x3 intrinsics for AARCH64 for all types.

(cherry picked from commit 568421baa5a4cdb7bb7c5ac323c939492ee3f052)

and also
commit a4004f62d60ada3a20dbf30146ca461047a575cc
Author: Sylvia Taylor <sylvia.taylor>
Date:   Thu Aug 22 11:28:26 2019 +0000

    add intrinsics for vld1(q)_x4 and vst1(q)_x4
    
    This patch adds the intrinsic functions for:
    - vld1_<mode>_x4
    - vst1_<mode>_x4
    - vld1q_<mode>_x4
    - vst1q_<mode>_x4

(cherry picked from commit 391625888d4d97f9016ab9ac04acc55d81f0c26f)

Comment 9 errata-xmlrpc 2021-05-18 13:28:00 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (gcc bug fix and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2021:1571