Bug 1876834

Summary:

No 'vst1q_f32_x2' intrinsic on AArch64

Product:

Red Hat Enterprise Linux 8

Reporter:

Marcin Juszkiewicz <mjuszkie>

Component:

gcc

Assignee:

Marek Polacek <mpolacek>

gcc sub component:

system-version

QA Contact:

Alexandra Petlanová Hájková <ahajkova>

Status:

CLOSED ERRATA

Docs Contact:

Severity:

unspecified

Priority:

unspecified

CC:

ahajkova, fweimer, jakub, mcermak, ohudlick, sipoyare, vmukhame

Version:

8.3

Keywords:

Bugfix, Triaged

Target Milestone:

Target Release:

8.0

Hardware:

aarch64

OS:

Unspecified

Whiteboard:

Fixed In Version:

gcc-8.4.1-1.el8

Doc Type:

No Doc Update

Doc Text:

Story Points:

---

Clone Of:

Environment:

Last Closed:

2021-05-18 13:28:00 UTC

Type:

Bug

Regression:

---

Mount Type:

---

Documentation:

---

CRM:

Verified Versions:

Category:

---

oVirt Team:

---

RHEL 7.3 requirements from Atomic Host:

Cloudforms Team:

---

Target Upstream Version:

Embargoed:

Attachments:

Description	Flags
patch against git.centos.org gcc repo	none

Description Marcin Juszkiewicz 2020-09-08 10:07:26 UTC

Created attachment 1714068 [details]
patch against git.centos.org gcc repo

Description of problem:

Pytorch can not be built on AArch64 in CentOS 8. Build fails with:

/root/pytorch/aten/src/ATen/cpu/vec256/vec256_float_neon.h:235:7: error: ‘vst1q_f32_x2’ was not declared in this scope

I filled bug against pytorch [1] and with help of Sebastian Pop did a fix [2]. But proper solution would be fixing it in gcc by backporting two patches.

1. https://github.com/pytorch/pytorch/issues/44198
2. https://github.com/pytorch/pytorch/pull/44199


Version-Release number of selected component (if applicable):

8.3.1-5.el8.0.2

How reproducible:

always

Steps to Reproduce:
1. Clone https://github.com/pytorch/pytorch
2. cd pytorch
3. USE_CUDA=0 BUILD_CAFFE2_OPS=0 USE_DISTRIBUTED=0 USE_QNNPACK=0 USE_XNNPACK=0 python3 setup.py install

Actual results:

In file included from /root/pytorch/aten/src/ATen/cpu/vec256/vec256.h:10,
                 from /root/pytorch/aten/src/ATen/native/cpu/Loops.h:35,
                 from /root/pytorch/aten/src/ATen/native/Normalization.cpp:10:
/root/pytorch/aten/src/ATen/cpu/vec256/vec256_float_neon.h:262:3: warning: type qualifiers ignored on function return type [-Wignored-qualifiers]
   const float operator[](int idx) const {
   ^~~~~
/root/pytorch/aten/src/ATen/cpu/vec256/vec256_float_neon.h:267:3: warning: type qualifiers ignored on function return type [-Wignored-qualifiers]
   const float operator[](int idx) {
   ^~~~~
/root/pytorch/aten/src/ATen/cpu/vec256/vec256_float_neon.h: In member function ‘void at::vec256::{anonymous}::Vec256<float>::store(void*, int64_t) const’:
/root/pytorch/aten/src/ATen/cpu/vec256/vec256_float_neon.h:235:7: error: ‘vst1q_f32_x2’ was not declared in this scope
       vst1q_f32_x2(reinterpret_cast<float*>(ptr), values);
       ^~~~~~~~~~~~
/root/pytorch/aten/src/ATen/cpu/vec256/vec256_float_neon.h:235:7: note: suggested alternative: ‘vld1q_f32_x2’
       vst1q_f32_x2(reinterpret_cast<float*>(ptr), values);
       ^~~~~~~~~~~~
       vld1q_f32_x2
/root/pytorch/aten/src/ATen/cpu/vec256/vec256_float_neon.h:242:7: error: ‘vst1q_f32_x2’ was not declared in this scope
       vst1q_f32_x2(reinterpret_cast<float*>(tmp_values), values);
       ^~~~~~~~~~~~
/root/pytorch/aten/src/ATen/cpu/vec256/vec256_float_neon.h:242:7: note: suggested alternative: ‘vld1q_f32_x2’
       vst1q_f32_x2(reinterpret_cast<float*>(tmp_values), values);
       ^~~~~~~~~~~~
       vld1q_f32_x2
gmake[2]: *** [caffe2/CMakeFiles/torch_cpu.dir/build.make:2014: caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/native/Normalization.cpp.o] Error 1
gmake[1]: *** [CMakeFiles/Makefile2:6058: caffe2/CMakeFiles/torch_cpu.dir/all] Error 2
gmake: *** [Makefile:141: all] Error 2

Expected results:

pytorch builds

Additional info:

Comment 2 Marek Polacek 2020-09-29 19:55:45 UTC

GCC 8.4 contains:

commit 2c55e6caa9432b2c1f081cb3aeddd36abec03233
Author: Sameera Deshpande <sameera.deshpande>
Date:   Thu May 31 08:46:20 2018 +0000

    Patch implementing vld1_*_x3, vst1_*_x2 and vst1_*_x3 intrinsics for AARCH64 for all types.

(cherry picked from commit 568421baa5a4cdb7bb7c5ac323c939492ee3f052)

and also
commit a4004f62d60ada3a20dbf30146ca461047a575cc
Author: Sylvia Taylor <sylvia.taylor>
Date:   Thu Aug 22 11:28:26 2019 +0000

    add intrinsics for vld1(q)_x4 and vst1(q)_x4
    
    This patch adds the intrinsic functions for:
    - vld1_<mode>_x4
    - vst1_<mode>_x4
    - vld1q_<mode>_x4
    - vst1q_<mode>_x4

(cherry picked from commit 391625888d4d97f9016ab9ac04acc55d81f0c26f)

Comment 9 errata-xmlrpc 2021-05-18 13:28:00 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (gcc bug fix and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2021:1571