Message ID | 20250521115945.3629052-1-sundeep.kokkonda@windriver.com |
---|---|
State | Accepted, archived |
Commit | a99a65632116955dc69809a14bf536b22582de72 |
Delegated to: | Steve Sakoman |
Headers | show |
Series | [kirkstone,v2] gcc: AArch64 - Fix strict-align cpymem/setmem | expand |
On 2025-05-21 7:59 a.m., sundeep.kokkonda@windriver.com wrote: It's good practice to CC the person who asked for a v2. I've added Gyorgy to this thread. > From: Sundeep KOKKONDA<sundeep.kokkonda@windriver.com> > > The cpymemdi/setmemdi implementation doesn't fully support strict alignment. > Block the expansion if the alignment is less than 16 with STRICT_ALIGNMENT. > Clean up the condition when to use MOPS. (PR103100) > > This patch includes only the alignment-related changes in the aarch64.c file. > Changes to aarch64.md were skipped due to non-essential impact and potential incompatibility. That's better than not mentioned what parts of the patch you dropped but it's doesn't *explain* why that change isn't needed or I'm missing the point! ;-) Please dig a bit deeper and explain: 1. what TARGET_MOPS means and does and maybe what the .md file's purpose is, 2. why was it removed in the gcc-12 commit, 3. why it's safe to ignore that change for gcc-11, 4. Since you are not backporting the MOPS part, it might be nice to point that out in each part of the patch that mentions MOPS. Maybe that's overkill and you just need to explain in the commit log but think about someone who is just looking at the patch and do what makes sense to you. I can almost understand things better now but I'll let you present it, step by step, in v3. > All changes and outputs have been verified by the author. Is there a test case that covers all of this that starts passing with this change? Please explain how are you testing? ../Randy I looked at this a bit and found gcc/ChangeLog-2021 that talks about the introduction of TARGET_MOPS. This was done in gcc-12 and later as you have hinted at. ❯ git blame gcc/ChangeLog-2021 | rg TARGET_MOPS d04ae83244d3 (Jakub Jelinek 2022-01-03 10:18:16 +0100 1214) (aarch64_expand_setmem): Adjust for TARGET_MOPS. d04ae83244d3 (Jakub Jelinek 2022-01-03 10:18:16 +0100 1215) * config/aarch64/aarch64.h (CLEAR_RATIO): Adjust for TARGET_MOPS. d04ae83244d3 (Jakub Jelinek 2022-01-03 10:18:16 +0100 1219) (setmemdi): Adjust for TARGET_MOPS. d04ae83244d3 (Jakub Jelinek 2022-01-03 10:18:16 +0100 1238) (TARGET_MOPS): Define. d04ae83244d3 (Jakub Jelinek 2022-01-03 10:18:16 +0100 1239) (MOVE_RATIO): Adjust for TARGET_MOPS. d04ae83244d3 (Jakub Jelinek 2022-01-03 10:18:16 +0100 1242) (cpymemdi): Adjust for TARGET_MOPS. 2021-12-13 Kyrylo Tkachov <kyrylo.tkachov@arm.com> * config/aarch64/aarch64.c (aarch64_expand_setmem_mops): Define. (aarch64_expand_setmem): Adjust for TARGET_MOPS. * config/aarch64/aarch64.h (CLEAR_RATIO): Adjust for TARGET_MOPS. (SET_RATIO): Likewise. * config/aarch64/aarch64.md ("unspec"): Add UNSPEC_SETMEM. (aarch64_setmemdi): Define. (setmemdi): Adjust for TARGET_MOPS. * config/aarch64/aarch64.opt (aarch64-mops-memset-size-threshold): New param. ... 2021-12-13 Kyrylo Tkachov <kyrylo.tkachov@arm.com> * config/aarch64/aarch64-option-extensions.def (mops): Define. * config/aarch64/aarch64.c (aarch64_expand_cpymem_mops): Define. (aarch64_expand_cpymem): Define. * config/aarch64/aarch64.h (AARCH64_FL_MOPS): Define. (AARCH64_ISA_MOPS): Define. (TARGET_MOPS): Define. (MOVE_RATIO): Adjust for TARGET_MOPS. * config/aarch64/aarch64.md ("unspec"): Add UNSPEC_CPYMEM. (aarch64_cpymemdi): New pattern. (cpymemdi): Adjust for TARGET_MOPS. * config/aarch64/aarch64.opt (aarch64-mops-memcpy-size-threshol): New param. * doc/invoke.texi (AArch64 Options): Document +mops. > > Upstream-Status: Backport [https://gcc.gnu.org/cgit/gcc/commit/?id=b9d16d8361a9e3a82a2f21e759e760d235d43322] > > Signed-off-by: Sundeep KOKKONDA<sundeep.kokkonda@windriver.com> > --- > meta/recipes-devtools/gcc/gcc-11.5.inc | 1 + > ...rch64-fix-strict-align-cpymem-setmem.patch | 45 +++++++++++++++++++ > 2 files changed, 46 insertions(+) > create mode 100644 meta/recipes-devtools/gcc/gcc/0032-gcc-aarch64-fix-strict-align-cpymem-setmem.patch > > diff --git a/meta/recipes-devtools/gcc/gcc-11.5.inc b/meta/recipes-devtools/gcc/gcc-11.5.inc > index f17ec9da5c..1e8371b2bd 100644 > --- a/meta/recipes-devtools/gcc/gcc-11.5.inc > +++ b/meta/recipes-devtools/gcc/gcc-11.5.inc > @@ -60,6 +60,7 @@ SRC_URI = "\ > file://0029-Fix-install-path-of-linux64.h.patch \ > file://0030-rust-recursion-limit.patch \ > file://0031-gcc-sanitizers-fix.patch \ > +file://0032-gcc-aarch64-fix-strict-align-cpymem-setmem.patch \ > file://0001-CVE-2021-42574.patch \ > file://0002-CVE-2021-42574.patch \ > file://0003-CVE-2021-42574.patch \ > diff --git a/meta/recipes-devtools/gcc/gcc/0032-gcc-aarch64-fix-strict-align-cpymem-setmem.patch b/meta/recipes-devtools/gcc/gcc/0032-gcc-aarch64-fix-strict-align-cpymem-setmem.patch > new file mode 100644 > index 0000000000..4c2d827799 > --- /dev/null > +++ b/meta/recipes-devtools/gcc/gcc/0032-gcc-aarch64-fix-strict-align-cpymem-setmem.patch > @@ -0,0 +1,45 @@ > +gcc: AArch64 - Fix strict-align cpymem/setmem > + > +The cpymemdi/setmemdi implementation doesn't fully support strict alignment. > +Block the expansion if the alignment is less than 16 with STRICT_ALIGNMENT. > +Clean up the condition when to use MOPS. > + > +Upstream-Status: Backport [https://gcc.gnu.org/cgit/gcc/commit/?id=b9d16d8361a9e3a82a2f21e759e760d235d43322] > + > +Signed-off-by: Wilco Dijkstra<wilco.dijkstra@arm.com> > +Signed-off-by: Sundeep KOKKONDA<sundeep.kokkonda@windriver.com> > +--- > +--- a/gcc/config/aarch64/aarch64.c 2025-05-08 20:40:10.969865898 -0700 > ++++ b/gcc/config/aarch64/aarch64.c 2025-05-13 23:11:07.006796627 -0700 > +@@ -23621,14 +23621,15 @@ > + int mode_bits; > + rtx dst = operands[0]; > + rtx src = operands[1]; > ++ unsigned align = UINTVAL (operands[3]); > + rtx base; > + machine_mode cur_mode = BLKmode; > + > + /* Only expand fixed-size copies. */ > +- if (!CONST_INT_P (operands[2])) > ++ if (!CONST_INT_P (operands[2]) || (STRICT_ALIGNMENT && align < 16)) > + return false; > + > +- unsigned HOST_WIDE_INT size = INTVAL (operands[2]); > ++ unsigned HOST_WIDE_INT size = UINTVAL (operands[2]); > + > + /* Inline up to 256 bytes when optimizing for speed. */ > + unsigned HOST_WIDE_INT max_copy_size = 256; > +@@ -23750,11 +23751,12 @@ > + unsigned HOST_WIDE_INT len; > + rtx dst = operands[0]; > + rtx val = operands[2], src; > ++ unsigned align = UINTVAL (operands[3]); > + rtx base; > + machine_mode cur_mode = BLKmode, next_mode; > + > + /* We can't do anything smart if the amount to copy is not constant. */ > +- if (!CONST_INT_P (operands[1])) > ++ if (!CONST_INT_P (operands[1]) || (STRICT_ALIGNMENT && align < 16)) > + return false; > + > + bool speed_p = !optimize_function_for_size_p (cfun);
On 21-May-25 21:54, Randy MacLeod wrote: > On 2025-05-21 7:59 a.m., sundeep.kokkonda@windriver.com wrote: > > It's good practice to CC the person who asked for a v2. > I've added Gyorgy to this thread. > >> From: Sundeep KOKKONDA<sundeep.kokkonda@windriver.com> >> >> The cpymemdi/setmemdi implementation doesn't fully support strict alignment. >> Block the expansion if the alignment is less than 16 with STRICT_ALIGNMENT. >> Clean up the condition when to use MOPS. (PR103100) >> >> This patch includes only the alignment-related changes in the aarch64.c file. >> Changes to aarch64.md were skipped due to non-essential impact and potential incompatibility. > That's better than not mentioned what parts of the patch you dropped but > it's doesn't *explain* why that change isn't needed or I'm missing the > point! ;-) > > Please dig a bit deeper and explain: > 1. what TARGET_MOPS means and does and maybe what the .md file's > purpose is, > TARGET_MOPS is a target-specific macro or flag in GCC that stands for "Memory Operations", that controls whether the target (e.g., AArch64) supports specialized memory operation instructions, such as mops.memcpy, mops.memset etc In GCC, TARGET_MOPS is used to conditionally enable instruction expansion patterns (e.g., setmemdi, cpymemdi). GCC .md files (machine description files) define /Instruction patterns/ (define_insn), /Expansion rules/ (define_expand), /Instruction splitting/, peephole optimizations, etc. They bridge GCC’s intermediate RTL with target machine assembly. So when we write setmemdi in .md, we're telling GCC: “Here’s how to emit instructions for setting memory blocks on this machine.” > > 2. why was it removed in the gcc-12 commit, > There are many changes in aarch64 implementation in gcc-12 w.r.t gcc-11, which includes mops changes as in this thread https://inbox.sourceware.org/gcc-patches/mptee4mhon1.fsf@arm.com/T/#t GCC 12 likely refactored memory ops which may make TARGET_MOPS obsolete? > > 3. why it's safe to ignore that change for gcc-11, > GCC-11 and older versions are still used TARGET_MOPS for certain expansions, like setmemdi and cpymemdi. So, we must rely on its original behavior to prevent unsafe expansion on unsupported hardware. Removing TARGET_MOPS in GCC 11 without full backporting of GCC 12’s changes would break instruction matching, cause ICEs, or expose unsupported paths (which we have already seen with the build failures) > > 4. Since you are not backporting the MOPS part, > it might be nice to point that out in each part of the patch that > mentions MOPS. > Maybe that's overkill and you just need to explain in the commit > log but > think about someone who is just looking at the patch and do what > makes sense to you. > This patch is based on changes in GCC 12 that remove TARGET_MOPS gating for setmem and cpymem expansions. However, this version for GCC 11 does not backport the MOPS-related behavior change and retains the TARGET_MOPS condition to preserve correctness and compatibility with the GCC 11 backend. > I can almost understand things better now but > I'll let you present it, step by step, in v3. > I'll send v3 by updating the commit log with above analysis. >> All changes and outputs have been verified by the author. > > Is there a test case that covers all of this that starts passing with > this change? > > Please explain how are you testing? I tested the prog given in https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103100#c3 on qemuarm64. And, the generated assembly I get it reviewed by the author. > > ../Randy > > > I looked at this a bit and found gcc/ChangeLog-2021 that talks about > the introduction of TARGET_MOPS. This was done in gcc-12 and later > as you have hinted at. > > ❯ git blame gcc/ChangeLog-2021 | rg TARGET_MOPS > d04ae83244d3 (Jakub Jelinek 2022-01-03 10:18:16 +0100 1214) > (aarch64_expand_setmem): Adjust for TARGET_MOPS. > d04ae83244d3 (Jakub Jelinek 2022-01-03 10:18:16 +0100 1215) * > config/aarch64/aarch64.h (CLEAR_RATIO): Adjust for TARGET_MOPS. > d04ae83244d3 (Jakub Jelinek 2022-01-03 10:18:16 +0100 1219) > (setmemdi): Adjust for TARGET_MOPS. > d04ae83244d3 (Jakub Jelinek 2022-01-03 10:18:16 +0100 1238) > (TARGET_MOPS): Define. > d04ae83244d3 (Jakub Jelinek 2022-01-03 10:18:16 +0100 1239) > (MOVE_RATIO): Adjust for TARGET_MOPS. > d04ae83244d3 (Jakub Jelinek 2022-01-03 10:18:16 +0100 1242) > (cpymemdi): Adjust for TARGET_MOPS. > > > 2021-12-13 Kyrylo Tkachov <kyrylo.tkachov@arm.com> > > * config/aarch64/aarch64.c (aarch64_expand_setmem_mops): Define. > (aarch64_expand_setmem): Adjust for TARGET_MOPS. > * config/aarch64/aarch64.h (CLEAR_RATIO): Adjust for TARGET_MOPS. > (SET_RATIO): Likewise. > * config/aarch64/aarch64.md ("unspec"): Add UNSPEC_SETMEM. > (aarch64_setmemdi): Define. > (setmemdi): Adjust for TARGET_MOPS. > * config/aarch64/aarch64.opt (aarch64-mops-memset-size-threshold): > New param. > > ... > > 2021-12-13 Kyrylo Tkachov <kyrylo.tkachov@arm.com> > > * config/aarch64/aarch64-option-extensions.def (mops): Define. > * config/aarch64/aarch64.c (aarch64_expand_cpymem_mops): Define. > (aarch64_expand_cpymem): Define. > * config/aarch64/aarch64.h (AARCH64_FL_MOPS): Define. > (AARCH64_ISA_MOPS): Define. > (TARGET_MOPS): Define. > (MOVE_RATIO): Adjust for TARGET_MOPS. > * config/aarch64/aarch64.md ("unspec"): Add UNSPEC_CPYMEM. > (aarch64_cpymemdi): New pattern. > (cpymemdi): Adjust for TARGET_MOPS. > * config/aarch64/aarch64.opt (aarch64-mops-memcpy-size-threshol): > New param. > * doc/invoke.texi (AArch64 Options): Document +mops. > >> Upstream-Status: Backport [https://gcc.gnu.org/cgit/gcc/commit/?id=b9d16d8361a9e3a82a2f21e759e760d235d43322] >> >> Signed-off-by: Sundeep KOKKONDA<sundeep.kokkonda@windriver.com> >> --- >> meta/recipes-devtools/gcc/gcc-11.5.inc | 1 + >> ...rch64-fix-strict-align-cpymem-setmem.patch | 45 +++++++++++++++++++ >> 2 files changed, 46 insertions(+) >> create mode 100644 meta/recipes-devtools/gcc/gcc/0032-gcc-aarch64-fix-strict-align-cpymem-setmem.patch >> >> diff --git a/meta/recipes-devtools/gcc/gcc-11.5.inc b/meta/recipes-devtools/gcc/gcc-11.5.inc >> index f17ec9da5c..1e8371b2bd 100644 >> --- a/meta/recipes-devtools/gcc/gcc-11.5.inc >> +++ b/meta/recipes-devtools/gcc/gcc-11.5.inc >> @@ -60,6 +60,7 @@ SRC_URI = "\ >> file://0029-Fix-install-path-of-linux64.h.patch \ >> file://0030-rust-recursion-limit.patch \ >> file://0031-gcc-sanitizers-fix.patch \ >> +file://0032-gcc-aarch64-fix-strict-align-cpymem-setmem.patch \ >> file://0001-CVE-2021-42574.patch \ >> file://0002-CVE-2021-42574.patch \ >> file://0003-CVE-2021-42574.patch \ >> diff --git a/meta/recipes-devtools/gcc/gcc/0032-gcc-aarch64-fix-strict-align-cpymem-setmem.patch b/meta/recipes-devtools/gcc/gcc/0032-gcc-aarch64-fix-strict-align-cpymem-setmem.patch >> new file mode 100644 >> index 0000000000..4c2d827799 >> --- /dev/null >> +++ b/meta/recipes-devtools/gcc/gcc/0032-gcc-aarch64-fix-strict-align-cpymem-setmem.patch >> @@ -0,0 +1,45 @@ >> +gcc: AArch64 - Fix strict-align cpymem/setmem >> + >> +The cpymemdi/setmemdi implementation doesn't fully support strict alignment. >> +Block the expansion if the alignment is less than 16 with STRICT_ALIGNMENT. >> +Clean up the condition when to use MOPS. >> + >> +Upstream-Status: Backport [https://gcc.gnu.org/cgit/gcc/commit/?id=b9d16d8361a9e3a82a2f21e759e760d235d43322] >> + >> +Signed-off-by: Wilco Dijkstra<wilco.dijkstra@arm.com> >> +Signed-off-by: Sundeep KOKKONDA<sundeep.kokkonda@windriver.com> >> +--- >> +--- a/gcc/config/aarch64/aarch64.c 2025-05-08 20:40:10.969865898 -0700 >> ++++ b/gcc/config/aarch64/aarch64.c 2025-05-13 23:11:07.006796627 -0700 >> +@@ -23621,14 +23621,15 @@ >> + int mode_bits; >> + rtx dst = operands[0]; >> + rtx src = operands[1]; >> ++ unsigned align = UINTVAL (operands[3]); >> + rtx base; >> + machine_mode cur_mode = BLKmode; >> + >> + /* Only expand fixed-size copies. */ >> +- if (!CONST_INT_P (operands[2])) >> ++ if (!CONST_INT_P (operands[2]) || (STRICT_ALIGNMENT && align < 16)) >> + return false; >> + >> +- unsigned HOST_WIDE_INT size = INTVAL (operands[2]); >> ++ unsigned HOST_WIDE_INT size = UINTVAL (operands[2]); >> + >> + /* Inline up to 256 bytes when optimizing for speed. */ >> + unsigned HOST_WIDE_INT max_copy_size = 256; >> +@@ -23750,11 +23751,12 @@ >> + unsigned HOST_WIDE_INT len; >> + rtx dst = operands[0]; >> + rtx val = operands[2], src; >> ++ unsigned align = UINTVAL (operands[3]); >> + rtx base; >> + machine_mode cur_mode = BLKmode, next_mode; >> + >> + /* We can't do anything smart if the amount to copy is not constant. */ >> +- if (!CONST_INT_P (operands[1])) >> ++ if (!CONST_INT_P (operands[1]) || (STRICT_ALIGNMENT && align < 16)) >> + return false; >> + >> + bool speed_p = !optimize_function_for_size_p (cfun); > > > -- > # Randy MacLeod > # Wind River Linux
diff --git a/meta/recipes-devtools/gcc/gcc-11.5.inc b/meta/recipes-devtools/gcc/gcc-11.5.inc index f17ec9da5c..1e8371b2bd 100644 --- a/meta/recipes-devtools/gcc/gcc-11.5.inc +++ b/meta/recipes-devtools/gcc/gcc-11.5.inc @@ -60,6 +60,7 @@ SRC_URI = "\ file://0029-Fix-install-path-of-linux64.h.patch \ file://0030-rust-recursion-limit.patch \ file://0031-gcc-sanitizers-fix.patch \ + file://0032-gcc-aarch64-fix-strict-align-cpymem-setmem.patch \ file://0001-CVE-2021-42574.patch \ file://0002-CVE-2021-42574.patch \ file://0003-CVE-2021-42574.patch \ diff --git a/meta/recipes-devtools/gcc/gcc/0032-gcc-aarch64-fix-strict-align-cpymem-setmem.patch b/meta/recipes-devtools/gcc/gcc/0032-gcc-aarch64-fix-strict-align-cpymem-setmem.patch new file mode 100644 index 0000000000..4c2d827799 --- /dev/null +++ b/meta/recipes-devtools/gcc/gcc/0032-gcc-aarch64-fix-strict-align-cpymem-setmem.patch @@ -0,0 +1,45 @@ +gcc: AArch64 - Fix strict-align cpymem/setmem + +The cpymemdi/setmemdi implementation doesn't fully support strict alignment. +Block the expansion if the alignment is less than 16 with STRICT_ALIGNMENT. +Clean up the condition when to use MOPS. + +Upstream-Status: Backport [https://gcc.gnu.org/cgit/gcc/commit/?id=b9d16d8361a9e3a82a2f21e759e760d235d43322] + +Signed-off-by: Wilco Dijkstra <wilco.dijkstra@arm.com> +Signed-off-by: Sundeep KOKKONDA <sundeep.kokkonda@windriver.com> +--- +--- a/gcc/config/aarch64/aarch64.c 2025-05-08 20:40:10.969865898 -0700 ++++ b/gcc/config/aarch64/aarch64.c 2025-05-13 23:11:07.006796627 -0700 +@@ -23621,14 +23621,15 @@ + int mode_bits; + rtx dst = operands[0]; + rtx src = operands[1]; ++ unsigned align = UINTVAL (operands[3]); + rtx base; + machine_mode cur_mode = BLKmode; + + /* Only expand fixed-size copies. */ +- if (!CONST_INT_P (operands[2])) ++ if (!CONST_INT_P (operands[2]) || (STRICT_ALIGNMENT && align < 16)) + return false; + +- unsigned HOST_WIDE_INT size = INTVAL (operands[2]); ++ unsigned HOST_WIDE_INT size = UINTVAL (operands[2]); + + /* Inline up to 256 bytes when optimizing for speed. */ + unsigned HOST_WIDE_INT max_copy_size = 256; +@@ -23750,11 +23751,12 @@ + unsigned HOST_WIDE_INT len; + rtx dst = operands[0]; + rtx val = operands[2], src; ++ unsigned align = UINTVAL (operands[3]); + rtx base; + machine_mode cur_mode = BLKmode, next_mode; + + /* We can't do anything smart if the amount to copy is not constant. */ +- if (!CONST_INT_P (operands[1])) ++ if (!CONST_INT_P (operands[1]) || (STRICT_ALIGNMENT && align < 16)) + return false; + + bool speed_p = !optimize_function_for_size_p (cfun);