1
0
Fork 0
mirror of https://github.com/sockspls/badfish synced 2025-04-30 00:33:09 +00:00

Enable popcount and prefetch for ppc-64

PowerPC has had popcount instructions for a long time, at least as far
back as POWER5 (released 2004). Enable them via a gcc builtin.

Using a gcc builtin has the added bonus that if compiled for a processor
that lacks a hardware instruction, gcc will include a software popcount
implementation that does not use the instruction. It might be slower
than the table lookups (or it might be faster) but it will certainly work.
So this isn't going to break anything.

On my POWER8 VM, this leads to a ~4.27% speedup.

Fir prefetch, the gcc builtin generates a 'dcbt' instruction, which is
supported at least as far back as the G5 (2002) and POWER4 (2001).

This leads to a ~5% speedup on my POWER8 VM.

No functional change
This commit is contained in:
Daniel Axtens 2019-07-11 09:12:54 +10:00 committed by Stéphane Nicolet
parent ca51d1ee63
commit fa1a2a0667

View file

@ -136,6 +136,8 @@ endif
ifeq ($(ARCH),ppc-64)
arch = ppc64
bits = 64
popcnt = yes
prefetch = yes
endif
@ -313,7 +315,9 @@ endif
### 3.6 popcnt
ifeq ($(popcnt),yes)
ifeq ($(comp),icc)
ifeq ($(arch),ppc64)
CXXFLAGS += -DUSE_POPCNT
else ifeq ($(comp),icc)
CXXFLAGS += -msse3 -DUSE_POPCNT
else
CXXFLAGS += -msse3 -mpopcnt -DUSE_POPCNT