Репозиторий Sisyphus
Последнее обновление: 1 октября 2023 | Пакетов: 18631 | Посещений: 37713196
en ru br
Репозитории ALT
S:1.4a-alt1
5.1: 1.4-alt5
4.1: 1.4-alt5
4.0: 1.4-alt5
3.0: 1.4-alt4
www.altlinux.org/Changes

Группа :: Мониторинг
Пакет: cpuburn

 Главная   Изменения   Спек   Патчи   Исходники   Загрузить   Gear   Bugs and FR  Repocop 

cpuburn-1.4/000075500000000000000000000000001213405604500130325ustar00rootroot00000000000000cpuburn-1.4/Design000064400000000000000000000133361213405604500141740ustar00rootroot00000000000000I wrote these programs to fill a vacuum.  Chris Brady's memtest-86 is
an excellent program for testing memory, but I wanted something that
would do stability testing for CPUs since I had decided to overclock
my pair of Celeron 366's on an Abit BP-6 motherboard. No comments from
the peanut gallery. burnBX was added to test RAM & controller stability

Other than much vilified overclockers, other people may find these
programs useful. System builders may wish to test their systems and
heatsinks. PC buyers may wish to test their systems, particularly if
they have doubts about the builder's expertise. Leaving out thermal
interface material (grease) on the heatsink is a likely flaw.

The usual advice is to run kernel compiles. This is dangerous since a
crash will certainly corrupt the filesystem with all the files make -j 4
will have open. Worse, I doubt that gcc has any significant FPU code.
Worse still, gcc is compiled with gcc, and I doubted that it would
produce highly optimized code.

Since I couldn't find anything, I decided to write it.

It's certain that Intel and other CPU manufacturers have devoted enormous
effort to CPU testing. They have some programs for stability testing
and parts speed rating ("binning"). Some of these (HIPWR30.EXE) are
available to qualified Intel customers under NDA.

I wanted a program that would load the CPU to maximum. Unintentionally,
code optimization does this. I chose a base of FPU code (DDOT) since
I believed from 8087 days that the FPU consumes alot of current, and
was untested by gcc. Then integer instructions were slipped into their
shadow to try to keep the other P6 ports loaded. Agner Fog's excellent
article helped quite a bit. Trial and much error.

I also tried to chose data (all-bits-lit) that would maximize power
consumption. But I do not claim that my code is the most optimized
nor the most power consuming. There could always be better.

Once I found lm-sensors, I could measure the results of my efforts.
Subject thermister vagaries, here are my results [revised]:

29'C at idle (hlt)
41' doing idle loop
46' mprime95 (as-is or reniced -19)
47' make -j 4 on kernel
47' 2 * burnP5 (estimated)
47' 2 * burnBX L (default, 4 MB)
48' 2 * burnMMX L
48' 2 * burnK6 (estimated)
50' 2 * burnMMX F (default, 64 kB in L2)
51' 2 * burnMMX D (16 kB, L1 cache)
51' 2 * burnP6 on zeroes for data
52' 2 * burnP6 with FF's for data

All at 2 * 5.5 * 97 MHz (26'C ambient). Higher and my CPU1 will lockup
under burnP6 in 5-10 min . kernel compiles are stable to 99 MHz for
24 h. But 98 MHz will give `burnBX` errors every 5-8 hours, and 95
MHz will give burnMMX D errors every ~6 hours, so now I run 94 MHz.
Errors seem to increase 10x for every 1 MHz.

I got tired of waiting for temperature steady-state so I measure current
instead. Mostly I use the ATX power harness as a shunt, and measure
current by voltage drop. Email for details. This permits testng many
different instruction mix ideas quickly. As it turns out, the orignal
burnP6 is close to the best I've found, needing only minor tweaking for
a 2% improvement. The optimum burnK6 is also fairly similar, with just
minor architectural adjustments for AMD.

I also did some measurements with an inductive ammeter. They gave 90% of
the estimated maximum datasheet current draw for burnP6. So I'm fairly
happy with the code. But suggestions for improvement are most welcome.
I don't claim this code is perfect, nor that it will catch all system
deficiencies.

BURNBX: This program has been quite frustrating to develop. It's hard to
measure the results. I've finally hit on a reasonable pattern (walking
bit through carry, inverted every quadword except for cacheline leadoff)
that really brings out errors, and occasional lockups (more on FreeBSD).
The 82443BX only gets to 42'C.

Essentially, burnBX is a RAM tester, using whatever pages the OS allocates
to the process. As such, it cannot test kernel RAM. But it is designed
to be very intense, using the P6 optimized `rep movsd` instructions.
Please note that burnBX is _not_ optimal on AMD K6 based systems because
they don't have the optimized `rep mosvd` block move.

Beta testers have mostly reported quick error terminations. Their impact
should not be minimized, because such a data error could occur in kernel
code, causing system crashes. The errors may be from the CPU/BX bus, in
which case ECC RAM will not help. The cause is not perfectly clear, but
general case & 440BX cooling helps and so does an adequate powersupply.
300W is suggested.

Errors on my "instrumented" version of burnBX have not been isolated
to one memory cell but have been distributed across many addresses and
a few bits [only one at a time]. It is suspected that there is a bus
or transistor driver problem. Or there may be undetected transients in
the 3.3 voltage.

REVISED BURNMMX: I started this project as simply a way for AMD system
owners to check out their systems. I was very surpised when my own
system started throwing errors with the MMX memory moves, and had to
downclock from 2 * 5.5 * 97 MHz to 94 MHz. It would seem that the simple
memory moves are more fragile (less robust to interrupts) than the 2%
higher bandwidth string moves.

BURNK7: I finally bought an AMD Athlon and had to write a tester even
though I don't overclock it. Writing burnK7 was much trial and error,
but the ammeter gave me immediate feedback on my efforts. The powerful
K7 core was easy and fun to optimize. I parallel pathed DDOT to remove a
dependancy, and could have gone much further, but current didn't increase,
so I stuffed in integer instructions which did increase current. On my
850 Thunderbird, burnK7 draws 9% more power than burnK6.


Robert Redelmeier redelm@ev1.net June 15, 2001




cpuburn-1.4/Makefile000064400000000000000000000001201213405604500144630ustar00rootroot00000000000000all : burnP5 burnP6 burnK6 burnK7 burnBX burnMMX
.S:
gcc -s -nostdlib -o $@ $<
cpuburn-1.4/README000064400000000000000000000074071213405604500137220ustar00rootroot00000000000000N E W burnK7 for the AMD Athlon/Duron has been released.

These programs are designed to load x86 CPUs as heavily as possible for
the purposes of system testing. They have been optimized for different
processors. FPU and ALU instructions are coded an assembler endless loop.
They do not test every instruction. The goal has been to maximize heat
production from the CPU, putting stress on the CPU itself, cooling
system, motherboard (especially voltage regulators) and power supply
(likely cause of burnBX/MMX errors).

burnP5 is optimized for Intel Pentium w&w/o MMX processors
P6 is for Intel PentiumPro, PentiumII&III and Celeron CPUs
K6 is for AMD K6 processors
K7 is for AMD Athlon/Duron processors
MMX is to test cache/memory interfaces on all CPUs with MMX
BX is an alternate cache/memory test for Intel CPUs

TO USE: root priviliges are NOT required. It has been designed for ELF
Linux, but also tested under FreeBSD. and a.out. Burn Testing
is best done from a ramdisk distribution (tomsrtbt) or with
filesystems unmounted or mounted read-only. untar the source
in a convenient directory:
`tar zxf cpuburn`
compile excutables
`make`
run desired program in background [ _repeat_ for SMP]:
`burnP6 || echo $? &`

Monitor progress of cpuburn by `ps`. When finished, `kill` the burn*
process(es). If you have temperature probes (fingers) or the lm-sensors
package, you can check your CPU temperature and/or system voltages.

If an error occurs in calculations, it will be preserved, and the
program will terminate with error code 254 for an integer/memory error,
and error code 255 for a FP/MMX error. Error checking happens every
10-40 sec for burnP6/K6/K7 and I haven't seen any CPU errors in testing
[lockups occur first]. burnBX and burnMMX check for error every 512 MB
(4-10 sec), and error termination is frequently seen, lockups are rarer.

burnBX and burnMMX are essentially very intense RAM testers. They can
also take an optional parameter indicating the RAM size to be tested:

A = 2 kB E = 32 kB I = 512 kB M = 8 MB
B = 4 F = 64 J = 1 MB N = 16
C = 8 G = 128 K = 2 O = 32
D = 16 H = 256 L = 4 P = 64

`burnBX L` (4 MB) and `burnMMX F` (64 kB) are the default sizes.
A-E mostly test L1 cache, F-H test L2 cache, and H-P force their way
to RAM. But even A-E will have some cacheline writeouts to RAM.

In spite of it's name, burnBX can be run on any chipset [RAM controller]
and tests alot more than the RAM controller. Unfortunately, burnBX is
not optimal on AMD processors. burnMMX is preferable for any CPU that
has an MMX unit.

burnBX/MMX needs about 72 MB of total RAM + swap to start (not necessarily
free), but doesn't use this much unless you request it. They will
throw a `Sig 11` if you don't have enough swap. If you don't want to
add more, you can adjust the .bss section downward as indicated in the
source comments. I use very simple memory management. They can also
test swap, and at least on my system, I can run 2*`burnBX 8` with 128
MB SDRAM with some use of swap, but no excessive thrashing[seeks]. YMMV.

If sub-spec, your system may lock up after 2-10 minutes. It shouldn't.
burn* are just an unpriviliged user processes. But it probably means
your CPU is undercooled, most likely no thermal grease or other interface
material between CPU & heatsink. Or some other deficiency. A power
cycle should reset the system. But you should fix it.

Robert Redelmeier
redelm@ev1.net

*** WARNING *** This program is designed to heavily load CPU chips.
Undercooled, overclocked or otherwise weak systems may fail causing data
loss (filesystem corruption) and possibly permanent damage to electronic
components. Nor will it catch all flaws. *** USE AT YOUR OWN RISK ***
cpuburn-1.4/burnBX.S000064400000000000000000000047111213405604500143610ustar00rootroot00000000000000# cpuburn-1.4: burnBX Chipset/DRAM Loading Utility
# Copyright 2000 Robert J. Redelmeier. All Right Reserved
# Licensed under GNU General Public Licence 2.0. No warrantee.
# *** USE AT YOUR OWN RISK ***

.text
#ifdef WINDOWS
.globl _main
_main:
movl 4(%esp),%eax
movl $12, %ecx # default L = 4 MB
subl $1,%eax # 1 string -> no paramater
jz no_size

movl 8(%esp),%eax # address of strings
movl 4(%eax),%eax # address of first paramater
movzb (%eax),%ecx # first parameter - a byte
no_size:
subl $12, %esp # stack allocation
#else
.globl _start
_start:
subl $12, %esp #stack space
movl 20(%esp), %eax
movl $12, %ecx # default L = 4 MB
testl %eax, %eax # is a param given?
jz no_size
movl (%eax), %ecx
no_size:
#endif
decl %ecx
andl $15, %ecx
movl $256, %eax
shll %cl, %eax
movl %eax, 4(%esp) # save blocksize
movl $256*1024, %eax
shrl %cl, %eax
movl %eax, 8(%esp) # save count blks / 512 MB

movl 4(%esp), %ecx
shrl $4, %ecx
movl $buffer, %edi
xorl %eax, %eax
notl %eax
more: # init fill of 2 cachelines
movl %eax, %edx # qwords F-F-0-F , F-0-F-0
notl %edx
movl %eax, 0(%edi)
movl %eax, 4(%edi)
movl %eax, 8(%edi)
movl %eax, 12(%edi)
movl %edx, 16(%edi)
movl %edx, 20(%edi)
movl %eax, 24(%edi)
movl %eax, 28(%edi)

movl %eax, 32(%edi)
movl %eax, 36(%edi)
movl %edx, 40(%edi)
movl %edx, 44(%edi)
movl %eax, 48(%edi)
movl %eax, 52(%edi)
movl %edx, 56(%edi)
movl %edx, 60(%edi)
rcll $1, %eax # walking zero, 33 cycle
leal 64(%edi), %edi # odd inst to preserve CF
decl %ecx
jnz more

cld
thrash: # MAIN LOOP
movl 8(%esp), %edx
mov_again:
movl $buffer, %esi
movl $buf2, %edi
movl 4(%esp), %ecx
rep # move block up
movsl

movl $buffer + 32, %edi
movl $buf2, %esi
movl 4(%esp), %ecx
subl $8, %ecx
rep # move block back shifting
movsl # by 1 cacheline

movl $buffer, %edi
movl $8, %ecx
rep # replace last c line
movsl

decl %edx # do again for 512 MB.
jnz mov_again

movl $buffer, %edi # DATA CHECK
xorl %ecx, %ecx
.align 16, 0x90
test:
mov 0(%edi,%ecx,4), %eax
cmp %eax, 4(%edi,%ecx,4)
jnz error
incl %ecx
incl %ecx
cmpl 4(%esp), %ecx
jc test
jmp thrash

error: # error abend
movl $1, %eax
#ifdef WINDOWS
addl $12, %esp # deallocate stack
ret
#else
movl $-2, %ebx
pushl %ebx # *BSD syscall convention
pushl %eax
int $0x80
#endif
.bss # Data allocation
.align 32
.lcomm buffer, 32 <<20 # reduce both to 8 <<20 for only
.lcomm buf2, 32 <<20 # 16 MB virtual memory available

#
cpuburn-1.4/burnK6.S000064400000000000000000000023521213405604500143270ustar00rootroot00000000000000# cpuburn-1.4: burnK6 CPU Loading Utility
# Copyright 1999 Robert J. Redelmeier. All Right Reserved
# Licensed under GNU General Public Licence 2.0. No warrantee.
# *** USE AT YOUR OWN RISK ***

.text
#ifdef WINDOWS
.globl _main
_main:
#else
.globl _start
_start:
#endif
finit
pushl %ebp
movl %esp, %ebp
andl $-32, %ebp
subl $96, %esp
fldpi
fldl rt
fstpl -24(%ebp)
fldl e
fstpl -32(%ebp)
movl half, %edx
movl %edx, -8(%ebp)
after_check:
xorl %eax, %eax
movl %eax, %ebx
lea -1(%eax), %esi
movl $400000000, %ecx
movl %ecx, -4(%ebp)
.align 32, 0x90
crunch:
fldl 8-24(%ebp,%esi,8) # CALC BLOCK
fmull 8-32(%ebp,%esi,8)
addl half+9(%esi,%esi,8), %edx
jnz . + 2
faddp
fldl 8-24(%ebp,%esi,8)
decl %ebx
subl half+9(%esi,%esi,8), %edx
jmp . + 2
fmull 8-32(%ebp,%esi,8)
incl %ebx
decl 8-4(%ebp,%esi,8)
fsubp
jnz crunch # time for testing ?

test %ebx, %ebx # TEST BLOCK
jnz int_exit
cmpl half, %edx
jnz int_exit
fldpi
fcomp %st(1)
fstsw %ax
sahf
jz after_check
decl %ebx
int_exit:
decl %ebx
addl $96, %esp
popl %ebp
movl $1, %eax
#ifdef WINDOWS
ret
#else
push %ebx
push %eax
int $0x80
#endif
.align 32,0
half: .long 0x7fffffff,0
e: .long 0xffffffff,0x3fdfffff
rt: .long 0xffffffff,0x3fefffff


cpuburn-1.4/burnK7.S000064400000000000000000000025651213405604500143360ustar00rootroot00000000000000# cpuburn-1.4: burnK7 CPU Loading Utility
# Copyright 2000 Robert J. Redelmeier. All Right Reserved
# Licensed under GNU General Public Licence 2.0. No warrantee.
# *** USE AT YOUR OWN RISK ***

.text
#ifdef WINDOWS
.globl _main
_main:
#else
.globl _start
_start:
#endif
finit
pushl %ebp
movl %esp, %ebp
andl $-32, %ebp
subl $96, %esp
fldl rt
fstpl -24(%ebp)
fldl e
fstpl -32(%ebp)
fldpi
fldpi
xorl %eax, %eax
xorl %ebx, %ebx
xorl %ecx, %ecx
movl half, %edx
lea -1(%eax), %esi
movl %eax, -12(%ebp)
movl %edx, -8(%ebp)
after_check:
movl $850000000, -4(%ebp)
.align 32, 0x90
crunch:
fxch # CALC BLOCK
fldl 8-24(%ebp,%esi,8) # 17 instr / 6.0 cycles
addl half+9(%esi,%esi,8), %edx
fmull 8-32(%ebp,%esi,8)
faddp
decl %ecx
fldl 8-24(%ebp,%esi,8)
decl %ebx
incl 8-12(%ebp,%esi,8)
subl half+9(%esi,%esi,8), %edx
incl %ecx
fmull 8-32(%ebp,%esi,8)
incl %ebx
decl 8-4(%ebp,%esi,8)
jmp . + 2
fsubp %st, %st(2)
jnz crunch # time for testing ?

test %ebx, %ebx # TEST BLOCK
jnz int_exit
test %ecx, %ecx
jnz int_exit
cmpl half, %edx
jnz int_exit
fcom %st(1)
fstsw %ax
sahf
jz after_check
decl %ebx
int_exit:
decl %ebx
addl $96, %esp
popl %ebp
movl $1, %eax
#ifdef WINDOWS
ret
#else
push %ebx
push %eax
int $0x80
#endif
.align 32,0
.fill 64
half: .long 0x7fffffff,0
e: .long 0xffffffff,0x3fdfffff
rt: .long 0xffffffff,0x3fefffff


cpuburn-1.4/burnMMX.S000064400000000000000000000061451213405604500145140ustar00rootroot00000000000000# cpuburn-1.4: burnMMX Chipset/DRAM Loading Utility
# Copyright 2000 Robert J. Redelmeier. All Right Reserved
# Licensed under GNU General Public Licence 2.0. No warrantee.
# *** USE AT YOUR OWN RISK ***

.text
#ifdef WINDOWS
.globl _main
_main:
movl 4(%esp),%eax
movl $6, %ecx # default f = 64 kB
subl $1, %eax # is a param given?
jz no_size

movl 8(%esp),%eax # address of strings
movl 4(%eax),%eax # address of first paramater
movzb (%eax),%ecx # first parameter - a byte
no_size:
subl $12, %esp # stack space
#else
.globl _start
_start:
subl $12, %esp
movl 20(%esp), %eax
movl $6, %ecx # default f = 64 kB
testl %eax, %eax # is a param given?
jz no_size
movl (%eax), %ecx
no_size:
#endif
emms
movq rt, %mm0
decl %ecx
andl $15, %ecx # mask off ASCII bits
movl $256, %eax
shll %cl, %eax
movl %eax, 4(%esp) # save blocksize
movl $256*1024, %eax
shrl %cl, %eax
movl %eax, 8(%esp) # save count blks / 512 MB

movl 4(%esp), %ecx # initial fill of 2 cachelines
shrl $4, %ecx
movl $buffer, %edi
xorl %eax, %eax
notl %eax
more:
movl %eax, %edx # qwords F-F-0-F , F-0-F-0
notl %edx
movl %eax, 0(%edi)
movl %eax, 4(%edi)
movl %eax, 8(%edi)
movl %eax, 12(%edi)
movl %edx, 16(%edi)
movl %edx, 20(%edi)
movl %eax, 24(%edi)
movl %eax, 28(%edi)

movl %eax, 32(%edi)
movl %eax, 36(%edi)
movl %edx, 40(%edi)
movl %edx, 44(%edi)
movl %eax, 48(%edi)
movl %eax, 52(%edi)
movl %edx, 56(%edi)
movl %edx, 60(%edi)
rcll $1, %eax # walking zero, 33 cycle
leal 64(%edi), %edi # odd inst to preserve CF
decl %ecx
jnz more

thrash: # OUTER LOOP
movl 8(%esp), %edx # reset count for 512 MB
mov_again:
movq %mm0, %mm1
movq %mm0, %mm2
movl $buffer, %esi
movl $buf2, %edi
movl 4(%esp), %ecx
shll $2, %ecx # move block up
addl %ecx, %esi
addl %ecx, %edi
negl %ecx
.align 16, 0x90
0: # WORKLOOP 7 uops/ 3 clks in L1
movq 0(%esi,%ecx),%mm7
pmaddwd %mm0, %mm1
pmaddwd %mm0, %mm2
movq %mm7, 0(%edi,%ecx)
addl $8, %ecx
jnz 0b

movl $buffer + 32, %edi # move block back
movl $buf2, %esi # shifting by
movl 4(%esp), %ecx # one cacheline
subl $8, %ecx
shll $2, %ecx
addl %ecx, %esi
addl %ecx, %edi
negl %ecx
.align 16, 0x90
0: # second workloop
movq 0(%esi,%ecx),%mm7
pmaddwd %mm0, %mm1
pmaddwd %mm0, %mm2
movq %mm7, 0(%edi,%ecx)
addl $8, %ecx
jnz 0b

movl $buffer, %edi
movsl # replace last c line
movsl
movsl
movsl
movsl
movsl
movsl
movsl
decl %edx # do again for 512 MB.
jnz mov_again

xorl %ebx ,%ebx # DATA CHECK
decl %ebx
pcmpeqd %mm2, %mm1
psrlq $16, %mm1
movd %mm1, %eax
incl %eax
jnz error # MMX calcs OK?

decl %ebx
subl $32, %edi
xorl %ecx, %ecx
test: # Check data (NOT optimized)
mov 0(%edi,%ecx,4), %eax
cmp %eax, 4(%edi,%ecx,4)
jnz error
incl %ecx
incl %ecx
cmpl 4(%esp), %ecx
jc test
jmp thrash

error: # error abend
emms
movl $1, %eax
#ifdef WINDOWS
addl $12, %esp # deallocate stack
ret
#else
push %ebx
push %eax
int $0x80
#endif
rt: .long 0x7fffffff, 0x7fffffff

.bss # Data allocation
.align 32
.lcomm buffer, 32 <<20 # reduce both to 8 <<20 for only
.lcomm buf2, 32 <<20 # 16 MB virtual memory available

#
cpuburn-1.4/burnP5.S000064400000000000000000000024131213405604500143310ustar00rootroot00000000000000# cpuburn-1.4: burnP5 CPU Loading Utility
# Copyright 1999 Robert J. Redelmeier. All Right Reserved
# Licensed under GNU General Public Licence 2.0. No warrantee.
# *** USE AT YOUR OWN RISK ***

.text
#ifdef WINDOWS
.globl _main
_main:
#else
.globl _start
_start:
#endif
finit
pushl %ebp
movl %esp, %ebp
andl $-32, %ebp
subl $96, %esp
fldl half
fstpl -24(%ebp)
fldl one
fstl -16(%ebp)
fld %st
fld %st
after_check:
xorl %eax, %eax
movl %eax, %ebx
movl $200000000, %ecx
.align 32, 0x90
# MAIN LOOP 16 flops / 18 cycles
crunch:
fmull -24(%ebp)
fxch %st(1)
faddl -16(%ebp)
fxch %st(2)
fmull -24(%ebp)
fxch %st(1)
faddl -16(%ebp)
fxch %st(2)

fmull -24(%ebp)
fxch %st(1)
faddl -16(%ebp)
fxch %st(2)
fmull -24(%ebp)
fxch %st(1)
faddl -16(%ebp)
fxch %st(2)

fmull -24(%ebp)
fxch %st(1)
faddl -16(%ebp)
fxch %st(2)
fmull -24(%ebp)
fxch %st(1)
faddl -16(%ebp)
fxch %st(2)

fmull -24(%ebp)
fxch %st(1)
faddl -16(%ebp)
fxch %st(2)
fmull -24(%ebp)
fxch %st(1)
faddl -16(%ebp)
fxch %st(2)

decl %ecx
jnz crunch

jmp after_check
addl $96, %esp # never reached
popl %ebp # no checking done
movl $1, %eax
#ifdef WINDOWS
ret
#else
int $0x80
#endif
.align 32,0
half: .long 0xffffffff,0x3fdfffff
one: .long 0xffffffff,0x3fefffff

cpuburn-1.4/burnP6.S000064400000000000000000000025071213405604500143360ustar00rootroot00000000000000# cpuburn-1.4: burnP6 CPU Loading Utility
# Copyright 1999 Robert J. Redelmeier. All Right Reserved
# Licensed under GNU General Public Licence 2.0. No warrantee.
# *** USE AT YOUR OWN RISK ***

.text
#ifdef WINDOWS
.globl _main
_main:
#else
.globl _start
_start:
#endif
finit
pushl %ebp
movl %esp, %ebp
andl $-32, %ebp
subl $96, %esp
fldpi
fldl rt
fstpl -24(%ebp)
fldl e
fstpl -32(%ebp)
movl half, %edx
movl %edx, -8(%ebp)
after_check:
xorl %eax, %eax
movl %eax, %ebx
lea -1(%eax), %esi
movl $539000000, %ecx # check after this count
movl %ecx, -4(%ebp)
.align 32, 0x90
crunch: # MAIN LOOP 21uops / 8.0 clocks
fldl 8-24(%ebp,%esi,8)
fmull 8-32(%ebp,%esi,8)
addl half, %edx
jnz . + 2
faddp
fldl -24(%ebp)
decl %ebx
subl half+9(%esi,%esi,8), %edx
jmp . + 2
fmull 8-32(%ebp,%esi,8)
incl %ebx
decl 8-4(%ebp,%esi,8)
fsubp
jnz crunch

test %ebx, %ebx # Testing block
mov $0, %ebx
jnz int_exit
cmpl half, %edx
jnz int_exit
fldpi
fcomp %st(1)
fstsw %ax
sahf
jz after_check # fp result = pi ?
decl %ebx
int_exit: # error abort
decl %ebx
addl $96, %esp
popl %ebp
movl $1, %eax # Linux syscall
#ifdef WINDOWS
ret
#else
push %ebx
push %eax # *BSD syscall
int $0x80
#endif
.align 32,0
half: .long 0x7fffffff,0
e: .long 0xffffffff,0x3fdfffff
rt: .long 0xffffffff,0x3fefffff
#

 
дизайн и разработка: Vladimir Lettiev aka crux © 2004-2005, Andrew Avramenko aka liks © 2007-2008
текущий майнтейнер: Michael Shigorin