Megalextoria
Retro computing and gaming, sci-fi books, tv and movies and other geeky stuff.

Home » Digital Archaeology » Computer Arcana » Apple » Apple II » Sieve of Eratosthenes benchmark
Show: Today's Messages :: Show Polls :: Message Navigator
E-mail to friend 
Switch to threaded view of this topic Create a new topic Submit Reply
Sieve of Eratosthenes benchmark [message #365309] Sun, 18 March 2018 14:56 Go to next message
Anonymous
Karma:
Originally posted by: John Brooks

I wrote an optimized Apple IIGS program to calculate all the prime numbers up to 16834. Finding primes via the SoE method is a common CPU benchmark and I wanted to see how fast the GS could compute primes.

The fastest previous versions for the 6502/65c02/65802/65816 was 140 usec @ 1 MHz.

This version is over 2x faster, finding the 1899 primes in ~59 usec @ 1 MHz, and ~24 usec @ 2.8 MHz.

The program is 245 bytes long, consisting of the sieve routine at 192 bytes, and a 53 byte test program which runs the sieve 100 times and prints out the number of primes found: 1899.

00/2000:A2 64 18 FB C2 30 86 40 20 35 20 C6 40 D0 F9 38
00/2010:FB 20 3A FF 18 FB C2 30 78 BA 9B A9 00 60 1B 7B
00/2020:AB D0 01 1A BA 10 F9 BB 9A 4B AB AA EB 58 FB 20
00/2030:24 ED 4C 8E FD 0B 3B 8D EC 20 A9 FF 7F 1B A2 4E
00/2040:00 86 42 7B 1A AA EB A8 7B 3A 5A 8B 48 48 0B 5A
00/2050:DA DA 5A DA 5A 5A DA DA 5A 48 0B 48 DA 0B 48 DA
00/2060:DA 5A 48 5A 5A DA 0B 5A 48 5A 48 DA DA 5A DA DA
00/2070:5A DA 5A 48 DA 0B 5A 48 0B 48 DA 0B 5A DA 48 C6
00/2080:42 D0 C7 7A 0B 0B E2 14 A9 00 20 5B A9 05 60 1B
00/2090:A9 3C 60 A0 0A 85 E2 C8 84 AE 84 B3 84 B8 84 BD
00/20A0:84 C2 84 C7 84 CC 84 D1 C2 10 BA 1B 08 69 00 00
00/20B0:1B 08 69 00 00 1B 08 69 00 00 1B 08 69 00 00 1B
00/20C0:08 69 00 00 1B 08 69 00 00 1B 08 69 00 00 1B 08
00/20D0:69 00 00 10 D6 9A E2 10 C8 10 04 C8 C8 30 0C 98
00/20E0:0A 69 00 00 85 E2 AB D0 F2 80 AC A9 00 00 1B 2B
00/20F0:4B AB C2 34 60

The sieve routine at $2035 is called in native 65816 mode with 16-bit AXY, B=0, D=0.

Enjoy.

-JB
@JBrooksBSI

00/2000: A2 64 LDX #64
00/2002: 18 CLC
00/2003: FB XCE
00/2004: C2 30 REP #30
00/2006: 86 40 STX 40
00/2008: 20 35 20 JSR 2035
00/200B: C6 40 DEC 40
00/200D: D0 F9 BNE 2008 {-07}
00/200F: 38 SEC
00/2010: FB XCE
00/2011: 20 3A FF JSR FF3A
00/2014: 18 CLC
00/2015: FB XCE
00/2016: C2 30 REP #30
00/2018: 78 SEI
00/2019: BA TSX
00/201A: 9B TXY
00/201B: A9 00 60 LDA #6000
00/201E: 1B TCS
00/201F: 7B TDC
00/2020: AB PLB
00/2021: D0 01 BNE 2024 {+01}
00/2023: 1A INC
00/2024: BA TSX
00/2025: 10 F9 BPL 2020 {-07}
00/2027: BB TYX
00/2028: 9A TXS
00/2029: 4B PHK
00/202A: AB PLB
00/202B: AA TAX
00/202C: EB XBA
00/202D: 58 CLI
00/202E: FB XCE
00/202F: 20 24 ED JSR ED24
00/2032: 4C 8E FD JMP FD8E
00/2035: 0B PHD
00/2036: 3B TSC
00/2037: 8D EC 20 STA 20EC
00/203A: A9 FF 7F LDA #7FFF
00/203D: 1B TCS
00/203E: A2 4E 00 LDX #004E
00/2041: 86 42 STX 42
00/2043: 7B TDC
00/2044: 1A INC
00/2045: AA TAX
00/2046: EB XBA
00/2047: A8 TAY
00/2048: 7B TDC
00/2049: 3A DEC
00/204A: 5A PHY
00/204B: 8B PHB
00/204C: 48 PHA
00/204D: 48 PHA
00/204E: 0B PHD
00/204F: 5A PHY
00/2050: DA PHX
00/2051: DA PHX
00/2052: 5A PHY
00/2053: DA PHX
00/2054: 5A PHY
00/2055: 5A PHY
00/2056: DA PHX
00/2057: DA PHX
00/2058: 5A PHY
00/2059: 48 PHA
00/205A: 0B PHD
00/205B: 48 PHA
00/205C: DA PHX
00/205D: 0B PHD
00/205E: 48 PHA
00/205F: DA PHX
00/2060: DA PHX
00/2061: 5A PHY
00/2062: 48 PHA
00/2063: 5A PHY
00/2064: 5A PHY
00/2065: DA PHX
00/2066: 0B PHD
00/2067: 5A PHY
00/2068: 48 PHA
00/2069: 5A PHY
00/206A: 48 PHA
00/206B: DA PHX
00/206C: DA PHX
00/206D: 5A PHY
00/206E: DA PHX
00/206F: DA PHX
00/2070: 5A PHY
00/2071: DA PHX
00/2072: 5A PHY
00/2073: 48 PHA
00/2074: DA PHX
00/2075: 0B PHD
00/2076: 5A PHY
00/2077: 48 PHA
00/2078: 0B PHD
00/2079: 48 PHA
00/207A: DA PHX
00/207B: 0B PHD
00/207C: 5A PHY
00/207D: DA PHX
00/207E: 48 PHA
00/207F: C6 42 DEC 42
00/2081: D0 C7 BNE 204A {-39}
00/2083: 7A PLY
00/2084: 0B PHD
00/2085: 0B PHD
00/2086: E2 14 SEP #14
00/2088: A9 00 20 LDA #2000
00/208B: 5B TCD
00/208C: A9 05 60 LDA #6005
00/208F: 1B TCS
00/2090: A9 3C 60 LDA #603C
00/2093: A0 0A LDY #0A
00/2095: 85 E2 STA E2
00/2097: C8 INY
00/2098: 84 AE STY AE
00/209A: 84 B3 STY B3
00/209C: 84 B8 STY B8
00/209E: 84 BD STY BD
00/20A0: 84 C2 STY C2
00/20A2: 84 C7 STY C7
00/20A4: 84 CC STY CC
00/20A6: 84 D1 STY D1
00/20A8: C2 10 REP #10
00/20AA: BA TSX
00/20AB: 1B TCS
00/20AC: 08 PHP
00/20AD: 69 00 00 ADC #0000
00/20B0: 1B TCS
00/20B1: 08 PHP
00/20B2: 69 00 00 ADC #0000
00/20B5: 1B TCS
00/20B6: 08 PHP
00/20B7: 69 00 00 ADC #0000
00/20BA: 1B TCS
00/20BB: 08 PHP
00/20BC: 69 00 00 ADC #0000
00/20BF: 1B TCS
00/20C0: 08 PHP
00/20C1: 69 00 00 ADC #0000
00/20C4: 1B TCS
00/20C5: 08 PHP
00/20C6: 69 00 00 ADC #0000
00/20C9: 1B TCS
00/20CA: 08 PHP
00/20CB: 69 00 00 ADC #0000
00/20CE: 1B TCS
00/20CF: 08 PHP
00/20D0: 69 00 00 ADC #0000
00/20D3: 10 D6 BPL 20AB {-2A}
00/20D5: 9A TXS
00/20D6: E2 10 SEP #10
00/20D8: C8 INY
00/20D9: 10 04 BPL 20DF {+04}
00/20DB: C8 INY
00/20DC: C8 INY
00/20DD: 30 0C BMI 20EB {+0C}
00/20DF: 98 TYA
00/20E0: 0A ASL
00/20E1: 69 00 00 ADC #0000
00/20E4: 85 E2 STA E2
00/20E6: AB PLB
00/20E7: D0 F2 BNE 20DB {-0E}
00/20E9: 80 AC BRA 2097 {-54}
00/20EB: A9 00 00 LDA #0000
00/20EE: 1B TCS
00/20EF: 2B PLD
00/20F0: 4B PHK
00/20F1: AB PLB
00/20F2: C2 34 REP #34
00/20F4: 60 RTS
Re: Sieve of Eratosthenes benchmark [message #365310 is a reply to message #365309] Sun, 18 March 2018 16:24 Go to previous messageGo to next message
Antoine Vignau is currently offline  Antoine Vignau
Messages: 1860
Registered: October 2012
Karma: 0
Senior Member
You are crazy, John ;-)
Re: Sieve of Eratosthenes benchmark [message #365323 is a reply to message #365309] Sun, 18 March 2018 16:57 Go to previous messageGo to next message
bpiltz is currently offline  bpiltz
Messages: 78
Registered: October 2012
Karma: 0
Member
Yes, crazy in a totally great, awesome way!

I love these optimized benchmark programs. How about giving us a native 6502/6502c version where we can specify the number of times the sieve is run, and therefore, the ultimate number of primes desired? My rusty old skills are not up the the task, although I think it is another great exercise in demonstrating speed of execution vs size of code issue, and here, of course, we're going for ultimate speed in calculating the sieve.
Re: Sieve of Eratosthenes benchmark [message #365324 is a reply to message #365323] Sun, 18 March 2018 18:01 Go to previous messageGo to next message
Antoine Vignau is currently offline  Antoine Vignau
Messages: 1860
Registered: October 2012
Karma: 0
Senior Member
See https://rosettacode.org/wiki/Sieve_of_Eratosthenes
Re: Sieve of Eratosthenes benchmark [message #365325 is a reply to message #365324] Sun, 18 March 2018 18:04 Go to previous messageGo to next message
Antoine Vignau is currently offline  Antoine Vignau
Messages: 1860
Registered: October 2012
Karma: 0
Senior Member
In John's code, all # must be read as #$. Hex code, not decimal,
av
Re: Sieve of Eratosthenes benchmark [message #365332 is a reply to message #365309] Sun, 18 March 2018 20:24 Go to previous messageGo to next message
Anonymous
Karma:
Originally posted by: John Brooks

On Sunday, March 18, 2018 at 11:56:23 AM UTC-7, John Brooks wrote:
> I wrote an optimized Apple IIGS program to calculate all the prime numbers up to 16834. Finding primes via the SoE method is a common CPU benchmark and I wanted to see how fast the GS could compute primes.
>
> The fastest previous versions for the 6502/65c02/65802/65816 was 140 usec @ 1 MHz.
>
> This version is over 2x faster, finding the 1899 primes in ~59 usec @ 1 MHz, and ~24 usec @ 2.8 MHz.
>
> The program is 245 bytes long, consisting of the sieve routine at 192 bytes, and a 53 byte test program which runs the sieve 100 times and prints out the number of primes found: 1899.
>
> 00/2000:A2 64 18 FB C2 30 86 40 20 35 20 C6 40 D0 F9 38
> 00/2010:FB 20 3A FF 18 FB C2 30 78 BA 9B A9 00 60 1B 7B
> 00/2020:AB D0 01 1A BA 10 F9 BB 9A 4B AB AA EB 58 FB 20
> 00/2030:24 ED 4C 8E FD 0B 3B 8D EC 20 A9 FF 7F 1B A2 4E
> 00/2040:00 86 42 7B 1A AA EB A8 7B 3A 5A 8B 48 48 0B 5A
> 00/2050:DA DA 5A DA 5A 5A DA DA 5A 48 0B 48 DA 0B 48 DA
> 00/2060:DA 5A 48 5A 5A DA 0B 5A 48 5A 48 DA DA 5A DA DA
> 00/2070:5A DA 5A 48 DA 0B 5A 48 0B 48 DA 0B 5A DA 48 C6
> 00/2080:42 D0 C7 7A 0B 0B E2 14 A9 00 20 5B A9 05 60 1B
> 00/2090:A9 3C 60 A0 0A 85 E2 C8 84 AE 84 B3 84 B8 84 BD
> 00/20A0:84 C2 84 C7 84 CC 84 D1 C2 10 BA 1B 08 69 00 00
> 00/20B0:1B 08 69 00 00 1B 08 69 00 00 1B 08 69 00 00 1B
> 00/20C0:08 69 00 00 1B 08 69 00 00 1B 08 69 00 00 1B 08
> 00/20D0:69 00 00 10 D6 9A E2 10 C8 10 04 C8 C8 30 0C 98
> 00/20E0:0A 69 00 00 85 E2 AB D0 F2 80 AC A9 00 00 1B 2B
> 00/20F0:4B AB C2 34 60
>
> The sieve routine at $2035 is called in native 65816 mode with 16-bit AXY, B=0, D=0.
>
> Enjoy.
>
> -JB
> @JBrooksBSI
>
> 00/2000: A2 64 LDX #64
> 00/2002: 18 CLC
> 00/2003: FB XCE
> 00/2004: C2 30 REP #30
> 00/2006: 86 40 STX 40
> 00/2008: 20 35 20 JSR 2035
> 00/200B: C6 40 DEC 40
> 00/200D: D0 F9 BNE 2008 {-07}
> 00/200F: 38 SEC
> 00/2010: FB XCE
> 00/2011: 20 3A FF JSR FF3A
> 00/2014: 18 CLC
> 00/2015: FB XCE
> 00/2016: C2 30 REP #30
> 00/2018: 78 SEI
> 00/2019: BA TSX
> 00/201A: 9B TXY
> 00/201B: A9 00 60 LDA #6000
> 00/201E: 1B TCS
> 00/201F: 7B TDC
> 00/2020: AB PLB
> 00/2021: D0 01 BNE 2024 {+01}
> 00/2023: 1A INC
> 00/2024: BA TSX
> 00/2025: 10 F9 BPL 2020 {-07}
> 00/2027: BB TYX
> 00/2028: 9A TXS
> 00/2029: 4B PHK
> 00/202A: AB PLB
> 00/202B: AA TAX
> 00/202C: EB XBA
> 00/202D: 58 CLI
> 00/202E: FB XCE
> 00/202F: 20 24 ED JSR ED24
> 00/2032: 4C 8E FD JMP FD8E
> 00/2035: 0B PHD
> 00/2036: 3B TSC
> 00/2037: 8D EC 20 STA 20EC
> 00/203A: A9 FF 7F LDA #7FFF
> 00/203D: 1B TCS
> 00/203E: A2 4E 00 LDX #004E
> 00/2041: 86 42 STX 42
> 00/2043: 7B TDC
> 00/2044: 1A INC
> 00/2045: AA TAX
> 00/2046: EB XBA
> 00/2047: A8 TAY
> 00/2048: 7B TDC
> 00/2049: 3A DEC
> 00/204A: 5A PHY
> 00/204B: 8B PHB
> 00/204C: 48 PHA
> 00/204D: 48 PHA
> 00/204E: 0B PHD
> 00/204F: 5A PHY
> 00/2050: DA PHX
> 00/2051: DA PHX
> 00/2052: 5A PHY
> 00/2053: DA PHX
> 00/2054: 5A PHY
> 00/2055: 5A PHY
> 00/2056: DA PHX
> 00/2057: DA PHX
> 00/2058: 5A PHY
> 00/2059: 48 PHA
> 00/205A: 0B PHD
> 00/205B: 48 PHA
> 00/205C: DA PHX
> 00/205D: 0B PHD
> 00/205E: 48 PHA
> 00/205F: DA PHX
> 00/2060: DA PHX
> 00/2061: 5A PHY
> 00/2062: 48 PHA
> 00/2063: 5A PHY
> 00/2064: 5A PHY
> 00/2065: DA PHX
> 00/2066: 0B PHD
> 00/2067: 5A PHY
> 00/2068: 48 PHA
> 00/2069: 5A PHY
> 00/206A: 48 PHA
> 00/206B: DA PHX
> 00/206C: DA PHX
> 00/206D: 5A PHY
> 00/206E: DA PHX
> 00/206F: DA PHX
> 00/2070: 5A PHY
> 00/2071: DA PHX
> 00/2072: 5A PHY
> 00/2073: 48 PHA
> 00/2074: DA PHX
> 00/2075: 0B PHD
> 00/2076: 5A PHY
> 00/2077: 48 PHA
> 00/2078: 0B PHD
> 00/2079: 48 PHA
> 00/207A: DA PHX
> 00/207B: 0B PHD
> 00/207C: 5A PHY
> 00/207D: DA PHX
> 00/207E: 48 PHA
> 00/207F: C6 42 DEC 42
> 00/2081: D0 C7 BNE 204A {-39}
> 00/2083: 7A PLY
> 00/2084: 0B PHD
> 00/2085: 0B PHD
> 00/2086: E2 14 SEP #14
> 00/2088: A9 00 20 LDA #2000
> 00/208B: 5B TCD
> 00/208C: A9 05 60 LDA #6005
> 00/208F: 1B TCS
> 00/2090: A9 3C 60 LDA #603C
> 00/2093: A0 0A LDY #0A
> 00/2095: 85 E2 STA E2
> 00/2097: C8 INY
> 00/2098: 84 AE STY AE
> 00/209A: 84 B3 STY B3
> 00/209C: 84 B8 STY B8
> 00/209E: 84 BD STY BD
> 00/20A0: 84 C2 STY C2
> 00/20A2: 84 C7 STY C7
> 00/20A4: 84 CC STY CC
> 00/20A6: 84 D1 STY D1
> 00/20A8: C2 10 REP #10
> 00/20AA: BA TSX
> 00/20AB: 1B TCS
> 00/20AC: 08 PHP
> 00/20AD: 69 00 00 ADC #0000
> 00/20B0: 1B TCS
> 00/20B1: 08 PHP
> 00/20B2: 69 00 00 ADC #0000
> 00/20B5: 1B TCS
> 00/20B6: 08 PHP
> 00/20B7: 69 00 00 ADC #0000
> 00/20BA: 1B TCS
> 00/20BB: 08 PHP
> 00/20BC: 69 00 00 ADC #0000
> 00/20BF: 1B TCS
> 00/20C0: 08 PHP
> 00/20C1: 69 00 00 ADC #0000
> 00/20C4: 1B TCS
> 00/20C5: 08 PHP
> 00/20C6: 69 00 00 ADC #0000
> 00/20C9: 1B TCS
> 00/20CA: 08 PHP
> 00/20CB: 69 00 00 ADC #0000
> 00/20CE: 1B TCS
> 00/20CF: 08 PHP
> 00/20D0: 69 00 00 ADC #0000
> 00/20D3: 10 D6 BPL 20AB {-2A}
> 00/20D5: 9A TXS
> 00/20D6: E2 10 SEP #10
> 00/20D8: C8 INY
> 00/20D9: 10 04 BPL 20DF {+04}
> 00/20DB: C8 INY
> 00/20DC: C8 INY
> 00/20DD: 30 0C BMI 20EB {+0C}
> 00/20DF: 98 TYA
> 00/20E0: 0A ASL
> 00/20E1: 69 00 00 ADC #0000
> 00/20E4: 85 E2 STA E2
> 00/20E6: AB PLB
> 00/20E7: D0 F2 BNE 20DB {-0E}
> 00/20E9: 80 AC BRA 2097 {-54}
> 00/20EB: A9 00 00 LDA #0000
> 00/20EE: 1B TCS
> 00/20EF: 2B PLD
> 00/20F0: 4B PHK
> 00/20F1: AB PLB
> 00/20F2: C2 34 REP #34
> 00/20F4: 60 RTS

..
> I think it is another great exercise in demonstrating speed of execution vs size of code issue

Below is a version of sieve without unrolled code or self-modified code.

The sieve routine is just 66 bytes at $2035. The test code is 53 bytes. The code is simpler to read, but is 2x slower: ~125 usec to calculate the 1899 primes up to 16384.

00/2000:A2 64 18 FB C2 30 86 40 20 35 20 C6 40 D0 F9 38
00/2010:FB 20 3A FF 18 FB C2 30 78 BA 9B A9 00 60 1B 7B
00/2020:AB D0 01 1A BA 10 F9 BB 9A 4B AB AA EB 58 FB 20
00/2030:24 ED 4C 8E FD 3B 85 44 A2 FF 7F 9A A9 AA 0A E2
00/2040:14 DA 0B 3A D0 FB 0B 68 A9 02 60 1B A9 0C 60 85
00/2050:48 A0 04 C2 10 C8 84 46 BA 1B 08 65 46 10 FA 9A
00/2060:88 C8 C8 98 0A 65 48 30 07 85 48 AB D0 F3 80 E5
00/2070:A5 44 1B 4B AB 58 60


> In John's code, all # must be read as #$. Hex code, not decimal

Yes, these listings are from the GS monitor, so all numbers are in hex

-JB
@JBrooksBSI

00/2000: A2 64 LDX #64
00/2002: 18 CLC
00/2003: FB XCE
00/2004: C2 30 REP #30
00/2006: 86 40 STX 40
00/2008: 20 35 20 JSR 2035
00/200B: C6 40 DEC 40
00/200D: D0 F9 BNE 2008 {-07}
00/200F: 38 SEC
00/2010: FB XCE
00/2011: 20 3A FF JSR FF3A
00/2014: 18 CLC
00/2015: FB XCE
00/2016: C2 30 REP #30
00/2018: 78 SEI
00/2019: BA TSX
00/201A: 9B TXY
00/201B: A9 00 60 LDA #6000
00/201E: 1B TCS
00/201F: 7B TDC
00/2020: AB PLB
00/2021: D0 01 BNE 2024 {+01}
00/2023: 1A INC
00/2024: BA TSX
00/2025: 10 F9 BPL 2020 {-07}
00/2027: BB TYX
00/2028: 9A TXS
00/2029: 4B PHK
00/202A: AB PLB
00/202B: AA TAX
00/202C: EB XBA
00/202D: 58 CLI
00/202E: FB XCE
00/202F: 20 24 ED JSR ED24
00/2032: 4C 8E FD JMP FD8E
00/2035: 3B TSC
00/2036: 85 44 STA 44
00/2038: A2 FF 7F LDX #7FFF
00/203B: 9A TXS
00/203C: A9 AA 0A LDA #0AAA
00/203F: E2 14 SEP #14
00/2041: DA PHX
00/2042: 0B PHD
00/2043: 3A DEC
00/2044: D0 FB BNE 2041 {-05}
00/2046: 0B PHD
00/2047: 68 PLA
00/2048: A9 02 60 LDA #6002
00/204B: 1B TCS
00/204C: A9 0C 60 LDA #600C
00/204F: 85 48 STA 48
00/2051: A0 04 C2 LDY #C204
00/2054: 10 C8 BPL 201E {-38}
00/2056: 84 46 STY 46
00/2058: BA TSX
00/2059: 1B TCS
00/205A: 08 PHP
00/205B: 65 46 ADC 46
00/205D: 10 FA BPL 2059 {-06}
00/205F: 9A TXS
00/2060: 88 DEY
00/2061: C8 INY
00/2062: C8 INY
00/2063: 98 TYA
00/2064: 0A ASL
00/2065: 65 48 ADC 48
00/2067: 30 07 BMI 2070 {+07}
00/2069: 85 48 STA 48
00/206B: AB PLB
00/206C: D0 F3 BNE 2061 {-0D}
00/206E: 80 E5 BRA 2055 {-1B}
00/2070: A5 44 LDA 44
00/2072: 1B TCS
00/2073: 4B PHK
00/2074: AB PLB
00/2075: 58 CLI
00/2076: 60 RTS
Re: Sieve of Eratosthenes benchmark [message #365344 is a reply to message #365309] Mon, 19 March 2018 02:38 Go to previous messageGo to next message
barrym95838 is currently offline  barrym95838
Messages: 130
Registered: April 2013
Karma: 0
Senior Member
On Sunday, March 18, 2018 at 11:56:23 AM UTC-7, John Brooks wrote:
> I wrote an optimized Apple IIGS program to calculate all the prime numbers
> up to 16834. Finding primes via the SoE method is a common CPU benchmark
> and I wanted to see how fast the GS could compute primes.
>
> The fastest previous versions for the 6502/65c02/65802/65816 was 140 usec
> @ 1 MHz.
>
> This version is over 2x faster, finding the 1899 primes in ~59 usec @ 1 MHz,
> and ~24 usec @ 2.8 MHz.
>
I know you're pretty darn good, but getting a 1 MHz anything to find 1899
anythings in ~59 microseconds is pure sorcery. Unless you really meant to
say "msec" ... ;-)

Mike B.
Re: Sieve of Eratosthenes benchmark [message #365346 is a reply to message #365344] Mon, 19 March 2018 03:42 Go to previous messageGo to next message
Anonymous
Karma:
Originally posted by: James Davis

On Sunday, March 18, 2018 at 11:38:58 PM UTC-7, barrym95838 wrote:
> On Sunday, March 18, 2018 at 11:56:23 AM UTC-7, John Brooks wrote:
>> I wrote an optimized Apple IIGS program to calculate all the prime numbers
>> up to 16834. Finding primes via the SoE method is a common CPU benchmark
>> and I wanted to see how fast the GS could compute primes.
>>
>> The fastest previous versions for the 6502/65c02/65802/65816 was 140 usec
>> @ 1 MHz.
>>
>> This version is over 2x faster, finding the 1899 primes in ~59 usec @ 1 MHz,
>> and ~24 usec @ 2.8 MHz.
>>
> I know you're pretty darn good, but getting a 1 MHz anything to find 1899
> anythings in ~59 microseconds is pure sorcery. Unless you really meant to
> say "msec" ... ;-)
>
> Mike B.

He is talking usec, You-Seconds; they are totally subjective, like 1-mississippi, 2-mississippi, 3-mississippi, etc., or at the speed of Your sub-articulation. ;-D & LOL!
Re: Sieve of Eratosthenes benchmark [message #365365 is a reply to message #365346] Mon, 19 March 2018 11:25 Go to previous messageGo to next message
barrym95838 is currently offline  barrym95838
Messages: 130
Registered: April 2013
Karma: 0
Senior Member
On Monday, March 19, 2018 at 12:42:48 AM UTC-7, James Davis wrote:
> On Sunday, March 18, 2018 at 11:38:58 PM UTC-7, barrym95838 wrote:
>> On Sunday, March 18, 2018 at 11:56:23 AM UTC-7, John Brooks wrote:
>>> I wrote an optimized Apple IIGS program to calculate all the prime numbers
>>> up to 16834. Finding primes via the SoE method is a common CPU benchmark
>>> and I wanted to see how fast the GS could compute primes.
>>>
>>> The fastest previous versions for the 6502/65c02/65802/65816 was 140 usec
>>> @ 1 MHz.
>>>
>>> This version is over 2x faster, finding the 1899 primes in ~59 usec @ 1 MHz,
>>> and ~24 usec @ 2.8 MHz.
>>>
>> I know you're pretty darn good, but getting a 1 MHz anything to find 1899
>> anythings in ~59 microseconds is pure sorcery. Unless you really meant to
>> say "msec" ... ;-)
>>
>> Mike B.
>
> He is talking usec, You-Seconds; they are totally subjective, like 1-
> mississippi, 2-mississippi, 3-mississippi, etc., or at the speed of Your
> sub-articulation. ;-D & LOL!

.... and while we're doing a bit of friendly trolling, please observe:

10 INPUT "PRIME NUMBER UPPER SEARCH LIMIT: ";L: DIM F%(L)
20 FOR N = 2 TO SQR (L): IF F%(N) THEN 40
30 FOR K = N * N TO L STEP N:F%(K) = 1: NEXT
40 NEXT :K = 0: FOR N = 2 TO L: IF F%(N) = 0 THEN K = K + 1: REM PRINT N,
50 NEXT : PRINT "TOTAL OF "K

My version is about 4000 times slower, but because it takes its time it was
able to find the 1900th prime that John's speed demon completely missed!

Mike B.
Re: Sieve of Eratosthenes benchmark [message #365366 is a reply to message #365365] Mon, 19 March 2018 12:45 Go to previous messageGo to next message
Anonymous
Karma:
Originally posted by: James Davis

On Monday, March 19, 2018 at 8:25:45 AM UTC-7, barrym95838 wrote:
> On Monday, March 19, 2018 at 12:42:48 AM UTC-7, James Davis wrote:
>> On Sunday, March 18, 2018 at 11:38:58 PM UTC-7, barrym95838 wrote:
>>> On Sunday, March 18, 2018 at 11:56:23 AM UTC-7, John Brooks wrote:
>>>> I wrote an optimized Apple IIGS program to calculate all the prime numbers
>>>> up to 16834. Finding primes via the SoE method is a common CPU benchmark
>>>> and I wanted to see how fast the GS could compute primes.
>>>>
>>>> The fastest previous versions for the 6502/65c02/65802/65816 was 140 usec
>>>> @ 1 MHz.
>>>>
>>>> This version is over 2x faster, finding the 1899 primes in ~59 usec @ 1 MHz,
>>>> and ~24 usec @ 2.8 MHz.
>>>>
>>> I know you're pretty darn good, but getting a 1 MHz anything to find 1899
>>> anythings in ~59 microseconds is pure sorcery. Unless you really meant to
>>> say "msec" ... ;-)
>>>
>>> Mike B.
>>
>> He is talking usec, You-Seconds; they are totally subjective, like 1-
>> mississippi, 2-mississippi, 3-mississippi, etc., or at the speed of Your
>> sub-articulation. ;-D & LOL!
>
> ... and while we're doing a bit of friendly trolling, please observe:
>
> 10 INPUT "PRIME NUMBER UPPER SEARCH LIMIT: ";L: DIM F%(L)
> 20 FOR N = 2 TO SQR (L): IF F%(N) THEN 40
> 30 FOR K = N * N TO L STEP N:F%(K) = 1: NEXT
> 40 NEXT :K = 0: FOR N = 2 TO L: IF F%(N) = 0 THEN K = K + 1: REM PRINT N,
> 50 NEXT : PRINT "TOTAL OF "K
>
> My version is about 4000 times slower, but because it takes its time it was
> able to find the 1900th prime that John's speed demon completely missed!
>
> Mike B.

Back in the day, after reading an article by WOZ about generating prime numbers using the Sweet16 interpreter, I wrote an Applesoft program to do it and print them out. Or, maybe WOZ wrote it and I just copied it. I don't rightly remember. It used the "&" vector to hook into the WOZ's Sweet16 interpreter routine, IIRC. It generated a list of 6,140 primes from 3 to 60,923 before I stopped it. It ran for at least 1/2 hour or maybe more. I do not know if it was using this algorithm or not, though.
Re: Sieve of Eratosthenes benchmark [message #365369 is a reply to message #365344] Mon, 19 March 2018 13:45 Go to previous messageGo to next message
Anonymous
Karma:
Originally posted by: John Brooks

On Sunday, March 18, 2018 at 11:38:58 PM UTC-7, barrym95838 wrote:
> On Sunday, March 18, 2018 at 11:56:23 AM UTC-7, John Brooks wrote:
>> I wrote an optimized Apple IIGS program to calculate all the prime numbers
>> up to 16834. Finding primes via the SoE method is a common CPU benchmark
>> and I wanted to see how fast the GS could compute primes.
>>
>> The fastest previous versions for the 6502/65c02/65802/65816 was 140 usec
>> @ 1 MHz.
>>
>> This version is over 2x faster, finding the 1899 primes in ~59 usec @ 1 MHz,
>> and ~24 usec @ 2.8 MHz.
>>
> I know you're pretty darn good, but getting a 1 MHz anything to find 1899
> anythings in ~59 microseconds is pure sorcery. Unless you really meant to
> say "msec" ... ;-)
>
> Mike B.

Good catch! Yes, the reported sieve times are in ms milliseconds. Got stuck in cycle-counting usec mode...


> My version is about 4000 times slower, but because it takes its time it was
able to find the 1900th prime that John's speed demon completely missed!

Optimized SoE calculators usually report zero-based prime counts. The first prime, 2, is typically implied as it is the only prime that is even. The calculation logic considers only odd numbers.

The following basic program will print out the prime numbers calculated by either the fast or small SoE routines:

10 HOME
20 PRINT "0) 2"
30 CTR = 1
40 FOR I = 1 TO 8191
50 IF PEEK (24576 + I) THEN NEXT : END
60 PRINT CTR") "I + I + 1:CTR = CTR + 1
70 NEXT

The GS listing of the small SoE routine had a misalignment at $2051. Here is a correctly aligned listing:

00/2035: 3B TSC
00/2036: 85 44 STA 44
00/2038: A2 FF 7F LDX #7FFF
00/203B: 9A TXS
00/203C: A9 AA 0A LDA #0AAA
00/203F: E2 14 SEP #14
00/2041: DA PHX
00/2042: 0B PHD
00/2043: 3A DEC
00/2044: D0 FB BNE 2041 {-05}
00/2046: 0B PHD
00/2047: 68 PLA
00/2048: A9 02 60 LDA #6002
00/204B: 1B TCS
00/204C: A9 0C 60 LDA #600C
00/204F: 85 48 STA 48
00/2051: A0 04 LDY #04
00/2053: C2 10 REP #10
00/2055: C8 INY
00/2056: 84 46 STY 46
00/2058: BA TSX
00/2059: 1B TCS
00/205A: 08 PHP
00/205B: 65 46 ADC 46
00/205D: 10 FA BPL 2059 {-06}
00/205F: 9A TXS
00/2060: 88 DEY
00/2061: C8 INY
00/2062: C8 INY
00/2063: 98 TYA
00/2064: 0A ASL
00/2065: 65 48 ADC 48
00/2067: 30 07 BMI 2070 {+07}
00/2069: 85 48 STA 48
00/206B: AB PLB
00/206C: D0 F3 BNE 2061 {-0D}
00/206E: 80 E5 BRA 2055 {-1B}
00/2070: A5 44 LDA 44
00/2072: 1B TCS
00/2073: 4B PHK
00/2074: AB PLB
00/2075: 58 CLI
00/2076: 60 RTS

-JB
@JBrooksBSI
Re: Sieve of Eratosthenes benchmark [message #365382 is a reply to message #365366] Mon, 19 March 2018 17:38 Go to previous messageGo to next message
barrym95838 is currently offline  barrym95838
Messages: 130
Registered: April 2013
Karma: 0
Senior Member
On Monday, March 19, 2018 at 9:45:04 AM UTC-7, James Davis wrote:
>
> Back in the day, after reading an article by WOZ about generating prime
> numbers using the Sweet16 interpreter, I wrote an Applesoft program to do
> it and print them out. Or, maybe WOZ wrote it and I just copied it. I
> don't rightly remember. It used the "&" vector to hook into the WOZ's
> Sweet16 interpreter routine, IIRC. It generated a list of 6,140 primes
> from 3 to 60,923 before I stopped it. It ran for at least 1/2 hour or
> maybe more. I do not know if it was using this algorithm or not, though.

AIUI, it's difficult for Sweet16 and Applesoft to get along with each
other for very long, since they both like to stomp on $0000..$001f. Or
maybe it's not as bad as I imagine ... I was never brave enough to try.

Mike B.
Re: Sieve of Eratosthenes benchmark [message #365488 is a reply to message #365365] Wed, 21 March 2018 18:56 Go to previous messageGo to next message
Anonymous
Karma:
Originally posted by: Gregory Schroeder

On Monday, March 19, 2018 at 10:25:45 AM UTC-5, barrym95838 wrote:
> On Monday, March 19, 2018 at 12:42:48 AM UTC-7, James Davis wrote:
>> On Sunday, March 18, 2018 at 11:38:58 PM UTC-7, barrym95838 wrote:
>>> On Sunday, March 18, 2018 at 11:56:23 AM UTC-7, John Brooks wrote:
>>>> I wrote an optimized Apple IIGS program to calculate all the prime numbers
>>>> up to 16834. Finding primes via the SoE method is a common CPU benchmark
>>>> and I wanted to see how fast the GS could compute primes.
>>>>
>>>> The fastest previous versions for the 6502/65c02/65802/65816 was 140 usec
>>>> @ 1 MHz.
>>>>
>>>> This version is over 2x faster, finding the 1899 primes in ~59 usec @ 1 MHz,
>>>> and ~24 usec @ 2.8 MHz.
>>>>
>>> I know you're pretty darn good, but getting a 1 MHz anything to find 1899
>>> anythings in ~59 microseconds is pure sorcery. Unless you really meant to
>>> say "msec" ... ;-)
>>>
>>> Mike B.
>>
>> He is talking usec, You-Seconds; they are totally subjective, like 1-
>> mississippi, 2-mississippi, 3-mississippi, etc., or at the speed of Your
>> sub-articulation. ;-D & LOL!
>
> ... and while we're doing a bit of friendly trolling, please observe:
>
> 10 INPUT "PRIME NUMBER UPPER SEARCH LIMIT: ";L: DIM F%(L)
> 20 FOR N = 2 TO SQR (L): IF F%(N) THEN 40
> 30 FOR K = N * N TO L STEP N:F%(K) = 1: NEXT
> 40 NEXT :K = 0: FOR N = 2 TO L: IF F%(N) = 0 THEN K = K + 1: REM PRINT N,
> 50 NEXT : PRINT "TOTAL OF "K
>
> My version is about 4000 times slower, but because it takes its time it was
> able to find the 1900th prime that John's speed demon completely missed!
>
> Mike B.

I use this program in GSoft and the upper limit over 16377 gives me an overflow error.
Re: Sieve of Eratosthenes benchmark [message #365489 is a reply to message #365309] Wed, 21 March 2018 19:29 Go to previous messageGo to next message
Anonymous
Karma:
Originally posted by: Chris Rupnik

On Sunday, March 18, 2018 at 2:56:23 PM UTC-4, John Brooks wrote:
> I wrote an optimized Apple IIGS program to calculate all the prime numbers up to 16834. Finding primes via the SoE method is a common CPU benchmark and I wanted to see how fast the GS could compute primes.
>
> The fastest previous versions for the 6502/65c02/65802/65816 was 140 usec @ 1 MHz.
>
> This version is over 2x faster, finding the 1899 primes in ~59 usec @ 1 MHz, and ~24 usec @ 2.8 MHz.
>
> The program is 245 bytes long, consisting of the sieve routine at 192 bytes, and a 53 byte test program which runs the sieve 100 times and prints out the number of primes found: 1899.
>
> 00/2000:A2 64 18 FB C2 30 86 40 20 35 20 C6 40 D0 F9 38
> 00/2010:FB 20 3A FF 18 FB C2 30 78 BA 9B A9 00 60 1B 7B
> 00/2020:AB D0 01 1A BA 10 F9 BB 9A 4B AB AA EB 58 FB 20
> 00/2030:24 ED 4C 8E FD 0B 3B 8D EC 20 A9 FF 7F 1B A2 4E
> 00/2040:00 86 42 7B 1A AA EB A8 7B 3A 5A 8B 48 48 0B 5A
> 00/2050:DA DA 5A DA 5A 5A DA DA 5A 48 0B 48 DA 0B 48 DA
> 00/2060:DA 5A 48 5A 5A DA 0B 5A 48 5A 48 DA DA 5A DA DA
> 00/2070:5A DA 5A 48 DA 0B 5A 48 0B 48 DA 0B 5A DA 48 C6
> 00/2080:42 D0 C7 7A 0B 0B E2 14 A9 00 20 5B A9 05 60 1B
> 00/2090:A9 3C 60 A0 0A 85 E2 C8 84 AE 84 B3 84 B8 84 BD
> 00/20A0:84 C2 84 C7 84 CC 84 D1 C2 10 BA 1B 08 69 00 00
> 00/20B0:1B 08 69 00 00 1B 08 69 00 00 1B 08 69 00 00 1B
> 00/20C0:08 69 00 00 1B 08 69 00 00 1B 08 69 00 00 1B 08
> 00/20D0:69 00 00 10 D6 9A E2 10 C8 10 04 C8 C8 30 0C 98
> 00/20E0:0A 69 00 00 85 E2 AB D0 F2 80 AC A9 00 00 1B 2B
> 00/20F0:4B AB C2 34 60
>
> The sieve routine at $2035 is called in native 65816 mode with 16-bit AXY, B=0, D=0.
>
> Enjoy.
>
> -JB
> @JBrooksBSI
>
> 00/2000: A2 64 LDX #64
> 00/2002: 18 CLC
> 00/2003: FB XCE
> 00/2004: C2 30 REP #30
> 00/2006: 86 40 STX 40
> 00/2008: 20 35 20 JSR 2035
> 00/200B: C6 40 DEC 40
> 00/200D: D0 F9 BNE 2008 {-07}
> 00/200F: 38 SEC
> 00/2010: FB XCE
> 00/2011: 20 3A FF JSR FF3A
> 00/2014: 18 CLC
> 00/2015: FB XCE
> 00/2016: C2 30 REP #30
> 00/2018: 78 SEI
> 00/2019: BA TSX
> 00/201A: 9B TXY
> 00/201B: A9 00 60 LDA #6000
> 00/201E: 1B TCS
> 00/201F: 7B TDC
> 00/2020: AB PLB
> 00/2021: D0 01 BNE 2024 {+01}
> 00/2023: 1A INC
> 00/2024: BA TSX
> 00/2025: 10 F9 BPL 2020 {-07}
> 00/2027: BB TYX
> 00/2028: 9A TXS
> 00/2029: 4B PHK
> 00/202A: AB PLB
> 00/202B: AA TAX
> 00/202C: EB XBA
> 00/202D: 58 CLI
> 00/202E: FB XCE
> 00/202F: 20 24 ED JSR ED24
> 00/2032: 4C 8E FD JMP FD8E
> 00/2035: 0B PHD
> 00/2036: 3B TSC
> 00/2037: 8D EC 20 STA 20EC
> 00/203A: A9 FF 7F LDA #7FFF
> 00/203D: 1B TCS
> 00/203E: A2 4E 00 LDX #004E
> 00/2041: 86 42 STX 42
> 00/2043: 7B TDC
> 00/2044: 1A INC
> 00/2045: AA TAX
> 00/2046: EB XBA
> 00/2047: A8 TAY
> 00/2048: 7B TDC
> 00/2049: 3A DEC
> 00/204A: 5A PHY
> 00/204B: 8B PHB
> 00/204C: 48 PHA
> 00/204D: 48 PHA
> 00/204E: 0B PHD
> 00/204F: 5A PHY
> 00/2050: DA PHX
> 00/2051: DA PHX
> 00/2052: 5A PHY
> 00/2053: DA PHX
> 00/2054: 5A PHY
> 00/2055: 5A PHY
> 00/2056: DA PHX
> 00/2057: DA PHX
> 00/2058: 5A PHY
> 00/2059: 48 PHA
> 00/205A: 0B PHD
> 00/205B: 48 PHA
> 00/205C: DA PHX
> 00/205D: 0B PHD
> 00/205E: 48 PHA
> 00/205F: DA PHX
> 00/2060: DA PHX
> 00/2061: 5A PHY
> 00/2062: 48 PHA
> 00/2063: 5A PHY
> 00/2064: 5A PHY
> 00/2065: DA PHX
> 00/2066: 0B PHD
> 00/2067: 5A PHY
> 00/2068: 48 PHA
> 00/2069: 5A PHY
> 00/206A: 48 PHA
> 00/206B: DA PHX
> 00/206C: DA PHX
> 00/206D: 5A PHY
> 00/206E: DA PHX
> 00/206F: DA PHX
> 00/2070: 5A PHY
> 00/2071: DA PHX
> 00/2072: 5A PHY
> 00/2073: 48 PHA
> 00/2074: DA PHX
> 00/2075: 0B PHD
> 00/2076: 5A PHY
> 00/2077: 48 PHA
> 00/2078: 0B PHD
> 00/2079: 48 PHA
> 00/207A: DA PHX
> 00/207B: 0B PHD
> 00/207C: 5A PHY
> 00/207D: DA PHX
> 00/207E: 48 PHA
> 00/207F: C6 42 DEC 42
> 00/2081: D0 C7 BNE 204A {-39}
> 00/2083: 7A PLY
> 00/2084: 0B PHD
> 00/2085: 0B PHD
> 00/2086: E2 14 SEP #14
> 00/2088: A9 00 20 LDA #2000
> 00/208B: 5B TCD
> 00/208C: A9 05 60 LDA #6005
> 00/208F: 1B TCS
> 00/2090: A9 3C 60 LDA #603C
> 00/2093: A0 0A LDY #0A
> 00/2095: 85 E2 STA E2
> 00/2097: C8 INY
> 00/2098: 84 AE STY AE
> 00/209A: 84 B3 STY B3
> 00/209C: 84 B8 STY B8
> 00/209E: 84 BD STY BD
> 00/20A0: 84 C2 STY C2
> 00/20A2: 84 C7 STY C7
> 00/20A4: 84 CC STY CC
> 00/20A6: 84 D1 STY D1
> 00/20A8: C2 10 REP #10
> 00/20AA: BA TSX
> 00/20AB: 1B TCS
> 00/20AC: 08 PHP
> 00/20AD: 69 00 00 ADC #0000
> 00/20B0: 1B TCS
> 00/20B1: 08 PHP
> 00/20B2: 69 00 00 ADC #0000
> 00/20B5: 1B TCS
> 00/20B6: 08 PHP
> 00/20B7: 69 00 00 ADC #0000
> 00/20BA: 1B TCS
> 00/20BB: 08 PHP
> 00/20BC: 69 00 00 ADC #0000
> 00/20BF: 1B TCS
> 00/20C0: 08 PHP
> 00/20C1: 69 00 00 ADC #0000
> 00/20C4: 1B TCS
> 00/20C5: 08 PHP
> 00/20C6: 69 00 00 ADC #0000
> 00/20C9: 1B TCS
> 00/20CA: 08 PHP
> 00/20CB: 69 00 00 ADC #0000
> 00/20CE: 1B TCS
> 00/20CF: 08 PHP
> 00/20D0: 69 00 00 ADC #0000
> 00/20D3: 10 D6 BPL 20AB {-2A}
> 00/20D5: 9A TXS
> 00/20D6: E2 10 SEP #10
> 00/20D8: C8 INY
> 00/20D9: 10 04 BPL 20DF {+04}
> 00/20DB: C8 INY
> 00/20DC: C8 INY
> 00/20DD: 30 0C BMI 20EB {+0C}
> 00/20DF: 98 TYA
> 00/20E0: 0A ASL
> 00/20E1: 69 00 00 ADC #0000
> 00/20E4: 85 E2 STA E2
> 00/20E6: AB PLB
> 00/20E7: D0 F2 BNE 20DB {-0E}
> 00/20E9: 80 AC BRA 2097 {-54}
> 00/20EB: A9 00 00 LDA #0000
> 00/20EE: 1B TCS
> 00/20EF: 2B PLD
> 00/20F0: 4B PHK
> 00/20F1: AB PLB
> 00/20F2: C2 34 REP #34
> 00/20F4: 60 RTS

Hi John,
How are you calculating the time it takes to compute the result? I entered this program on my accelerated IIGS and whether i use the "FAST" or "Normal" Speed setting, from the time of 2000G command to execute it to the time to the putput of '1899' seems the same.

Am I doing something wrong? I was hoping to get a different result.

Chris
Re: Sieve of Eratosthenes benchmark [message #365514 is a reply to message #365488] Thu, 22 March 2018 01:17 Go to previous messageGo to next message
barrym95838 is currently offline  barrym95838
Messages: 130
Registered: April 2013
Karma: 0
Senior Member
On Wednesday, March 21, 2018 at 3:56:35 PM UTC-7, Gregory Schroeder wrote:
>
> I use this program in GSoft and the upper limit over 16377 gives me an
> overflow error.

Hmmm ... I'm running on AppleWin with DOS 3.3 ... not really sure what
"GSoft" is ...

]POKE33,33:LIST:POKE33,40

10 INPUT "PRIME NUMBER UPPER SEARCH LIMIT: ";L: DIM F%(L)
20 FOR N = 2 TO SQR (L): IF F%(N) THEN 40
30 FOR K = N * N TO L STEP N:F%(K) = 1: NEXT
40 NEXT :K = 0: FOR N = 2 TO L: IF F%(N) = 0 THEN K = K + 1: REM PRINT N,
50 NEXT : PRINT "TOTAL OF "K

]RUN
PRIME NUMBER UPPER SEARCH LIMIT: 18077

?OUT OF MEMORY ERROR IN 30
]RUN
PRIME NUMBER UPPER SEARCH LIMIT: 18076
TOTAL OF 2071

]?FRE(0)
2

Mike B.
Re: Sieve of Eratosthenes benchmark [message #365522 is a reply to message #365309] Thu, 22 March 2018 05:28 Go to previous messageGo to next message
Anonymous
Karma:
Originally posted by: Tom Porter

This is what I use to list a program instead of
]POKE33,33:LIST:POKE33,40

....
EXEC FILE (LIST)- works great in APPLEWIN's Printer File

PR#3
PR#1
POKE 33,66
LIST
PR#3

As for making faster primes, I'd just slow it down!
Re: Sieve of Eratosthenes benchmark [message #365527 is a reply to message #365489] Thu, 22 March 2018 07:21 Go to previous messageGo to next message
Anonymous
Karma:
Originally posted by: John Brooks

On Wednesday, March 21, 2018 at 4:29:18 PM UTC-7, Chris Rupnik wrote:

> Hi John,
> How are you calculating the time it takes to compute the result? I entered this program on my accelerated IIGS and whether i use the "FAST" or "Normal" Speed setting, from the time of 2000G command to execute it to the time to the putput of '1899' seems the same.
>
> Am I doing something wrong? I was hoping to get a different result.
>
> Chris

Hi Chris. I am timing by increasing the loop ctr to 16 bits and measuring with a stopwatch.

For 1000 loops of the sieve calc, use this patch:
1FFF:18 FB C2 30 A2 E8 03
1FFFG

For 10,000 loops:
1FFF:18 FB C2 30 A2 10 27
1FFFG

-JB
@JBrooksBSI
Re: Sieve of Eratosthenes benchmark [message #365528 is a reply to message #365514] Thu, 22 March 2018 07:36 Go to previous messageGo to next message
Anonymous
Karma:
Originally posted by: John Brooks

On Wednesday, March 21, 2018 at 10:17:13 PM UTC-7, barrym95838 wrote:
> On Wednesday, March 21, 2018 at 3:56:35 PM UTC-7, Gregory Schroeder wrote:
>>
>> I use this program in GSoft and the upper limit over 16377 gives me an
>> overflow error.
>
> Hmmm ... I'm running on AppleWin with DOS 3.3 ... not really sure what
> "GSoft" is ...
>
> ]POKE33,33:LIST:POKE33,40
>
> 10 INPUT "PRIME NUMBER UPPER SEARCH LIMIT: ";L: DIM F%(L)
> 20 FOR N = 2 TO SQR (L): IF F%(N) THEN 40
> 30 FOR K = N * N TO L STEP N:F%(K) = 1: NEXT
> 40 NEXT :K = 0: FOR N = 2 TO L: IF F%(N) = 0 THEN K = K + 1: REM PRINT N,
> 50 NEXT : PRINT "TOTAL OF "K
>
> ]RUN
> PRIME NUMBER UPPER SEARCH LIMIT: 18077
>
> ?OUT OF MEMORY ERROR IN 30
> ]RUN
> PRIME NUMBER UPPER SEARCH LIMIT: 18076
> TOTAL OF 2071
>
> ]?FRE(0)
> 2
>
> Mike B.

Hi Mike.

Some ideas for Applesoft SoE:

1) Try replacing the F% array with direct peek/pokes of memory = 1 byte per search number. This cuts memory use in half compared to 16-bit F% words.

2) My SoE code omits all even numbers from the search array, doing that in basic would cut memory use in half again.

3) Try Beagle Basic which has very fast integer math and can use GS or Ramworks memory for larger arrays.

Fun stuff!
-JB
@JBrooksBSI
Re: Sieve of Eratosthenes benchmark [message #365537 is a reply to message #365527] Thu, 22 March 2018 14:48 Go to previous messageGo to next message
Anonymous
Karma:
Originally posted by: Chris Rupnik

On Thursday, March 22, 2018 at 7:21:19 AM UTC-4, John Brooks wrote:
> On Wednesday, March 21, 2018 at 4:29:18 PM UTC-7, Chris Rupnik wrote:
>
>> Hi John,
>> How are you calculating the time it takes to compute the result? I entered this program on my accelerated IIGS and whether i use the "FAST" or "Normal" Speed setting, from the time of 2000G command to execute it to the time to the putput of '1899' seems the same.
>>
>> Am I doing something wrong? I was hoping to get a different result.
>>
>> Chris
>
> Hi Chris. I am timing by increasing the loop ctr to 16 bits and measuring with a stopwatch.
>
> For 1000 loops of the sieve calc, use this patch:
> 1FFF:18 FB C2 30 A2 E8 03
> 1FFFG
>
> For 10,000 loops:
> 1FFF:18 FB C2 30 A2 10 27
> 1FFFG
>
> -JB
> @JBrooksBSI


On my 11GS - the 1000 loop runs in 13 seconds, either in FAST or NORMAL.
That is not what i was expecting. What do you get for the 1000 loop total time?
Re: Sieve of Eratosthenes benchmark [message #365545 is a reply to message #365514] Thu, 22 March 2018 18:59 Go to previous messageGo to next message
gids.rs is currently offline  gids.rs
Messages: 1395
Registered: October 2012
Karma: 0
Senior Member
On Wednesday, March 21, 2018 at 11:17:13 PM UTC-6, barrym95838 wrote:
> On Wednesday, March 21, 2018 at 3:56:35 PM UTC-7, Gregory Schroeder wrote:
>>
>> I use this program in GSoft and the upper limit over 16377 gives me an
>> overflow error.
>
> Hmmm ... I'm running on AppleWin with DOS 3.3 ... not really sure what
> "GSoft" is ...
>
> ]POKE33,33:LIST:POKE33,40
>
> 10 INPUT "PRIME NUMBER UPPER SEARCH LIMIT: ";L: DIM F%(L)
> 20 FOR N = 2 TO SQR (L): IF F%(N) THEN 40
> 30 FOR K = N * N TO L STEP N:F%(K) = 1: NEXT
> 40 NEXT :K = 0: FOR N = 2 TO L: IF F%(N) = 0 THEN K = K + 1: REM PRINT N,
> 50 NEXT : PRINT "TOTAL OF "K
>
> ]RUN
> PRIME NUMBER UPPER SEARCH LIMIT: 18077
>
> ?OUT OF MEMORY ERROR IN 30
> ]RUN
> PRIME NUMBER UPPER SEARCH LIMIT: 18076
> TOTAL OF 2071
>
> ]?FRE(0)
> 2
>
> Mike B.


This routine skips all even numbers and should run a little faster

10 INPUT "PRIME NUMBER UPPER SEARCH LIMIT: ";L: DIM F%(L / 2)
20 FOR N = 3 TO SQR (L) STEP 2:K% = N / 2: IF F%(K%) THEN NEXT : GOTO 50
30 FOR J = N * N TO L STEP N:K% = J / 2: IF J / 2 = K% THEN NEXT : NEXT : GOTO 50
40 F%(K%) = 1: NEXT : NEXT
50 K = 0: FOR N = 3 TO L / 2: IF F%(N) = 0 THEN K = K + 1
60 NEXT : PRINT "TOTAL OF "K + 1: PRINT "2 ";
65 FOR I = 1 TO 256: IF F%(I) = 1 THEN NEXT : END
70 PRINT (F%(I) = 0) * I * 2 + 1" ";: NEXT
Re: Sieve of Eratosthenes benchmark [message #365549 is a reply to message #365537] Fri, 23 March 2018 00:00 Go to previous messageGo to next message
Anonymous
Karma:
Originally posted by: John Brooks

On Thursday, March 22, 2018 at 11:48:45 AM UTC-7, Chris Rupnik wrote:
> On Thursday, March 22, 2018 at 7:21:19 AM UTC-4, John Brooks wrote:
>> On Wednesday, March 21, 2018 at 4:29:18 PM UTC-7, Chris Rupnik wrote:
>>
>>> Hi John,
>>> How are you calculating the time it takes to compute the result? I entered this program on my accelerated IIGS and whether i use the "FAST" or "Normal" Speed setting, from the time of 2000G command to execute it to the time to the putput of '1899' seems the same.
>>>
>>> Am I doing something wrong? I was hoping to get a different result.
>>>
>>> Chris
>>
>> Hi Chris. I am timing by increasing the loop ctr to 16 bits and measuring with a stopwatch.
>>
>> For 1000 loops of the sieve calc, use this patch:
>> 1FFF:18 FB C2 30 A2 E8 03
>> 1FFFG
>>
>> For 10,000 loops:
>> 1FFF:18 FB C2 30 A2 10 27
>> 1FFFG
>>
>> -JB
>> @JBrooksBSI
>
>
> On my 11GS - the 1000 loop runs in 13 seconds, either in FAST or NORMAL.
> That is not what i was expecting. What do you get for the 1000 loop total time?

I'm at GDC this week, but I'll run it on my 9 MHz ZipGS this weekend and post numbers.

IIRC, normal speed (1.0 MHz) was 24 seconds for 1000 iterations, and fast speed (2.8 MHz) was 60 seconds for 1000 iterations.

-JB
@JBrooksBSI
Re: Sieve of Eratosthenes benchmark [message #365581 is a reply to message #365549] Fri, 23 March 2018 23:12 Go to previous messageGo to next message
Michael J. Mahon is currently offline  Michael J. Mahon
Messages: 1767
Registered: October 2012
Karma: 0
Senior Member
John Brooks <jbrooks@blueshiftinc.com> wrote:
> On Thursday, March 22, 2018 at 11:48:45 AM UTC-7, Chris Rupnik wrote:
>> On Thursday, March 22, 2018 at 7:21:19 AM UTC-4, John Brooks wrote:
>>> On Wednesday, March 21, 2018 at 4:29:18 PM UTC-7, Chris Rupnik wrote:
>>>
>>>> Hi John,
>>>> How are you calculating the time it takes to compute the result? I
>>>> entered this program on my accelerated IIGS and whether i use the
>>>> "FAST" or "Normal" Speed setting, from the time of 2000G command to
>>>> execute it to the time to the putput of '1899' seems the same.
>>>>
>>>> Am I doing something wrong? I was hoping to get a different result.
>>>>
>>>> Chris
>>>
>>> Hi Chris. I am timing by increasing the loop ctr to 16 bits and
>>> measuring with a stopwatch.
>>>
>>> For 1000 loops of the sieve calc, use this patch:
>>> 1FFF:18 FB C2 30 A2 E8 03
>>> 1FFFG
>>>
>>> For 10,000 loops:
>>> 1FFF:18 FB C2 30 A2 10 27
>>> 1FFFG
>>>
>>> -JB
>>> @JBrooksBSI
>>
>>
>> On my 11GS - the 1000 loop runs in 13 seconds, either in FAST or NORMAL.
>> That is not what i was expecting. What do you get for the 1000 loop total time?
>
> I'm at GDC this week, but I'll run it on my 9 MHz ZipGS this weekend and post numbers.
>
> IIRC, normal speed (1.0 MHz) was 24 seconds for 1000 iterations, and fast
> speed (2.8 MHz) was 60 seconds for 1000 iterations.
>
> -JB
> @JBrooksBSI
>

....or the other way around!

--
-michael - NadaNet 3.1 and AppleCrate II: http://michaeljmahon.com
Re: Sieve of Eratosthenes benchmark [message #365588 is a reply to message #365522] Sat, 24 March 2018 02:15 Go to previous messageGo to next message
Anonymous
Karma:
Originally posted by: Brian Patrie

On 2018-03-22 04:28, Tom Porter wrote:
> This is what I use to list a program instead of
> ]POKE33,33:LIST:POKE33,40
>
> ....
> EXEC FILE (LIST)- works great in APPLEWIN's Printer File
>
> PR#3
> PR#1
> POKE 33,66
> LIST
> PR#3
>
> As for making faster primes, I'd just slow it down!

32 or 64 for width is more reliable. If conditions are just right, an
unwanted cr can slip in with 33 (and probably 66). Harmless if all you
need is more convenient right-arrow editing; but a problem when writing
to a file, or printer.


FWIW, here's my (kinda ridiculously fancy) LLIST exec file:

:L=PEEK(32):W=PEEK(33):POKE32,0:POKE33,32:PR#1:CALL-998:LIST
:CALL-998:CALL-958:CALL-756+588*(W<41):POKE33,W:POKE32,L:CALL976

(all on one line). The CALL 976 is the reconnect (Pro)DOS call at $3D0,
since i'm not using the (Pro)DOS PR#1. -998 is cursor up. The
-756+588*(W<41) dance conditionally calls RDKEY, to absorb an extra cr
that happens in 80 column mode. (Yeah, a lot of fuss just to keep the
screen relatively pretty; but it was entertaining to write.) :)
Re: Sieve of Eratosthenes benchmark [message #365630 is a reply to message #365549] Sat, 24 March 2018 15:39 Go to previous messageGo to next message
Anonymous
Karma:
Originally posted by: Chris Rupnik

On Friday, March 23, 2018 at 12:00:50 AM UTC-4, John Brooks wrote:
> On Thursday, March 22, 2018 at 11:48:45 AM UTC-7, Chris Rupnik wrote:
>> On Thursday, March 22, 2018 at 7:21:19 AM UTC-4, John Brooks wrote:
>>> On Wednesday, March 21, 2018 at 4:29:18 PM UTC-7, Chris Rupnik wrote:
>>>
>>>> Hi John,
>>>> How are you calculating the time it takes to compute the result? I entered this program on my accelerated IIGS and whether i use the "FAST" or "Normal" Speed setting, from the time of 2000G command to execute it to the time to the putput of '1899' seems the same.
>>>>
>>>> Am I doing something wrong? I was hoping to get a different result..
>>>>
>>>> Chris
>>>
>>> Hi Chris. I am timing by increasing the loop ctr to 16 bits and measuring with a stopwatch.
>>>
>>> For 1000 loops of the sieve calc, use this patch:
>>> 1FFF:18 FB C2 30 A2 E8 03
>>> 1FFFG
>>>
>>> For 10,000 loops:
>>> 1FFF:18 FB C2 30 A2 10 27
>>> 1FFFG
>>>
>>> -JB
>>> @JBrooksBSI
>>
>>
>> On my 11GS - the 1000 loop runs in 13 seconds, either in FAST or NORMAL..
>> That is not what i was expecting. What do you get for the 1000 loop total time?
>
> I'm at GDC this week, but I'll run it on my 9 MHz ZipGS this weekend and post numbers.
>
> IIRC, normal speed (1.0 MHz) was 24 seconds for 1000 iterations, and fast speed (2.8 MHz) was 60 seconds for 1000 iterations.
>
> -JB
> @JBrooksBSI


On my accelerated GS - i got 13 seconds for 1000 iterations.
On my regular GS - i would get 24 seconds in FAST mode - and 59 seconds in NORMAL mode.
Re: Sieve of Eratosthenes benchmark [message #365867 is a reply to message #365309] Fri, 30 March 2018 11:41 Go to previous messageGo to next message
Anonymous
Karma:
Originally posted by: dmrogers99

On Sunday, March 18, 2018 at 2:56:23 PM UTC-4, John Brooks wrote:
> I wrote an optimized Apple IIGS program to calculate all the prime numbers up to 16834. Finding primes via the SoE method is a common CPU benchmark and I wanted to see how fast the GS could compute primes.
>
> The fastest previous versions for the 6502/65c02/65802/65816 was 140 usec @ 1 MHz.
>
> This version is over 2x faster, finding the 1899 primes in ~59 usec @ 1 MHz, and ~24 usec @ 2.8 MHz.
>
> The program is 245 bytes long, consisting of the sieve routine at 192 bytes, and a 53 byte test program which runs the sieve 100 times and prints out the number of primes found: 1899.


John,

Can you provide a pointer to the previous benchmark program? Am I correct in reading that time as 140 ms @ 1MHz?

It's an interesting topic, the Sieve of Eratosthenes. I found Bob Sander-Cederlof's first take on it here: http://www.txbobsc.com/aal/1981/aal8110.html

Follow-ups here: http://www.txbobsc.com/aal/1982/aal8202.html#a3

And here: http://www.txbobsc.com/aal/1982/aal8211.html#a4

A comment on the original article: http://www.txbobsc.com/aal/1983/aal8303.html#a6

Comments about the 68000 processor benchmark: http://www.txbobsc.com/aal/1984/aal8407.html#a4

An update to the 6502 routine based on the 68000 algorithm: http://www.txbobsc.com/aal/1984/aal8407.html#a5

A 65802 routine is here: http://www.txbobsc.com/aal/1985/aal8509.html#a1

Feedback about the 68000 routine and informed speculation on memory handicaps: http://www.txbobsc.com/aal/1985/aal8510.html#a4

I think that was all the references I could find in Assembly Lines, and it appears as though your previous fastest benchmark is based on Bob's 65802 results. Is that correct?

I can't profess to understanding much of it, but it's short enough that I might, with some effort, eventually figure it out! ;^)

Thanks for the post. I led to a fascinating excursion down memory lane. And Ulam's Spiral is pretty interesting too!

Dave Rogers
Re: Sieve of Eratosthenes benchmark [message #365869 is a reply to message #365867] Fri, 30 March 2018 12:57 Go to previous messageGo to next message
Anonymous
Karma:
Originally posted by: John Brooks

On Friday, March 30, 2018 at 8:41:48 AM UTC-7, dmrog...@yahoo.com wrote:
> On Sunday, March 18, 2018 at 2:56:23 PM UTC-4, John Brooks wrote:
>> I wrote an optimized Apple IIGS program to calculate all the prime numbers up to 16834. Finding primes via the SoE method is a common CPU benchmark and I wanted to see how fast the GS could compute primes.
>>
>> The fastest previous versions for the 6502/65c02/65802/65816 was 140 usec @ 1 MHz.
>>
>> This version is over 2x faster, finding the 1899 primes in ~59 usec @ 1 MHz, and ~24 usec @ 2.8 MHz.
>>
>> The program is 245 bytes long, consisting of the sieve routine at 192 bytes, and a 53 byte test program which runs the sieve 100 times and prints out the number of primes found: 1899.
>
>
> John,
>
> Can you provide a pointer to the previous benchmark program? Am I correct in reading that time as 140 ms @ 1MHz?
>
> It's an interesting topic, the Sieve of Eratosthenes. I found Bob Sander-Cederlof's first take on it here: http://www.txbobsc.com/aal/1981/aal8110.html
>
> Follow-ups here: http://www.txbobsc.com/aal/1982/aal8202.html#a3
>
> And here: http://www.txbobsc.com/aal/1982/aal8211.html#a4
>
> A comment on the original article: http://www.txbobsc.com/aal/1983/aal8303.html#a6
>
> Comments about the 68000 processor benchmark: http://www.txbobsc.com/aal/1984/aal8407.html#a4
>
> An update to the 6502 routine based on the 68000 algorithm: http://www.txbobsc.com/aal/1984/aal8407.html#a5
>
> A 65802 routine is here: http://www.txbobsc.com/aal/1985/aal8509.html#a1
>
> Feedback about the 68000 routine and informed speculation on memory handicaps: http://www.txbobsc.com/aal/1985/aal8510.html#a4
>
> I think that was all the references I could find in Assembly Lines, and it appears as though your previous fastest benchmark is based on Bob's 65802 results. Is that correct?
>
> I can't profess to understanding much of it, but it's short enough that I might, with some effort, eventually figure it out! ;^)
>
> Thanks for the post. I led to a fascinating excursion down memory lane. And Ulam's Spiral is pretty interesting too!
>
> Dave Rogers

..

> Can you provide a pointer to the previous benchmark program? Am I correct in reading that time as 140 ms @ 1MHz?

Yes, the fastest previous version I could find was the Assembly Lines 65802 version by Bob S-C at 140ms. It uses a wheel2 for both the flags memory (storing only odd bytes) and init routines.

While traveling last week I made a fast 6502 sieve, and slightly smaller/faster versions for the 65816.

Below is the 6502 version which runs in 145ms per sieve and is 174 bytes long (228 bytes with test code). The flags data is stored wheel2 (odd only) at $6000-$7FFF with primes == bit7 set.

2000:A2 FA 86 06 20 36 20 C6 06 D0 F9 20 3A FF A9 00
2010:AA 18 A0 60 8C 19 20 2C 00 7F 10 05 E8 D0 02 69
2020:01 EE 18 20 D0 F1 C8 10 EB 20 DA FD 8A 20 DA FD
2030:20 8E FD 6C FC FF A2 9C BD 48 20 95 39 CA D0 F8
2040:A0 00 A2 0E A9 F6 4C 6F 00 C2 2C A6 4A 4C A6 88
2050:6C 22 CA 64 A4 CA 68 86 85 71 85 75 85 79 85 7D
2060:85 81 85 85 85 89 86 70 E8 86 74 E8 86 78 E8 86
2070:7C E8 86 80 E8 86 84 E8 86 88 A2 0E B5 3A 99 00
2080:60 0A 99 01 60 0A 99 02 60 0A 99 03 60 0A 99 04
2090:60 0A 99 05 60 0A 99 06 60 98 18 69 07 A8 CA 10
20A0:DB 98 10 D6 29 7F A8 A5 70 49 80 AA 30 B8 A5 71
20B0:69 01 10 A4 A2 60 A9 3C A0 0B 85 CD E6 B0 2C 04
20C0:60 10 15 86 C8 84 C1 86 BF 18 85 BE 8E 00 FF 69
20D0:00 90 F7 E8 10 F1 A2 00 C8 98 0A 69 00 90 01 E8
20E0:C8 10 D7 60

It should run on all Apple II & compatibles. I've tested it on the integer Apple II too (if you save it as a system file & launch from P8 2.4 Bitsy Bye).

It prints the prime count in hex: $076C == 1900 primes, and ends via simulated reset to set up zero page and drop to the monitor on integer Apple II machines.

For Applesoft systems, this basic program will print out all 1900 primes:

10 HOME
20 PRINT "1) 2"
30 CTR = 2
40 FOR I = 1 TO 8191
50 IF PEEK (24576 + I) < 128 THEN NEXT : END
60 PRINT CTR") "I + I + 1:CTR = CTR + 1
70 NEXT

> I can't profess to understanding much of it, but it's short enough that I might, with some effort, eventually figure it out! ;^)

I'm happy to explain how it works.

My changes are primarily to optimize the inner loops of the sieve calc - the flag clear routine, and to initialize the flags array as wheel105 (3*5*7) instead of wheel2.

A nonintuitive part of the code is incrementally calculating squares. Adding Ctr*8 to the previously calculated square gives the next square:

1*1 = 1
3*3 = 9 (1*8+1)
5*5 = 25 (2*8+9)
7*7 = 49 (3*8+25)

In my case, I increment the ctr by 2 via INY, INY, then double again to get 4x via TYA, ASL. Since the flags array is wheel2, storing only odd flags, I can skip the last double to 8x.

One a prime is found, all flags less then Prime*Prime have already been cleared by smaller primes, so skipping to Prime squared is a big speed up.

Below is the 6502 disasm.

Enjoy.
-JB
@JBrooksBSI

-- Test code --
2000- A2 64 LDX #$64
2002- 86 06 STX $06
2004- 20 36 20 JSR $2036
2007- C6 06 DEC $06
2009- D0 F9 BNE $2004
200B- 20 3A FF JSR $FF3A
200E- A9 00 LDA #$00
2010- AA TAX
2011- 18 CLC
2012- A0 60 LDY #$60
2014- 8C 19 20 STY $2019
2017- 2C 00 60 BIT $6000
201A- 10 05 BPL $2021
201C- E8 INX
201D- D0 02 BNE $2021
201F- 69 01 ADC #$01
2021- EE 18 20 INC $2018
2024- D0 F1 BNE $2017
2026- C8 INY
2027- 10 EB BPL $2014
2027- 10 EB BPL $2014
2029- 20 DA FD JSR $FDDA
202C- 8A TXA
202D- 20 DA FD JSR $FDDA
2030- 20 8E FD JSR $FD8E
2033- 6C FC FF JMP ($FFFC)

-- 6502 Sieve code --
2036- A2 9C LDX #$9C
2038- BD 48 20 LDA $2048,X
203B- 95 39 STA $39,X
203D- CA DEX
203E- D0 F8 BNE $2038
2040- A0 00 LDY #$00
2042- A2 0E LDX #$0E
2044- A9 F6 LDA #$F6
2046- 4C 6F 00 JMP $006F
-- Wheel 105 (3*5*7) constants --
2049- C2 2C A6 4A 4C A6 88
2050- 6C 22 CA 64 A4 CA 68 86
-- Flags init --
2058- 85 71 STA $71
205A- 85 75 STA $75
205C- 85 79 STA $79
205E- 85 7D STA $7D
2060- 85 81 STA $81
2062- 85 85 STA $85
2064- 85 89 STA $89
2066- 86 70 STX $70
2068- E8 INX
2069- 86 74 STX $74
206B- E8 INX
206C- 86 78 STX $78
206E- E8 INX
206F- 86 7C STX $7C
2071- E8 INX
2072- 86 80 STX $80
2074- E8 INX
2075- 86 84 STX $84
2077- E8 INX
2078- 86 88 STX $88
2078- 86 88 STX $88
207A- A2 0E LDX #$0E
207C- B5 3A LDA $3A,X
207E- 99 00 60 STA $6000,Y
2081- 0A ASL
2082- 99 01 60 STA $6001,Y
2085- 0A ASL
2086- 99 02 60 STA $6002,Y
2089- 0A ASL
208A- 99 03 60 STA $6003,Y
208D- 0A ASL
208E- 99 04 60 STA $6004,Y
2091- 0A ASL
2092- 99 05 60 STA $6005,Y
2095- 0A ASL
2096- 99 06 60 STA $6006,Y
2099- 98 TYA
209A- 18 CLC
209B- 69 07 ADC #$07
209D- A8 TAY
209E- CA DEX
209E- CA DEX
209F- 10 DB BPL $207C
20A1- 98 TYA
20A2- 10 D6 BPL $207A
20A4- 29 7F AND #$7F
20A6- A8 TAY
20A7- A5 70 LDA $70
20A9- 49 80 EOR #$80
20AB- AA TAX
20AC- 30 B8 BMI $2066
20AE- A5 71 LDA $71
20B0- 69 01 ADC #$01
20B2- 10 A4 BPL $2058
20B4- A2 60 LDX #$60
20B6- A9 3C LDA #$3C
20B8- A0 0B LDY #$0B
-- Sieve loop --
20BA- 85 CD STA $CD
20BC- E6 B0 INC $B0
20BE- 2C 04 60 BIT $6004
20C1- 10 15 BPL $20D8
20C3- 86 C8 STX $C8
20C3- 86 C8 STX $C8
20C5- 84 C1 STY $C1
20C7- 86 BF STX $BF
20C9- 18 CLC
-- Inner loop --
20CA- 85 BE STA $BE
20CC- 8E 00 FF STX $FF00
20CF- 69 00 ADC #$00
20D1- 90 F7 BCC $20CA
20D3- E8 INX
20D4- 10 F1 BPL $20C7
20D6- A2 00 LDX #$00
20D8- C8 INY
20D9- 98 TYA
20DA- 0A ASL
20DB- 69 00 ADC #$00
20DD- 90 01 BCC $20E0
20DF- E8 INX
20E0- C8 INY
20E1- 10 D7 BPL $20BA
20E3- 60 RTS
Re: Sieve of Eratosthenes benchmark [message #365889 is a reply to message #365545] Fri, 30 March 2018 17:02 Go to previous messageGo to next message
gids.rs is currently offline  gids.rs
Messages: 1395
Registered: October 2012
Karma: 0
Senior Member
Here are some Basic comparisons
All tests were done with Sweet16 at 1 Mhz

* Integer Basic - 154 secs
* time matches with this website
* http://www.txbobsc.com/aal/1981/aal8110.html

10 S=8191: DIM F(8191): N=0
20 FOR I=0 TO S: F(I)=1: NEXT I
30 FOR I=0 TO S: IF F(I)=0 THEN 80
40 P=I+I+3 : IF P>130 THEN 70 : K=I+P
50 IF K<S THEN F(K)=0: K=K+P: IF K<S THEN 50
70 N=N+1: REM PRINT P;" ";
80 NEXT I
90 PRINT: PRINT N;" PRIMES": END


* AS - without calculating even numbers
* using all FP variables - 128 secs
* max limit is 14443 using FP array DIM F(14443/2)
* using integer variables - 134 secs
* max limit is 18077 when using integer array DIM F%(18077/2)

10 L = 8191: DIM F%(L / 2)
20 FOR N = 3 TO SQR (L) STEP 2:K% = N / 2: IF F%(K%) THEN NEXT : GOTO 42
30 FOR J = N * N TO L STEP N:K% = J / 2: IF J / 2 = K% THEN NEXT : NEXT : GOTO 42
40 F%(K%) = 1: NEXT : NEXT
42 PRINT CHR$ (7)
50 PRINT "2 ";: FOR N = 1 TO 500: IF F%(N) = 0 THEN K = K + 1: PRINT N * 2 + 1" ";
60 NEXT : PRINT: PRINT "Total Primes: "K + 1


* Applesoft - 63 secs for Sweet16, 74 secs for Applewin
* max FP array is 7221 so unable to use FP to find all primes

10 L = 8191: DIM F%(L):A = 1
20 FOR N = 2 TO SQR (L): IF F%(N) THEN NEXT : GOTO 42
30 FOR K = N * N TO L STEP N:F%(K) = A: NEXT
40 NEXT
42 PRINT CHR$ (7):L = 500
45 K = 0: FOR N = 2 TO L: IF F%(N) = 0 THEN K = K + A: PRINT N" ";
50 NEXT : PRINT : PRINT "Total Primes: "K


Result: Sweet16 has a slightly faster 1 Mhz than Applewin
Now to test on a real machine to see which one is closer to authentic

Sweet16 will be compared with a real IIGS and Applewin to a real IIe. Will post back here with some results.
Re: Sieve of Eratosthenes benchmark [message #365907 is a reply to message #365889] Fri, 30 March 2018 23:42 Go to previous messageGo to next message
Anonymous
Karma:
Originally posted by: John Brooks

On Friday, March 30, 2018 at 2:02:33 PM UTC-7, I am Rob wrote:
> Here are some Basic comparisons
> All tests were done with Sweet16 at 1 Mhz
>
> * Integer Basic - 154 secs
> * time matches with this website
> * http://www.txbobsc.com/aal/1981/aal8110.html
>
> 10 S=8191: DIM F(8191): N=0
> 20 FOR I=0 TO S: F(I)=1: NEXT I
> 30 FOR I=0 TO S: IF F(I)=0 THEN 80
> 40 P=I+I+3 : IF P>130 THEN 70 : K=I+P
> 50 IF K<S THEN F(K)=0: K=K+P: IF K<S THEN 50
> 70 N=N+1: REM PRINT P;" ";
> 80 NEXT I
> 90 PRINT: PRINT N;" PRIMES": END
>
>
> * AS - without calculating even numbers
> * using all FP variables - 128 secs
> * max limit is 14443 using FP array DIM F(14443/2)
> * using integer variables - 134 secs
> * max limit is 18077 when using integer array DIM F%(18077/2)
>
> 10 L = 8191: DIM F%(L / 2)
> 20 FOR N = 3 TO SQR (L) STEP 2:K% = N / 2: IF F%(K%) THEN NEXT : GOTO 42
> 30 FOR J = N * N TO L STEP N:K% = J / 2: IF J / 2 = K% THEN NEXT : NEXT : GOTO 42
> 40 F%(K%) = 1: NEXT : NEXT
> 42 PRINT CHR$ (7)
> 50 PRINT "2 ";: FOR N = 1 TO 500: IF F%(N) = 0 THEN K = K + 1: PRINT N * 2 + 1" ";
> 60 NEXT : PRINT: PRINT "Total Primes: "K + 1
>
>
> * Applesoft - 63 secs for Sweet16, 74 secs for Applewin
> * max FP array is 7221 so unable to use FP to find all primes
>
> 10 L = 8191: DIM F%(L):A = 1
> 20 FOR N = 2 TO SQR (L): IF F%(N) THEN NEXT : GOTO 42
> 30 FOR K = N * N TO L STEP N:F%(K) = A: NEXT
> 40 NEXT
> 42 PRINT CHR$ (7):L = 500
> 45 K = 0: FOR N = 2 TO L: IF F%(N) = 0 THEN K = K + A: PRINT N" ";
> 50 NEXT : PRINT : PRINT "Total Primes: "K
>
>
> Result: Sweet16 has a slightly faster 1 Mhz than Applewin
> Now to test on a real machine to see which one is closer to authentic
>
> Sweet16 will be compared with a real IIGS and Applewin to a real IIe. Will post back here with some results.

Last week I did a quick compiled Applesoft (Beagle Basic) version that took about 9 seconds at 1 MHz IIRC. I'll try to clean it up this weekend and post it too.

-JB
@JBrooksBSI
Re: Sieve of Eratosthenes benchmark [message #365940 is a reply to message #365907] Sun, 01 April 2018 04:05 Go to previous messageGo to next message
Anonymous
Karma:
Originally posted by: John Brooks

This Applesoft version of sieve takes 19.8 seconds @ 1 MHz, and 464 bytes:

5 F = 6 * 4096:D = F + 2
10 FOR I = 1 TO 15: READ W
15 FOR J = 1 TO 7: POKE D,W: POKE D + 105,W
20 IF W > 127 THEN W = W - 128
25 W = W + W:D = D + 1: NEXT J,I
30 FOR I = 1 TO 3: READ D,L,H
35 POKE D,L: POKE D + 1,H: NEXT I
40 FOR I = 1 TO 38: POKE 171,0: POKE 172,0: CALL - 6700: NEXT
45 FOR I = 0 TO 3: POKE F + I,128: NEXT
50 SQ = 9 * 9 / 2 + F:S = 9 / 2 + F
55 FOR N = 11 TO 127 STEP 2
60 SQ = SQ - 2 + N + N:S = S + 1
65 IF PEEK (S) < 128 THEN NEXT : GOTO 75
70 FOR D = SQ TO 32767 STEP N: POKE D,0: NEXT : NEXT
75 PRINT CHR$ (7)"1) 2":C = 2: FOR I = 1 TO 8191
80 IF PEEK (F + I) > 127 THEN PRINT C") "I + I + 1:C = C + 1
85 NEXT
90 DATA 26,166,44,146,150,40,138,180,36,154,50,44,152,182,12
95 DATA 0,210,0,1,2,96,113,212,96
save sieve.bas

This Beagle Basic version of sieve takes 2.6 seconds @ 1 MHz, and 412 bytes compiled:

5 FLAGS% = 6 * 4096:DST% = FLAGS% + 2
10 FOR I = 1 TO 15: READ WHEEL%
15 FOR J = 1 TO 7: POKE DST%,WHEEL%: POKE DST% + 105,WHEEL%
20 IF WHEEL% > 127 THEN WHEEL% = WHEEL% - 128
25 WHEEL% = WHEEL% + WHEEL%:DST% = DST% + 1: NEXT J,I
30 FOR I = 1 TO 3: READ DST%,VL%,VH%
35 POKE DST%,VL%: POKE DST% + 1,VH%: NEXT I
40 FOR I = 1 TO 38: POKE 171,0: POKE 172,3: CALL - 6700: NEXT
45 FOR I = 0 TO 3: POKE FLAGS% + I,128: NEXT
50 SQ% = 9 * 9 / 2 + FLAGS%:SRC% = 9 / 2 + FLAGS%
55 FOR NUM% = 11 TO 127 STEP 2
60 SQ% = SQ% - 2 + NUM% + NUM%:SRC% = SRC% + 1
65 IF PEEK (SRC%) < 128 THEN NEXT : GOTO 75
70 FOR DST% = SQ% TO 32767 STEP NUM%: POKE DST%,0: NEXT : NEXT
75 PRINT CHR$ (7)"1) 2":C = 2: FOR I = 1 TO 8191
80 IF PEEK (FLAGS% + I) > 127 THEN PRINT C") "I + I + 1:C = C + 1
85 NEXT
90 DATA 26,166,44,146,150,40,138,180,36,154,50,44,152,182,12
95 DATA 768,210,0,769,2,96,113,212,96
save sieve

NOTES:
5 Wheel2 (odd) flags are stored at $6000-$7FFF. wheel105*2*38 is 8190 bytes (+2=8192)
10 Init 2x wheel105 via 15 DATA bytes
15 Each DATA byte inits 7 flags (7 * 15 = 105). Store 2x wheels
20 Strip high bit
25 Add to itself to shift left one bit
30 Set up a string descriptor at $300 and set string dest ptr to FLAGS%+105+105
35 Poke a 16-bit word into memory
40 Init the flags array via 38x 210-byte string copies using descriptor at $300
45 Mark first 4x flags as primes (2,3,5,7)
50 Init SQ = previous NUM squared. SRC = ptr to previous NUM
55 Check odd numbers 11 to 127 for primes
60 Update NUM squared and the ptr to the FLAGS num to check
65 If FLAGS entry for this odd NUM is not prime, then try the next
70 Found a prime. Zero flags from SQ to the end
75 Beep to show calc is finished. Then print 2 as the only even prime
80 For each prime, print count & prime
85 Loop through all flag bytes
90 Wheel105 constants: 3*5*7 bits stored 7 bits per byte
95 String descriptor at $300=len:210,ptr:$6002. String dest ptr @ ZP $71 = $60D4

Note: The string descriptor is at $00-$02 in the Applesoft sieve.

To compile the Beagle Basic version:
1) Run Compiler.System to launch the Beagle Basic interpreter
2) "-compiler" to load the basic compiler
3) "compile sieve,sieve.com" to compile applesoft into Beagle Basic tokenized format
4) "-compiler.system" to reload the interpreter, freeing memory used by the compiler
5) "-sieve.com" to run the compiled sieve via the Beagle Basic interpreter

Have fun,
-JB
@JBrooksBSI
Re: Sieve of Eratosthenes benchmark [message #365953 is a reply to message #365940] Sun, 01 April 2018 16:05 Go to previous messageGo to next message
barrym95838 is currently offline  barrym95838
Messages: 130
Registered: April 2013
Karma: 0
Senior Member
On Sunday, April 1, 2018 at 1:05:06 AM UTC-7, John Brooks wrote:
> This Applesoft version of sieve takes 19.8 seconds @ 1 MHz, and 464 bytes:
>
> 5 F = 6 * 4096:D = F + 2
> 10 FOR I = 1 TO 15: READ W
> 15 FOR J = 1 TO 7: POKE D,W: POKE D + 105,W
> 20 IF W > 127 THEN W = W - 128
> 25 W = W + W:D = D + 1: NEXT J,I
> 30 FOR I = 1 TO 3: READ D,L,H
> 35 POKE D,L: POKE D + 1,H: NEXT I
> 40 FOR I = 1 TO 38: POKE 171,0: POKE 172,0: CALL - 6700: NEXT
> 45 FOR I = 0 TO 3: POKE F + I,128: NEXT
> 50 SQ = 9 * 9 / 2 + F:S = 9 / 2 + F
> 55 FOR N = 11 TO 127 STEP 2
> 60 SQ = SQ - 2 + N + N:S = S + 1
> 65 IF PEEK (S) < 128 THEN NEXT : GOTO 75
> 70 FOR D = SQ TO 32767 STEP N: POKE D,0: NEXT : NEXT
> 75 PRINT CHR$ (7)"1) 2":C = 2: FOR I = 1 TO 8191
> 80 IF PEEK (F + I) > 127 THEN PRINT C") "I + I + 1:C = C + 1
> 85 NEXT
> 90 DATA 26,166,44,146,150,40,138,180,36,154,50,44,152,182,12
> 95 DATA 0,210,0,1,2,96,113,212,96
> save sieve.bas
>

I'd like to move the sieve buffer into a hi-res range so I could view
a "live animation" of the routine's progress, but it looks to be a
bit more work than I can afford at the moment ... the "honey-do" list
is looming large, and must take priority!

P.S. I know that the "wheel methods" are great for improving performance,
but they seem to be a bit "shady" IMO. Heck, what's to stop a person
from "wheeling" just about everything, and actually "sieving" next to
nothing? The next step after that is 1,900 PRINT statements, right?

Mike B.
Re: Sieve of Eratosthenes benchmark [message #365956 is a reply to message #365953] Sun, 01 April 2018 17:28 Go to previous messageGo to next message
Antoine Vignau is currently offline  Antoine Vignau
Messages: 1860
Registered: October 2012
Karma: 0
Senior Member
On Sunday, April 1, 2018 at 10:05:01 PM UTC+2, barrym95838 wrote:
> On Sunday, April 1, 2018 at 1:05:06 AM UTC-7, John Brooks wrote:
>> This Applesoft version of sieve takes 19.8 seconds @ 1 MHz, and 464 bytes:
>>
>> 5 F = 6 * 4096:D = F + 2
>> 10 FOR I = 1 TO 15: READ W
>> 15 FOR J = 1 TO 7: POKE D,W: POKE D + 105,W
>> 20 IF W > 127 THEN W = W - 128
>> 25 W = W + W:D = D + 1: NEXT J,I
>> 30 FOR I = 1 TO 3: READ D,L,H
>> 35 POKE D,L: POKE D + 1,H: NEXT I
>> 40 FOR I = 1 TO 38: POKE 171,0: POKE 172,0: CALL - 6700: NEXT
>> 45 FOR I = 0 TO 3: POKE F + I,128: NEXT
>> 50 SQ = 9 * 9 / 2 + F:S = 9 / 2 + F
>> 55 FOR N = 11 TO 127 STEP 2
>> 60 SQ = SQ - 2 + N + N:S = S + 1
>> 65 IF PEEK (S) < 128 THEN NEXT : GOTO 75
>> 70 FOR D = SQ TO 32767 STEP N: POKE D,0: NEXT : NEXT
>> 75 PRINT CHR$ (7)"1) 2":C = 2: FOR I = 1 TO 8191
>> 80 IF PEEK (F + I) > 127 THEN PRINT C") "I + I + 1:C = C + 1
>> 85 NEXT
>> 90 DATA 26,166,44,146,150,40,138,180,36,154,50,44,152,182,12
>> 95 DATA 0,210,0,1,2,96,113,212,96
>> save sieve.bas
>>
>
> I'd like to move the sieve buffer into a hi-res range so I could view
> a "live animation" of the routine's progress, but it looks to be a
> bit more work than I can afford at the moment ... the "honey-do" list
> is looming large, and must take priority!
>
> P.S. I know that the "wheel methods" are great for improving performance,
> but they seem to be a bit "shady" IMO. Heck, what's to stop a person
> from "wheeling" just about everything, and actually "sieving" next to
> nothing? The next step after that is 1,900 PRINT statements, right?
>
> Mike B.

2 HGR
5 F = 2 * 4096 : D = F + 2
....should do the trick,
av
Re: Sieve of Eratosthenes benchmark [message #365957 is a reply to message #365956] Sun, 01 April 2018 18:20 Go to previous messageGo to next message
barrym95838 is currently offline  barrym95838
Messages: 130
Registered: April 2013
Karma: 0
Senior Member
On Sunday, April 1, 2018 at 2:28:57 PM UTC-7, Antoine Vignau wrote:
>
> 2 HGR
> 5 F = 2 * 4096 : D = F + 2
> ...should do the trick,
> av

Yeah, I thought the same, until I saw that "32767" in line 70.
That "CALL - 6700" in line 40 may have mysterious side-effects also.

Mike B.
Re: Sieve of Eratosthenes benchmark [message #365958 is a reply to message #365956] Sun, 01 April 2018 18:26 Go to previous messageGo to next message
barrym95838 is currently offline  barrym95838
Messages: 130
Registered: April 2013
Karma: 0
Senior Member
On Sunday, April 1, 2018 at 2:28:57 PM UTC-7, Antoine Vignau wrote:
> On Sunday, April 1, 2018 at 10:05:01 PM UTC+2, barrym95838 wrote:
>> On Sunday, April 1, 2018 at 1:05:06 AM UTC-7, John Brooks wrote:
>>> ...
>>> 40 FOR I = 1 TO 38: POKE 171,0: POKE 172,0: CALL - 6700: NEXT
>>> ...
>> ...
> ...
John, could you explain why you're poking 172,0 for Applesoft and
172,3 for Beagle BASIC? Typo?

Mike B.
Re: Sieve of Eratosthenes benchmark [message #365959 is a reply to message #365957] Sun, 01 April 2018 18:34 Go to previous messageGo to next message
Antoine Vignau is currently offline  Antoine Vignau
Messages: 1860
Registered: October 2012
Karma: 0
Senior Member
On Monday, April 2, 2018 at 12:20:30 AM UTC+2, barrym95838 wrote:
> On Sunday, April 1, 2018 at 2:28:57 PM UTC-7, Antoine Vignau wrote:
>>
>> 2 HGR
>> 5 F = 2 * 4096 : D = F + 2
>> ...should do the trick,
>> av
>
> Yeah, I thought the same, until I saw that "32767" in line 70.
> That "CALL - 6700" in line 40 may have mysterious side-effects also.
>
> Mike B.

The two lines show poor results, that's right.

-6700 = $E5D4 which is a ROM routine that copies strings, it is called MOVINS. On entry, you have the source pointer at $AB..$AC (poke 171...), and on exit, the next string pointed at $71..$72 (113..114)

av
Re: Sieve of Eratosthenes benchmark [message #365961 is a reply to message #365953] Sun, 01 April 2018 19:15 Go to previous messageGo to next message
Anonymous
Karma:
Originally posted by: John Brooks

On Sunday, April 1, 2018 at 1:05:01 PM UTC-7, barrym95838 wrote:
> On Sunday, April 1, 2018 at 1:05:06 AM UTC-7, John Brooks wrote:
>> This Applesoft version of sieve takes 19.8 seconds @ 1 MHz, and 464 bytes:
>>
>> 5 F = 6 * 4096:D = F + 2
>> 10 FOR I = 1 TO 15: READ W
>> 15 FOR J = 1 TO 7: POKE D,W: POKE D + 105,W
>> 20 IF W > 127 THEN W = W - 128
>> 25 W = W + W:D = D + 1: NEXT J,I
>> 30 FOR I = 1 TO 3: READ D,L,H
>> 35 POKE D,L: POKE D + 1,H: NEXT I
>> 40 FOR I = 1 TO 38: POKE 171,0: POKE 172,0: CALL - 6700: NEXT
>> 45 FOR I = 0 TO 3: POKE F + I,128: NEXT
>> 50 SQ = 9 * 9 / 2 + F:S = 9 / 2 + F
>> 55 FOR N = 11 TO 127 STEP 2
>> 60 SQ = SQ - 2 + N + N:S = S + 1
>> 65 IF PEEK (S) < 128 THEN NEXT : GOTO 75
>> 70 FOR D = SQ TO 32767 STEP N: POKE D,0: NEXT : NEXT
>> 75 PRINT CHR$ (7)"1) 2":C = 2: FOR I = 1 TO 8191
>> 80 IF PEEK (F + I) > 127 THEN PRINT C") "I + I + 1:C = C + 1
>> 85 NEXT
>> 90 DATA 26,166,44,146,150,40,138,180,36,154,50,44,152,182,12
>> 95 DATA 0,210,0,1,2,96,113,212,96
>> save sieve.bas
>>
>
> I'd like to move the sieve buffer into a hi-res range so I could view
> a "live animation" of the routine's progress, but it looks to be a
> bit more work than I can afford at the moment ... the "honey-do" list
> is looming large, and must take priority!
>
> P.S. I know that the "wheel methods" are great for improving performance,
> but they seem to be a bit "shady" IMO. Heck, what's to stop a person
> from "wheeling" just about everything, and actually "sieving" next to
> nothing? The next step after that is 1,900 PRINT statements, right?
>
> Mike B.

..

> I'd like to move the sieve buffer into a hi-res range so I could view
> a "live animation" of the routine's progress

Good idea. Here is an Applesoft version that stores the 8192 flag bytes on hires page 2 $4000-$5FFF. Press a key after the beep to switch to text mode and print out the 1900 primes.

5 F = 4 * 4096:D = F + 2: HGR2
10 FOR I = 1 TO 15: READ W
15 FOR J = 1 TO 7: POKE D,W: POKE D + 105,W
20 IF W > 127 THEN W = W - 128
25 W = W + W:D = D + 1: NEXT J,I
30 FOR I = 1 TO 3: READ D,L,H
35 POKE D,L: POKE D + 1,H: NEXT I
40 FOR I = 1 TO 38: POKE 171,0: POKE 172,0: CALL - 6700: NEXT
45 FOR I = 0 TO 3: POKE F + I,128: NEXT
50 SQ = 9 * 9 / 2 + F:S = 9 / 2 + F
55 FOR N = 11 TO 127 STEP 2
60 SQ = SQ - 2 + N + N:S = S + 1
65 IF PEEK (S) < 128 THEN NEXT : GOTO 75
70 FOR D = SQ TO 24576 STEP N: POKE D,0: NEXT : NEXT
75 PRINT CHR$ (7)"1) 2":C = 2: GET A$: TEXT : FOR I = 1 TO 8191
80 IF PEEK (F + I) > 127 THEN PRINT C") "I + I + 1:C = C + 1
85 NEXT
90 DATA 26,166,44,146,150,40,138,180,36,154,50,44,152,182,12
95 DATA 0,210,0,1,2,64,113,212,64


> P.S. I know that the "wheel methods" are great for improving performance,
> but they seem to be a bit "shady" IMO. Heck, what's to stop a person
> from "wheeling" just about everything, and actually "sieving" next to
> nothing?

Yes, the ultimate speed up is to just precompute all the primes. In practice, all the 'fast SoE' routines use wheel optimizations, particularly wheel2 to halve the memory size.

It's less of a win to use wheel optimizations above 105 as 3*5*7*11 is wheel1155 and 3*5*7*11*13 is wheel15015, so it becomes prohibitively large to store the wheel data.


> John, could you explain why you're poking 172,0 for Applesoft and 172,3 for Beagle BASIC?

Sure. The Applesoft interpreter zeroes the string descriptor ZP ptr at $AB/$AC, so the descriptor must be stored at $00-$02. But the Beagle Basic interpreter uses ZP $00, so the string descriptor is moved to $300 for BB.

-JB
@JBrooksBSI
Re: Sieve of Eratosthenes benchmark [message #365962 is a reply to message #365961] Sun, 01 April 2018 20:34 Go to previous messageGo to next message
qkumba is currently offline  qkumba
Messages: 1584
Registered: March 2013
Karma: 0
Senior Member
I have no opportunity to try at the moment, but at a glance, it looks like the flags init code in the assembler version could be preset in the code, since the original code at $208x is never executed from that address.
Re: Sieve of Eratosthenes benchmark [message #365965 is a reply to message #365962] Sun, 01 April 2018 23:33 Go to previous messageGo to next message
Anonymous
Karma:
Originally posted by: John Brooks

On Sunday, April 1, 2018 at 5:34:22 PM UTC-7, qkumba wrote:
> I have no opportunity to try at the moment, but at a glance, it looks like the flags init code in the assembler version could be preset in the code, since the original code at $208x is never executed from that address.

The entire sieve routine runs from zero page. The 7x STA abs,Y at $207E-$2096 are self-modified at $2058 (high byte) and $2066 (low byte). Those 7x stores init the 8K flags array at $6000-$7FFF.

The self-mod addresses are updated every ~128x INY (low adr mod only if no page cross) using a single lazy-update-check every 7x flags.

BTW: The disasm post above incorrectly duplicated lines $209E and $20C3. Ignore the 2nd line.

-JB
Re: Sieve of Eratosthenes benchmark [message #421153 is a reply to message #365309] Sat, 25 November 2023 17:04 Go to previous message
qkumba is currently offline  qkumba
Messages: 1584
Registered: March 2013
Karma: 0
Senior Member
ldx #SieveEnd-SieveZP+1 ; Num bytes to copy to ZP. +1 for X=0 exit
CopyToZp
lda RelocToZP-1,x
sta ZpCodeOrg-1,x
dex
bne CopyToZp

ldy #0 ; y: Flags index == 0

->

ldy #SieveEnd-SieveZP+1 ; Num bytes to copy to ZP. +1 for X=0 exit
CopyToZp
ldx RelocToZP-1,y
stx ZpCodeOrg-1,y
dey
bne CopyToZp

; ldy #0 ; y: Flags index == 0
  Switch to threaded view of this topic Create a new topic Submit Reply
Previous Topic: Oh no
Next Topic: Kelvin Sherlock's ProFUSE 2.0 - Any Details?
Goto Forum:
  

-=] Back to Top [=-
[ Syndicate this forum (XML) ] [ RSS ] [ PDF ]

Current Time: Thu Mar 28 19:57:26 EDT 2024

Total time taken to generate the page: 0.04995 seconds