Megalextoria
Retro computing and gaming, sci-fi books, tv and movies and other geeky stuff.

Home » Digital Archaeology » Computer Arcana » Commodore » Commodore 8-bit » Wanted: Assembler (long post)
Show: Today's Messages :: Show Polls :: Message Navigator
E-mail to friend 
Switch to threaded view of this topic Create a new topic Submit Reply
Wanted: Assembler (long post) [message #94948] Mon, 27 December 2004 09:48 Go to next message
Jody Bruchon is currently offline  Jody Bruchon
Messages: 33
Registered: December 2004
Karma: 0
Member
I am seeking an assembler that probably does not exist. The specific
features I need are:

* Support for #ifdef, #include, #define, #undef, #else, #endif
(preprocessing conditional includes like lupo/luna/lld)
* Support for 6502, 6510 illegal ops, 65816 (emulation mode) ops, AND 65C02 ops

And I need them *in the same assembler*. The luna toolkit looked promising
until I saw that it doesn't do 65C02 and 6510 illegal ops. I can't program
in C (though I'm great in BASIC ;) so I'm not particularly up to adding this
support to luna. Any suggestions?

The reasoning behind this is: I'm making that darn C02 OS and I was
thinking, "Hey, it would be pretty smart to rewrite some parts of the system
in 65C02 ops and take advantage of 65C02/65816 new ops if the user wants to
build it that way!" I want the OS to become "portable and optimized" across
the entire available 6502 family.

For those who don't want to see lots of assembly code, close this message.

*** INSANELY LENGTHY CODE COMPARISON FOLLOWS ***

Take this code from the scheduler I posted on c.b.cbm:

irq
pha ; Save A so we won't lose it!
txa ; X too!
pha
ldx task ; Find out what task we're on
lda #$00 ; Init A for countdown

That was for a stock 6502/6510 NMOS CPU. 7 bytes, 13 cycles. Rework it for
the 65C02 and we can get:

irq
pha ; Save A so we won't lose it!
phx ; X too!
ldx task ; Find out what task we're on
lda #$00 ; Init A for countdown

6 bytes, 11 cycles. Okay, okay, two cycles ain't a big deal, but that's
just the init for the IRQ routine, and that's triggering once every 60
seconds at a minimum, so that's 120 cycles per second that can go to
something else.

How about running on a 65816 in Emulation mode? NMOS has:

irqsav
tax ; Load index
sty t1y,x ; Store Y
pla ; Pull X
sta t1x,x ; Store X
pla ; Pull A
sta t1a,x ; Store A
pla ; Pull P (IRQ stored)
sta t1p,x ; Store P
pla ; Pull PC low byte
sta t1pc,x ; Store PC low
pla ; Pull PC high
sta t1pc+1,x ; Store PC high
stx temp ; Save index
tsx ; Get current SP
txa ; Save SP elsewhere
ldx temp ; Restore index
sta t1sp,x ; Save SP
ldx task ; Load task number
inx ; Increment task number
cpx tasks ; Compare with running task counter
bne irqtsk ; If not at max, proceed normally
ldx #$01 ; Reset task number to 1

That counts out to 35 bytes and 70 cycles if the task counter hasn't maxed
out, and 72 cycles if it has. 65816 says:

irqsav
tax ; Load index
sty t1y,x ; Store Y
pla ; Pull X
sta t1x,x ; Store X
pla ; Pull A
sta t1a,x ; Store A
pla ; Pull P (IRQ stored)
sta t1p,x ; Store P
pla ; Pull PC low byte
sta t1pc,x ; Store PC low
pla ; Pull PC high
sta t1pc+1,x ; Store PC high
tsc ; Get SP
sta t1sp,x ; Save SP
ldx task ; Load task number
inx ; Increment task number
cpx tasks ; Compare with running task counter
bne irqtsk ; If not at max, proceed normally
ldx #$01 ; Reset task number to 1

That knocked out stx $xx, txa, ldx $xx, saving 5 bytes and 8 cycles...only
30 bytes and 62 cycles to save a context on a 65816 in emulation mode.
Don't forget the same silliness executes on the context restore operation as
well (X denotes lines that would be removed entirely):

irqload
tax ; Load index
X stx temp ; Save index to memory temporarily
lda t1sp,x ; Load SP
X tax ; Prepare SP for change
txs ; Change SP
X ldx temp ; Restore index
lda t1pc+1,x ; Load PC high
pha ; Push PC high
lda t1pc,x ; Load PC low
pha ; Push PC low
lda t1p,x ; Load P
pha ; Push P
lda t1a,x ; Load A
sta temp ; Temporarily save A
ldy t1y,x ; Load Y
lda t1x,x ; Load X with A
tax ; Move X's value from A into X
lda temp ; Load A from temporary location
lda $dc0d ; c64: Silence the CIA 1 interrupts
lda $dd0d ; c64: Silence the CIA 2 interrupts
rti ; Return from IRQ into next task

I'm sure there are other things I could do to optimize more, such as
replacing the "sta temp" and "lda temp" with XBA instructions (no cycle
count gain, but drop two bytes of code size). It appears that I stand to
gain about 18 cycles per IRQ for this extremely minimal IRQ handler on a
65816, which gives me 1,080 cycles per second that can go to something
else...and the IRQ handler is definitely going to expand. If someone wanted
to port C02 to an Apple IIgs, an SNES, a VIC-20 with a 65C02 stuck in there
to replace the NMOS 6502, or even their own homebuilt computer system, they
would be able to reap the full benefits of their CPUs by being able to plug
in optimized versions of the most important pieces of code. Unfortunately,
it seems I would have to mix and match some sort of preprocessor with a
different assembler to achieve these results. Any suggestions?

JodyZ
Re: Wanted: Assembler (long post) [message #94953 is a reply to message #94948] Mon, 27 December 2004 11:10 Go to previous messageGo to next message
Pekka Takala is currently offline  Pekka Takala
Messages: 73
Registered: March 2012
Karma: 0
Member
On Mon, 27 Dec 2004 14:48:45 +0000, Jody Bruchon took his/her keyboard and
got this out:

> * Support for #ifdef, #include, #define, #undef, #else, #endif

> (preprocessing conditional includes like lupo/luna/lld)


Acme has also those.

> * Support for 6502, 6510 illegal ops, 65816 (emulation mode) ops, AND 65C02 ops


Acme supports all except 6510 illegals. But why you need 6510 illegals?
They are not as reliable as documented codes. And I suggest that you
detect the presence of 65816 and 65c02, and if you find out that there is
65816, program it in native mode. With 65816 and native mode you have
access to true 16 bit registers and you can change their length "on-fly"
when needed, and there are also other neat features, for example stack can
be anywhere in first 64k, so can the zeropage (actually direct page) and
so on.

If I write code, I do not use illegals. That makes the code working with
the whole serie. 65c02 has bit-manipulating instructions which do not
exist with 65816. So this would mean that you need to write four different
codes and this takes four times as much space.

I would write just two codes, the task scheduler for 6502 and 65816. And
the latter would use the native mode, since there you can relocate the
stack in whole 64 k area- no need of reading bytes from stack and storing
them elsewhere.

Illegal operations just might render your program so that it won't work on
some machines.

For example, ldax $c000 and lda $c000, tax perform the same, latter takes
one memory cell more than first but it WILL work with all processors.

Of course you are free to do different codes for all processor types if
you want it. Acme supports binary embedding so you can assembly different
codes with other tools if Acme does not support them.



>

> And I need them *in the same assembler*. The luna toolkit looked

> promising until I saw that it doesn't do 65C02 and 6510 illegal ops. I

> can't program in C (though I'm great in BASIC ;) so I'm not particularly

> up to adding this support to luna. Any suggestions?

>

> The reasoning behind this is: I'm making that darn C02 OS and I was

> thinking, "Hey, it would be pretty smart to rewrite some parts of the

> system in 65C02 ops and take advantage of 65C02/65816 new ops if the

> user wants to build it that way!" I want the OS to become "portable and

> optimized" across the entire available 6502 family.


What you shoved in your coding examples you do not seem to code
efficiently. I am lazy, so I try to optimize everything, even coding. So
it means that if I can save typing an extra opcode I will do that. Result
is that code is damn small.

I would do my task scheduler in a way that does not use the stack at all
on 6502. It is faster than first putting all data to stack, then reading
it immediately back. Just put the data to better place and read only three
bytes from stack- program counter and status. And task swapping must be
faster than 1/60 secs, and preferably have priorities. A program waiting
for keyboard does not need attention before it gets a key. So you need to
provide also these services, and if you use graphics mode, the graphics
could be drawn from a "stack" where draw commands are added. And if you
swap tasks 1/200 or faster, you will get smoother multitasking.
Multitasking os is not an easy to write- you must also notice that if you
have two programs waiting for a key, you must know which program gets it
when you push a key.




--
Pekka "Pihti" Takala
When replying to my mail, remove SPAMREMOVE and .invalid
Nothing can be so bad that you cannot find something good in it!
65XXX assembler programmer/developer, linux user
Re: Wanted: Assembler (long post) [message #94958 is a reply to message #94953] Mon, 27 December 2004 14:35 Go to previous messageGo to next message
Jody Bruchon is currently offline  Jody Bruchon
Messages: 33
Registered: December 2004
Karma: 0
Member
Pekka Takala wrote:

> On Mon, 27 Dec 2004 14:48:45 +0000, Jody Bruchon took his/her keyboard and

> got this out:

>

>

>> * Support for #ifdef, #include, #define, #undef, #else, #endif

>> (preprocessing conditional includes like lupo/luna/lld)

>

>

> Acme has also those.


Shameless plug?

>

>

>> * Support for 6502, 6510 illegal ops, 65816 (emulation mode) ops, AND 65C02 ops

>

>

> Acme supports all except 6510 illegals. But why you need 6510 illegals?

> They are not as reliable as documented codes. And I suggest that you

> detect the presence of 65816 and 65c02, and if you find out that there is

> 65816, program it in native mode. With 65816 and native mode you have


I don't want to program in 65816 native mode. I figured I could use the
illegal opcodes for some optimizations along the way that would only work on
the C64, and if the person wanted to use these illegal optimizations in
place of the standards-compliant base code, they could rebuild the system
and have it their way.

> access to true 16 bit registers and you can change their length "on-fly"

> when needed, and there are also other neat features, for example stack can

> be anywhere in first 64k, so can the zeropage (actually direct page) and

> so on.


This OS is for the 6502. I want to provide modularity in the source so I
can plug in replacements optimized for different CPUs. The 65816 native
mode would require a LOT of replacement of code, and I don't have a SCPU,
and I don't have anything with a 65816 (except an old SNES). If someone was
to replace their CPU in their VIC-20 with a 65C02, for example, I want to be
able to provide optimized code for that hack if the person wants it. If
someone has a SCPU, then I can optimize for emulation mode's new opcodes
without major code chunk rewrites.

>

> If I write code, I do not use illegals. That makes the code working with

> the whole serie. 65c02 has bit-manipulating instructions which do not

> exist with 65816. So this would mean that you need to write four different

> codes and this takes four times as much space.44


The point is that I won't have to *use* four different pieces of code in the
final assembled object code, only the *one* piece that matches the target
sub-architecture. The whole point of having #define/#ifdef type
preprocessing directives in my assembler is to allow size and speed
optimization for 6502 cores that have special features. If there were an
interest in it, I would optimize for R65C02 as well :)

>

> I would write just two codes, the task scheduler for 6502 and 65816. And

> the latter would use the native mode, since there you can relocate the

> stack in whole 64 k area- no need of reading bytes from stack and storing

> them elsewhere.


*sigh* I'm not writing for 65816 native mode at this time. It would make
context switching easier but it is a different system from 6502 and I don't
want to deal with it yet. End Of Discussion on that topic.

>

> Illegal operations just might render your program so that it won't work on

> some machines.

>


They work on stock C64s and most emulators. "Universally optimized" code is
provided by making it in such a fashion that I can do something like this:

#ifdef IS_NMOS6502
(NMOS 6502 code)
#endif
#ifdef IS_65816_E
(65816 Emulation mode code)
#endif

The preprocessor should use a configuration file that #define's what the
target should have and cuts out un-#defined code. You can pick from base
6502, 65C02, MOS 6502 illegals, and 65816 emulation, all in the
configuration file, and whatever you don't pick is chopped out.

> For example, ldax $c000 and lda $c000, tax perform the same, latter takes

> one memory cell more than first but it WILL work with all processors.

>


Good point, assuming I am writing a very un-modular system that has to
assemble the same way every time. I don't want to. I know my assembler
wants are a little silly-sounding but I know that if I had a custom-built
system with a 65C02, I'd love to have a full-blown OS that would not only
port, but also run in a fully optimized manner. No, 120 cycles is not a
whole lot of gain, but like squeezing gas mileage from an old car, you only
get more out of it by causing a lot of tiny boosts in different places so
the system as a whole is more efficient.

> Of course you are free to do different codes for all processor types if

> you want it. Acme supports binary embedding so you can assembly different

> codes with other tools if Acme does not support them.

>


Where can I find Acme?

>

>

>

>> And I need them *in the same assembler*. The luna toolkit looked

>> promising until I saw that it doesn't do 65C02 and 6510 illegal ops. I

>> can't program in C (though I'm great in BASIC ;) so I'm not particularly

>> up to adding this support to luna. Any suggestions?

>>

>> The reasoning behind this is: I'm making that darn C02 OS and I was

>> thinking, "Hey, it would be pretty smart to rewrite some parts of the

>> system in 65C02 ops and take advantage of 65C02/65816 new ops if the

>> user wants to build it that way!" I want the OS to become "portable and

>> optimized" across the entire available 6502 family.

>

>

> What you shoved in your coding examples you do not seem to code

> efficiently. I am lazy, so I try to optimize everything, even coding. So

> it means that if I can save typing an extra opcode I will do that. Result

> is that code is damn small.

>


"Shoved" has a negative connotation to it and is a little abrasive for
professional discussion. How is my code inefficient? I humbly request that
you produce code for a stock NMOS 6502 and for a 65816 in emulation mode
that performs the same function as what my code does. Please back up your
assertion that I generate sloppy code with proof that you could write my
scheduler's code better. Also, were you referring to my original code or my
optimization hacks that I spent about two minutes total to generate as examples?

> I would do my task scheduler in a way that does not use the stack at all

> on 6502. It is faster than first putting all data to stack, then reading

> it immediately back. Just put the data to better place and read only three


What better place? I push A and X to the stack because I clobber them in
the IRQ handler. PHA/TXA/PHA vs. STA $xx/STX $xx. Zero-page instructions
require three cycles and two bytes of space, stack push/pull instructions
require three cycles and *one* byte of space. The only inefficiency here is
the TXA in the middle, which becomes a size/speed tradeoff; I get 8 cycles
and 3 bytes my original way vs. 6 cycles, 4 bytes, and 2 ZP bytes your way.
Keep in mind I am currently using ZP to hold context information and
overusing ZP for the kernel's IRQ stuff will lower the task count limit
here. I will probably move to using a 128-byte chunk of a page somewhere
for contexts (128 bytes will allow 18 tasks to run, and there's no business
running more than that on systems this slow AFAIK).

> bytes from stack- program counter and status. And task swapping must be

> faster than 1/60 secs, and preferably have priorities. A program waiting


How do you propose to make task swapping faster than 1/60 sec on a C64?
Every system that uses this OS will need some timer hooked to the IRQ line
to trigger the task switch, otherwise the system will become co-operative
instead of pre-emptive. I haven't gotten around to priorities yet, and if I
don't want them, I don't have to have them.

> for keyboard does not need attention before it gets a key. So you need to


I know this. It is up to the program to execute a BRK instruction to return
control to the task switcher if it does not want its timeslice. That's my
design. It allows the program to run a check on what it's waiting on and
time out if needed. The other way is to skip a "sleeping" task entirely
until the IRQ handler gets a call from the requested resource, which will
add more bulk to the IRQ handler that runs 60 (200 in your desired system)
times per second. The IRQ handler would have to maintain these sleeping
tasks to watch for deadlocks, etc. etc. That sounds crummy to me, because
this maintenance is being performed many times per second. My focus is *low
overhead multitasking* here, as well as *giving programs more control over
their own behavior*. If the SID is locked and the task gets switched into
more than X times, it can abort the SID request with an error message, and
if it was causing the other task wanting, say, a CIA timer to deadlock with
it, this will remove the deadlock.

> provide also these services, and if you use graphics mode, the graphics

> could be drawn from a "stack" where draw commands are added. And if you


I'm sorry, but it seems you've placed the subject in the middle of a field
and walked very far away from it. I'm not coding this stuff now. Why do
you insist on telling me how to code something I may NEVER EVER decide I
want to code after all?

> swap tasks 1/200 or faster, you will get smoother multitasking.


At a price. By adding 140 more context switches per second, with my current
*optimized and very minimal* schedumer, I'd be putting an additional
overhead of, at a MINIMUM, 22,260 cycles per second in there. The base
cycle count for my context switching IRQ routine is already at 159 cycles,
and every additional task will cause successive context switches until the
task counter is at max to jump 10*(tasknum*2-1) cycles. So if we're running
10 tasks at 200 switches per second, total IRQ handler overhead in cycles
per second will be (922 cycles per 10 task switches * 20 sets of 10
switches) + 159*200 cycles base = 50240 cycles, as compared to 15072 cycles
for the 60 Hz timer approach. Granted, multitasking won't be as smooth, but
then again this is an estimate that assumes every program eats the entire
allocated timeslice and doesn't BRK out to "sleep" and force another context
switch. If a program is not busy, by my design standards, it should BRK to
force context switching so another program can have the rest of its
timeslice, thus providing *very* smooth multitasking when the system is not
under multiple tasks using 100% of their timeslices. Blah, blah, I speak
too much.

> Multitasking os is not an easy to write- you must also notice that if you

> have two programs waiting for a key, you must know which program gets it

> when you push a key.

>


I'll provide a facility for foreground tasks. That's not going to be too
difficult, especially since I control the API entirely.

>

>

>
Re: Wanted: Assembler (long post) [message #94990 is a reply to message #94948] Tue, 28 December 2004 00:03 Go to previous messageGo to next message
Anton Treuenfels is currently offline  Anton Treuenfels
Messages: 105
Registered: December 2011
Karma: 0
Senior Member
First, shameless plug:

www.home.earthlink.net/~hxa

That's my HXA assembler, currently v0.10.

"Jody Bruchon" <jbruchon@nc.rr.com> wrote in message
news:hhVzd.3268$aM4.631360@twister.southeast.rr.com...
> I am seeking an assembler that probably does not exist. The specific

> features I need are:

>

> * Support for #ifdef, #include, #define, #undef, #else, #endif

> (preprocessing conditional includes like lupo/luna/lld)

> * Support for 6502, 6510 illegal ops, 65816 (emulation mode) ops, AND

65C02 ops
>

> And I need them *in the same assembler*. The luna toolkit looked

promising
> until I saw that it doesn't do 65C02 and 6510 illegal ops. I can't

program
> in C (though I'm great in BASIC ;) so I'm not particularly up to adding

this
> support to luna. Any suggestions?


HXA supports conditional assembly, includes and macros.
HXA supports 6502, 65C02, R65C02 and W65C02S directly. Each successive set
is a superset of the previous.

The 65816 is not directly supported, although it will execute all the
instructions of the supported processors except the bit-manipulation
instructions of the R65C02 and W65C02S.

As for illegal 6502/6510 instructions, have you considered macros? Any
instructions using immediate, implied, absolute or absolute-indexed address
modes are fairly easy to implement this way. Relative instructions are bit
trickier. Even some 65816-specific instructions could be managed this way
(athough they would generally be 16-bit instructions, which you don't want.
You might want to be able to force the processor into 8-bit emulation mode,
though, and that should be an easy macro).

> The reasoning behind this is: I'm making that darn C02 OS and I was

> thinking, "Hey, it would be pretty smart to rewrite some parts of the

system
> in 65C02 ops and take advantage of 65C02/65816 new ops if the user wants

to
> build it that way!" I want the OS to become "portable and optimized"

across
> the entire available 6502 family.


Again I'd suggest macros. Also includes, so early in the source file you'd
have something like:

#if P_6502
#include "p6502.def"
#elseif P_65C02
#include "p65C02.def"
# ...etc...

#endif

> For those who don't want to see lots of assembly code, close this message.

>

> *** INSANELY LENGTHY CODE COMPARISON FOLLOWS ***

>

> Take this code from the scheduler I posted on c.b.cbm:

>

> irq

> pha ; Save A so we won't lose it!

> txa ; X too!

> pha

> ldx task ; Find out what task we're on

> lda #$00 ; Init A for countdown


So there's a macro SAVEAX.

In "p_6502.def" it looks like:

.macro SAVEAX
pha
txa
pha
.endm

But in "p_65C02.def" it looks like:

.macro SAVEAX
pha
phx
.endm

And your code then looks like:

irq
SAVEAX
ldx task
lda #$00

> That was for a stock 6502/6510 NMOS CPU. 7 bytes, 13 cycles. Rework it

for
> the 65C02 and we can get:

>

> irq

> pha ; Save A so we won't lose it!

> phx ; X too!

> ldx task ; Find out what task we're on

> lda #$00 ; Init A for countdown

>

> 6 bytes, 11 cycles. Okay, okay, two cycles ain't a big deal, but that's

> just the init for the IRQ routine, and that's triggering once every 60

> seconds at a minimum, so that's 120 cycles per second that can go to

> something else.

>

> How about running on a 65816 in Emulation mode? NMOS has:

>

> irqsav

> tax ; Load index

> sty t1y,x ; Store Y

> pla ; Pull X

> sta t1x,x ; Store X

> pla ; Pull A

> sta t1a,x ; Store A

> pla ; Pull P (IRQ stored)

> sta t1p,x ; Store P

> pla ; Pull PC low byte

> sta t1pc,x ; Store PC low

> pla ; Pull PC high

> sta t1pc+1,x ; Store PC high

> stx temp ; Save index

> tsx ; Get current SP

> txa ; Save SP elsewhere

> ldx temp ; Restore index

> sta t1sp,x ; Save SP

> ldx task ; Load task number

> inx ; Increment task number

> cpx tasks ; Compare with running task counter

> bne irqtsk ; If not at max, proceed normally

> ldx #$01 ; Reset task number to 1

>

> That counts out to 35 bytes and 70 cycles if the task counter hasn't maxed

> out, and 72 cycles if it has. 65816 says:

>

> irqsav

> tax ; Load index

> sty t1y,x ; Store Y

> pla ; Pull X

> sta t1x,x ; Store X

> pla ; Pull A

> sta t1a,x ; Store A

> pla ; Pull P (IRQ stored)

> sta t1p,x ; Store P

> pla ; Pull PC low byte

> sta t1pc,x ; Store PC low

> pla ; Pull PC high

> sta t1pc+1,x ; Store PC high

> tsc ; Get SP

> sta t1sp,x ; Save SP

> ldx task ; Load task number

> inx ; Increment task number

> cpx tasks ; Compare with running task counter

> bne irqtsk ; If not at max, proceed normally

> ldx #$01 ; Reset task number to 1


.macro GETSP ; for 6502
stx temp
tsx
txa
ldx temp
.endm

.macro GETSP ; for 65816 (supported directly)
tsc
.endm

.macro GETSP ; for 65816 (not supported directly)
.byte whatever_the_opcode_is
.endm

Well, you get the idea. Identify code sequences that could benefit from the
instruction set of a higher processor, then replace them with macros that
use those instructions. Select a set of macros to use at assembly time to
generate the different versions.

Of course, you could also replace those identified code sequences with new
instructions directly:

#if P_6502
(one sequence)
#elseif P_65C02
(another sequence)
#...etc...

#endif

It certainly works, but personally it always looks messy to me. Plus, to add
another processor means going through the entire source code and adding new
sequences (and duplicating them if they are used more than once). My
preference simply means coding another macro file (and you can use an
existing one as an easy reference to what needs to be created).

- Anton Treuenfels
Re: Wanted: Assembler (long post) [message #94992 is a reply to message #94990] Tue, 28 December 2004 00:27 Go to previous messageGo to next message
Jody Bruchon is currently offline  Jody Bruchon
Messages: 33
Registered: December 2004
Karma: 0
Member
Anton Treuenfels wrote:
---snip---
>

>

> .macro GETSP ; for 6502

> stx temp

> tsx

> txa

> ldx temp

> .endm

>

> .macro GETSP ; for 65816 (supported directly)

> tsc

> .endm

>

> .macro GETSP ; for 65816 (not supported directly)

> .byte whatever_the_opcode_is

> .endm

>

> Well, you get the idea. Identify code sequences that could benefit from the

> instruction set of a higher processor, then replace them with macros that

> use those instructions. Select a set of macros to use at assembly time to

> generate the different versions.

>

> Of course, you could also replace those identified code sequences with new

> instructions directly:

>

> #if P_6502

> (one sequence)

> #elseif P_65C02

> (another sequence)

> #...etc...

>

> #endif

>

> It certainly works, but personally it always looks messy to me. Plus, to add

> another processor means going through the entire source code and adding new

> sequences (and duplicating them if they are used more than once). My

> preference simply means coding another macro file (and you can use an

> existing one as an easy reference to what needs to be created).

>

> - Anton Treuenfels

>

>


That sheds a little light on the whole operation, though what I actually
have in mind is a little bit more complex. I want to have the source chunks
expanded out in their own arch/<archname> folders and the glue assembly file
be dropped in place via an inclusion directive such that the following...

#ifdef CONFIG_IS_NMOS
#include "arch/nmos/schedirq.a"
#elseif CONFIG_IS_WDC_6502
#include "arch/wdc6502/schedirq.a"
#elseif CONFIG_IS_R_6502
#include "arch/r6502/schedirq.a"
#elseif CONFIG_IS_65816_EMU
#include "arch/65816emu/schedirq.a"
#endif

....will drop the assembly source file in place that is specified by the
configuration file that #defines everything. If #define CONFIG_IS_WDC_6502
is in the configuration file, then the assembler preprocessor drops in the
entirety of the file arch/wdc6502/schedirq.a where all that #directive-mess
is. I don't know for 100% sure if this is how it works in the C/C++ world
but that's how I want it to work here. I think #include is used primarily
for header files that define things like global variables in C, not for
dropping code chunks in place dynamically, but I could be wrong and it
doesn't matter because I'm straying off topic.

I would like a little bit of reassurance from the people in the group that
my coding for my IRQ handler is as fast as it can be without increasing
memory usage (I remember a suggestion to use bit shifting to cheat on
division by 8 instead of my looping counter and I may provide an alternate
scheduler with this for people who run loads of tasks). I realize that I
could use two more ZP bytes and gain two cycles in one spot I mentioned
earlier, but that would increase ZP usage even more and I want low overhead
in all areas if possible.

I'll check out the assembler you shamelessly plugged. As usual, I am very
thankful to everyone who gives me input. (Let's hope I don't commit a parse
error though! :)

Jody
Re: Wanted: Assembler (long post) [message #94999 is a reply to message #94992] Tue, 28 December 2004 02:02 Go to previous messageGo to next message
Payton Byrd is currently offline  Payton Byrd
Messages: 1198
Registered: December 2011
Karma: 0
Senior Member
Jody Bruchon wrote:

> That sheds a little light on the whole operation, though what I

> actually have in mind is a little bit more complex. I want to have

> the source chunks expanded out in their own arch/<archname> folders

> and the glue assembly file be dropped in place via an inclusion

> directive such that the following...

>

> #ifdef CONFIG_IS_NMOS

> #include "arch/nmos/schedirq.a"

> #elseif CONFIG_IS_WDC_6502

> #include "arch/wdc6502/schedirq.a"

> #elseif CONFIG_IS_R_6502

> #include "arch/r6502/schedirq.a"

> #elseif CONFIG_IS_65816_EMU

> #include "arch/65816emu/schedirq.a"

> #endif

>

> ...will drop the assembly source file in place that is specified by

> the configuration file that #defines everything. If #define

> CONFIG_IS_WDC_6502 is in the configuration file, then the assembler

> preprocessor drops in the entirety of the file arch/wdc6502/schedirq.a

> where all that #directive-mess is. I don't know for 100% sure if this

> is how it works in the C/C++ world but that's how I want it to work

> here. I think #include is used primarily for header files that define

> things like global variables in C, not for dropping code chunks in

> place dynamically, but I could be wrong and it doesn't matter because

> I'm straying off topic.

>

> I would like a little bit of reassurance from the people in the group

> that my coding for my IRQ handler is as fast as it can be without

> increasing memory usage (I remember a suggestion to use bit shifting

> to cheat on division by 8 instead of my looping counter and I may

> provide an alternate scheduler with this for people who run loads of

> tasks). I realize that I could use two more ZP bytes and gain two

> cycles in one spot I mentioned earlier, but that would increase ZP

> usage even more and I want low overhead in all areas if possible.

>

> I'll check out the assembler you shamelessly plugged. As usual, I am

> very thankful to everyone who gives me input. (Let's hope I don't

> commit a parse error though! :)

>

> Jody


Does CA65 not do this already? I couldn't imagine that it would be
difficult for Uz to add if it doesn't. This is exactly how it would be
done in C and since CA65 is the support assembler for CC65 I would
imagine it would be a requirement.
Re: Wanted: Assembler (long post) [message #95000 is a reply to message #94999] Tue, 28 December 2004 02:06 Go to previous messageGo to next message
Jody Bruchon is currently offline  Jody Bruchon
Messages: 33
Registered: December 2004
Karma: 0
Member
Payton Byrd wrote:

---snip---

> Does CA65 not do this already? I couldn't imagine that it would be

> difficult for Uz to add if it doesn't. This is exactly how it would be

> done in C and since CA65 is the support assembler for CC65 I would

> imagine it would be a requirement.


Does CC65 use a preprocessor or is CA65 a pp+asm in one?

Jody
Re: Wanted: Assembler (long post) [message #95044 is a reply to message #94948] Tue, 28 December 2004 11:46 Go to previous messageGo to next message
last_ninja is currently offline  last_ninja
Messages: 6
Registered: December 2004
Karma: 0
Junior Member
Jody Bruchon wrote:
> I am seeking an assembler that probably does not exist. The specific


> features I need are:

>

> * Support for #ifdef, #include, #define, #undef, #else, #endif

> (preprocessing conditional includes like lupo/luna/lld)

> * Support for 6502, 6510 illegal ops, 65816 (emulation mode) ops, AND

65C02 ops
>

> And I need them *in the same assembler*. The luna toolkit looked

promising
> until I saw that it doesn't do 65C02 and 6510 illegal ops. I can't

program
> in C (though I'm great in BASIC ;) so I'm not particularly up to

adding this
> support to luna. Any suggestions?


There is a cross-assembler called MXASS that I beleieve supports all
those porcessors and might have conditional support as well? You should
be able to find it with google if you want to check it out.

-Chad
Re: Wanted: Assembler (long post) [message #95049 is a reply to message #94999] Tue, 28 December 2004 13:42 Go to previous messageGo to next message
uz is currently offline  uz
Messages: 41
Registered: February 2012
Karma: 0
Member
Payton Byrd <plbyrd@bellsouth.net> wrote:
> Does CA65 not do this already?


ca65 has all of the requested features. However, the syntax and the concepts
are somewhat different, which is the reason, why I didn't answer the post in
the first place.

The current development version offers support for all the requested CPUs,
including a 65X02 CPU which is a 6502 plus illegal opcodes. There's even
support for the sweet16 pseudo CPU in case someone wants to extend the list:-)

There's conditional assembly, but the directives are named differently.
There's a .define directive, but it does something different (and quite
complex), so it's use is strongly discouraged for "normal" code. For
conditional assembly, standard symbols are used (these symbols can also be
assigned on the command line).

ca65 does also differ in a few other aspects from "normal" assemblers, so some
people don't like it. The development suite (which the assembler is part of)
is more like "big" PC development suites. For example, the assembler outputs
relocatable object code, which must be run through a linker to generate raw
binary data. This is good for big projects, or programs written partly in high
level languages, but may be overkill for a small program with just one hundred
bytes of code. Especially if you have to learn all the details about the
assembler and linker before use. ca65 is quite powerful, but with power comes
complexity, so it's not an assembler for everyone (as I've learned:-).

And last but not least, it's the development version that supports all this.
The current stable version (2.10.1) has all the features with the exception of
6502 illegal opcodes. This means that one of the snapshots has to be used.
They are usually quite stable, but your mileage may vary.

Regards


Uz


--
Ullrich von Bassewitz uz@spamtrap.musoftware.de
18:45:47 up 19 days, 58 min, 9 users, load average: 0.08, 0.04, 0.03
Re: Wanted: Assembler (long post) [message #95363 is a reply to message #95044] Thu, 30 December 2004 02:46 Go to previous message
Jody Bruchon is currently offline  Jody Bruchon
Messages: 33
Registered: December 2004
Karma: 0
Member
last_ninja wrote:

> There is a cross-assembler called MXASS that I beleieve supports all

> those porcessors and might have conditional support as well? You should

> be able to find it with google if you want to check it out.

>

> -Chad

>


I got MXASS and when I run it on my W2K system it crashes out with a runtime
error, so that's no good :(

Jody
  Switch to threaded view of this topic Create a new topic Submit Reply
Previous Topic: Re: What FPGA kits did you buy?
Next Topic: ACME Assembler
Goto Forum:
  

-=] Back to Top [=-
[ Syndicate this forum (XML) ] [ RSS ] [ PDF ]

Current Time: Thu Mar 28 07:05:42 EDT 2024

Total time taken to generate the page: 0.08932 seconds