CPU - 6502

Parent Previous Next

#

# $Id: 6502_cpu.txt,v 1.1.1.1 2004/08/29 01:29:35 bryan Exp $

#

# This file is part of Commodore 64 emulator

#      and Program Development System.

#

# See README for copyright notice

#

# This file contains documentation for 6502/6510/8500/8502 instruction set.

#

#

# Written by

#   John West       (john@ucc.gu.uwa.edu.au)

#   Marko MЉkelЉ    (msmakela@kruuna.helsinki.fi)

#

#

# $Log: 6502_cpu.txt,v $

# Revision 1.1.1.1  2004/08/29 01:29:35  bryan

# no message

#

# Revision 1.1  2002/05/21 00:42:27  xodnizel

# updates

#

# Revision 1.8  1994/06/03  19:50:04  jopi

# Patchlevel 2

#

# Revision 1.7  1994/04/15  13:07:04  jopi

# 65xx Register descriptions added

#

# Revision 1.6  1994/02/18  16:09:36  jopi

#

# Revision 1.5  1994/01/26  16:08:37  jopi

# X64 version 0.2 PL 1

#

# Revision 1.4  1993/11/10  01:55:34  jopi

#

# Revision 1.3  93/06/21  13:37:18  jopi

#  X64 version 0.2 PL 0

#

# Revision 1.2  93/06/21  13:07:15  jopi

# *** empty log message ***

#

#


Note: To extract the uuencoded ML programs in this article most

      easily you may use e.g. "uud" by Edwin Kremer ,

      which extracts them all at once.



Documentation for the NMOS 65xx/85xx Instruction Set


       6510 Instructions by Addressing Modes

       6502 Registers

       6510/8502 Undocumented Commands

       Register selection for load and store

       Decimal mode in NMOS 6500 series

       6510 features

       Different CPU types

       6510 Instruction Timing

       How Real Programmers Acknowledge Interrupts

       Memory Management

       Autostart Code

       Notes

       References



6510 Instructions by Addressing Modes


off- ++++++++++ Positive ++++++++++  ---------- Negative ----------

set  00      20      40      60      80      a0      c0      e0      mode


+00  BRK     JSR     RTI     RTS     NOP*    LDY     CPY     CPX     Impl/immed

+01  ORA     AND     EOR     ADC     STA     LDA     CMP     SBC     (indir,x)

+02   t       t       t       t      NOP*t   LDX     NOP*t   NOP*t     ? /immed

+03  SLO*    RLA*    SRE*    RRA*    SAX*    LAX*    DCP*    ISB*    (indir,x)

+04  NOP*    BIT     NOP*    NOP*    STY     LDY     CPY     CPX     Zeropage

+05  ORA     AND     EOR     ADC     STA     LDA     CMP     SBC     Zeropage

+06  ASL     ROL     LSR     ROR     STX     LDX     DEC     INC     Zeropage

+07  SLO*    RLA*    SRE*    RRA*    SAX*    LAX*    DCP*    ISB*    Zeropage


+08  PHP     PLP     PHA     PLA     DEY     TAY     INY     INX     Implied

+09  ORA     AND     EOR     ADC     NOP*    LDA     CMP     SBC     Immediate

+0a  ASL     ROL     LSR     ROR     TXA     TAX     DEX     NOP     Accu/impl

+0b  ANC**   ANC**   ASR**   ARR**   ANE**   LXA**   SBX**   SBC*    Immediate

+0c  NOP*    BIT     JMP     JMP ()  STY     LDY     CPY     CPX     Absolute

+0d  ORA     AND     EOR     ADC     STA     LDA     CMP     SBC     Absolute

+0e  ASL     ROL     LSR     ROR     STX     LDX     DEC     INC     Absolute

+0f  SLO*    RLA*    SRE*    RRA*    SAX*    LAX*    DCP*    ISB*    Absolute


+10  BPL     BMI     BVC     BVS     BCC     BCS     BNE     BEQ     Relative

+11  ORA     AND     EOR     ADC     STA     LDA     CMP     SBC     (indir),y

+12   t       t       t       t       t       t       t       t         ?

+13  SLO*    RLA*    SRE*    RRA*    SHA**   LAX*    DCP*    ISB*    (indir),y

+14  NOP*    NOP*    NOP*    NOP*    STY     LDY     NOP*    NOP*    Zeropage,x

+15  ORA     AND     EOR     ADC     STA     LDA     CMP     SBC     Zeropage,x

+16  ASL     ROL     LSR     ROR     STX  y) LDX  y) DEC     INC     Zeropage,x

+17  SLO*    RLA*    SRE*    RRA*    SAX* y) LAX* y) DCP*    ISB*    Zeropage,x


+18  CLC     SEC     CLI     SEI     TYA     CLV     CLD     SED     Implied

+19  ORA     AND     EOR     ADC     STA     LDA     CMP     SBC     Absolute,y

+1a  NOP*    NOP*    NOP*    NOP*    TXS     TSX     NOP*    NOP*    Implied

+1b  SLO*    RLA*    SRE*    RRA*    SHS**   LAS**   DCP*    ISB*    Absolute,y

+1c  NOP*    NOP*    NOP*    NOP*    SHY**   LDY     NOP*    NOP*    Absolute,x

+1d  ORA     AND     EOR     ADC     STA     LDA     CMP     SBC     Absolute,x

+1e  ASL     ROL     LSR     ROR     SHX**y) LDX  y) DEC     INC     Absolute,x

+1f  SLO*    RLA*    SRE*    RRA*    SHA**y) LAX* y) DCP*    ISB*    Absolute,x


       ROR intruction is available on MC650x microprocessors after

       June, 1976.


       Legend:


       t       Jams the machine

       *t      Jams very rarely

       *       Undocumented command

       **      Unusual operation

       y)      indexed using Y instead of X

       ()      indirect instead of absolute


Note that the NOP instructions do have other addressing modes than the

implied addressing. The NOP instruction is just like any other load

instruction, except it does not store the result anywhere nor affects the

flags.


6502 Registers


The NMOS 65xx processors are not ruined with too many registers. In addition

to that, the registers are mostly 8-bit. Here is a brief description of each

register:


    PC Program Counter

         This register points the address from which the next instruction

         byte (opcode or parameter) will be fetched. Unlike other

         registers, this one is 16 bits in length. The low and high 8-bit

         halves of the register are called PCL and PCH, respectively. The

         Program Counter may be read by pushing its value on the stack.

         This can be done either by jumping to a subroutine or by causing

         an interrupt.

    S Stack pointer

         The NMOS 65xx processors have 256 bytes of stack memory, ranging

         from $0100 to $01FF. The S register is a 8-bit offset to the stack

         page. In other words, whenever anything is being pushed on the

         stack, it will be stored to the address $0100+S.


         The Stack pointer can be read and written by transfering its value

         to or from the index register X (see below) with the TSX and TXS

         instructions.

    P Processor status

         This 8-bit register stores the state of the processor. The bits in

         this register are called flags. Most of the flags have something

         to do with arithmetic operations.


         The P register can be read by pushing it on the stack (with PHP or

         by causing an interrupt). If you only need to read one flag, you

         can use the branch instructions. Setting the flags is possible by

         pulling the P register from stack or by using the flag set or

         clear instructions.


         Following is a list of the flags, starting from the 8th bit of the

         P register (bit 7, value $80):

              N Negative flag

                   This flag will be set after any arithmetic operations

                   (when any of the registers A, X or Y is being loaded

                   with a value). Generally, the N flag will be copied from

                   the topmost bit of the register being loaded.


                   Note that TXS (Transfer X to S) is not an arithmetic

                   operation. Also note that the BIT instruction affects

                   the Negative flag just like arithmetic operations.

                   Finally, the Negative flag behaves differently in

                   Decimal operations (see description below).

              V oVerflow flag

                   Like the Negative flag, this flag is intended to be used

                   with 8-bit signed integer numbers. The flag will be

                   affected by addition and subtraction, the instructions

                   PLP, CLV and BIT, and the hardware signal -SO. Note that

                   there is no SEV instruction, even though the MOS

                   engineers loved to use East European abbreviations, like

                   DDR (Deutsche Demokratische Republik vs. Data Direction

                   Register). (The Russian abbreviation for their former

                   trade association COMECON is SEV.) The -SO (Set

                   Overflow) signal is available on some processors, at

                   least the 6502, to set the V flag. This enables response

                   to an I/O activity in equal or less than three clock

                   cycles when using a BVC instruction branching to itself

                   ($50 $FE).


                   The CLV instruction clears the V flag, and the PLP and

                   BIT instructions copy the flag value from the bit 6 of

                   the topmost stack entry or from memory.


                   After a binary addition or subtraction, the V flag will

                   be set on a sign overflow, cleared otherwise. What is a

                   sign overflow? For instance, if you are trying to add

                   123 and 45 together, the result (168) does not fit in a

                   8-bit signed integer (upper limit 127 and lower limit

                   -128). Similarly, adding -123 to -45 causes the

                   overflow, just like subtracting -45 from 123 or 123 from

                   -45 would do.


                   Like the N flag, the V flag will not be set as expected

                   in the Decimal mode. Later in this document is a precise

                   operation description.


                   A common misbelief is that the V flag could only be set

                   by arithmetic operations, not cleared.

              1 unused flag

                   To the current knowledge, this flag is always 1.

              B Break flag

                   This flag is used to distinguish software (BRK)

                   interrupts from hardware interrupts (IRQ or NMI). The B

                   flag is always set except when the P register is being

                   pushed on stack when jumping to an interrupt routine to

                   process only a hardware interrupt.


                   The official NMOS 65xx documentation claims that the BRK

                   instruction could only cause a jump to the IRQ vector

                   ($FFFE). However, if an NMI interrupt occurs while

                   executing a BRK instruction, the processor will jump to

                   the NMI vector ($FFFA), and the P register will be

                   pushed on the stack with the B flag set.

              D Decimal mode flag

                   This flag is used to select the (Binary Coded) Decimal

                   mode for addition and subtraction. In most applications,

                   the flag is zero.


                   The Decimal mode has many oddities, and it operates

                   differently on CMOS processors. See the description of

                   the ADC, SBC and ARR instructions below.

              I Interrupt disable flag

                   This flag can be used to prevent the processor from

                   jumping to the IRQ handler vector ($FFFE) whenever the

                   hardware line -IRQ is active. The flag will be

                   automatically set after taking an interrupt, so that the

                   processor would not keep jumping to the interrupt

                   routine if the -IRQ signal remains low for several clock

                   cycles.

              Z Zero flag

                   The Zero flag will be affected in the same cases than

                   the Negative flag. Generally, it will be set if an

                   arithmetic register is being loaded with the value zero,

                   and cleared otherwise. The flag will behave differently

                   in Decimal operations.

              C Carry flag

                   This flag is used in additions, subtractions,

                   comparisons and bit rotations. In additions and

                   subtractions, it acts as a 9th bit and lets you to chain

                   operations to calculate with bigger than 8-bit numbers.

                   When subtracting, the Carry flag is the negative of

                   Borrow: if an overflow occurs, the flag will be clear,

                   otherwise set. Comparisons are a special case of

                   subtraction: they assume Carry flag set and Decimal flag

                   clear, and do not store the result of the subtraction

                   anywhere.


                   There are four kinds of bit rotations. All of them store

                   the bit that is being rotated off to the Carry flag. The

                   left shifting instructions are ROL and ASL. ROL copies

                   the initial Carry flag to the lowmost bit of the byte;

                   ASL always clears it. Similarly, the ROR and LSR

                   instructions shift to the right.

    A Accumulator

         The accumulator is the main register for arithmetic and logic

         operations. Unlike the index registers X and Y, it has a direct

         connection to the Arithmetic and Logic Unit (ALU). This is why

         many operations are only available for the accumulator, not the

         index registers.

    X Index register X

         This is the main register for addressing data with indices. It has

         a special addressing mode, indexed indirect, which lets you to

         have a vector table on the zero page.

    Y Index register Y

         The Y register has the least operations available. On the other

         hand, only it has the indirect indexed addressing mode that

         enables access to any memory place without having to use

         self-modifying code.


6510/8502 Undocumented Commands


-- A brief explanation about what may happen while using don't care states.


       ANE $8B         A = (A | #$EE) & X & #byte

                       same as

                       A = ((A & #$11 & X) | ( #$EE & X)) & #byte


                       In real 6510/8502 the internal parameter #$11

                       may occasionally be #$10, #$01 or even #$00.

                       This occurs when the video chip starts DMA

                       between the opcode fetch and the parameter fetch

                       of the instruction.  The value probably depends

                       on the data that was left on the bus by the VIC-II.


       LXA $AB         C=Lehti:   A = X = ANE

                       Alternate: A = X = (A & #byte)


                       TXA and TAX have to be responsible for these.


       SHA $93,$9F     Store (A & X & (ADDR_HI + 1))

       SHX $9E         Store (X & (ADDR_HI + 1))

       SHY $9C         Store (Y & (ADDR_HI + 1))

       SHS $9B         SHA and TXS, where X is replaced by (A & X).


                       Note: The value to be stored is copied also

                       to ADDR_HI if page boundary is crossed.


       SBX $CB         Carry and Decimal flags are ignored but the

                       Carry flag will be set in substraction. This

                       is due to the CMP command, which is executed

                       instead of the real SBC.


       ARR $6B         This instruction first performs an AND

                       between the accumulator and the immediate

                       parameter, then it shifts the accumulator to

                       the right. However, this is not the whole

                       truth. See the description below.


Many undocumented commands do not use AND between registers, the CPU

just throws the bytes to a bus simultaneously and lets the

open-collector drivers perform the AND. I.e. the command called 'SAX',

which is in the STORE section (opcodes $A0...$BF), stores the result

of (A & X) by this way.


More fortunate is its opposite, 'LAX' which just loads a byte

simultaneously into both A and X.


       $6B  ARR


This instruction seems to be a harmless combination of AND and ROR at

first sight, but it turns out that it affects the V flag and also has

a special kind of decimal mode. This is because the instruction has

inherited some properties of the ADC instruction ($69) in addition to

the ROR ($6A).


In Binary mode (D flag clear), the instruction effectively does an AND

between the accumulator and the immediate parameter, and then shifts

the accumulator to the right, copying the C flag to the 8th bit. It

sets the Negative and Zero flags just like the ROR would. The ADC code

shows up in the Carry and oVerflow flags. The C flag will be copied

from the bit 6 of the result (which doesn't seem too logical), and the

V flag is the result of an Exclusive OR operation between the bit 6

and the bit 5 of the result.  This makes sense, since the V flag will

be normally set by an Exclusive OR, too.


In Decimal mode (D flag set), the ARR instruction first performs the

AND and ROR, just like in Binary mode. The N flag will be copied from

the initial C flag, and the Z flag will be set according to the ROR

result, as expected. The V flag will be set if the bit 6 of the

accumulator changed its state between the AND and the ROR, cleared

otherwise.


Now comes the funny part. If the low nybble of the AND result,

incremented by its lowmost bit, is greater than 5, the low nybble in

the ROR result will be incremented by 6. The low nybble may overflow

as a consequence of this BCD fixup, but the high nybble won't be

adjusted. The high nybble will be BCD fixed in a similar way. If the

high nybble of the AND result, incremented by its lowmost bit, is

greater than 5, the high nybble in the ROR result will be incremented

by 6, and the Carry flag will be set. Otherwise the C flag will be

cleared.


To help you understand this description, here is a C routine that

illustrates the ARR operation in Decimal mode:


       unsigned

          A,  /* Accumulator */

          AL, /* low nybble of accumulator */

          AH, /* high nybble of accumulator */


          C,  /* Carry flag */

          Z,  /* Zero flag */

          V,  /* oVerflow flag */

          N,  /* Negative flag */


          t,  /* temporary value */

          s;  /* value to be ARRed with Accumulator */


       t = A & s;                      /* Perform the AND. */


       AH = t >> 4;                    /* Separate the high */

       AL = t & 15;                    /* and low nybbles. */


       N = C;                          /* Set the N and */

       Z = !(A = (t >> 1) | (C << 7)); /* Z flags traditionally */

       V = (t ^ A) & 64;               /* and V flag in a weird way. */


       if (AL + (AL & 1) > 5)          /* BCD "fixup" for low nybble. */

         A = (A & 0xF0) | ((A + 6) & 0xF);


       if (C = AH + (AH & 1) > 5)      /* Set the Carry flag. */

         A = (A + 0x60) & 0xFF;        /* BCD "fixup" for high nybble. */


       $CB  SBX   X <- (A & X) - Immediate


The 'SBX' ($CB) may seem to be very complex operation, even though it

is a combination of the subtraction of accumulator and parameter, as

in the 'CMP' instruction, and the command 'DEX'. As a result, both A

and X are connected to ALU but only the subtraction takes place. Since

the comparison logic was used, the result of subtraction should be

normally ignored, but the 'DEX' now happily stores to X the value of

(A & X) - Immediate.  That is why this instruction does not have any

decimal mode, and it does not affect the V flag. Also Carry flag will

be ignored in the subtraction but set according to the result.


Proof:


begin 644 vsbx

M`0@9$,D'GL(H-#,IJC(U-JS"*#0T*:HR-@```*D`H#V1*Z`_D2N@09$KJ0>%

M^QBE^VEZJ+$KH#F1*ZD`2"BI`*(`RP`(:-B@.5$K*4#P`E@`H#VQ*SAI`)$K

JD-Z@/[$K:0"1*Y#4J2X@TO\XH$&Q*VD`D2N0Q,;[$+188/_^]_:_OK>V

`

end


and


begin 644 sbx

M`0@9$,D'GL(H-#,IJC(U-JS"*#0T*:HR-@```'BI`*!-D2N@3Y$KH%&1*ZD#

MA?L8I?M*2)`#J1@LJ3B@29$K:$J0`ZGX+*G8R)$K&/BXJ?2B8\L)AOP(:(7]

MV#B@3;$KH$\Q*Z!1\2L(1?SP`0!H1?TIM]#XH$VQ*SAI`)$KD,N@3[$K:0"1

9*Y#!J2X@TO\XH%&Q*VD`D2N0L<;[$))88-#X

`

end


These test programs show if your machine is compatible with ours

regarding the opcode $CB. The first test, vsbx, proves that SBX does

not affect the V flag. The latter one, sbx, proves the rest of our

theory. The vsbx test tests 33554432 SBX combinations (16777216

different A, X and Immediate combinations, and two different V flag

states), and the sbx test doubles that amount (16777216*4 D and C flag

combinations). Both tests have run successfully on a C64 and a Vic20.

They ought to run on C16, +4 and the PET series as well. The tests

stop with BRK, if the opcode $CB does not work as expected. Successful

operation ends in RTS. As the tests are very slow, they print dots on

the screen while running so that you know that the machine has not

jammed. On computers running at 1 MHz, the first test prints

approximately one dot every four seconds and a total of 2048 dots,

whereas the second one prints half that amount, one dot every seven

seconds.


If the tests fail on your machine, please let us know your processor's

part number and revision. If possible, save the executable (after it

has stopped with BRK) under another name and send it to us so that we

know at which stage the program stopped.


The following program is a Commodore 64 executable that Marko M"akel"a

developed when trying to find out how the V flag is affected by SBX.

(It was believed that the SBX affects the flag in a weird way, and

this program shows how SBX sets the flag differently from SBC.)  You

may find the subroutine at $C150 useful when researching other

undocumented instructions' flags. Run the program in a machine

language monitor, as it makes use of the BRK instruction. The result

tables will be written on pages $C2 and $C3.


begin 644 sbx-c100

M`,%XH`",#L&,$,&,$L&XJ8*B@LL7AOL(:(7\N#BM#L$M$,'M$L$(Q?OP`B@`

M:$7\\`,@4,'N#L'0U.X0P=#/SB#0[A+!T,<``````````````)BJ\!>M#L$M

L$,'=_\'0":T2P=W_PM`!8,K0Z:T.P2T0P9D`PID`!*T2P9D`PYD`!


Other undocumented instructions usually cause two preceding opcodes

being executed. However 'NOP' seems to completely disappear from 'SBC'

code $EB.


The most difficult to comprehend are the rest of the instructions

located on the '$0B' line.


All the instructions located at the positive (left) side of this line

should rotate either memory or the accumulator, but the addressing

mode turns out to be immediate! No problem. Just read the operand, let

it be ANDed with the accumulator and finally use accumulator

addressing mode for the instructions above them.


RELIGION_MODE_ON

/* This part of the document is not accurate.  You can

  read it as a fairy tale, but do not count on it when

  performing your own measurements. */


The rest two instructions on the same line, called 'ANE' and 'LXA'

($8B and $AB respectively) often give quite unpredictable results.

However, the most usual operation is to store ((A | #$ee) & X & #$nn)

to accumulator. Note that this does not work reliably in a real 64!

In the Commodore 128 the opcode $8B uses values 8C, CC, EE, and

occasionally 0C and 8E for the OR instead of EE,EF,FE and FF used in

the C64. With a C128 running at 2 MHz #$EE is always used.  Opcode $AB

does not cause this OR taking place on 8502 while 6510 always performs

it. Note that this behaviour depends on processor and/or video chip

revision.


Let's take a closer look at $8B (6510).


       A <- X & D & (A | VAL)


       where VAL comes from this table:


      X high   D high  D low   VAL

       even     even    ---    $EE (1)

       even     odd     ---    $EE

       odd      even    ---    $EE

       odd      odd      0     $EE

       odd      odd     not 0  $FE (2)


(1) If the bottom 2 bits of A are both 1, then the LSB of the result may

   be 0. The values of X and D are different every time I run the test.

   This appears to be very rare.

(2) VAL is $FE most of the time. Sometimes it is $EE - it seems to be random,

   not related to any of the data. This is much more common than (1).


 In decimal mode, VAL is usually $FE.


Two different functions have been discovered for LAX, opcode $AB. One

is A = X = ANE (see above) and the other, encountered with 6510 and

8502, is less complicated A = X = (A & #byte). However, according to

what is reported, the version altering only the lowest bits of each

nybble seems to be more common.


What happens, is that $AB loads a value into both A and X, ANDing the

low bit of each nybble with the corresponding bit of the old

A. However, there are exceptions. Sometimes the low bit is cleared

even when A contains a '1', and sometimes other bits are cleared. The

exceptions seem random (they change every time I run the test). Oops -

that was in decimal mode. Much the same with D=0.


What causes the randomness?  Probably it is that it is marginal logic

levels - when too much wired-anding goes on, some of the signals get

very close to the threshold. Perhaps we're seeing some of them step

over it. The low bit of each nybble is special, since it has to cope

with carry differently (remember decimal mode). We never see a '0'

turn into a '1'.


Since these instructions are unpredictable, they should not be used.


There is still very strange instruction left, the one named SHA/X/Y,

which is the only one with only indexed addressing modes. Actually,

the commands 'SHA', 'SHX' and 'SHY' are generated by the indexing

algorithm.


While using indexed addressing, effective address for page boundary

crossing is calculated as soon as possible so it does not slow down

operation. As a result, in the case of SHA/X/Y, the address and data

are processed at the same time making AND between them to take place.

Thus, the value to be stored by SAX, for example, is in fact (A & X &

(ADDR_HI + 1)).  On page boundary crossing the same value is copied

also to high byte of the effective address.


RELIGION_MODE_OFF



Register selection for load and store


  bit1 bit0     A  X  Y

   0    0             x

   0    1          x

   1    0       x

   1    1       x  x


So, A and X are selected by bits 1 and 0 respectively, while

~(bit1|bit0) enables Y.


Indexing is determined by bit4, even in relative addressing mode,

which is one kind of indexing.


Lines containing opcodes xxx000x1 (01 and 03) are treated as absolute

after the effective address has been loaded into CPU.


Zeropage,y and Absolute,y (codes 10x1 x11x) are distinquished by bit5.



Decimal mode in NMOS 6500 series


 Most sources claim that the NMOS 6500 series sets the N, V and Z

flags unpredictably when performing addition or subtraction in decimal

mode. Of course, this is not true. While testing how the flags are

set, I also wanted to see what happens if you use illegal BCD values.


 ADC works in Decimal mode in a quite complicated way. It is amazing

how it can do that all in a single cycle. Here's a C code version of

the instruction:


       unsigned

          A,  /* Accumulator */

          AL, /* low nybble of accumulator */

          AH, /* high nybble of accumulator */


          C,  /* Carry flag */

          Z,  /* Zero flag */

          V,  /* oVerflow flag */

          N,  /* Negative flag */


          s;  /* value to be added to Accumulator */


       AL = (A & 15) + (s & 15) + C;         /* Calculate the lower nybble. */


       AH = (A >> 4) + (s >> 4) + (AL > 15); /* Calculate the upper nybble. */


       if (AL > 9) AL += 6;                  /* BCD fixup for lower nybble. */


       Z = ((A + s + C) & 255 != 0);         /* Zero flag is set just

                                                like in Binary mode. */


       /* Negative and Overflow flags are set with the same logic than in

          Binary mode, but after fixing the lower nybble. */


       N = (AH & 8 != 0);

       V = ((AH << 4) ^ A) & 128 && !((A ^ s) & 128);


       if (AH > 9) AH += 6;                  /* BCD fixup for upper nybble. */


       /* Carry is the only flag set after fixing the result. */


       C = (AH > 15);

       A = ((AH << 4) | (AL & 15)) & 255;


 The C flag is set as the quiche eaters expect, but the N and V flags

are set after fixing the lower nybble but before fixing the upper one.

They use the same logic than binary mode ADC. The Z flag is set before

any BCD fixup, so the D flag does not have any influence on it.


Proof: The following test program tests all 131072 ADC combinations in

      Decimal mode, and aborts with BRK if anything breaks this theory.

      If everything goes well, it ends in RTS.


begin 600 dadc

M 0@9",D'GL(H-#,IJC(U-JS"*#0T*:HR-@   'BI&*  A/N$_$B@+)$KH(V1

M*Q@(I?PI#X7]I?LI#V7]R0J0 FD%J"D/A?VE^RGP9?PI\ C $) ":0^JL @H

ML ?)H) &""@X:5\X!?V%_0AH*3W@ ! ""8"HBD7[$ JE^T7\, 28"4"H**7[

M9?S0!)@) J@8N/BE^V7\V A%_= G:(3]1?W0(.;[T(?F_-"#:$D8\ )88*D=

0&&4KA?NI &4LA?RI.&S[  A%


end


 All programs in this chapter have been successfully tested on a Vic20

and a Commodore 64 and a Commodore 128D in C64 mode. They should run on

C16, +4 and on the PET series as well. If not, please report the problem

to Marko M"akel"a. Each test in this chapter should run in less than a

minute at 1 MHz.


SBC is much easier. Just like CMP, its flags are not affected by

the D flag.


Proof:


begin 600 dsbc-cmp-flags

M 0@9",D'GL(H-#,IJC(U-JS"*#0T*:HR-@   'B@ (3[A/RB XH8:66HL2N@

M09$KH$R1*XII::BQ*Z!%D2N@4)$K^#BXI?OE_-@(:(7].+BE^^7\"&A%_? !

5 .;[T./F_-#?RA"_8!@X&#CEY<7%


end


 The only difference in SBC's operation in decimal mode from binary mode

is the result-fixup:


       unsigned

          A,  /* Accumulator */

          AL, /* low nybble of accumulator */

          AH, /* high nybble of accumulator */


          C,  /* Carry flag */

          Z,  /* Zero flag */

          V,  /* oVerflow flag */

          N,  /* Negative flag */


          s;  /* value to be added to Accumulator */


       AL = (A & 15) - (s & 15) - !C;        /* Calculate the lower nybble. */


       if (AL & 16) AL -= 6;                 /* BCD fixup for lower nybble. */


       AH = (A >> 4) - (s >> 4) - (AL & 16); /* Calculate the upper nybble. */


       if (AH & 16) AH -= 6;                 /* BCD fixup for upper nybble. */


       /* The flags are set just like in Binary mode. */


       C = (A - s - !C) & 256 != 0;

       Z = (A - s - !C) & 255 != 0;

       V = ((A - s - !C) ^ s) & 128 && (A ^ s) & 128;

       N = (A - s - !C) & 128 != 0;


       A = ((AH << 4) | (AL & 15)) & 255;


 Again Z flag is set before any BCD fixup. The N and V flags are set

at any time before fixing the high nybble. The C flag may be set in any

phase.


 Decimal subtraction is easier than decimal addition, as you have to

make the BCD fixup only when a nybble overflows. In decimal addition,

you had to verify if the nybble was greater than 9. The processor has

an internal "half carry" flag for the lower nybble, used to trigger

the BCD fixup. When calculating with legal BCD values, the lower nybble

cannot overflow again when fixing it.

So, the processor does not handle overflows while performing the fixup.

Similarly, the BCD fixup occurs in the high nybble only if the value

overflows, i.e. when the C flag will be cleared.


 Because SBC's flags are not affected by the Decimal mode flag, you

could guess that CMP uses the SBC logic, only setting the C flag

first. But the SBX instruction shows that CMP also temporarily clears

the D flag, although it is totally unnecessary.


 The following program, which tests SBC's result and flags,

contains the 6502 version of the pseudo code example above.


begin 600 dsbc

M 0@9",D'GL(H-#,IJC(U-JS"*#0T*:HR-@   'BI&*  A/N$_$B@+)$KH':1

M*S@(I?PI#X7]I?LI#^7]L /I!1@I#ZBE_"GPA?VE^RGP"#CE_2GPL KI7RBP

M#ND/.+ )*+ &Z0^P NE?A/T%_87]*+BE^^7\"&BH.+CXI?OE_-@(1?W0FVB$

8_47]T)3F^]">YOS0FFA)&- $J3C0B%A@


end


 Obviously the undocumented instructions RRA (ROR+ADC) and ISB

(INC+SBC) have inherited also the decimal operation from the official

instructions ADC and SBC. The program droradc proves this statement

for ROR, and the dincsbc test proves this for ISB. Finally,

dincsbc-deccmp proves that ISB's and DCP's (DEC+CMP) flags are not

affected by the D flag.


begin 644 droradc

M`0@9",D'GL(H-#,IJC(U-JS"*#0T*:HR-@```'BI&*``A/N$_$B@+)$KH(V1

M*S@(I?PI#X7]I?LI#V7]R0J0`FD%J"D/A?VE^RGP9?PI\`C`$)`":0^JL`@H

ML`?)H)`&""@X:5\X!?V%_0AH*3W@`!`""8"HBD7[$`JE^T7\,`28"4"H**7[

M9?S0!)@)`J@XN/BE^R;\9_S8"$7]T"=HA/U%_=`@YOO0A>;\T(%H21CP`EA@

2J1T892N%^ZD`92R%_*DX;/L`

`

end


begin 644 dincsbc

M`0@9",D'GL(H-#,IJC(U-JS"*#0T*:HR-@```'BI&*``A/N$_$B@+)$KH':1

M*S@(I?PI#X7]I?LI#^7]L`/I!1@I#ZBE_"GPA?VE^RGP"#CE_2GPL`KI7RBP

M#ND/.+`)*+`&Z0^P`NE?A/T%_87]*+BE^^7\"&BH.+CXI?O&_.?\V`A%_="9

::(3]1?W0DN;[T)SF_-"8:$D8T`2I.-"&6&#\

`

end


begin 644 dincsbc-deccmp

M`0@9",D'GL(H-#,IJC(U-JS"*#0T*:HR-@```'B@`(3[A/RB`XH8:7>HL2N@

M3Y$KH%R1*XII>ZBQ*Z!3D2N@8)$KBFE_J+$KH%61*Z!BD2OX.+BE^^;\Q_S8

L"&B%_3BXI?OF_,?\"&A%_?`!`.;[T-_F_-#;RA"M8!@X&#CFYL;&Q\?GYP#8

`

end



6510 features


  o  PHP always pushes the Break (B) flag as a `1' to the stack.

     Jukka Tapanim"aki claimed in C=lehti issue 3/89, on page 27 that the

     processor makes a logical OR between the status register's bit 4

     and the bit 8 of the stack pointer register (which is always 1).

     He did not give any reasons for this argument, and has refused to clarify

     it afterwards. Well, this was not the only error in his article...


  o  Indirect addressing modes do not handle page boundary crossing at all.

     When the parameter's low byte is $FF, the effective address wraps

     around and the CPU fetches high byte from $xx00 instead of $xx00+$0100.

     E.g. JMP ($01FF) fetches PCL from $01FF and PCH from $0100,

     and LDA ($FF),Y fetches the base address from $FF and $00.


  o  Indexed zero page addressing modes never fix the page address on

     crossing the zero page boundary.

     E.g. LDX #$01 : LDA ($FF,X) loads the effective address from $00 and $01.


  o  The processor always fetches the byte following a relative branch

     instruction. If the branch is taken, the processor reads then the

     opcode from the destination address. If page boundary is crossed, it

     first reads a byte from the old page from a location that is bigger

     or smaller than the correct address by one page.


  o  If you cross a page boundary in any other indexed mode,

     the processor reads an incorrect location first, a location that is

     smaller by one page.


  o  Read-Modify-Write instructions write unmodified data, then modified

     (so INC effectively does LDX loc;STX loc;INX;STX loc)


  o  -RDY is ignored during writes

     (This is why you must wait 3 cycles before doing any DMA --

     the maximum number of consecutive writes is 3, which occurs

     during interrupts except -RESET.)


  o  Some undefined opcodes may give really unpredictable results.


  o  All registers except the Program Counter remain unmodified after -RESET.

     (This is why you must preset D and I flags in the RESET handler.)



Different CPU types


The Rockwell data booklet 29651N52 (technical information about R65C00

microprocessors, dated October 1984), lists the following differences between

NMOS R6502 microprocessor and CMOS R65C00 family:



    1. Indexed addressing across page boundary.

            NMOS: Extra read of invalid address.

            CMOS: Extra read of last instruction byte.



    2. Execution of invalid op codes.

            NMOS: Some terminate only by reset. Results are undefined.

            CMOS: All are NOPs (reserved for future use).



    3. Jump indirect, operand = XXFF.

            NMOS: Page address does not increment.

            CMOS: Page address increments and adds one additional cycle.



    4. Read/modify/write instructions at effective address.

            NMOS: One read and two write cycles.

            CMOS: Two read and one write cycle.



    5. Decimal flag.

            NMOS: Indeterminate after reset.

            CMOS: Initialized to binary mode (D=0) after reset and interrupts.



    6. Flags after decimal operation.

            NMOS: Invalid N, V and Z flags.

            CMOS: Valid flag adds one additional cycle.



    7. Interrupt after fetch of BRK instruction.

            NMOS: Interrupt vector is loaded, BRK vector is ignored.

            CMOS: BRK is executed, then interrupt is executed.



6510 Instruction Timing


 The NMOS 6500 series processors always perform at least two reads

for each instruction. In addition to the operation code (opcode), they

fetch the next byte. This is quite efficient, as most instructions are

two or three bytes long.


 The processors also use a sort of pipelining. If an instruction does

not store data in memory on its last cycle, the processor can fetch

the opcode of the next instruction while executing the last cycle. For

instance, the instruction EOR #$FF truly takes three cycles. On the

first cycle, the opcode $49 will be fetched. During the second cycle

the processor decodes the opcode and fetches the parameter #$FF. On

the third cycle, the processor will perform the operation and store

the result to accumulator, but simultaneously it fetches the opcode

for the next instruction. This is why the instruction effectively

takes only two cycles.


 The following tables show what happens on the bus while executing

different kinds of instructions.


 Interrupts


    NMI and IRQ both take 7 cycles. Their timing diagram is much like

    BRK's (see below). IRQ will be executed only when the I flag is

    clear. IRQ and BRK both set the I flag, whereas the NMI does not

    affect its state.


    The processor will usually wait for the current instruction to

    complete before executing the interrupt sequence. To process the

    interrupt before the next instruction, the interrupt must occur

    before the last cycle of the current instruction.


    There is one exception to this rule: the BRK instruction. If a

    hardware interrupt (NMI or IRQ) occurs before the fourth (flags

    saving) cycle of BRK, the BRK instruction will be skipped, and

    the processor will jump to the hardware interrupt vector. This

    sequence will always take 7 cycles.


    You do not completely lose the BRK interrupt, the B flag will be

    set in the pushed status register if a BRK instruction gets

    interrupted. When BRK and IRQ occur at the same time, this does

    not cause any problems, as your program will consider it as a

    BRK, and the IRQ would occur again after the processor returned

    from your BRK routine, unless you cleared the interrupt source in

    your BRK handler. But the simultaneous occurrence of NMI and BRK

    is far more fatal. If you do not check the B flag in the NMI

    routine and subtract two from the return address when needed, the

    BRK instruction will be skipped.


    If the NMI and IRQ interrupts overlap each other (one interrupt

    occurs before fetching the interrupt vector for the other

    interrupt), the processor will most probably jump to the NMI

    vector in every case, and then jump to the IRQ vector after

    processing the first instruction of the NMI handler. This has not

    been measured yet, but the IRQ is very similar to BRK, and many

    sources state that the NMI has higher priority than IRQ. However,

    it might be that the processor takes the interrupt that comes

    later, i.e. you could lose an NMI interrupt if an IRQ occurred in

    four cycles after it.


    After finishing the interrupt sequence, the processor will start

    to execute the first instruction of the interrupt routine. This

    proves that the processor uses a sort of pipelining: it finishes

    the current instruction (or interrupt sequence) while reading the

    opcode of the next instruction.


    RESET does not push program counter on stack, and it lasts

    probably 6 cycles after deactivating the signal. Like NMI, RESET

    preserves all registers except PC.


 Instructions accessing the stack


    BRK


       #  address R/W description

      --- ------- --- -----------------------------------------------

       1    PC     R  fetch opcode, increment PC

       2    PC     R  read next instruction byte (and throw it away),

                      increment PC

       3  $0100,S  W  push PCH on stack (with B flag set), decrement S

       4  $0100,S  W  push PCL on stack, decrement S

       5  $0100,S  W  push P on stack, decrement S

       6   $FFFE   R  fetch PCL

       7   $FFFF   R  fetch PCH


    RTI


       #  address R/W description

      --- ------- --- -----------------------------------------------

       1    PC     R  fetch opcode, increment PC

       2    PC     R  read next instruction byte (and throw it away)

       3  $0100,S  R  increment S

       4  $0100,S  R  pull P from stack, increment S

       5  $0100,S  R  pull PCL from stack, increment S

       6  $0100,S  R  pull PCH from stack


    RTS


       #  address R/W description

      --- ------- --- -----------------------------------------------

       1    PC     R  fetch opcode, increment PC

       2    PC     R  read next instruction byte (and throw it away)

       3  $0100,S  R  increment S

       4  $0100,S  R  pull PCL from stack, increment S

       5  $0100,S  R  pull PCH from stack

       6    PC     R  increment PC


    PHA, PHP


       #  address R/W description

      --- ------- --- -----------------------------------------------

       1    PC     R  fetch opcode, increment PC

       2    PC     R  read next instruction byte (and throw it away)

       3  $0100,S  W  push register on stack, decrement S


    PLA, PLP


       #  address R/W description

      --- ------- --- -----------------------------------------------

       1    PC     R  fetch opcode, increment PC

       2    PC     R  read next instruction byte (and throw it away)

       3  $0100,S  R  increment S

       4  $0100,S  R  pull register from stack


    JSR


       #  address R/W description

      --- ------- --- -------------------------------------------------

       1    PC     R  fetch opcode, increment PC

       2    PC     R  fetch low address byte, increment PC

       3  $0100,S  R  internal operation (predecrement S?)

       4  $0100,S  W  push PCH on stack, decrement S

       5  $0100,S  W  push PCL on stack, decrement S

       6    PC     R  copy low address byte to PCL, fetch high address

                      byte to PCH


 Accumulator or implied addressing


       #  address R/W description

      --- ------- --- -----------------------------------------------

       1    PC     R  fetch opcode, increment PC

       2    PC     R  read next instruction byte (and throw it away)


 Immediate addressing


       #  address R/W description

      --- ------- --- ------------------------------------------

       1    PC     R  fetch opcode, increment PC

       2    PC     R  fetch value, increment PC


 Absolute addressing


    JMP


       #  address R/W description

      --- ------- --- -------------------------------------------------

       1    PC     R  fetch opcode, increment PC

       2    PC     R  fetch low address byte, increment PC

       3    PC     R  copy low address byte to PCL, fetch high address

                      byte to PCH


    Read instructions (LDA, LDX, LDY, EOR, AND, ORA, ADC, SBC, CMP, BIT,

                       LAX, NOP)


       #  address R/W description

      --- ------- --- ------------------------------------------

       1    PC     R  fetch opcode, increment PC

       2    PC     R  fetch low byte of address, increment PC

       3    PC     R  fetch high byte of address, increment PC

       4  address  R  read from effective address


    Read-Modify-Write instructions (ASL, LSR, ROL, ROR, INC, DEC,

                                    SLO, SRE, RLA, RRA, ISB, DCP)


       #  address R/W description

      --- ------- --- ------------------------------------------

       1    PC     R  fetch opcode, increment PC

       2    PC     R  fetch low byte of address, increment PC

       3    PC     R  fetch high byte of address, increment PC

       4  address  R  read from effective address

       5  address  W  write the value back to effective address,

                      and do the operation on it

       6  address  W  write the new value to effective address


    Write instructions (STA, STX, STY, SAX)


       #  address R/W description

      --- ------- --- ------------------------------------------

       1    PC     R  fetch opcode, increment PC

       2    PC     R  fetch low byte of address, increment PC

       3    PC     R  fetch high byte of address, increment PC

       4  address  W  write register to effective address


 Zero page addressing


    Read instructions (LDA, LDX, LDY, EOR, AND, ORA, ADC, SBC, CMP, BIT,

                       LAX, NOP)


       #  address R/W description

      --- ------- --- ------------------------------------------

       1    PC     R  fetch opcode, increment PC

       2    PC     R  fetch address, increment PC

       3  address  R  read from effective address


    Read-Modify-Write instructions (ASL, LSR, ROL, ROR, INC, DEC,

                                    SLO, SRE, RLA, RRA, ISB, DCP)


       #  address R/W description

      --- ------- --- ------------------------------------------

       1    PC     R  fetch opcode, increment PC

       2    PC     R  fetch address, increment PC

       3  address  R  read from effective address

       4  address  W  write the value back to effective address,

                      and do the operation on it

       5  address  W  write the new value to effective address


    Write instructions (STA, STX, STY, SAX)


       #  address R/W description

      --- ------- --- ------------------------------------------

       1    PC     R  fetch opcode, increment PC

       2    PC     R  fetch address, increment PC

       3  address  W  write register to effective address


 Zero page indexed addressing


    Read instructions (LDA, LDX, LDY, EOR, AND, ORA, ADC, SBC, CMP, BIT,

                       LAX, NOP)


       #   address  R/W description

      --- --------- --- ------------------------------------------

       1     PC      R  fetch opcode, increment PC

       2     PC      R  fetch address, increment PC

       3   address   R  read from address, add index register to it

       4  address+I* R  read from effective address


      Notes: I denotes either index register (X or Y).


             * The high byte of the effective address is always zero,

               i.e. page boundary crossings are not handled.


    Read-Modify-Write instructions (ASL, LSR, ROL, ROR, INC, DEC,

                                    SLO, SRE, RLA, RRA, ISB, DCP)


       #   address  R/W description

      --- --------- --- ---------------------------------------------

       1     PC      R  fetch opcode, increment PC

       2     PC      R  fetch address, increment PC

       3   address   R  read from address, add index register X to it

       4  address+X* R  read from effective address

       5  address+X* W  write the value back to effective address,

                        and do the operation on it

       6  address+X* W  write the new value to effective address


      Note: * The high byte of the effective address is always zero,

              i.e. page boundary crossings are not handled.


    Write instructions (STA, STX, STY, SAX)


       #   address  R/W description

      --- --------- --- -------------------------------------------

       1     PC      R  fetch opcode, increment PC

       2     PC      R  fetch address, increment PC

       3   address   R  read from address, add index register to it

       4  address+I* W  write to effective address


      Notes: I denotes either index register (X or Y).


             * The high byte of the effective address is always zero,

               i.e. page boundary crossings are not handled.


 Absolute indexed addressing


    Read instructions (LDA, LDX, LDY, EOR, AND, ORA, ADC, SBC, CMP, BIT,

                       LAX, LAE, SHS, NOP)


       #   address  R/W description

      --- --------- --- ------------------------------------------

       1     PC      R  fetch opcode, increment PC

       2     PC      R  fetch low byte of address, increment PC

       3     PC      R  fetch high byte of address,

                        add index register to low address byte,

                        increment PC

       4  address+I* R  read from effective address,

                        fix the high byte of effective address

       5+ address+I  R  re-read from effective address


      Notes: I denotes either index register (X or Y).


             * The high byte of the effective address may be invalid

               at this time, i.e. it may be smaller by $100.


             + This cycle will be executed only if the effective address

               was invalid during cycle #4, i.e. page boundary was crossed.


    Read-Modify-Write instructions (ASL, LSR, ROL, ROR, INC, DEC,

                                    SLO, SRE, RLA, RRA, ISB, DCP)


       #   address  R/W description

      --- --------- --- ------------------------------------------

       1    PC       R  fetch opcode, increment PC

       2    PC       R  fetch low byte of address, increment PC

       3    PC       R  fetch high byte of address,

                        add index register X to low address byte,

                        increment PC

       4  address+X* R  read from effective address,

                        fix the high byte of effective address

       5  address+X  R  re-read from effective address

       6  address+X  W  write the value back to effective address,

                        and do the operation on it

       7  address+X  W  write the new value to effective address


      Notes: * The high byte of the effective address may be invalid

               at this time, i.e. it may be smaller by $100.


    Write instructions (STA, STX, STY, SHA, SHX, SHY)


       #   address  R/W description

      --- --------- --- ------------------------------------------

       1     PC      R  fetch opcode, increment PC

       2     PC      R  fetch low byte of address, increment PC

       3     PC      R  fetch high byte of address,

                        add index register to low address byte,

                        increment PC

       4  address+I* R  read from effective address,

                        fix the high byte of effective address

       5  address+I  W  write to effective address


      Notes: I denotes either index register (X or Y).


             * The high byte of the effective address may be invalid

               at this time, i.e. it may be smaller by $100. Because

               the processor cannot undo a write to an invalid

               address, it always reads from the address first.


 Relative addressing (BCC, BCS, BNE, BEQ, BPL, BMI, BVC, BVS)


       #   address  R/W description

      --- --------- --- ---------------------------------------------

       1     PC      R  fetch opcode, increment PC

       2     PC      R  fetch operand, increment PC

       3     PC      R  Fetch opcode of next instruction,

                        If branch is taken, add operand to PCL.

                        Otherwise increment PC.

       4+    PC*     R  Fetch opcode of next instruction.

                        Fix PCH. If it did not change, increment PC.

       5!    PC      R  Fetch opcode of next instruction,

                        increment PC.


      Notes: The opcode fetch of the next instruction is included to

             this diagram for illustration purposes. When determining

             real execution times, remember to subtract the last

             cycle.


             * The high byte of Program Counter (PCH) may be invalid

               at this time, i.e. it may be smaller or bigger by $100.


             + If branch is taken, this cycle will be executed.


             ! If branch occurs to different page, this cycle will be

               executed.


 Indexed indirect addressing


    Read instructions (LDA, ORA, EOR, AND, ADC, CMP, SBC, LAX)


       #    address   R/W description

      --- ----------- --- ------------------------------------------

       1      PC       R  fetch opcode, increment PC

       2      PC       R  fetch pointer address, increment PC

       3    pointer    R  read from the address, add X to it

       4   pointer+X   R  fetch effective address low

       5  pointer+X+1  R  fetch effective address high

       6    address    R  read from effective address


      Note: The effective address is always fetched from zero page,

            i.e. the zero page boundary crossing is not handled.


    Read-Modify-Write instructions (SLO, SRE, RLA, RRA, ISB, DCP)


       #    address   R/W description

      --- ----------- --- ------------------------------------------

       1      PC       R  fetch opcode, increment PC

       2      PC       R  fetch pointer address, increment PC

       3    pointer    R  read from the address, add X to it

       4   pointer+X   R  fetch effective address low

       5  pointer+X+1  R  fetch effective address high

       6    address    R  read from effective address

       7    address    W  write the value back to effective address,

                          and do the operation on it

       8    address    W  write the new value to effective address


      Note: The effective address is always fetched from zero page,

            i.e. the zero page boundary crossing is not handled.


    Write instructions (STA, SAX)


       #    address   R/W description

      --- ----------- --- ------------------------------------------

       1      PC       R  fetch opcode, increment PC

       2      PC       R  fetch pointer address, increment PC

       3    pointer    R  read from the address, add X to it

       4   pointer+X   R  fetch effective address low

       5  pointer+X+1  R  fetch effective address high

       6    address    W  write to effective address


      Note: The effective address is always fetched from zero page,

            i.e. the zero page boundary crossing is not handled.


 Indirect indexed addressing


    Read instructions (LDA, EOR, AND, ORA, ADC, SBC, CMP)


       #    address   R/W description

      --- ----------- --- ------------------------------------------

       1      PC       R  fetch opcode, increment PC

       2      PC       R  fetch pointer address, increment PC

       3    pointer    R  fetch effective address low

       4   pointer+1   R  fetch effective address high,

                          add Y to low byte of effective address

       5   address+Y*  R  read from effective address,

                          fix high byte of effective address

       6+  address+Y   R  read from effective address


      Notes: The effective address is always fetched from zero page,

             i.e. the zero page boundary crossing is not handled.


             * The high byte of the effective address may be invalid

               at this time, i.e. it may be smaller by $100.


             + This cycle will be executed only if the effective address

               was invalid during cycle #5, i.e. page boundary was crossed.


    Read-Modify-Write instructions (SLO, SRE, RLA, RRA, ISB, DCP)


       #    address   R/W description

      --- ----------- --- ------------------------------------------

       1      PC       R  fetch opcode, increment PC

       2      PC       R  fetch pointer address, increment PC

       3    pointer    R  fetch effective address low

       4   pointer+1   R  fetch effective address high,

                          add Y to low byte of effective address

       5   address+Y*  R  read from effective address,

                          fix high byte of effective address

       6   address+Y   R  read from effective address

       7   address+Y   W  write the value back to effective address,

                          and do the operation on it

       8   address+Y   W  write the new value to effective address


      Notes: The effective address is always fetched from zero page,

             i.e. the zero page boundary crossing is not handled.


             * The high byte of the effective address may be invalid

               at this time, i.e. it may be smaller by $100.


    Write instructions (STA, SHA)


       #    address   R/W description

      --- ----------- --- ------------------------------------------

       1      PC       R  fetch opcode, increment PC

       2      PC       R  fetch pointer address, increment PC

       3    pointer    R  fetch effective address low

       4   pointer+1   R  fetch effective address high,

                          add Y to low byte of effective address

       5   address+Y*  R  read from effective address,

                          fix high byte of effective address

       6   address+Y   W  write to effective address


      Notes: The effective address is always fetched from zero page,

             i.e. the zero page boundary crossing is not handled.


             * The high byte of the effective address may be invalid

               at this time, i.e. it may be smaller by $100.


 Absolute indirect addressing (JMP)


       #   address  R/W description

      --- --------- --- ------------------------------------------

       1     PC      R  fetch opcode, increment PC

       2     PC      R  fetch pointer address low, increment PC

       3     PC      R  fetch pointer address high, increment PC

       4   pointer   R  fetch low address to latch

       5  pointer+1* R  fetch PCH, copy latch to PCL


      Note: * The PCH will always be fetched from the same page

              than PCL, i.e. page boundary crossing is not handled.


               How Real Programmers Acknowledge Interrupts


 With RMW instructions:


       ; beginning of combined raster/timer interrupt routine

       LSR $D019       ; clear VIC interrupts, read raster interrupt flag to C

       BCS raster      ; jump if VIC caused an interrupt

       ...             ; timer interrupt routine


       Operational diagram of LSR $D019:


         #  data  address  R/W

        --- ----  -------  ---  ---------------------------------

         1   4E     PC      R   fetch opcode

         2   19    PC+1     R   fetch address low

         3   D0    PC+2     R   fetch address high

         4   xx    $D019    R   read memory

         5   xx    $D019    W   write the value back, rotate right

         6  xx/2   $D019    W   write the new value back


       The 5th cycle acknowledges the interrupt by writing the same

       value back. If only raster interrupts are used, the 6th cycle

       has no effect on the VIC. (It might acknowledge also some

       other interrupts.)


 With indexed addressing:


       ; acknowledge interrupts to both CIAs

       LDX #$10

       LDA $DCFD,X


       Operational diagram of LDA $DCFD,X:


         #  data  address  R/W  description

        --- ----  -------  ---  ---------------------------------

         1   BD     PC      R   fetch opcode

         2   FD    PC+1     R   fetch address low

         3   DC    PC+2     R   fetch address high, add X to address low

         4   xx    $DC0D    R   read from address, fix high byte of address

         5   yy    $DD0D    R   read from right address


       ; acknowledge interrupts to CIA 2

       LDX #$10

       STA $DDFD,X


       Operational diagram of STA $DDFD,X:


         #  data  address  R/W  description

        --- ----  -------  ---  ---------------------------------

         1   9D     PC      R   fetch opcode

         2   FD    PC+1     R   fetch address low

         3   DC    PC+2     R   fetch address high, add X to address low

         4   xx    $DD0D    R   read from address, fix high byte of address

         5   ac    $DE0D    W   write to right address


 With branch instructions:


       ; acknowledge interrupts to CIA 2

               LDA #$00  ; clear N flag

               JMP $DD0A

       DD0A    BPL $DC9D ; branch

       DC9D    BRK       ; return


       You need the following preparations to initialize the CIA registers:


               LDA #$91  ; argument of BPL

               STA $DD0B

               LDA #$10  ; BPL

               STA $DD0A

               STA $DD08 ; load the ToD values from the latches

               LDA $DD0B ; freeze the ToD display

               LDA #$7F

               STA $DC0D ; assure that $DC0D is $00


       Operational diagram of BPL $DC9D:


         #  data  address  R/W  description

        --- ----  -------  ---  ---------------------------------

         1   10    $DD0A    R   fetch opcode

         2   91    $DD0B    R   fetch argument

         3   xx    $DD0C    R   fetch opcode, add argument to PCL

         4   yy    $DD9D    R   fetch opcode, fix PCH

       ( 5   00    $DC9D    R   fetch opcode )


       ; acknowledge interrupts to CIA 1

               LSR       ; clear N flag

               JMP $DCFA

       DCFA    BPL $DD0D

       DD0D    BRK


       ; Again you need to set the ToD registers of CIA 1 and the

       ; Interrupt Control Register of CIA 2 first.


       Operational diagram of BPL $DD0D:


         #  data  address  R/W  description

        --- ----  -------  ---  ---------------------------------

         1   10    $DCFA    R   fetch opcode

         2   11    $DCFB    R   fetch argument

         3   xx    $DCFC    R   fetch opcode, add argument to PCL

         4   yy    $DC0D    R   fetch opcode, fix PCH

       ( 5   00    $DD0D    R   fetch opcode )


       ; acknowledge interrupts to CIA 2 automagically

               ; preparations

               LDA #$7F

               STA $DD0D       ; disable all interrupt sources of CIA2

               LDA $DD0E

               AND #$BE        ; ensure that $DD0C remains constant

               STA $DD0E       ; and stop the timer

               LDA #$FD

               STA $DD0C       ; parameter of BPL

               LDA #$10

               STA $DD0B       ; BPL

               LDA #$40

               STA $DD0A       ; RTI/parameter of LSR

               LDA #$46

               STA $DD09       ; LSR

               STA $DD08       ; load the ToD values from the latches

               LDA $DD0B       ; freeze the ToD display

               LDA #$09

               STA $0318

               LDA #$DD

               STA $0319       ; change NMI vector to $DD09

               LDA #$FF        ; Try changing this instruction's operand

               STA $DD05       ; (see comment below).

               LDA #$FF

               STA $DD04       ; set interrupt frequency to 1/65536 cycles

               LDA $DD0E

               AND #$80

               ORA #$11

               LDX #$81

               STX $DD0D       ; enable timer interrupt

               STA $DD0E       ; start timer


               LDA #$00        ; To see that the interrupts really occur,

               STA $D011       ; use something like this and see how

       LOOP    DEC $D020       ; changing the byte loaded to $DD05 from

               BNE LOOP        ; #$FF to #$0F changes the image.


       When an NMI occurs, the processor jumps to Kernal code, which jumps to

       ($0318), which points to the following routine:


       DD09    LSR $40         ; clear N flag

               BPL $DD0A       ; Note: $DD0A contains RTI.


       Operational diagram of BPL $DD0A:


         #  data  address  R/W  description

        --- ----  -------  ---  ---------------------------------

         1   10    $DD0B    R   fetch opcode

         2   11    $DD0C    R   fetch argument

         3   xx    $DD0D    R   fetch opcode, add argument to PCL

         4   40    $DD0A    R   fetch opcode, (fix PCH)


 With RTI:


       ; the fastest possible interrupt handler in the 6500 family

               ; preparations

               SEI

               LDA $01         ; disable ROM and enable I/O

               AND #$FD

               ORA #$05

               STA $01

               LDA #$7F

               STA $DD0D       ; disable CIA 2's all interrupt sources

               LDA $DD0E

               AND #$BE        ; ensure that $DD0C remains constant

               STA $DD0E       ; and stop the timer

               LDA #$40

               STA $DD0C       ; store RTI to $DD0C

               LDA #$0C

               STA $FFFA

               LDA #$DD

               STA $FFFB       ; change NMI vector to $DD0C

               LDA #$FF        ; Try changing this instruction's operand

               STA $DD05       ; (see comment below).

               LDA #$FF

               STA $DD04       ; set interrupt frequency to 1/65536 cycles

               LDA $DD0E

               AND #$80

               ORA #$11

               LDX #$81

               STX $DD0D       ; enable timer interrupt

               STA $DD0E       ; start timer


               LDA #$00        ; To see that the interrupts really occur,

               STA $D011       ; use something like this and see how

       LOOP    DEC $D020       ; changing the byte loaded to $DD05 from

               BNE LOOP        ; #$FF to #$0F changes the image.


       When an NMI occurs, the processor jumps to Kernal code, which

       jumps to ($0318), which points to the following routine:


       DD0C    RTI


       How on earth can this clear the interrupts? Remember, the

       processor always fetches two successive bytes for each

       instruction.


       A little more practical version of this is redirecting the NMI

       (or IRQ) to your own routine, whose last instruction is JMP

       $DD0C or JMP $DC0C.  If you want to confuse more, change the 0

       in the address to a hexadecimal digit different from the one

       you used when writing the RTI.


       Or you can combine the latter two methods:


       DD09    LSR $xx  ; xx is any appropriate BCD value 00-59.

               BPL $DCFC

       DCFC    RTI


       This example acknowledges interrupts to both CIAs.


 If you want to confuse the examiners of your code, you can use any

of these techniques. Although these examples use no undefined opcodes,

they do not necessarily run correctly on CMOS processors. However, the

RTI example should run on 65C02 and 65C816, and the latter branch

instruction example might work as well.


 The RMW instruction method has been used in some demos, others were

developed by Marko M"akel"a. His favourite is the automagical RTI

method, although it does not have any practical applications, except

for some time dependent data decryption routines for very complicated

copy protections.




Created with the Personal Edition of HelpNDoc: Full featured multi-format Help generator