Comp200 lecture -- assembly language

Comp 200 Lecture
assembly language

00.Mar.12-17

Assembly language

-------------
00.mar.13 (mon)

We will transition 
from a high-level language (eg scheme) to a low-level language:
one concerned with running on an actual machine.

Consider the problem of adding 1 + 2 + 3 + ... + N.
We have seen how we could write this in scheme.
  (Even ignoring the fact that there's a formula for this:
   pair terms from teh front and back, or see the pyramid structure)
Here's a high-level language program in a different language (as in book);
it uses the idea of *variables* (their value can vary over time):

;; Given N, return 1+2+3+...+N
;; We do this w/o recursion, but instead using variables:
;; keep track of the sumSoFar, *and* 
;; keep track of i, the next number we will add to our sumSoFar.
;;
sum(N) {
  sumSoFar <-- 0
  i <-- 1
  while (i <= N)
    sumSoFar <-- sumSoFar + i
    i <-- i+1
  return sumSoFar
  }

Variables are like mail-slots which can hold one piece of paper;
when a new paper is shoved in, the old one is lost on the floor.
(This is what "<--" means, pronounced "gets"; you may have seen
languages that abuse the symbol "=" for this, with strange looking
statements "i = i + 1", which looks clearly false for any value of i.
The arrow is used to indicate which direction information is flowing.)


Okay, forget that high-level bit for now; that was to motivate
the idea of variables and assignment.
Now we're going to switch to a low-level view:

So what are machines like?
They have two main parts:
   cpu, memory.
(plus input and output)

The Memory is just a large bank of variables (mailboxes),
which are named "1", "2", "3", ..., "64Meg".

Cpu has one local memory slot, the accumulator ("ACC").
Cpu can do simple instructions: 
  Move between ACC and a memory location (store, load)
  add contents of memory to accumulator (putting results back in ACC),
  Test if ACC is 0 and if so jump to different part of the program.
    (To be discussed further, later).


For the exact list of what the Jelly2000 machine can do, see
the Jelly2000 
reference sheet.



The high-level statement "X <-- X + Y" actually takes work:
  Suppose X and Y stored in memory locations 234, 567 resp.
  Then we have to do three things:
       (a) move X into ACC
       (b) add Y to ACC, with answer stored back to ACC,
       (c) move ACC into Y
  These are the "memory load/store" and "arithmetic" instructions
  from the sheet.  Looking at the third column ("meaning") we can
  find instructions with the meaning we want.  Thus:
       (ldm 234)   ; (a) move X into ACC
       (add 567)   ; (b) add Y to ACC, with answer stored back to ACC,
       (stm 234)   ; (c) move ACC into Y
  Ignore the first column of the reference sheet, for now.
  Also, ignore how X and Y got into those particular memory locations.

 
-------------
00.mar.15 (wed)
99.mar.29 (mon)
We saw last time, we saw the concept of (assigning to) variables, and also 
a series of three assembly instructions for the high-level "X <-- X + Y".


Hey, how did 1152 get stored as the previous value for X, anyway?
Exercise:
Write two assembly instructions which place the value "1152" into
memory location 234.
Hint: You *have* to look at the Jelly2000 reference sheet;
try looking for at the "memory load/store" part.
We already know that if we could get 1152 to be in the ACCumulator,
then we could use the stm instruction, as seen last time.


We're about to do an exercise; here's what to ignore:
just remember that the cpu can compare if the ACC is equal to 0
(or, if we look at the "control flow" part of the sheet,
if it's bigger-than 0, or ...), and if so jump to some other part
of the program.
(This involves PC, which we haven't talked about yet; ignore PC for now.)
How would we say "if X > Y then jump to [line # 7 of the program]"?
Note that we have to put X-Y in the accumulator, and then test if
that's bigger than 0:
  (ldm 234)    ; ACC <-- mem[234]        or ACC <-- X
  (sub 567)    ; ACC <-- ACC - mem[567]  or ACC <-- ACC - Y,  which is X-Y
  (blz line#7)
Well, we don't literally say "line#7"; we'll see what to put there later.


Exercise:
Okay, to work on yourself, in groups of 2:
Recall our high-level program from last time, for sum(N).
sum(N)
  sumSoFar <-- 0
  i <-- 1
  while (i <= N)
    sumSoFar <-- sumSoFar + i
    i <-- i+1
  return sumSoFar

We'll write an equivalent program, in assembly!
First, a few notions: the input "N" will be in location 502;
the "returned" value will be printed out.
The rest is up to us!

First thing to decide: hey, the high-level program used
variables i,sumSoFar.  However, in our assembly world,
all we have available to us is numbered memory locations (and ACC).
So, let's use location 500 for sumSoFar, and 501 for i.

Second thing: we have something new in this program, the while-loop.  
When writing this in assembly, we'll have to make use of the fact
that each line is giong to have a line-number, and we can
use commands like "jmpi" ("jump immediate") to immediately go to
a different line number, and "bgz" ("branch-greater-zero") to
go to a different line number depending on how the ACC compares to 0.

Thus we translate the while-loop to mean 
"if *not* (i <= N), then branch to (some line number near) the 
end of the program"; and
"go back to the start of the while-loop and test the condition again" will
become, in assembly, "jump back to (the line number at) the start of the loop"

... Okay, work out the problem: translate the above program into
    assembly instruction! ...

The solution is on the back of the Jelly reference sheet.


---------
00.mar.17 (fri, st patrick)
99.mar.31 (wed)

  Topic: Jelly2000: encoding instructions as numbers,
         the truth about line numbers; what the cpu really does.
         branch instructions;

If a computer is just cpu and memory, where does the program sit?
Yes, in memory.  In fact, jelly2000 instructions don't quite 
have line-numbers; rather they reside in a memory location.
So instead of the instruction "(ldm 500)" having line number (say) 497,
actually it's mem[497] which contains the command "(ldm 500)".

In fact, it's a bit worse: the only thing you can store in memory
locations is a 7-digit number (positive or negative).
So we must encode these instructions, as numbers.

(Encode vs encrypt: the latter implies secrecy; the former is
just changing one format to another: e.g., the sound of my voice
is encoded into electrical impulses over the phone system, then decoded
at the other end.)

How to encode instructions as numbers?  This is what the first column
is, on the jelly reference sheet.  So we look at (ldm <addr>),
and see that the encoding column for ldm says "<addr> 1 1".
Thus (ldm 500) becomes 50011.  Note that we're not thinking of this
as fifty-thousand-and-eleven, as much as we're thinking 5-followed-by-
zero-followed-by-....

Okay, now that this explains the encoding, we're ready to see what
really goes on in the cpu.  There's a little leprechaun in there,
which does the following, over and over and over:
  1. Fetch the value stored in mem[PC].
  2. PC <-- PC + 1
  3. Decode the fetched value (using reference sheet)
  4. Execute the (meaning of) the decoded instruction.
(This list of four items is also found on the back of the reference sheet.)

Example:
Example program:
Recall the program which did "x <-- x + y";
(assume that x is 24302, and y is 39.)
  ;; x stored in mem[500], currently this is 24302
  ;; y stored in mem[505], currently this is 39.
  mem[497]:    50011  (ldm 500)
  mem[498]:    50510  (add 505)
  mem[499]:    50031  (stm 500)

Is this program complete?  Unfortunately, not quite.
What is the next command executed?  the one one in mem[500].
this happens "(mov 243)", which is mem[acc] <-- mem[243].
This is a probably a bug -- the value of x was data, and not intended
to be decoded as a instruction!  
When this happens, it leads to the Blue Screen of Death,
or a bomb icon, with the happy message "illegal instruction".

Note that programs are in memory merely as numbers; so is data.
It's up to the programmer not to confuse them!














------------
99.apr.02 (fri) -- spring recess
------------
99.apr.05 (mon)

You could write self-modifying code.
core-wars!
Imagine further, core-wars with a "split" command, so the cpu can
  copy code to location 621, then say "spawn off a program at location 621"
Comp 200 Lecture assembly language

00.Mar.12-17

Comp 200 Lecture
assembly language