Intro to Assembly and a Brainfuck Interpreter

LD R0,BLOGPOST —If you’ve listened to March 2nd’s Geeknights, you’ve heard Scott discuss how a high level code is either complied or interpreted into Assembly or machine code.

Well, if you would like to delve further into this subject matter, here is a small lecture which will go over the basics of computing (hastily I’ll admit) and later on we will cover a simple interpreter for the Brainfuck language. The lecture covers a lot of material within a small amount of text, so I have glossed over a lot of information. If you wish, I can delve deeper into any topics you would like to know. This will be a multipost topic within this thread, hopefully I can bang it out in two posts.

I have also assumed that you have at least listened to the Geeknights shows on computers and programming or have a basic knowledge of how computers work which will provide a decent background in which to understand the material. If you would like to listen to shows, try starting here.

With this little project, you will:

  • Learn about the computing process in detail
  • Learn Brainfuck ^__^
  • Understand how assembly works with a processor (The LC-3)
  • Get a basic overview of a simple computer.
  • Impress your friends with awesome knowledge

Alright, let’s begin. First off, let’s go over the scope of the computer hierarchy. Here is a basic flow chart of computing system:

Problem->Algorithm->Language->Machine Architecture->Micro-Architecture->Circuits->Devices

Problem– This is what we wish to do with our computer. What is the task we wish to perform with the computer? Do we want to play a video game, write a paper, or perform calculations? Having a solid grasp of the problem allows us to find a solution to the problem in the quickest and most efficient manner. This stage is the furthest most computer users will ever utilize and will never delve deeper into the computing hierarchy.

Algorithm– The algorithm is the basic overview of the solution to our problem. It is what allows us to transform the solution to the problem from a natural description to some series of steps that progress towards termination and a final solution. So, do we need a program which can take in input from our keyboard and save it in a separate file? Do we need a three dimensional world which can simulate realistic physics? Or do we need a mathematical formula to solve a complex equation? With the correct algorithm we can solve our problem quickly and correctly every time.

Language– This is the level most programmers spend their time. The language consists of any high-level programming language in which one uses to implement their algorithm. There are hundreds of languages out there today, all of which allow you to solve different kinds of problems. Popular examples today are Java, C++, Python, Lisp, C, and the one we will be using today, Brainfuck. These languages are often complex, robust, and allow the programmer to implement their algorithm into a digital form or a series of steps that the computer can follow to create reproducible results.

Machine Architecture or ISA (Instruction Set Architecture)– This is the assembly language which takes the higher level language and transforms it into instructions which the hardware can interpret and operate with. The two ways that high level languages can be converted into Machine Architecture code are compliers or interpreters. One of the most famous ISA’s is the x86, which most of the chips on the market today are based on. Both AMD and Intel use x86 in their processors today. The ISA is made up of various Opcodes, or instructions for the processor. Because the LC-3 is a 16-bit processor, the opcodes are 16 bits long. Let’s say you have a 32 bit processor in your computer right now, it’s opcodes would be 32 bits long. A 64 bit processor would have 64 bit opcodes.

Micro-Architecture
– This is the level that most people think of when they envision a computer. You processor, motherboard, RAM, Video Card, Hard Drive, and input devices such as mice and keyboards fall in this category. This level transforms the Machine Architecture into an actual implementation.

Circuits– The circuit level describes physical construction of each of the components of the Micro-Architecture. The main component of this level are logic gates, simple components of wires and devices which allow simple boolean logic execution.

Devices
Transistors. Only the most advanced electrical engineers will work at this level. Logic gates are constructed using wires and transistors. They are the main units of operation in your computer.

So, what is the LC-3? The LC-3 is a simple implementation of the Von Neumann model computer, a simple but outdated architecture. The LC-3 is also an assembly language (Instruction Set Architecture or ISA) for the specific LC-3 processor (there is only one). The LC-3 processor is a 16-bit processor which means that it can handle bit strings of length 16. This bits are either 1’s or 0’s, your basic binary. However, there are multiple ways to represent numbers with binary. The most common today is Two’s Complement. I won’t be going over the specifics as it is a little more complicated that what I was planning on going over.

Here is the Micro-Architecture diagram of the LC-3 based computer:

LC-3

To most of you, this diagram is extremely confusing and is totally beyond all comprehension. I’ll try to go over the basics and hopefully you will understand some of the aspects of a computer.

A: This is your Register File or Processor Register, it is a small amount of memory on the processor which can be quickly accessed by other aspects of the processor. The LC-3 has eight registers within the Register File.
B: This is your memory, commonly known as RAM. This is where the majority of your operations and instructions are stored.
C: This large black line is your BUS, which is used to carry 16 bit binary numbers to several components within the computer. It is the main system of transit for your computer.
D: The PC or Program Control holds the MEMORY LOCATION of the next instruction to be carried out. This memory location is sent to the memory which then retrieves the instruction (opcode) and sends it to the processor for execution.
E: The ALU or Arithmetic Logic Unit is what performs operations to the bits in your computer. In the LC-3 it can only perform the operations ADD, bitwise AND, and bitwise NOT.
F*: I forgot to mark this component on the diagram, but it is the IR or Instruction Register. This keeps track of the current instruction being executed during the clock cycle and sends the instruction opcode to the Control and eventually the ALU.

Everything else is not necessary for this brief overview, but they all serve an important purpose. Components A and E are what make up the processor, B is your memory, and E and F make up your Control Unit. These are the basic components of a computer.

So how do we interact with the computer? We utilize the processors opcodes, or instructions which when inputed into the computers logic will cause changes in the computers state, which are then saved into the memory. Here is a list of the LC-3’s opcodes:

OPCODES

Let’s look at the first opcode on the list, ADD. The first four bits (15-12) in the opcode are the instruction code, it tells the computer which operation it wants to carry out. The next three bits represent the destination register, where we want to place the result of our operation. Bits 8-6 represent the first source register and bits 2-0 represent the second source register. Bits 5-3 are not used and thus are just zeros, the processor will ignore these bits. The registers are part of the register file, which the processor interacts with directly (the processor never directly interacts with memory). So, what would it look like when we want to add the value at register 1 and register 2 and place it into register 0?

In assembly:
ADD R0,R1,R2

In binary:
0001 000 001 000 010

000 is 0 in binary which represents R0, 001 is 1 in binary which represents R1 and 010 represents R2

So if we had the value 3 stored at R1 and 4 stored at R2, after executing this instruction the value 7 will be inputed into the register at R0.

So, utilizing these opcodes, we can write programs which can be executed directly by the computer hardware in order to perform computing tasks. This covers the basics on how one programs in assembly to perform basic computer executions. Hopefully with this background you will be able to marginally understand the interpreter.

If you have any questions, feel free to ask them and hopefully I can answer them within my next post. Until the next post, I recommend these PDF’s for a more detailed explanation of the LC-3 ISA and the Micro Architecture of the LC-3.
ISA
Micro Architecture

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: