University of Iowa, 2020  
CS2630: Computer Organization  
Project 1  
MiniMa: (mini) MIPS assembler written in  
C  
Goals for this assignment  
• Translate MAL instructions to TAL, and TAL to binary  
• Resolve the addresses of branch and jump labels  
• Build an assembler  
Before you start  
This project is fairly involved, so start early. To be prepared, you should:  
• review the readings from Ch 6  
• complete the Stored Programs inquiry activity  
• take the Knowledge Check on pseudoinstructions  
Read the document as soon as you can and ask questions in Debug Your Brain/discussion  
board/office hours.   
The testing infrastructure is designed so that it is possible to get MiniMa working one phase at  
a time (there are 3 phases), so you can pace yourself.  
So that we know who is working together: you must declare your partner at   
https://uiowa.instructure.com/courses/101698/pages/sign-up-for-project-1-partner   
Rubric  
See the assignment on ICON for the rubric. Notice that non-compiling programs and programs  
that don’t pass any tests earn 0 points. Therefore, we cannot stress enough that you should  
approach this project incrementally. Consider the following stories:  
• Team 0xBA51C: got a few tests working each day, running the code frequently to check.  
They ended up passing all tests except a couple tricky cases and got an A (> 90 %).  
• Team 0xCOFFEE: wrote all functionality in one all-night coding fever. They are pretty  
sure they covered every single case for all 3 phases. However, they never compiled or  
ran their code until the due date. When they did, they found it didn’t even compile.  
They didn’t have time to fix it and had to turn it in as is. Grade: F (0%).  
University of Iowa, 2020  
Setup  
You can git clone (or download zip) the project from  
https://research-git.uiowa.edu/cs2630-assignments/minima-c  
Log into this website using your hawkid and password.  
A note on collaboration: We encourage you and your partner to create a https://research- 
git.uiowa.edu/  
repository to collaborate on your code more effectively. If you do so, you must mark your  
repository Private and add your partner by going to Members and inviting your partner by  
hawkid.   
Repositories marked Internal or Public will be considered an intent to cheat by sharing code  
with other students.  
Introduction  
In this project, you will be writing some components of a basic assembler for MIPS, called  
MiniMa (mini MIPS assembler). You will be writing MiniMa in Java.  
The input to MiniMa is an array of Instruction objects. The Instruction class has several fields  
indicating different aspects of the instruction. MiniMa does not include a parser to turn a text  
file into Instruction objects, so you will write programs using the functions in  
InstructionFactory, which creates Instructions.  
struct Instruction {  
enum ID instruction_id;   // id indicating the instruction  
uint8_t rd;            // register number destination  
University of Iowa, 2020  
uint8_t rs;            // register number source  
uint8_t rt;            // register number secondary source  
int32_t immediate;     // immediate, may use up to 32 bits  
uint32_t jump_address;  // jump address    
uint8_t shift_amount;  // shift amount   
char label[MAX_LABEL_LENGTH]; // label of the line this Instruction appears on  
char branch_label[MAX_LABEL_LENGTH];  // label used by branch or jump  
instructions  
};  
MiniMa has three basic phases for translating a MIPS program into binary. The next three  
sections describe these phases. The section after that, “What you need to do”, will describe  
your job in Project 1.  
1. Convert MAL to TAL  
In this phase, MiniMa converts any pseudo instructions into TAL instructions. Specifically,  
MiniMa creates a new output array of Instruction objects and stores the TAL instructions into it  
in order. For any true instruction in the input, MiniMa just copies it from the input to the  
output. For any pseudo instruction, MiniMa writes 1-3 real instructions into the output.   
Examples:  
Ex a)  
label2: addu $t0,$zero,$zero  
This instruction will be provided to you as the Instruction object:  
Instruction_id rd rs rt imm jump_address shift_amount label branch_label  
addu 8 0 0 0 0 0 "label2" 0  
Because this instruction is already a TAL instruction, you will just copy it into the output array.  
Ex b)  
blt $s0, $t0, label3  
This pseudo instruction, will be provided to you as the instruction object:  
Instruction_id rd rs rt imm jump_address shift_amount label branch_label  
blt 8 0 0 0 0 0 "" "label3"  
Note that label="" because the line the instruction was on is unlabeled.  
This instruction is a pseudo instruction, so we must translate it to TAL instructions. In this case:  
slt $at,$s0,$t0  
University of Iowa, 2020  
bne $at,$zero,label3  
Which you will represent with the following two Instruction objects.  
Instruction_id rd rs rt imm jump_address shift_amount label branch_label  
slt 1 16 8 0 0 0 "" ""  
Instruction_id rd rs rt imm jump_address shift_amount label branch_label  
bne 0 1 0 0 0 0 "" "label3"  
We used $at (the assembler register) to store the result of the comparison. Since MIPS  
programmers are not allowed to use $at themselves, we know we can safely use it for passing  
data between generated TAL instructions.  
IMPORTANT: notice that branch instructions have Immediate=0 in phase 1. Instead, they  
specify the target using branch_label. In phase 2, the branch_label will get translated into the  
correct immediate.  
You must also make sure that you detect I-type instructions that use an immediate using more  
than the bottom 16 bits of the immediate field and translate them to the appropriate sequence  
of instructions.  
This is just the end of the Part 1 background information. There are no tasks to do yet, so  
keep reading!  
2. Convert labels into addresses  
This phase converts logical labels into actual addresses. This process requires two passes over  
the instruction array.   
- Pass one: find the mapping of labels to the PC where that label occurs  
- Pass two: for each instruction with a non-zero branch_label (jumps and branches)  
calculate the appropriate address using the mapping.  
Example  
before phase2: branch target for branch instructions indicated using branch_label field  
Address Label Instruction Important instruction field values  
0x00400000 label1: addu $t0,$t0,$t1 label="label1"  
0x00400004  ori $t0,$t0,0xFF    
0x00400008 label2: beq $t0,$t2,label1 label="label2", branch_label="label1"  
0x0040000C  addiu $t1,$t1,-1   
0x00400010 label3: addiu $t2,$t2,-1 label="label3"  
after phase2: branch target for branch instructions indicated using immediate field  
Address Label Instruction Important instruction field values  
0x00400000 label1: addu $t0,$t0,$t1   
University of Iowa, 2020  
0x00400004  ori $t0,$t0,0xFF    
0x00400008 label2: beq $t0,$t2,-3 immediate = -3  
0x0040000C  addiu $t1,$t1,-1   
0x00400010 label3: addiu $t2,$t2,-1   
This is just the end of the Part 2 background information. There are no tasks to do yet, so  
keep reading!  
3. Translate instructions to binary  
This phase converts each Instruction to a 32-bit integer using the MIPS instruction encoding, as  
specified by the MIPS reference card. We will be able to test the output of this final phase by  
using MARs to translate the same input instructions and compare them byte-for-byte.  
To limit the work you have to do, MiniMa only needs to support the following instructions:  
Instruction  
addiu   (whether it is MAL depends on imm)  
sll  
addu  
or  
beq  
bne  
slt        
lui       
move (always MAL)  
ori    (whether it is MAL depends on imm)  
blt     (always MAL)  
bge   (always MAL)  
This is just the end of the Part 3 background information. Your tasks begin in the next section.  
What you need to do  
1. You will complete the implementation of phase 1 by modifying the file Phase1.c. Do  
not modify Phase1.h.  
/* Translates the MAL instruction to 1-3 TAL instructions  
* and returns the TAL instructions in a list  
*  
* mals: input program as a list of Instruction objects  
* tals: after the function returns, will contain TAL instructions (should be same  
size or longer than input list)  
*/  
void mal_to_tal(struct ArrayList *mals, struct ArrayList *tals);  
University of Iowa, 2020  
If a MAL Instruction is already in TAL format, then you should just copy that Instruction object  
into your output list.   
• It is strongly discouraged to modify the fields of an Instruction object individually.  
Instead you should do the below.  
• To create an Instruction with specific fields, see the functions in  
InstructionFactory.h. Find examples of their usage in AssemblerTest.c.  
If a MAL Instruction is a pseudo-instruction, such as blt, then you should create the TAL  
Instructions that it translates to in order in result ArrayList.  
You must check I-type instructions for the case where the immediate does not fit into 16 bits  
and translate it to lui, ori, followed by the appropriate r-type instruction. Remember: the 16-bit  
immediate check does not need to be done on branch instructions because they do not have  
immediates in phase 1 (see phase 1 description above).  
Use the following translations for pseudo instructions. These translations are the same as MARS  
uses.  
I. Instruction passed to mal_to_tal:  
addiu r1,r2,Immediate        
# when immediate is too large (note that addi/addiu RTL has sign  
extension)  
=>  
Instructions returned from mal_to_tal:  
lui $at,Upper 16-bit immediate              
ori $at,$at,Lower 16-bit immediate  
addu r1,r2,$at  
Note that lui will never be given an immediate too large because it is not well-defined for  
more than 16 bits (MARS also disallows lui with >16-bit immediate, try it yourself).  
II. Instruction passed to mal_to_tal:  
blt r1,r2,gohere  
=>  
Instructions returned from mal_to_tal:  
slt $at,r1,r2  
bne $at,$zero,gohere  
III. Instruction passed to mal_to_tal:  
bge rs,rt,hello  
=>  
Instructions returned from mal_to_tal:  
slt $at,r1,r2  
beq $at,$zero,hello  
University of Iowa, 2020  
IV. Instruction passed to mal_to_tal:  
ori r1,r2,Immediate      
# when immediate is too large (note that ori RTL has zero extension)  
=>  
Instructions returned from mal_to_tal:  
lui $at,Upper 16-bit immediate              
ori $at,$at,Lower 16-bit immediate  
or r1,r2,$at  
V. Instruction passed to mal_to_tal:  
move r1,r2      
=>  
Instructions returned from mal_to_tal:  
addu r1, $0, r2  
NOTES:  
If in doubt about a case of translation there are things you can do before asking for help.  
1) examine the test cases, and   
2) assemble the test program in MARS to see what it produces  
You only need to support the instructions that are already uncommented in Instruction.c. You  
should not change Instruction.c or InstructionFactory.c.  
2. You will complete the implementation of phase 2 by implementing the 2-pass address  
resolution in a function called resolve_addresses in Phase2.c. Do not modify  
Phase2.h.  
/* Returns a list of copies of the Instructions with the  
* immediate field (i-type) or jump_address (j-type) of the instruction filled in  
* with the address calculated from the branch_label.  
*  
* The instruction should not be changed if it is not a branch or jump instruction.  
*  
* unresolved: input program, whose branch/jump instructions don't have resolved  
immediate/jump_address  
* first_pc: address where the first instruction of the program will eventually be placed in  
memory  
* resolved: after the function returns, resolved contains the resulting instructions  
*/  
void resolve_addresses(struct ArrayList *unresolved, uint32_t first_pc, struct ArrayList  
*resolved);  
Using our example from the phase 2 description:  
Address Label Instruction Important instruction field values  
University of Iowa, 2020  
0x00400000 label1: addu $t0,$t0,$t1 label="label1"  
0x00400004  ori $t0,$t0,0xFF    
0x00400008 label2: beq $t0,$t2,label1 label="label2", branch_label=1  
0x0040000C  addiu $t1,$t1,-1   
0x00400010 label3: addiu $t2,$t2,-1 label="label3"  
The first_pc argument is the address where the first instruction in unresolved would be  
written to memory after phase 3. Using the above example, resolve_addresses would be  
called with first_pc=0x00400000.  
Refer to the earlier description of phase 2 for how to calculate the immediate field.  
3. You will complete the implementation of phase 3 by implementing the function  
translate_instruction in Phase3.c. Do not modify Phase3.h.  
/* Translate each Instruction object into  
* a 32-bit number.  
*  
* tals: list of Instructions to translate  
*  
* returns a list of instructions in their 32-bit binary representation  
*  
*/  
public static List translate_instructions(List tals)  
This function produces an encoding of each R-type, I-type, or J-type instruction. Refer to the  
MIPS reference sheet for format of the 32-bit format.   
Make sure to set unused fields to 0. This default value is not required by the MIPS architecture,  
but it is required by the provided MiniMa test code. You'll also notice that MARS chooses to use  
0 for unused fields.  
General implementation help  
• .h files are header files, which declare symbols (functions, structs, enums) that are  
defined elsewhere in a .c file. The idea of putting declarations in an .h file and  
definitions in a .c file is roughly analogous to putting declarations in an interface and  
definitions in a class in Java.  
• If you need a function or struct declared in a .h file, then put the following at the top of  
your .c file.  
#include “filename.h”  
University of Iowa, 2020  
For example, if your code needed to refer to struct Instruction, then put  
#include “Instruction.h”  
• We’ve structured the skeleton code so that you only need to modify…   
o Phase1.c  
o Phase2.c  
o Phase3.c  
o AssemblerTest.c (optional but recommended)  
Running and testing your code  
The three phases are run on several test inputs by …  
make  
./AssemblerTest  
The provided test cases separately test the three Phases. Therefore, you'll have good evidence  
for a correct Phase when all tests named by that Phase # are passing.   
You can add your own tests to AssemblerTest.java. Use an existing test as a template. If you  
don't want to manually calculate what phase3_expected should be for your test, you can let  
MARS do it for you: assemble your input program in MARS and look at the Code column in the  
Text Segment.  
You should add additional test cases to AssemblerTest.java. Find corner cases the tests do not  
cover:   
• other input instructions  
• I-type instructions that do or do not exceed 16 bits  
• different label positions  
• different combinations of instructions  
You are responsible for testing your assembler beyond the given tests. We will use additional  
tests during grading.  
Testing tips  
When you run the tests, you’ll get a message like.  
Assertion failed: (size(expectedP1) == size(tals)), function testHelperPhase1, file  
AssemblerTest.c, line 18.  
Since many tests call testHelperPhase1, this information alone is not enough to know which  
test failed. You’ll need a stack trace. To get one, use gdb. Here’s an example interaction.  
University of Iowa, 2020  
gdb ./AssemblerTest  
(gdb) run  
(gdb) bt  
#0  0x00007ffff7deee35 in raise () from /lib64/libc.so.6  
#1  0x00007ffff7dd9895 in abort () from /lib64/libc.so.6  
#2  0x00007ffff7dd9769 in __assert_fail_base.cold () from /lib64/libc.so.6  
#3  0x00007ffff7de7566 in __assert_fail () from /lib64/libc.so.6  
#4  0x000000000040249d in testHelperPhase1 (input=0x40b2a0, expectedP1=0x40b650) at  
AssemblerTest.c:18  
#5  0x0000000000402c17 in test1Phase1 () at AssemblerTest.c:80  
#6  0x00000000004067b0 in main () at AssemblerTest.c:1174  
Aha! It was test1Phase1 that failed. From here you could examine the variable values by adding  
print statements or by continuing your gdb interaction. Tip: the frame command in gdb takes  
an argument N and goes to the function call labeled #N in the stack trace.  
(gdb) frame 4  
#4  0x000000000040249d in testHelperPhase1 (input=0x40b2a0, expectedP1=0x40b650) at  
AssemblerTest.c:18  
18   assert(size(expectedP1) == size(tals));  
(gdb) display size(expectedP1)  
1: size(expectedP1) = 7  
(gdb) display size(tals)  
2: size(tals) = 0  
What to submit  
For credit your MiniMa implementation must at least compile and be runnable. You should not  
depend on modifications to files other than those you are submitting:   
Required:  
• Phase1.c (method implemented)  
• Phase2.c (method implemented)  
• Phase3.c (method implemented)  
Both partners are responsible for double-double-checking that 3 files submitted to ICON are  
the version of the files that you intended.   
Good job!  
Once you have completed this project, if you added support for the rest of the MIPS  
instructions and the dot (.) directives you could replace the assembler of MARS (i.e., the code  
behind this button ).  
The only piece we’ve excluded is the parser that turns text files of MIPS code into our internal  
representation of instructions: a list of Instruction objects.