首页
网站开发
桌面应用
管理软件
微信开发
App开发
嵌入式软件
工具软件
数据采集与分析
其他
首页
>
> 详细
program编程代做、代写Java, C++/Python程序
项目预算:
开发周期:
发布时间:
要求地区:
LW1
DA53: COMPILERS AND LANGUAGE
THEORY
Lexical Analysis
by hand
00 | Goal
The goal of this on-computer tutorial session is to write a small lexical analyzer (or lexer) by hand.
The development could be done in Java, C# or Python.
Remember that the lexical analyzer takes a source program, and scans it, and replies the recognized
tokens when they are requested by the parser.
Definitions :
• Lexeme : Sequence of characters that is recognized by the parser (usually expressed by
regular expression)
• Token : Symbolic name of a lexeme
During this on-computer tutorial, we must :
Define the language of the source program, its syntax and its semantic.
Write a symbol table
Write the lexer
01 | Language Description
The language to implement in this Labwork (LW) is a part of the Tiny Basic dialect of the BASIC
language.
01.1 Statements
The language is composed of statements, one per line. If this ST we do not recognize
multiple statements on the same line. A line must follow one of the following formats :
statement
number statement
where number is the number of the line, and statement is the instruction of the language.
Lexical
Analyzer Parser
Symbol
Table
tokens
syntax
tree
source
program
getNextToken
The supported statements are :
PRINT expression
▪ Print the given expression on the standard output
IF expression relational_operator expression THEN statement
▪ The statement is executed only if the given relational operator on the two
given expressions replies true.
▪ The relational operators are :
= equality
< less than
> greater than
<= less or equal
>= greater or equal
<>, >< not equal
GOTO expression
▪ Run the statement at the line de)ned by the expression.
INPUTvariable [, variable ...]
▪ Read variable values from the standard input .
LET variable = expression
▪ Assign the value of the expression to the variable.
GOSUB expression
▪ Run the statement at the line de)ned by the expression, but in opposite
than GOTO, remember the line of the GOSUB and run the statement just
after that line when the statement RETURN is encountered.
RETURN
▪ Run the statement at the line following the last GOSUB.
END
▪ Stop the program.
REM text
▪ Insert a comment in the source code.
01.2 Expressions
Many of these statement types involve the use of expressions. An expression is the
combination of one or more numbers, strings or variables, joined by operators, and
possibly grouped by parentheses.
There are four operators: +, -, *, /
Note that the operators + and – have two versions : the binary (a+b) and the unary (+a).
A string of characters is enclosed by the quote characters (").
01.3 Variable Definition and Scope
The variables do not need to be declared. The scope of all the variables is global. So that
a variable, when it is used, could be undened.
All the variables are of numerical type or of string type, depending on the type of the
expression assigned to the variable.
01.4 Example
Let the following Tiny Basic program :
5 LET S = 0
10 INPUT NUM
15 INPUT V
20 LET N = NUM
30 IF N <= 0 THEN GOTO 99
40 LET S = S + V
50 LET N = N - 1
70 GOTO 30
99 PRINT S/NUM
100 END
Your lexer must reply the following tokens, in that order :
<=>
<=>
<=>
<=>
02 | Exercices
02.1 Exercice 1
What is the alphabet of the language ?
What are the lexemes (as regular expressions) of the language ?
Write the table that is matching the lexemes, the tokens, and the attributes of the tokens.
02.2 Symbol Table
Create the class SymbolTable that permits to store several informations about the variables :
name, lexeme, line of the first occurrence, etc.
This symbol table is a kind of map with the name of the variable as key and the description of
the variable as value.
02.2 Tokens
Define the hierarchy of classes that is describing all the tokens of the language.
Note that the token ID should point to an entry of the symbol table.
02.3 Scanner
Create the Scanner class that takes an input stream (Reader is prefered in Java) as
parameter of its constructor.
The scanner is able to read the characters from the stream, to compact white spaces when
possible.
The scanner is able to reply the line in the source program at which the read character is
located.
We recommend to provides to functions : char peek(), and char get(). The first replies the next
available character but does not consume it. The second replies the next available character
and consumes it to pass to the next available character.
02.4 Lexer
Create the Lexer class and a function inside that is able to read a character from an input
stream.
Create a Finite Automata (FA) or an ad-hoc (from-scratch) code that is recognizing the
language. If you are encountering problems to do the FA, you may take the FA at the end of
this document.
Implements the FA or the ad-hoc in the function getNextSymbol() of your Lexer, and it should
use the Scanner.
Remember that the Lexer should not reply a token for the comment statements.
02.5 Main Program
Write the main program that creates the Symbol Table, and the Lexer on an input file.
The main program invokes the getNextSymbol() function of the Lexer to retreive the tokens.
The tokens are displayed on the standard output.
03 | Finite Automata
A Finite Automata (explained in Chapter 2) is able to recognized one or more tokens. When a state is
drawn with two circles, it means that a token is recognized and the Lexer could return it. If a star is
drawn on the side of a two-circle state, it means that it does not consume the character from the input
stream that has permits to reach this star-two-circle state.
03.1 Alphabet
The alphabet of the language is :
0, 1, 2, 3, 4, 5, 6, 7, 8, 9
+, -, /, *
a, b, c, d, …., x, y, z
A, B, C, D, ….., X, Y, Z
., <, >, =, (, ), "
03.2 Finite Automata for Token ID
1 2 3 start letter other *
letter or digit
03.3 Finite Automata for Token RELOP
03.4 Finite Automata for Token NUM
LW2 | LW3
DA53: COMPILERS AND LANGUAGE
THEORY
Lexical Analysis
and
Syntax Analysis
00 | Goal
The goal of this on-computer tutorial session is to write a small interpreter for the Tiny Basic language.
The development must be done in Java.
JavaCC (https://javacc.github.io/javacc/) should be used. This library is also available on Teams.
You must download and complete a skeleton of Java project, downloadable from Teams.
Remember that the lexical analyzer takes a source program, and scans it, and replies the recognized
tokens when they are requested by the parser. The parser generates a syntax tree that can be
executed (interpretation mode).
During this on-computer tutorial, you will :
1. Define the grammar and the Syntax-Directed Definition (SDD) of the Tiny Basic language.
2. Write a symbol table
3. Write the lexer
4. Write the interpretation rules
Lexical
Analyzer Parser
Symbol
Table
tokens
syntax
tree
source
program
getNextToken
01 | Language Definition
The language to implement in this labwork is a part of the Tiny Basic dialect of the BASIC language.
01.1 Statements
The language is composed of statements, one per line. If this labwork we do not
recognize multiple statements on the same line. A line must follow one of the following
formats :
statement
number statement
where number is the number of the line, and statement is the instruction of the language.
The supported statements are :
PRINT expression
▪ Print the given expression on the standard output
IF expression relational_operator expression THEN statement
▪ The statement is executed only if the given relational operator on the two
given expressions replies true.
▪ The relational operators are :
= equality
< less than
> greater than
<= less or equal
>= greater or equal
<>, >< not equal
GOTO expression
▪ Run the statement at the line de(ned by the expression.
INPUTvariable [, variable ...]
▪ Read variable values from the standard input .
LET variable = expression
▪ Assign the value of the expression to the variable.
GOSUB expression
▪ Run the statement at the line de(ned by the expression, but in opposite
than GOTO, remember the line of the GOSUB and run the statement just
after that line when the statement RETURN is encountered.
RETURN
▪ Run the statement at the line following the last GOSUB.
END
▪ Stop the program.
REM text
▪ Insert a comment in the source code.
01.2 Expressions
Many of these statement types involve the use of expressions. An expression is the
combination of one or more numbers, strings or variables, joined by operators, and
possibly grouped by parentheses.
There are four operators: +, -, *, /
Note that the operators + and – have two versions : the binary (a+b) and the unary (+a).
A string of characters is enclosed by the quote characters (").
01.3 Variable Definition and Scope
The variables do not need to be declared. The scope of all the variables is global. So that
a variable, when it is used, could be undened.
All the variables are of numerical type or of string type, depending on the type of the
expression assigned to the variable.
01.4 Example
Let the following Tiny Basic program :
5 LET S = 0
10 INPUT NUM
15 INPUT V
20 LET N = NUM
30 IF N <= 0 THEN GOTO 99
40 LET S = S + V
50 LET N = N - 1
70 GOTO 30
99 PRINT S/NUM
100 END
02 | JavaCC
JavaCC is a Java library and a collection of CLI tools that is generating a lexer and a parser
written in Java.
02.1 Input file
The input file of JavaCC is a file with extension « .jj », named here «tinybasic.jj ».
The input file must start with the Java definition of the Parser (named TinyBasicParser). You may
put in this part all the functions and attributes that are invoked in the grammar rules.
After the Java definition, you must put the lexer rules and the grammar (BNF) rules.
options {
IGNORE_CASE = true;
STATIC = false;
}
PARSER_BEGIN(TinyBasicParser)
package fr.utbm.gi.lo46.tp2.parser;
import java.util.SortedMap;
import java.util.Map;
import java.util.List;
import java.util.TreeMap;
import java.util.ArrayList;
import fr.utbm.gi.lo46.tp2.context.*;
import fr.utbm.gi.lo46.tp2.symbol.*;
import fr.utbm.gi.lo46.tp2.syntaxtree.*;
public class TinyBasicParser {
private int basicLineNumber = 1;
private final SymbolTable symbolTable = new SymbolTable();
/** Replies the symbol table used by the parser.
* @return the symbol table.
*/
public SymbolTable getSymbolTable() {
return this.symbolTable;
}
}
PARSER_END(TinyBasicParser)
02.2 Lexer rule syntax
Each lexer rule defines the lexeme that is recognized by the lexer. The recognized lexemes
are named tokens.
Different types of lexer rules are available : SKIP, TOKEN, …
The best practice in JavaCC is to define tokens only when they are associated to complex
regular expressions, ie. then an regular expression operator is used. For example, the lexeme
« <= » may not be defined in a token rule. Rather is should be directed typed in the BNF rules.
The SKIP rule tells to the lexer to ignore several sequences of characters :
SKIP :
{
" "
| "\t"
| "\r"
}
The TOKEN rule tells to the lexer the token to reply to the syntax analyzer :
TOKEN :
{
< CR: "\n" >
}
02.3 Parser rule syntax
The grammar (BNF) rules defines the syntactic rules of the language.
Each rule is defined like a kind of Java function (with a similar but different syntax) :
void ruleName() :
{
}
{
}
The prototype of the rule has the same syntax as the prototype of a Java function. Return type
and parameters are possible. Indeed, a rule could return a value or take values as parameters
when it is invoked.
The first block after the column character must contains any declaration of Java variable.
The second block after the column character must contains the BNF rules (separated by the
« | » character).
The BNF rule is a sequence of one of :
1. a token
• the value of the token could be retreived with an assign symbol (see
example below).
2. a BNF rule invokation
• it is a call to one of the BNF-rule functions, using the Java syntax.
• the value returned by the BNF-rule could be retreived with an assign symbol (see
example below).
• Parameters could be passed to the BNF-rules function.
3. a Java block
The following example is rule that is recognizing a parenthezed numerical expression, a
number literal, or an identifier.
private Number factor() :
{
Number value;
Token t;
}
{
"(" value = expression() ")"
{ return value; }
| t =
{ return NumberUtil.parse(t.image); }
| t =
{ this.symbolTable.add(t.image,t.beginLine);
return this.symbolTable.get(t.image);
}
}
02.4 Run JavaCC
On the CLI, type :
$> cd src/fr/utm/gi/lo46/tp2/parser
$> javacc tinybasic.jj
02 | Working Steps
02.1 Skeleton
Download the skeleton, and install it into your Eclipse.
The skeleton contains : a symbol table (SymbolTable), a context of interpretation
(ExecutionContext), an interpreter (Interpreter, and SyntaxTreeInterpreter), the main function
(TinyBasicInterpreter), and several abstract classes that correspond to the tree nodes of the
syntax tree.
Takes a look on the code.
02.2 JavaCC Scanner and Lexer Definition
Write the JavaCC input file that is corresponding to the Lexer (the token).
02.3 Complete the JavaCC Definition
Write the grammar inside your JavaCC input file (the BNF rules with the Java blocks empty).
02.4 Parse Tree Node
Create classes for all the nodes that are required in a parse tree dedicated to the Tiny Basic
language.
Each node of the parse tree must corresponds to an element of the language.
Each statement node should be linked to child nodes that are corresponding to the
parameters of the statement.
02.5 Write the SDD in JavaCC
Update your JavaCC input file with the SDD rules that permit to :
check the types
generate the parse tree for the Tiny Basic program.
02.6 Interpretation of the Parse Tree
Add in each node of your parse tree, the function run(...) and evaluate(...) which may be
invoked to run the parse tree as in an interpreter, and to compute the value of an expression.
02.7 Extension of the Language
Update your compiler implementation to support the new statements :
Statement : for id = num to num step num statements next id
Arrays as id ( expression )
The new keyword that is replying the size of an array : LEN ( id )
Statement : while ( expression relop expression ) do statements wend
LW4
DA53: COMPILERS AND LANGUAGE
THEORY
Intermediate
Code
Generation
00 | Goal
The goal of this on-computer tutorial session is to write a small compiler that is able to compile a Tiny
Basic program and generate three-address code program.
The development must be done in Java.
You must reuse your works from the previous on-computer tutorial sessions.
01 | Working Steps
01.1 Three-Address Code
Select the three-address instructions to support and put them in the ThreeAddressInstruction
enumeration.
Write the class ThreeAddressRecord. It is a record that contains the three-address instruction,
the parameters (String), and the result (String). A record may also be associated to a label,
and optionally to a comment. Note that during this tutorial, you should not compute the
addresses of the instructions and of the variables (this computation will be done during
the tutorial TU5).
Write the class ThreeAddressCode. It is a list of three-address records in one hand. In the
other hand, it should permit to :
1. create temporary variables ;
2. maintain a mapping between a TinyBasic line and a three-address code line ;
3. access to the symbol table ;
4. create labels.
01.2 Three-Address Code Generation
Update classes of the syntax tree with the generate() function. This function put in the given
ThreeAddressCode all the three-address statements that may corresponds to the syntax tree
node.
Note that the gosub and goto statements take a TinyBasic line number. You must,
dynamically, retreive the three-address code line that is corresponding to a given TinyBasic
line. This mapping is a part of the generated code by the ThreeAddressCode class.
Note that the symbol table may contains string literals (constants). You may define the threeaddress variables that are corresponding to these constants prior to any execution of the
program.
软件开发、广告设计客服
QQ:99515681
邮箱:99515681@qq.com
工作时间:8:00-23:00
微信:codinghelp
热点项目
更多
代做ceng0013 design of a pro...
2024-11-13
代做mech4880 refrigeration a...
2024-11-13
代做mcd1350: media studies a...
2024-11-13
代写fint b338f (autumn 2024)...
2024-11-13
代做engd3000 design of tunab...
2024-11-13
代做n1611 financial economet...
2024-11-13
代做econ 2331: economic and ...
2024-11-13
代做cs770/870 assignment 8代...
2024-11-13
代写amath 481/581 autumn qua...
2024-11-13
代做ccc8013 the process of s...
2024-11-13
代写csit040 – modern comput...
2024-11-13
代写econ 2070: introduc2on t...
2024-11-13
代写cct260, project 2 person...
2024-11-13
热点标签
mktg2509
csci 2600
38170
lng302
csse3010
phas3226
77938
arch1162
engn4536/engn6536
acx5903
comp151101
phl245
cse12
comp9312
stat3016/6016
phas0038
comp2140
6qqmb312
xjco3011
rest0005
ematm0051
5qqmn219
lubs5062m
eee8155
cege0100
eap033
artd1109
mat246
etc3430
ecmm462
mis102
inft6800
ddes9903
comp6521
comp9517
comp3331/9331
comp4337
comp6008
comp9414
bu.231.790.81
man00150m
csb352h
math1041
eengm4100
isys1002
08
6057cem
mktg3504
mthm036
mtrx1701
mth3241
eeee3086
cmp-7038b
cmp-7000a
ints4010
econ2151
infs5710
fins5516
fin3309
fins5510
gsoe9340
math2007
math2036
soee5010
mark3088
infs3605
elec9714
comp2271
ma214
comp2211
infs3604
600426
sit254
acct3091
bbt405
msin0116
com107/com113
mark5826
sit120
comp9021
eco2101
eeen40700
cs253
ece3114
ecmm447
chns3000
math377
itd102
comp9444
comp(2041|9044)
econ0060
econ7230
mgt001371
ecs-323
cs6250
mgdi60012
mdia2012
comm221001
comm5000
ma1008
engl642
econ241
com333
math367
mis201
nbs-7041x
meek16104
econ2003
comm1190
mbas902
comp-1027
dpst1091
comp7315
eppd1033
m06
ee3025
msci231
bb113/bbs1063
fc709
comp3425
comp9417
econ42915
cb9101
math1102e
chme0017
fc307
mkt60104
5522usst
litr1-uc6201.200
ee1102
cosc2803
math39512
omp9727
int2067/int5051
bsb151
mgt253
fc021
babs2202
mis2002s
phya21
18-213
cege0012
mdia1002
math38032
mech5125
07
cisc102
mgx3110
cs240
11175
fin3020s
eco3420
ictten622
comp9727
cpt111
de114102d
mgm320h5s
bafi1019
math21112
efim20036
mn-3503
fins5568
110.807
bcpm000028
info6030
bma0092
bcpm0054
math20212
ce335
cs365
cenv6141
ftec5580
math2010
ec3450
comm1170
ecmt1010
csci-ua.0480-003
econ12-200
ib3960
ectb60h3f
cs247—assignment
tk3163
ics3u
ib3j80
comp20008
comp9334
eppd1063
acct2343
cct109
isys1055/3412
math350-real
math2014
eec180
stat141b
econ2101
msinm014/msing014/msing014b
fit2004
comp643
bu1002
cm2030
联系我们
- QQ: 9951568
© 2021
www.rj363.com
软件定制开发网!