首页
网站开发
桌面应用
管理软件
微信开发
App开发
嵌入式软件
工具软件
数据采集与分析
其他
首页
>
> 详细
CSC3050代做、C++程序语言代写
项目预算:
开发周期:
发布时间:
要求地区:
CSC3050 Project 3: RISC-V Simulator with RVV
1 Background
RISC-V, an open standard instruction set architecture (ISA), has rapidly become a
pivotal force in academic research and industrial development due to its flexibility
and open-source nature. Unlike proprietary ISAs, RISC-V offers the freedom for
developers to customize and extend the architecture, making it an ideal platform
for innovation in research, education, and the design of specialized hardware. One
of its most impactful extensions is the RISC-V Vector Extension (RVV), which
introduces efficient vector processing capabilities—a cornerstone of modern high performance computing. This is especially critical for applications like machine
learning, cryptography, and scientific simulations, where parallel data processing is
essential for improving computational speed and efficiency.
In this project, you are tasked with extending the QTRVSim RISC-V simulator
to support vector operations by implementing some of the RVV instructions.
After reviewing the number of cycles, you will get a feeling of how this is faster
than conducting element-wise operations.
Start early, this project can be time-consuming if you are not familiar with
simulators.
2 QTRVSim
QTRVSim is a RISC-V CPU simulator for education, where you can try its online
version on this link. Just in case you want to try different instructions, you can refer
to this page: RISC-V Instruction Set Specifications. A helpful video about using
QTRVSim can be found on Youtube
After familiarizing yourself with the QtRVSim manual, you can begin planning how
to integrate RVV instructions into the existing implementation. The simulator’s
source code, written in C++ and including both the core simulation functions and
graphical user interfaces (GUIs), can be found in the repository at this link. To test
your modifications, QtRVSim offers two methods for simulating assembly code: GUI
or command-line prompts.
Note: For this project, you are not required to modify any of the GUI components.
Your primary goal is to ensure that the RVV instructions function correctly when
using command-line prompts. Another objective in this project is to save the number
of cycles; the smaller the number you get, the better the score you get.
1
2.1 How to run
We give the example of running QTRVSim on Ubuntu with the terminal. You can
follow these steps:
1. We assume you already have the necessary packages for compiling cpp. If
not, you can easily find tutorial for them on the internet.
2. Install QT6 (QT5 does not work in most cases) with sudo apt install qt6-
base-dev. You might need sudo apt update first, and make sure you are
installing QT6, not QT5.
3. Download QTRVSim from the given repository.
4. Make a new directory for building files (mkdir build; cd build)
5. cmake -DCMAKE BUILD TYPE=Release /path/to/qtrvsim
6. make -j X, where X is the number of threads you want to use
7. If everything goes correctly, you can use ./target/qtrvsim cli –asm XXXXX.S
to run your .S file.
8. Via ./target/qtrvsim cli –help, you can check all helpful arguments.
3 RVV Instructions
In this assignment, you are required to implement the following RVV instructions
(suppose max vector size is 32):
1. vsetvl rd, rs1, rs2: sets the length register vl to rs1 and rd, also sets the
register holding the type of vector to rs2 (8/16/32).
2. vadd.vv vd, vs2, vs1: adds two vectors vs2 and vs1, and stores the result
in vd
3. vadd.vx vd, vs2, rs1: adds rs1 to each element of vector vs2, and stores
the result in vd
4. vadd.vi vd, vs2, imm: adds the scalar value imm to each element of vector
vs2, and stores the result in vd
5. vmul.vv vd, vs2, vs1: conducts dot production on two vectors vs2 and vs1,
and stores the result in vd
6. vlw.v vd, (rs1): loads elements stored starting at rs1 into vector vd. The
length to load is dependent on the length stored at vl and the unit length
specified earlier.
7. vsw.v vs3, (rs1): stores vector elements of vs3 into memory starting at rs1.
The length to load is dependent on the length stored at vl and the unit length
specified earlier.
2
Figure 1: Matrix stored as vector
The whole point of this project is that, through the implementation, you will
understand why are vector operations is much faster than manipulate each ele ment individually. For example, writing 100 elements into memory will require 100
individual store instructions if in an element-wise manner. However, using vector
write, you only need to do one vector store instruction.
A detailed explanation of RVV instructions can be found at this manual. Reminder:
Do not forget to update vl when switching to operate on vectors with different
lengths.
4 Matrix Multiplication
After implementing and testing the aforementioned functionalities, you are required
to write a .S file that conduct matrix to matrix multiplication.
Ci,j =
X Ai,kBk,j
k
The actual matrix will be stored as a vector in memory, as shown in Figure 1. In
order to conduct vector multiplication, the size of the matrix n × m will be given.
We require you to generate two random matrices with sizes of 20 × 46 and
46 × 50 where elements can be of your own choice.
5 Tricks
There are several tricks you can apply to reduce cycle counts.
1. Reduction (required): This is similar to calculate the summation of a
vector, but more efficiently. The basic requirement is that you conduct this
summation on each element one-by-one, which leads to excessive cycles.
Another approach is to do binary split, i.e. repeatedly decompose the a vector
of size n into 2 vectors of size n//2, and then conduct vadd. There are also
other trick for conducting reduction, and you can explore any of them.
3
Possible reduction:
(a) scalar loop
(b) vector shift
(c) reduction instruction
(d) ...
2. Chaining (Extra credit): When conducting vector operations, it is not nec essary to wait for the entire instruction to complete. As shown in Figure 2, it
is possible to conduct VADD on the first element, right after obtaining the
first element of VMUL. A much better illustration can be found at Prof.Hsu’s
slides at this link.
Figure 2: chaining
6 Instruction on Implementation
The code involved in QTRVSim is quite complicated. Luckily, you only need to
focus on few script files.
1. src/machine/instruction.cpp: Edit this file to add new instructions. The
boxed fields are:
• instruction name
• instruction enum type (you can edit this by yourself; no need to follow
the example)
• input types (you can go through instruction.cpp to see what char is for
what type)
• machine code (hexadecimal)
• mask for effective bits for instruction (hexadecimal)
• customize flags (you can edit this by yourself; no need to follow the
example)
2. src/machine/core.cpp: Main pipeline of the simulator. You can find fetch,
decode, execute, writeback, memory in it, and edit these codes for your con venience.
4
3. src/machine/execute/alu.cpp: specify what to do for each alu operation.
You can create/edit these codes for your own convenience.
Other files might also interest you, but we will not go through all of them here.
Feel free to modify any codes as long as they work.
Notice: you need to use state.cycle count++; in core.cpp when needed.
Notice2: If you want to use v1,v2... as the vector register, you can modify
parse reg from string() in instruction.cpp.
Notice3: You might want to check dt.num rt, dt.num rd, dt.num rs for specific
register indexing.
Notice4: The largest vector register length is 32. Load instruction will have a
memory latency of 32. Besides, the cycles for multiplication is 4. (This means that,
to load a vector of length 10, the total cycles will be 1 + 1 + 32 + 10 + 1 + 1 = 46)
7 Grading Criteria
The maximum score you can get for this lab is 100 points. We will first exam ine the correctness of your outputs to test cases. Since hard-coding each opera tion is fairly easy in C++, we will check the execution information, such as the
number of cycles, and content in memories/registers. Using of ChatGPT to im prove writing/generate codes/provide ideas is allowed and highly-recommended
as ChatGPT has become one of the best productivity tools.
Conducting ”higher-level” reduction or finishing the task with less number of cycles
will be granted with extra credit.
You are also required to compose a report, where you should show the results
of your test case executions. Besides you also need to show the total number of
cycles and explain where those cycles come from. (few sentences, no need to be
super specific.)
The deadline of this project is 23:59, Tuesday, 2024/11/19. For each day after
the deadline, 10 points will be deducted from your final score up to 30 points, after
which you will get 0 points.
Besides, if anyone is interested in developing with QT, you are more than welcome
to implement GUI support for RVV instruction. If done properly, you will earn extra
credits, and might contribute to future contents of this class.
Feel free to ask questions if you find anything confusing.
5
8 Submission
You should make sure your code compiles and runs. Then, it should be compressed
into a .zip file and submitted to BlackBoard. Any necessary instructions to
compile and run your code should also be documented and included. Finally, you are
also required to include a report containing the results of your test case execution.
6
软件开发、广告设计客服
QQ:99515681
邮箱:99515681@qq.com
工作时间:8:00-23:00
微信:codinghelp
热点项目
更多
代做 program、代写 c++设计程...
2024-12-23
comp2012j 代写、代做 java 设...
2024-12-23
代做 data 编程、代写 python/...
2024-12-23
代做en.553.413-613 applied s...
2024-12-23
代做steady-state analvsis代做...
2024-12-23
代写photo essay of a deciduo...
2024-12-23
代写gpa analyzer调试c/c++语言
2024-12-23
代做comp 330 (fall 2024): as...
2024-12-23
代写pstat 160a fall 2024 - a...
2024-12-23
代做pstat 160a: stochastic p...
2024-12-23
代做7ssgn110 environmental d...
2024-12-23
代做compsci 4039 programming...
2024-12-23
代做lab exercise 8: dictiona...
2024-12-23
热点标签
mktg2509
csci 2600
38170
lng302
csse3010
phas3226
77938
arch1162
engn4536/engn6536
acx5903
comp151101
phl245
cse12
comp9312
stat3016/6016
phas0038
comp2140
6qqmb312
xjco3011
rest0005
ematm0051
5qqmn219
lubs5062m
eee8155
cege0100
eap033
artd1109
mat246
etc3430
ecmm462
mis102
inft6800
ddes9903
comp6521
comp9517
comp3331/9331
comp4337
comp6008
comp9414
bu.231.790.81
man00150m
csb352h
math1041
eengm4100
isys1002
08
6057cem
mktg3504
mthm036
mtrx1701
mth3241
eeee3086
cmp-7038b
cmp-7000a
ints4010
econ2151
infs5710
fins5516
fin3309
fins5510
gsoe9340
math2007
math2036
soee5010
mark3088
infs3605
elec9714
comp2271
ma214
comp2211
infs3604
600426
sit254
acct3091
bbt405
msin0116
com107/com113
mark5826
sit120
comp9021
eco2101
eeen40700
cs253
ece3114
ecmm447
chns3000
math377
itd102
comp9444
comp(2041|9044)
econ0060
econ7230
mgt001371
ecs-323
cs6250
mgdi60012
mdia2012
comm221001
comm5000
ma1008
engl642
econ241
com333
math367
mis201
nbs-7041x
meek16104
econ2003
comm1190
mbas902
comp-1027
dpst1091
comp7315
eppd1033
m06
ee3025
msci231
bb113/bbs1063
fc709
comp3425
comp9417
econ42915
cb9101
math1102e
chme0017
fc307
mkt60104
5522usst
litr1-uc6201.200
ee1102
cosc2803
math39512
omp9727
int2067/int5051
bsb151
mgt253
fc021
babs2202
mis2002s
phya21
18-213
cege0012
mdia1002
math38032
mech5125
07
cisc102
mgx3110
cs240
11175
fin3020s
eco3420
ictten622
comp9727
cpt111
de114102d
mgm320h5s
bafi1019
math21112
efim20036
mn-3503
fins5568
110.807
bcpm000028
info6030
bma0092
bcpm0054
math20212
ce335
cs365
cenv6141
ftec5580
math2010
ec3450
comm1170
ecmt1010
csci-ua.0480-003
econ12-200
ib3960
ectb60h3f
cs247—assignment
tk3163
ics3u
ib3j80
comp20008
comp9334
eppd1063
acct2343
cct109
isys1055/3412
math350-real
math2014
eec180
stat141b
econ2101
msinm014/msing014/msing014b
fit2004
comp643
bu1002
cm2030
联系我们
- QQ: 9951568
© 2021
www.rj363.com
软件定制开发网!