首页
网站开发
桌面应用
管理软件
微信开发
App开发
嵌入式软件
工具软件
数据采集与分析
其他
首页
>
> 详细
c/c++编程设计调试、辅导c++程序语言、辅导program留学生程序 辅导Web开发|讲解数据库SQL
项目预算:
开发周期:
发布时间:
要求地区:
Coursework: Colliding suns
Parallel Scientific Computing I
Tobias Weinzierl
November 16, 2020
In this assignment, you will write a very simple N-body solver which runs efficiently on a
single node of a supercomputer. It simulates N objects (bodies) in space—think of suns—
which all move and fuse when they hit each other. I have broken down the assignment into
steps of increasing difficulty. Each step’s result is to be handed in separately (see instructions
below), so I can mark them independently.
The assignment steps mirror content from the lecture. I strongly recommend that you
revise the lecture script while you complete the coursework. Consult the script’s appendix for
hints and helpful material how to produce high-quality solutions. The challenges within the
assignment cover both topics from Parallel Scientific Computing: numerical algorithms and
parallelisation. This assignment is to be completed and handed in via DUO. For
deadlines, please consult the relvant level rubric on DUO (3rd Year Computing).
Submission format
• You are given a code assignment-code.cpp. Use this one as starting point. It is
slightly different than the demo code we have used throughout the lecture (it plots two
additional lines).
• You submit exactly one zip file which contains files named step-1.cpp, step-2.cpp,
step-3.cpp and step-4.cpp, plus one PDF file report.pdf. If you don’t stick to the
naming conventions, your submission will not be marked. There is a Python validation
script on DUO. You can run it via python3 validate.py my-zip and it will tell you
whether your zip file conforms to the submission guidelines.
• Do not make the code output any additional information to the terminal, i.e. do not
alter any screen output.
• Do not make the code accept any additional arguments. No matter how you alter your
code, it should still only read the parameters my template code accepts, too.
• I will translate your code on a Linux machine with the Intel compiler. My tests will use
the compiler options -fopenmp -O3 -xhost --std=c++0x. On Hamilton, you can try
out these flags with
module load intel/2019.5
icpc -fopenmp -O3 -xhost --std=c++0x assignment-code.cpp
1
• You can reorganise the code in whichever way you want, but all the implementation has
to end up in the single .cpp file per step. Do not split up your code into multiple files
per step!
• DUO provides a Python script that allows you to generate arbitrary complex, random
initial configurations of particles. I’ll use this same script to generate scenarios. My
masses of choice for the suns mi will be normalised, i.e. drawn from mi ∈]0, 1].
Step 1: Multiple free objects
This first step extends the given assignment code into a real N-body code. This is the baseline
of the whole coursework. It is used in all follow-up steps.
In the template code given to you, one object (the first one specified on the command
line) is free, while all the other bodies are fixed in space. Alter the code such that all objects
move freely through space. Furthermore, make two objects fuse into one if they collide and
ensure that the terminal output of the code (minimal object distances, maximum velocities,
final number of objects and object’s position) remains valid.
• Two objects collide if they end up in positions p1 and p2 with
|p1 − p2| ≤ C(m1 + m2) with C = 10−2
/N
after a time step. N is the object count at the simulation startup. There is no need to
check the actual trajectory. Simply use the final position after each step.
• The merger has a new mass of m1 + m2.
• You can average the objects’ positions. Use the weighted average here, too,
Marks will be given for the correctness of the solution. The solution to this step is to be
handed in as a single file called step-1.cpp. The step assesses your understanding of the test
code that I used throughout the lecture. It touches on the principle of numerical imprecision
due to time stepping. Finally, it establishes a baseline to assess workload imbalances for the
next steps. I do recommend that you use some basic profiling to eliminate major runtime
bottlenecks.
This step is worth 20 marks (correct movement: 6, fusion: 6, speed of implementation: 8).
Step 2: Vectorisation
This step extends the source code from step 1 and makes it exploit vectorisation.
Assess the vector efficiency of your solution. Where appropriate—you might want to use
performance analysis tools—add OpenMP vectorisation pragmas. The step assesses your
understanding of the vector paradigm and basic OpenMP vectorisation pragmas. Marks will
preliminary be awarded for performance, but you have to ensure that you don’t break the
code semantics. The solution to this step is to be handed in as a single file called step-2.cpp.
This step is worth 20 marks (correctness: 5, obtained efficiency: 15).
2
Step 3: Higher order time integration
This step extends the source code from step 2, and adds a better numerical scheme. If you
struggle with step 2, you are allowed to start with the solution of step 1.
Implement an RK(2) (Runge-Kutta 2) algorithm for your particle movement. This scheme
is also called Runge’s method (cmp. script).
The step assesses your understanding of numerical time integrators. Your lecture script
discusses multiple variants of time stepping schemes with second-order accuracy. The most
widely used scheme for N-body codes is the leapfrog scheme. I however want you to implement
Runge-Kutta here to assess that you understand RK’s data flow.
Marks will be awarded for the correctness of the implementation. The solution to this
step is to be handed in as a single file called step-3.cpp.
This step is worth 20 marks.
Step 4: OpenMP
This step should extend the source code from step 3 and adds multicore parallelism. If you
struggle with step 3, you are allowed to fall back to the solution from step 1 or step 2.
Parallelise your code with OpenMP vectorisation pragmas. Ensure that you do not break
the global statistics v max and dx min which are plotted after each step.
The step assesses your understanding of the OpenMP parallelisation paradigm. Marks will
be awarded for performance, but you have to ensure that you don’t break the code semantics.
The solution to this step is to be handed in as a single file called step-4.cpp.
This step is worth 20 marks (correctness: 5, obtained efficiency: 15).
Step 5: Report
This step assesses your ability to run numerical/HPC experiments. You should use your
solution of step 4 for your experiments.
Study the scalability of your code and compare it to a strong scaling model. What are
the involved constants? You can either use your own machine (if it is reasonably powerful)
or use Durham’s supercomputer Hamilton or use another Durham machine.
Furthermore, assess the convergence order of your time stepping scheme for both the
scheme from step 1 and the scheme from step 3. For this, I recommend that you take a
two-particle setup and you track where the particles collide. Rely solely on numerical data,
i.e. do not try to use any analytical solution.
The step assesses your understanding of how to design numerical and upscaling experiments,
how to present findings and how to calibrate models. Marks will be awarded for
the data and the presentation. Besides some background information about the machine et
al, your report should consist of two plots (one for the scaling and one for the convergence
behaviour) and a brief paragraph that interprets the data. The solution to this step is to be
handed in as a one-page PDF report report.pdf.
This step is worth 20 marks (data: 5, speedup plot: 5, convergence plot: 5, interpretation: 5)
Hamilton
If you plan to use Hamilton, please consider:
• Start early. If you submit tests to run on the machine, these tests are queued—you
share the resources with others—so you have to expect a few hours until they start.
• Familiarise yourself with the machine early. To use a supercomputer, you have to write
some startup scripts, you have to familiarise yourself with the modules environment,
and so forth. It is not difficult, but you might have to invest some time.
3
• Make your supercomputer runs as brief as possible. This way, the scheduler on the
supercomputer will squeeze them in whenever some idle time arises on a node. For the
speedup studies, a few minutes of simulation time typically do the job.
4
软件开发、广告设计客服
QQ:99515681
邮箱:99515681@qq.com
工作时间:8:00-23:00
微信:codinghelp
热点项目
更多
代写math 1151, autumn 2024 w...
2024-11-14
代做comp4336/9336 mobile dat...
2024-11-14
代做eesa01 lab 2: weather an...
2024-11-14
代写comp1521 - 24t3 assignme...
2024-11-14
代写nbs8020 - dissertation s...
2024-11-14
代做fin b377f technical anal...
2024-11-14
代做ceic6714 mini design pro...
2024-11-14
代做introduction to computer...
2024-11-14
代做cs 353, fall 2024 introd...
2024-11-14
代做phy254 problem set #3 fa...
2024-11-14
代写n1569 financial risk man...
2024-11-14
代写csci-ua.0202 lab 3: enco...
2024-11-14
代写econ2226: chinese econom...
2024-11-14
热点标签
mktg2509
csci 2600
38170
lng302
csse3010
phas3226
77938
arch1162
engn4536/engn6536
acx5903
comp151101
phl245
cse12
comp9312
stat3016/6016
phas0038
comp2140
6qqmb312
xjco3011
rest0005
ematm0051
5qqmn219
lubs5062m
eee8155
cege0100
eap033
artd1109
mat246
etc3430
ecmm462
mis102
inft6800
ddes9903
comp6521
comp9517
comp3331/9331
comp4337
comp6008
comp9414
bu.231.790.81
man00150m
csb352h
math1041
eengm4100
isys1002
08
6057cem
mktg3504
mthm036
mtrx1701
mth3241
eeee3086
cmp-7038b
cmp-7000a
ints4010
econ2151
infs5710
fins5516
fin3309
fins5510
gsoe9340
math2007
math2036
soee5010
mark3088
infs3605
elec9714
comp2271
ma214
comp2211
infs3604
600426
sit254
acct3091
bbt405
msin0116
com107/com113
mark5826
sit120
comp9021
eco2101
eeen40700
cs253
ece3114
ecmm447
chns3000
math377
itd102
comp9444
comp(2041|9044)
econ0060
econ7230
mgt001371
ecs-323
cs6250
mgdi60012
mdia2012
comm221001
comm5000
ma1008
engl642
econ241
com333
math367
mis201
nbs-7041x
meek16104
econ2003
comm1190
mbas902
comp-1027
dpst1091
comp7315
eppd1033
m06
ee3025
msci231
bb113/bbs1063
fc709
comp3425
comp9417
econ42915
cb9101
math1102e
chme0017
fc307
mkt60104
5522usst
litr1-uc6201.200
ee1102
cosc2803
math39512
omp9727
int2067/int5051
bsb151
mgt253
fc021
babs2202
mis2002s
phya21
18-213
cege0012
mdia1002
math38032
mech5125
07
cisc102
mgx3110
cs240
11175
fin3020s
eco3420
ictten622
comp9727
cpt111
de114102d
mgm320h5s
bafi1019
math21112
efim20036
mn-3503
fins5568
110.807
bcpm000028
info6030
bma0092
bcpm0054
math20212
ce335
cs365
cenv6141
ftec5580
math2010
ec3450
comm1170
ecmt1010
csci-ua.0480-003
econ12-200
ib3960
ectb60h3f
cs247—assignment
tk3163
ics3u
ib3j80
comp20008
comp9334
eppd1063
acct2343
cct109
isys1055/3412
math350-real
math2014
eec180
stat141b
econ2101
msinm014/msing014/msing014b
fit2004
comp643
bu1002
cm2030
联系我们
- QQ: 9951568
© 2021
www.rj363.com
软件定制开发网!