首页
网站开发
桌面应用
管理软件
微信开发
App开发
嵌入式软件
工具软件
数据采集与分析
其他
首页
>
> 详细
CMSC 421代写、Python程序设计代做
项目预算:
开发周期:
发布时间:
要求地区:
CMSC 421 Assignment One
Neural Networks and Optimization
September 12, 2023
General Instructions. Please submit TWO (2) files to ELMS:
(1) a PDF file that is the report of your experimental results and answers to the questions.
(2) a codebase submission in form of a zip file including only the code folders/files you modified and
the Questions folder. Please do not submit the Data Folder we provided. The code should contain
your implementations of the experiments and code for producing visualizations of the results.
The project is due at 11:59 pm on September 26 (Monday), 2023.
Please read through this document before starting your implementation and experiments. Your score
will be mostly dependent on the completion of experiments, the effectiveness of the reported results,
visualizations, the consistency between the experimental results and analysis, and the clarity of the
report. Neatness and clarity count! Good visualization helps!
As you would need to use pytorch for the second half of the programming assignment Convolutional
Neural Networks - 15 Points, We have included links to some tutorials and documentations to help
you get started with pytorch:
• Official Pytorch Documentation
• Quickstart Guide
• Tensors
• Data Loading
• Building models in Pytorch
Implementation Details
For each problem, you’ll need to code both the training and application phases of the neural network.
During training, you’ll adjust the network’s weights and biases using gradient descent. Use a single
parameter, η, to control the step size during gradient descent. The updated weights and biases will be
calculated as the old values minus the gradient multiplied by the step size.
We will be providing code snippets and datasets for some parts of the assignment. You will be required
to read the comments in the code file and fill in the missing pieces in the code files to correctly execute
these files. Please ensure that you are read through all the code files we provide. These will be available
in the CMSC421 - Fall2023 GitHub repository.
1
Part 1: Programming Task - (50 Points)
Objective
The goal of this assignment is to build a neural network from scratch, focusing on implementing the
backpropagation algorithm. You’ll apply your neural network to simple, synthetic datasets to gain
hands-on experience in tuning network parameters.
Language and Libraries
Python is mandatory for this assignment. Use numpy for all linear algebra operations. Do not use
machine learning libraries like PyTorch or TensorFlow for Questions 1,2 & 3; only numpy, matplotlib,
and Python built-in libraries are permitted.
1 Simple Linear Regression Model - (10 Points)
1.1 Network Architecture
• The network consists of an input layer, a hidden layer with one unit, a bias layer, and an output
layer with one unit.
• The output is a linear combination of the input, represented as a1 = Xw0 + a0 + b1.
1.2 Loss Function
Use a regression loss for training, defined as
1
2
Xn
i=1
(yi − a1(xi))2
1.3 Implementation
Using the template_for_solitions file, write code to train this network and apply it to data on both
1D data as q1_a and on higher dimensional data as q1_b.
• Data Preparation: Use the q1_
function from the Data.generator module to generate
training and testing data. The data module has both a and b so use the appropriate function
call to fetch the right data for each experiment.
• Network Setup: Use the net_setup method in the Trainer class to initialize the network, loss
layer, and optimizer.
• Training: Use the train method in the Trainer class to train the network. Plot the training
loss over iterations.
• Testing: Use the test data to evaluate the model’s performance. Plot the actual vs. predicted
values and compute evaluation metrics.
Tests and Experiments
1.4 Hyperparameters
• The main hyperparameters are the step size (η) and the number of gradient descent iterations.
• You may also have implicit hyperparameters like weight and bias initialization.
Hyperparameter Tuning
Discuss the difficulty level in finding an appropriate set of hyperparameters.
2
2 A Shallow Network - (10 Points)
The goal of this assignment is to implement a fully connected neural network with a single hidden
layer and a ReLU (Rectified Linear Unit) activation function. The network should be flexible enough
to accommodate any number of units in the hidden layer and any size of input, while having just one
output unit.
2.1 Network Architecture
The network consists of an input layer, a hidden layer with one unit, a bias layer, and an output layer
with one unit.
• Input Layer: a01
, a02
, . . . , a0d
• Hidden Layer: z1j =
Pd
k=1 Xw1k a0k + b1j
• ReLU Activation: a1j = max(0, z1j
)
• Output Layer: a2 =
Pd
k=1 Xw2k a1k + b2
2.2 Loss Function
Continue to use a regression loss for training the network. You can continue to use a regression loss
in training the network defined as
Xn
i=1
1
2
(yi − a
1
1
(xi))2
2.3 Implementation
Using the template_for_solitions file, write code to train this network and apply it to data on both
1D data as q2_a.py and on higher dimensional data as q2_b.py.
• Data Preparation: Use the q2_
function from the Data.generator module to generate
training and testing data. The data module has both a and b so use the appropriate function
call to fetch the right data for each experiment.
• Network Setup: Use the net_setup method in the Trainer class to initialize the network, loss
layer, and optimizer.
• Training: Use the train method in the Trainer class to train the network. Plot the training
loss over iterations.
• Testing: Use the test data to evaluate the model’s performance. Plot the actual vs. predicted
values and compute evaluation metrics.
Tests and Experiments
2.4 Hyperparameters
You now have an additional hyperparameter: the number of hidden units.
Hyperparameter Tuning:
• Discuss the difficulty in finding an appropriate set of hyperparameters.
• Compare the difficulty level between solving the 1D problem and the higher-dimensional problem.
3
3 General Deep Learning - (15 Points)
The goal of this section of the assignment is to write your neural network to handle fully-connected
networks of arbitrary depth. It will be just like the network in Problem 2, but with more layers. Each
layer will use a ReLU activation function, except for the final layer.
Tests and Experiments
• Test your network with the same training data that you used in Problem 2 A Shallow Network -
(10 Points), using both 1D and higher dimensional data. Experiment with using 3 and 5 hidden
layers. Evaluate the accuracy of your solutions in the same way as Problem 2 A Shallow Network
- (10 Points).
• Conduct and report on experiments to determine whether the depth of a network has any significant effect on how quickly your network can converge to a good solution. Include at least one
plot to justify your conclusions.
Again ensure your files are saved as q3_a.py and q3_b.py.
EXTRA CREDIT (EC): - Cross Entropy Loss (10 Points) Modify your network General
Deep Learning - (15 Points) in to perform classification tasks using a cross-entropy loss and a logistic
activation function in the output layer.
If you are submitting the EC save the code files as qec_a.py and qec_b.py.
3.1 Network Architecture
• Input Layer: Arbitrary size
• Hidden Layers: ReLU activation, arbitrary depth
• Output Layer: Logistic activation function defined as a
L
1 =
1
1+e
−zL
1
3.2 Loss Function
Use a cross-entropy loss defined as:
−
Xn
i=1
yi
log(a
L
1
(xi)) + (1 − yi)log(1 − a
L
1
(xi))
Here, yi
is assumed to be a binary value (0 or 1).
3.3 Note on Numerical Stability
Be cautious when exponentiating numbers in the sigmoid function to avoid overflow. Utilize np.maximum
and np.minimum for a concise implementation.
Tests and Experiments
3.4 Test Scenarios
1. 1D Data Tests:
• Linearly Separable Data:
– Vary the margin between points and the number of layers.
– Investigate the difficulty in finding hyperparameters based on the margin.
– Examine the speed of convergence based on the margin. Include plots.
• Non-Linearly Separable Data:
– Note the differences you observe when the data is not linearly separable.
4
2. Higher-Dimensional Data Tests:
• Repeat the experiments with higher-dimensional data.
• Use both linearly separable and non-linearly separable data sets.
• Include data to support your conclusions.
5
4 Convolutional Neural Networks - 15 Points
In this Section, you are required to implement a Convolutional Neural Network (CNN) using PyTorch
to classify images from the CINIC-10 dataset provided.
Requirements
Your CNN model should meet the following criteria:
(A) Utilize dropout for regularization. Mathematically, dropout sets a fraction p of the input units
to 0 at each update during training time, which helps to prevent overfitting.
(B) Be trained using either the RMSprop and ADAM optimizer separately. The update rule for
RMSprop is given by:
θt+1 = θt −
η
√
vt +
· gt
where θ are the parameters, η is the learning rate, vt is the moving average of the squared
gradient, is a smoothing term to avoid division by zero, and gt is the gradient.
For ADAM, the update rule is:
θt+1 = θt −
η · mˆ t √
vˆt +
where mˆ t and vˆt are bias-corrected estimates of the first and second moment of the gradients.
Report on how each optimizer performed.
(C) Include at least 3 convolutional layers and 2 fully connected layers. The convolution operation
can be represented as:
(f ∗ g)(t) = X
τ
f(τ ) · g(t − τ )
(D) Use wandb for visualization of the training loss L, which could be the cross-entropy loss for
classification:
L = −
X
i
yi
log(ˆyi)
Experimental Results
In addition to reporting the Test Accuracy and plotting the figure of Training Loss over iterations, the
following experimental results should also be reported for a comprehensive evaluation of the model’s
performance:
1. Validation Accuracy and Loss: Monitor and report the accuracy and loss on a separate
validation set to assess the model’s generalization capability.
2. Confusion Matrix: Include a confusion matrix to identify which classes the model is having
difficulty distinguishing between.
3. Precision, Recall, and F1-Score: Calculate and report these metrics to provide a more
nuanced view of the model’s performance. The F1-Score is the harmonic mean of Precision and
Recall and is defined as:
F1 = 2 ×
Precision × Recall
Precision + Recall
4. Model Size: Report the number of parameters and the memory footprint of the model.
5. Hyperparameter Tuning: If hyperparameter tuning is performed, report the performance
under different hyperparameter settings, such as learning rate, batch size, etc.
6. Class-wise Accuracy: Report the accuracy for each individual class to show how well the
model performs on different categories.
6
Part 2: Theoretical Questions - (50 Points + 3 Bonus Points)
1. Please answer the following questions about the activation function: - (9 Points)
(A) Why do we need activation functions in neural networks? (1 points)
(B) Write down the formula of the Sigmoid function and its derivative. What are the pros and cons
of using the Sigmoid function in neural networks? (4 points)
(C) Write down the formula of the ReLU function and its derivative. What are the pros and cons of
using the ReLU function in neural networks? (4 points)
2. When we optimize the neural networks, we usually use gradient descent to update
the weights of neural networks. To obtain well-trained neural networks, one of the most
important hyperparameters is the learning rate. Please answer the following questions
about learning rate: - (6 Points)
(A) What is the role of the learning rate in the gradient descent algorithm? (2 points)
(B) What happens to the neural network if the Learning Rate is too low or too high? (4 points)
3. After we train a neural network, we need to evaluate the model performance by determining if the model is underfitting or overfitting. Please answer the following questions
about underfitting or overfitting: - (12 Points)
(A) Explain the concept of underfitting and overfitting in your own words. And explain how to
determine whether a model is overfitting or underfitting based on the model performance on the
training set and validation set. (4 points)
(B) Please write down four methods that can be used to prevent the overfitting of a neural network.
(4 points)
(C) Please write down four methods that can be used to prevent the underfitting of a neural network.
(4 points)
4. Computer Vision(CV) and Natural Language Processing(NLP) are two primary application areas of neural networks. In CV areas, CNN models are often used to extract
information from images and videos, while RNN and Transformer are often used in NLP
areas to handle text data. - (9 Points + 3 Bonus Points)
(A) The key components of a CNN architecture include convolutional layers, pooling layers, and fully
connected layers. Provide a brief description of the function of each component. (4.5 points)
(B) Explain the concept of Hidden State, Time Steps and Weight Sharing in the design of RNN. (4.5
points)
(C) Bonus Question: Batch Normalization (BN) is important in real-world practice. Please describe
what BN is doing and explain why do we need BN in neural networks. (3 points)
5. Convolutional to Multi-layer Perceptron - (14 Points)
A convolution operation is a linear operation, and therefore convolutional layers can be represented
in the form of matrix multiplication, or in other words, represented by multi-layer perceptron. More
precisely, if we denote the convolution operation as c(x, θw, θb, γ), where θw are the filter weights, θb
are the filter biases, and γ are the padding and stride parameters, we want to convert the filters to a
weight matrix so that
flatten(c(x, θw, θb, γ)) = Wflatten(x) + b, (1)
where flatten(·) takes in a tensor of size (d1, d2, d3) and outputs a 1-D vector of size (d1×d2×d3). For example, flatten(F ilter1) = (i1,1, i1,2, i1,3, i2,1, i2,2, i2,3, i3,1, i3,2, i3,3, j1,1, j1,2, j1,3, j2,1, j2,2, j2,3, j3,1, j3,2, j3,3)
The converted weights and biases W and b depend on the convolution filters θw, θb and also γ (paddings
and strides).
Suppose the input is a 2 × 2 × 3 (C × H × W) image, and we have a convolutional layer with
two filters as shown in Figure 1, where the filter size is 3 × 3, the padding is 1 (filled with zeros)
7
1st Channel
A Sliding Window
2nd Channel
j1,1 j1,2 j1,3
j2,1 14 15
j3,1 17 18
i1,1 i1,2 i1,3
i2,1 i2,2 i2,3
i3,1 i3,2 i3,3
l1,1 l1,2 l1,3
l2,1 32 33
l3,1 35 36
k1,1 k1,2 k1,3
k2,1 k2,2 K2,3
k3,1 k3,2 k3,3
Filter 1
Filter 2
Figure 1: Input image and filters. Note that the sliding window slides in row major order, i.e., it first
slides right and changes to the first position of the second row until it reaches the end of the first row.
The white region around the input image is the zero padding.
and the stride is 1. The bias terms for the two convolutional filters in Filter1(Filter2) are b1(b3)
and b2(b4) respectively. For one filter, we convolve it with every sliding window of the input
image, and every such convolve operation over one sliding window generates one output of this
convolutional layer. For one filter, there are 6 sliding windows in total, which correspond to the
6 outputs of such filter. For every sliding window, we can think the output to be generated by
a dot product of a weight vector and the flattened input image, where non-zero entries of the
the weight vector should have exactly the same values as the filter, and their positions depend
on the sliding window. When we get the weight vector for each sliding window, we can simply
stack them together to get the converted weight matrix W. The bias part is simple, as for one
filter, we are adding the same bias to every sliding window output. Write out the weight matrix
W and bias b in terms of the filter weights and biases. Convince yourself that you get exactly
the same output (flattened) as the original convolution.
8
软件开发、广告设计客服
QQ:99515681
邮箱:99515681@qq.com
工作时间:8:00-23:00
微信:codinghelp
热点项目
更多
urba6006代写、java/c++编程语...
2024-12-26
代做program、代写python编程语...
2024-12-26
代写dts207tc、sql编程语言代做
2024-12-25
cs209a代做、java程序设计代写
2024-12-25
cs305程序代做、代写python程序...
2024-12-25
代写csc1001、代做python设计程...
2024-12-24
代写practice test preparatio...
2024-12-24
代写bre2031 – environmental...
2024-12-24
代写ece5550: applied kalman ...
2024-12-24
代做conmgnt 7049 – measurem...
2024-12-24
代写ece3700j introduction to...
2024-12-24
代做adad9311 designing the e...
2024-12-24
代做comp5618 - applied cyber...
2024-12-24
热点标签
mktg2509
csci 2600
38170
lng302
csse3010
phas3226
77938
arch1162
engn4536/engn6536
acx5903
comp151101
phl245
cse12
comp9312
stat3016/6016
phas0038
comp2140
6qqmb312
xjco3011
rest0005
ematm0051
5qqmn219
lubs5062m
eee8155
cege0100
eap033
artd1109
mat246
etc3430
ecmm462
mis102
inft6800
ddes9903
comp6521
comp9517
comp3331/9331
comp4337
comp6008
comp9414
bu.231.790.81
man00150m
csb352h
math1041
eengm4100
isys1002
08
6057cem
mktg3504
mthm036
mtrx1701
mth3241
eeee3086
cmp-7038b
cmp-7000a
ints4010
econ2151
infs5710
fins5516
fin3309
fins5510
gsoe9340
math2007
math2036
soee5010
mark3088
infs3605
elec9714
comp2271
ma214
comp2211
infs3604
600426
sit254
acct3091
bbt405
msin0116
com107/com113
mark5826
sit120
comp9021
eco2101
eeen40700
cs253
ece3114
ecmm447
chns3000
math377
itd102
comp9444
comp(2041|9044)
econ0060
econ7230
mgt001371
ecs-323
cs6250
mgdi60012
mdia2012
comm221001
comm5000
ma1008
engl642
econ241
com333
math367
mis201
nbs-7041x
meek16104
econ2003
comm1190
mbas902
comp-1027
dpst1091
comp7315
eppd1033
m06
ee3025
msci231
bb113/bbs1063
fc709
comp3425
comp9417
econ42915
cb9101
math1102e
chme0017
fc307
mkt60104
5522usst
litr1-uc6201.200
ee1102
cosc2803
math39512
omp9727
int2067/int5051
bsb151
mgt253
fc021
babs2202
mis2002s
phya21
18-213
cege0012
mdia1002
math38032
mech5125
07
cisc102
mgx3110
cs240
11175
fin3020s
eco3420
ictten622
comp9727
cpt111
de114102d
mgm320h5s
bafi1019
math21112
efim20036
mn-3503
fins5568
110.807
bcpm000028
info6030
bma0092
bcpm0054
math20212
ce335
cs365
cenv6141
ftec5580
math2010
ec3450
comm1170
ecmt1010
csci-ua.0480-003
econ12-200
ib3960
ectb60h3f
cs247—assignment
tk3163
ics3u
ib3j80
comp20008
comp9334
eppd1063
acct2343
cct109
isys1055/3412
math350-real
math2014
eec180
stat141b
econ2101
msinm014/msing014/msing014b
fit2004
comp643
bu1002
cm2030
联系我们
- QQ: 9951568
© 2021
www.rj363.com
软件定制开发网!