Skip to content
Projects
Groups
Snippets
Help
Loading...
Help
Support
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
A
Anomaly_detection_and_recommender_system
Project overview
Project overview
Details
Activity
Releases
Repository
Repository
Files
Commits
Branches
Tags
Contributors
Graph
Compare
Issues
0
Issues
0
List
Boards
Labels
Milestones
Merge Requests
0
Merge Requests
0
CI / CD
CI / CD
Pipelines
Jobs
Schedules
Analytics
Analytics
CI / CD
Repository
Value Stream
Wiki
Wiki
Members
Members
Collapse sidebar
Close sidebar
Activity
Graph
Create a new issue
Jobs
Commits
Issue Boards
Open sidebar
TARUSH AGARWAL
Anomaly_detection_and_recommender_system
Commits
d820dee5
Commit
d820dee5
authored
Sep 27, 2018
by
RACHIT BANSAL
Committed by
tarush
Jul 30, 2019
Browse files
Options
Browse Files
Download
Email Patches
Plain Diff
added for verification
parents
Changes
33
Expand all
Hide whitespace changes
Inline
Side-by-side
Showing
33 changed files
with
1392 additions
and
0 deletions
+1392
-0
.DS_Store
.DS_Store
+0
-0
Anomaly Detection and Recommender Systems/.DS_Store
Anomaly Detection and Recommender Systems/.DS_Store
+0
-0
Anomaly Detection and Recommender Systems/checkCostFunction.m
...aly Detection and Recommender Systems/checkCostFunction.m
+49
-0
Anomaly Detection and Recommender Systems/cofiCostFunc.m
Anomaly Detection and Recommender Systems/cofiCostFunc.m
+73
-0
Anomaly Detection and Recommender Systems/computeNumericalGradient.m
...ection and Recommender Systems/computeNumericalGradient.m
+29
-0
Anomaly Detection and Recommender Systems/estimateGaussian.m
Anomaly Detection and Recommender Systems/estimateGaussian.m
+42
-0
Anomaly Detection and Recommender Systems/ex8.m
Anomaly Detection and Recommender Systems/ex8.m
+121
-0
Anomaly Detection and Recommender Systems/ex8_cofi.m
Anomaly Detection and Recommender Systems/ex8_cofi.m
+237
-0
Anomaly Detection and Recommender Systems/ex8_movieParams.mat
...aly Detection and Recommender Systems/ex8_movieParams.mat
+0
-0
Anomaly Detection and Recommender Systems/ex8_movies.mat
Anomaly Detection and Recommender Systems/ex8_movies.mat
+0
-0
Anomaly Detection and Recommender Systems/ex8data1.mat
Anomaly Detection and Recommender Systems/ex8data1.mat
+0
-0
Anomaly Detection and Recommender Systems/ex8data2.mat
Anomaly Detection and Recommender Systems/ex8data2.mat
+0
-0
Anomaly Detection and Recommender Systems/fmincg.m
Anomaly Detection and Recommender Systems/fmincg.m
+175
-0
Anomaly Detection and Recommender Systems/lib/jsonlab/AUTHORS.txt
...Detection and Recommender Systems/lib/jsonlab/AUTHORS.txt
+41
-0
Anomaly Detection and Recommender Systems/lib/jsonlab/ChangeLog.txt
...tection and Recommender Systems/lib/jsonlab/ChangeLog.txt
+74
-0
Anomaly Detection and Recommender Systems/lib/jsonlab/LICENSE_BSD.txt
...ction and Recommender Systems/lib/jsonlab/LICENSE_BSD.txt
+25
-0
Anomaly Detection and Recommender Systems/lib/jsonlab/README.txt
... Detection and Recommender Systems/lib/jsonlab/README.txt
+0
-0
Anomaly Detection and Recommender Systems/lib/jsonlab/jsonopt.m
...y Detection and Recommender Systems/lib/jsonlab/jsonopt.m
+32
-0
Anomaly Detection and Recommender Systems/lib/jsonlab/loadjson.m
... Detection and Recommender Systems/lib/jsonlab/loadjson.m
+0
-0
Anomaly Detection and Recommender Systems/lib/jsonlab/loadubjson.m
...etection and Recommender Systems/lib/jsonlab/loadubjson.m
+0
-0
Anomaly Detection and Recommender Systems/lib/jsonlab/mergestruct.m
...tection and Recommender Systems/lib/jsonlab/mergestruct.m
+33
-0
Anomaly Detection and Recommender Systems/lib/jsonlab/savejson.m
... Detection and Recommender Systems/lib/jsonlab/savejson.m
+0
-0
Anomaly Detection and Recommender Systems/lib/jsonlab/saveubjson.m
...etection and Recommender Systems/lib/jsonlab/saveubjson.m
+0
-0
Anomaly Detection and Recommender Systems/lib/jsonlab/varargin2struct.m
...ion and Recommender Systems/lib/jsonlab/varargin2struct.m
+40
-0
Anomaly Detection and Recommender Systems/lib/makeValidFieldName.m
...etection and Recommender Systems/lib/makeValidFieldName.m
+30
-0
Anomaly Detection and Recommender Systems/lib/submitWithConfiguration.m
...ion and Recommender Systems/lib/submitWithConfiguration.m
+179
-0
Anomaly Detection and Recommender Systems/loadMovieList.m
Anomaly Detection and Recommender Systems/loadMovieList.m
+25
-0
Anomaly Detection and Recommender Systems/movie_ids.txt
Anomaly Detection and Recommender Systems/movie_ids.txt
+0
-0
Anomaly Detection and Recommender Systems/multivariateGaussian.m
... Detection and Recommender Systems/multivariateGaussian.m
+23
-0
Anomaly Detection and Recommender Systems/normalizeRatings.m
Anomaly Detection and Recommender Systems/normalizeRatings.m
+17
-0
Anomaly Detection and Recommender Systems/selectThreshold.m
Anomaly Detection and Recommender Systems/selectThreshold.m
+49
-0
Anomaly Detection and Recommender Systems/submit.m
Anomaly Detection and Recommender Systems/submit.m
+77
-0
Anomaly Detection and Recommender Systems/visualizeFit.m
Anomaly Detection and Recommender Systems/visualizeFit.m
+21
-0
No files found.
.DS_Store
0 → 100644
View file @
d820dee5
File added
Anomaly Detection and Recommender Systems/.DS_Store
0 → 100644
View file @
d820dee5
File added
Anomaly Detection and Recommender Systems/checkCostFunction.m
0 → 100755
View file @
d820dee5
function
checkCostFunction
(
lambda
)
%CHECKCOSTFUNCTION Creates a collaborative filering problem
%to check your cost function and gradients
% CHECKCOSTFUNCTION(lambda) Creates a collaborative filering problem
% to check your cost function and gradients, it will output the
% analytical gradients produced by your code and the numerical gradients
% (computed using computeNumericalGradient). These two gradient
% computations should result in very similar values.
% Set lambda
if
~
exist
(
'lambda'
,
'var'
)
||
isempty
(
lambda
)
lambda
=
0
;
end
%% Create small problem
X_t
=
rand
(
4
,
3
);
Theta_t
=
rand
(
5
,
3
);
% Zap out most entries
Y
=
X_t
*
Theta_t
'
;
Y
(
rand
(
size
(
Y
))
>
0.5
)
=
0
;
R
=
zeros
(
size
(
Y
));
R
(
Y
~=
0
)
=
1
;
%% Run Gradient Checking
X
=
randn
(
size
(
X_t
));
Theta
=
randn
(
size
(
Theta_t
));
num_users
=
size
(
Y
,
2
);
num_movies
=
size
(
Y
,
1
);
num_features
=
size
(
Theta_t
,
2
);
numgrad
=
computeNumericalGradient
(
...
@
(
t
)
cofiCostFunc
(
t
,
Y
,
R
,
num_users
,
num_movies
,
...
num_features
,
lambda
),
[
X
(:);
Theta
(:)]);
[
cost
,
grad
]
=
cofiCostFunc
([
X
(:);
Theta
(:)],
Y
,
R
,
num_users
,
...
num_movies
,
num_features
,
lambda
);
disp
([
numgrad
grad
]);
fprintf
([
'The above two columns you get should be very similar.\n'
...
'(Left-Your Numerical Gradient, Right-Analytical Gradient)\n\n'
]);
diff
=
norm
(
numgrad
-
grad
)/
norm
(
numgrad
+
grad
);
fprintf
([
'If your cost function implementation is correct, then \n'
...
'the relative difference will be small (less than 1e-9). \n'
...
'\nRelative Difference: %g\n'
],
diff
);
end
\ No newline at end of file
Anomaly Detection and Recommender Systems/cofiCostFunc.m
0 → 100755
View file @
d820dee5
function
[
J
,
grad
]
=
cofiCostFunc
(
params
,
Y
,
R
,
num_users
,
num_movies
,
...
num_features
,
lambda
)
%COFICOSTFUNC Collaborative filtering cost function
% [J, grad] = COFICOSTFUNC(params, Y, R, num_users, num_movies, ...
% num_features, lambda) returns the cost and gradient for the
% collaborative filtering problem.
%
% Unfold the U and W matrices from params
X
=
reshape
(
params
(
1
:
num_movies
*
num_features
),
num_movies
,
num_features
);
Theta
=
reshape
(
params
(
num_movies
*
num_features
+
1
:
end
),
...
num_users
,
num_features
);
% You need to return the following values correctly
J
=
0
;
X_grad
=
zeros
(
size
(
X
));
Theta_grad
=
zeros
(
size
(
Theta
));
% ====================== YOUR CODE HERE ======================
% Instructions: Compute the cost function and gradient for collaborative
% filtering. Concretely, you should first implement the cost
% function (without regularization) and make sure it is
% matches our costs. After that, you should implement the
% gradient and use the checkCostFunction routine to check
% that the gradient is correct. Finally, you should implement
% regularization.
%
% Notes: X - num_movies x num_features matrix of movie features
% Theta - num_users x num_features matrix of user features
% Y - num_movies x num_users matrix of user ratings of movies
% R - num_movies x num_users matrix, where R(i, j) = 1 if the
% i-th movie was rated by the j-th user
%
% You should set the following variables correctly:
%
% X_grad - num_movies x num_features matrix, containing the
% partial derivatives w.r.t. to each element of X
% Theta_grad - num_users x num_features matrix, containing the
% partial derivatives w.r.t. to each element of Theta
%
J
=
sum
(
sum
(((
X
*
(
Theta
'
)
-
Y
)
.^
2
)
.*
R
))/
2
+
lambda
*
(
sum
(
sum
(
Theta
.^
2
)))/
2
+
lambda
*
(
sum
(
sum
(
X
.^
2
)))/
2
;
for
i
=
1
:
num_movies
idx
=
find
(
R
(
i
,
:)
==
1
);
Thetatemp
=
Theta
(
idx
,
:);
Ytemp
=
Y
(
i
,
idx
);
X_grad
(
i
,
:)
=
(
X
(
i
,
:)
*
(
Thetatemp
'
)
-
Ytemp
)
*
Thetatemp
;
end
for
i
=
1
:
num_users
idx
=
find
(
R
(:,
i
)
==
1
);
Xtemp
=
X
(
idx
,
:);
Ytemp
=
Y
(
idx
,
i
);
Theta_grad
(
i
,
:)
=
(
Theta
(
i
,
:)
*
(
Xtemp
') - Ytemp'
)
*
Xtemp
;
end
X_grad
=
X_grad
+
lambda
.*
X
;
Theta_grad
=
Theta_grad
+
lambda
.*
Theta
;
% =============================================================
grad
=
[
X_grad
(:);
Theta_grad
(:)];
end
Anomaly Detection and Recommender Systems/computeNumericalGradient.m
0 → 100755
View file @
d820dee5
function
numgrad
=
computeNumericalGradient
(
J
,
theta
)
%COMPUTENUMERICALGRADIENT Computes the gradient using "finite differences"
%and gives us a numerical estimate of the gradient.
% numgrad = COMPUTENUMERICALGRADIENT(J, theta) computes the numerical
% gradient of the function J around theta. Calling y = J(theta) should
% return the function value at theta.
% Notes: The following code implements numerical gradient checking, and
% returns the numerical gradient.It sets numgrad(i) to (a numerical
% approximation of) the partial derivative of J with respect to the
% i-th input argument, evaluated at theta. (i.e., numgrad(i) should
% be the (approximately) the partial derivative of J with respect
% to theta(i).)
%
numgrad
=
zeros
(
size
(
theta
));
perturb
=
zeros
(
size
(
theta
));
e
=
1e-4
;
for
p
=
1
:
numel
(
theta
)
% Set perturbation vector
perturb
(
p
)
=
e
;
loss1
=
J
(
theta
-
perturb
);
loss2
=
J
(
theta
+
perturb
);
% Compute Numerical Gradient
numgrad
(
p
)
=
(
loss2
-
loss1
)
/
(
2
*
e
);
perturb
(
p
)
=
0
;
end
end
Anomaly Detection and Recommender Systems/estimateGaussian.m
0 → 100755
View file @
d820dee5
function
[
mu
sigma2
]
=
estimateGaussian
(
X
)
%ESTIMATEGAUSSIAN This function estimates the parameters of a
%Gaussian distribution using the data in X
% [mu sigma2] = estimateGaussian(X),
% The input X is the dataset with each n-dimensional data point in one row
% The output is an n-dimensional vector mu, the mean of the data set
% and the variances sigma^2, an n x 1 vector
%
% Useful variables
[
m
,
n
]
=
size
(
X
);
% You should return these values correctly
mu
=
zeros
(
n
,
1
);
sigma2
=
zeros
(
n
,
1
);
% ====================== YOUR CODE HERE ======================
% Instructions: Compute the mean of the data and the variances
% In particular, mu(i) should contain the mean of
% the data for the i-th feature and sigma2(i)
% should contain variance of the i-th feature.
%
mu
=
(
sum
(
X
))
'
/
m
;
for
i
=
1
:
n
for
j
=
1
:
m
sigma2
(
i
)
=
sigma2
(
i
)
+
(
X
(
j
,
i
)
-
mu
(
i
))
.^
2
;
end
end
sigma2
=
sigma2
/
m
;
% =============================================================
end
Anomaly Detection and Recommender Systems/ex8.m
0 → 100755
View file @
d820dee5
%% Machine Learning Online Class
% Exercise 8 | Anomaly Detection and Collaborative Filtering
%
% Instructions
% ------------
%
% This file contains code that helps you get started on the
% exercise. You will need to complete the following functions:
%
% estimateGaussian.m
% selectThreshold.m
% cofiCostFunc.m
%
% For this exercise, you will not need to change any code in this file,
% or any other files other than those mentioned above.
%
%% Initialization
clear
;
close
all
;
clc
%% ================== Part 1: Load Example Dataset ===================
% We start this exercise by using a small dataset that is easy to
% visualize.
%
% Our example case consists of 2 network server statistics across
% several machines: the latency and throughput of each machine.
% This exercise will help us find possibly faulty (or very fast) machines.
%
fprintf
(
'Visualizing example dataset for outlier detection.\n\n'
);
% The following command loads the dataset. You should now have the
% variables X, Xval, yval in your environment
load
(
'ex8data1.mat'
);
% Visualize the example dataset
plot
(
X
(:,
1
),
X
(:,
2
),
'bx'
);
axis
([
0
30
0
30
]);
xlabel
(
'Latency (ms)'
);
ylabel
(
'Throughput (mb/s)'
);
fprintf
(
'Program paused. Press enter to continue.\n'
);
pause
%% ================== Part 2: Estimate the dataset statistics ===================
% For this exercise, we assume a Gaussian distribution for the dataset.
%
% We first estimate the parameters of our assumed Gaussian distribution,
% then compute the probabilities for each of the points and then visualize
% both the overall distribution and where each of the points falls in
% terms of that distribution.
%
fprintf
(
'Visualizing Gaussian fit.\n\n'
);
% Estimate my and sigma2
[
mu
sigma2
]
=
estimateGaussian
(
X
);
% Returns the density of the multivariate normal at each data point (row)
% of X
p
=
multivariateGaussian
(
X
,
mu
,
sigma2
);
% Visualize the fit
visualizeFit
(
X
,
mu
,
sigma2
);
xlabel
(
'Latency (ms)'
);
ylabel
(
'Throughput (mb/s)'
);
fprintf
(
'Program paused. Press enter to continue.\n'
);
pause
;
%% ================== Part 3: Find Outliers ===================
% Now you will find a good epsilon threshold using a cross-validation set
% probabilities given the estimated Gaussian distribution
%
pval
=
multivariateGaussian
(
Xval
,
mu
,
sigma2
);
[
epsilon
F1
]
=
selectThreshold
(
yval
,
pval
);
fprintf
(
'Best epsilon found using cross-validation: %e\n'
,
epsilon
);
fprintf
(
'Best F1 on Cross Validation Set: %f\n'
,
F1
);
fprintf
(
' (you should see a value epsilon of about 8.99e-05)\n'
);
fprintf
(
' (you should see a Best F1 value of 0.875000)\n\n'
);
% Find the outliers in the training set and plot the
outliers
=
find
(
p
<
epsilon
);
% Draw a red circle around those outliers
hold
on
plot
(
X
(
outliers
,
1
),
X
(
outliers
,
2
),
'ro'
,
'LineWidth'
,
2
,
'MarkerSize'
,
10
);
hold
off
fprintf
(
'Program paused. Press enter to continue.\n'
);
pause
;
%% ================== Part 4: Multidimensional Outliers ===================
% We will now use the code from the previous part and apply it to a
% harder problem in which more features describe each datapoint and only
% some features indicate whether a point is an outlier.
%
% Loads the second dataset. You should now have the
% variables X, Xval, yval in your environment
load
(
'ex8data2.mat'
);
% Apply the same steps to the larger dataset
[
mu
sigma2
]
=
estimateGaussian
(
X
);
% Training set
p
=
multivariateGaussian
(
X
,
mu
,
sigma2
);
% Cross-validation set
pval
=
multivariateGaussian
(
Xval
,
mu
,
sigma2
);
% Find the best threshold
[
epsilon
F1
]
=
selectThreshold
(
yval
,
pval
);
fprintf
(
'Best epsilon found using cross-validation: %e\n'
,
epsilon
);
fprintf
(
'Best F1 on Cross Validation Set: %f\n'
,
F1
);
fprintf
(
' (you should see a value epsilon of about 1.38e-18)\n'
);
fprintf
(
' (you should see a Best F1 value of 0.615385)\n'
);
fprintf
(
'# Outliers found: %d\n\n'
,
sum
(
p
<
epsilon
));
Anomaly Detection and Recommender Systems/ex8_cofi.m
0 → 100755
View file @
d820dee5
%% Machine Learning Online Class
% Exercise 8 | Anomaly Detection and Collaborative Filtering
%
% Instructions
% ------------
%
% This file contains code that helps you get started on the
% exercise. You will need to complete the following functions:
%
% estimateGaussian.m
% selectThreshold.m
% cofiCostFunc.m
%
% For this exercise, you will not need to change any code in this file,
% or any other files other than those mentioned above.
%
%% =============== Part 1: Loading movie ratings dataset ================
% You will start by loading the movie ratings dataset to understand the
% structure of the data.
%
fprintf
(
'Loading movie ratings dataset.\n\n'
);
% Load data
load
(
'ex8_movies.mat'
);
% Y is a 1682x943 matrix, containing ratings (1-5) of 1682 movies on
% 943 users
%
% R is a 1682x943 matrix, where R(i,j) = 1 if and only if user j gave a
% rating to movie i
% From the matrix, we can compute statistics like average rating.
fprintf
(
'Average rating for movie 1 (Toy Story): %f / 5\n\n'
,
...
mean
(
Y
(
1
,
R
(
1
,
:))));
% We can "visualize" the ratings matrix by plotting it with imagesc
imagesc
(
Y
);
ylabel
(
'Movies'
);
xlabel
(
'Users'
);
fprintf
(
'\nProgram paused. Press enter to continue.\n'
);
pause
;
%% ============ Part 2: Collaborative Filtering Cost Function ===========
% You will now implement the cost function for collaborative filtering.
% To help you debug your cost function, we have included set of weights
% that we trained on that. Specifically, you should complete the code in
% cofiCostFunc.m to return J.
% Load pre-trained weights (X, Theta, num_users, num_movies, num_features)
load
(
'ex8_movieParams.mat'
);
% Reduce the data set size so that this runs faster
num_users
=
4
;
num_movies
=
5
;
num_features
=
3
;
X
=
X
(
1
:
num_movies
,
1
:
num_features
);
Theta
=
Theta
(
1
:
num_users
,
1
:
num_features
);
Y
=
Y
(
1
:
num_movies
,
1
:
num_users
);
R
=
R
(
1
:
num_movies
,
1
:
num_users
);
% Evaluate cost function
J
=
cofiCostFunc
([
X
(:)
;
Theta
(:)],
Y
,
R
,
num_users
,
num_movies
,
...
num_features
,
0
);
fprintf
([
'Cost at loaded parameters: %f '
...
'\n(this value should be about 22.22)\n'
],
J
);
fprintf
(
'\nProgram paused. Press enter to continue.\n'
);
pause
;
%% ============== Part 3: Collaborative Filtering Gradient ==============
% Once your cost function matches up with ours, you should now implement
% the collaborative filtering gradient function. Specifically, you should
% complete the code in cofiCostFunc.m to return the grad argument.
%
fprintf
(
'\nChecking Gradients (without regularization) ... \n'
);
% Check gradients by running checkNNGradients
checkCostFunction
;
fprintf
(
'\nProgram paused. Press enter to continue.\n'
);
pause
;
%% ========= Part 4: Collaborative Filtering Cost Regularization ========
% Now, you should implement regularization for the cost function for
% collaborative filtering. You can implement it by adding the cost of
% regularization to the original cost computation.
%
% Evaluate cost function
J
=
cofiCostFunc
([
X
(:)
;
Theta
(:)],
Y
,
R
,
num_users
,
num_movies
,
...
num_features
,
1.5
);
fprintf
([
'Cost at loaded parameters (lambda = 1.5): %f '
...
'\n(this value should be about 31.34)\n'
],
J
);
fprintf
(
'\nProgram paused. Press enter to continue.\n'
);
pause
;
%% ======= Part 5: Collaborative Filtering Gradient Regularization ======
% Once your cost matches up with ours, you should proceed to implement
% regularization for the gradient.
%
%
fprintf
(
'\nChecking Gradients (with regularization) ... \n'
);
% Check gradients by running checkNNGradients
checkCostFunction
(
1.5
);
fprintf
(
'\nProgram paused. Press enter to continue.\n'
);
pause
;
%% ============== Part 6: Entering ratings for a new user ===============
% Before we will train the collaborative filtering model, we will first
% add ratings that correspond to a new user that we just observed. This
% part of the code will also allow you to put in your own ratings for the
% movies in our dataset!
%
movieList
=
loadMovieList
();
% Initialize my ratings
my_ratings
=
zeros
(
1682
,
1
);
% Check the file movie_idx.txt for id of each movie in our dataset
% For example, Toy Story (1995) has ID 1, so to rate it "4", you can set
my_ratings
(
1
)
=
4
;
% Or suppose did not enjoy Silence of the Lambs (1991), you can set
my_ratings
(
98
)
=
2
;
% We have selected a few movies we liked / did not like and the ratings we
% gave are as follows:
my_ratings
(
7
)
=
3
;
my_ratings
(
12
)
=
5
;
my_ratings
(
54
)
=
4
;
my_ratings
(
64
)
=
5
;
my_ratings
(
66
)
=
3
;
my_ratings
(
69
)
=
5
;
my_ratings
(
183
)
=
4
;
my_ratings
(
226
)
=
5
;
my_ratings
(
355
)
=
5
;
fprintf
(
'\n\nNew user ratings:\n'
);
for
i
=
1
:
length
(
my_ratings
)
if
my_ratings
(
i
)
>
0
fprintf
(
'Rated %d for %s\n'
,
my_ratings
(
i
),
...
movieList
{
i
});
end
end
fprintf
(
'\nProgram paused. Press enter to continue.\n'
);
pause
;
%% ================== Part 7: Learning Movie Ratings ====================
% Now, you will train the collaborative filtering model on a movie rating
% dataset of 1682 movies and 943 users
%
fprintf
(
'\nTraining collaborative filtering...\n'
);
% Load data
load
(
'ex8_movies.mat'
);
% Y is a 1682x943 matrix, containing ratings (1-5) of 1682 movies by
% 943 users
%
% R is a 1682x943 matrix, where R(i,j) = 1 if and only if user j gave a
% rating to movie i
% Add our own ratings to the data matrix
Y
=
[
my_ratings
Y
];
R
=
[(
my_ratings
~=
0
)
R
];
% Normalize Ratings
[
Ynorm
,
Ymean
]
=
normalizeRatings
(
Y
,
R
);
% Useful Values
num_users
=
size
(
Y
,
2
);
num_movies
=
size
(
Y
,
1
);
num_features
=
10
;
% Set Initial Parameters (Theta, X)
X
=
randn
(
num_movies
,
num_features
);
Theta
=
randn
(
num_users
,
num_features
);
initial_parameters
=
[
X
(:);
Theta
(:)];
% Set options for fmincg
options
=
optimset
(
'GradObj'
,
'on'
,
'MaxIter'
,
100
);
% Set Regularization
lambda
=
10
;
theta
=
fmincg
(
@
(
t
)(
cofiCostFunc
(
t
,
Ynorm
,
R
,
num_users
,
num_movies
,
...
num_features
,
lambda
)),
...
initial_parameters
,
options
);
% Unfold the returned theta back into U and W
X
=
reshape
(
theta
(
1
:
num_movies
*
num_features
),
num_movies
,
num_features
);
Theta
=
reshape
(
theta
(
num_movies
*
num_features
+
1
:
end
),
...
num_users
,
num_features
);
fprintf
(
'Recommender system learning completed.\n'
);
fprintf
(
'\nProgram paused. Press enter to continue.\n'
);
pause
;
%% ================== Part 8: Recommendation for you ====================
% After training the model, you can now make recommendations by computing
% the predictions matrix.
%
p
=
X
*
Theta
'
;
my_predictions
=
p
(:,
1
)
+
Ymean
;
movieList
=
loadMovieList
();
[
r
,
ix
]
=
sort
(
my_predictions
,
'descend'
);
fprintf
(
'\nTop recommendations for you:\n'
);
for
i
=
1
:
10
j
=
ix
(
i
);
fprintf
(
'Predicting rating %.1f for movie %s\n'
,
my_predictions
(
j
),
...
movieList
{
j
});
end
fprintf
(
'\n\nOriginal ratings provided:\n'
);
for
i
=
1
:
length
(
my_ratings
)
if
my_ratings
(
i
)
>
0
fprintf
(
'Rated %d for %s\n'
,
my_ratings
(
i
),
...
movieList
{
i
});
end
end
Anomaly Detection and Recommender Systems/ex8_movieParams.mat
0 → 100755
View file @
d820dee5
File added
Anomaly Detection and Recommender Systems/ex8_movies.mat
0 → 100755
View file @
d820dee5
File added
Anomaly Detection and Recommender Systems/ex8data1.mat
0 → 100755
View file @
d820dee5
File added
Anomaly Detection and Recommender Systems/ex8data2.mat
0 → 100755
View file @
d820dee5
File added
Anomaly Detection and Recommender Systems/fmincg.m
0 → 100755
View file @
d820dee5
function
[
X
,
fX
,
i
]
=
fmincg
(
f
,
X
,
options
,
P1
,
P2
,
P3
,
P4
,
P5
)
% Minimize a continuous differentialble multivariate function. Starting point
% is given by "X" (D by 1), and the function named in the string "f", must
% return a function value and a vector of partial derivatives. The Polack-
% Ribiere flavour of conjugate gradients is used to compute search directions,
% and a line search using quadratic and cubic polynomial approximations and the
% Wolfe-Powell stopping criteria is used together with the slope ratio method
% for guessing initial step sizes. Additionally a bunch of checks are made to
% make sure that exploration is taking place and that extrapolation will not
% be unboundedly large. The "length" gives the length of the run: if it is
% positive, it gives the maximum number of line searches, if negative its
% absolute gives the maximum allowed number of function evaluations. You can
% (optionally) give "length" a second component, which will indicate the
% reduction in function value to be expected in the first line-search (defaults
% to 1.0). The function returns when either its length is up, or if no further
% progress can be made (ie, we are at a minimum, or so close that due to
% numerical problems, we cannot get any closer). If the function terminates
% within a few iterations, it could be an indication that the function value
% and derivatives are not consistent (ie, there may be a bug in the
% implementation of your "f" function). The function returns the found
% solution "X", a vector of function values "fX" indicating the progress made
% and "i" the number of iterations (line searches or function evaluations,
% depending on the sign of "length") used.
%
% Usage: [X, fX, i] = fmincg(f, X, options, P1, P2, P3, P4, P5)
%
% See also: checkgrad
%
% Copyright (C) 2001 and 2002 by Carl Edward Rasmussen. Date 2002-02-13
%
%
% (C) Copyright 1999, 2000 & 2001, Carl Edward Rasmussen
%
% Permission is granted for anyone to copy, use, or modify these
% programs and accompanying documents for purposes of research or
% education, provided this copyright notice is retained, and note is
% made of any changes that have been made.
%
% These programs and documents are distributed without any warranty,
% express or implied. As the programs were written for research
% purposes only, they have not been tested to the degree that would be
% advisable in any important application. All use of these programs is
% entirely at the user's own risk.
%
% [ml-class] Changes Made:
% 1) Function name and argument specifications
% 2) Output display
%
% Read options
if
exist
(
'options'
,
'var'
)
&&
~
isempty
(
options
)
&&
isfield
(
options
,
'MaxIter'
)
length
=
options
.
MaxIter
;
else
length
=
100
;
end
RHO
=
0.01
;
% a bunch of constants for line searches
SIG
=
0.5
;
% RHO and SIG are the constants in the Wolfe-Powell conditions
INT
=
0.1
;
% don't reevaluate within 0.1 of the limit of the current bracket
EXT
=
3.0
;
% extrapolate maximum 3 times the current bracket
MAX
=
20
;
% max 20 function evaluations per line search
RATIO
=
100
;
% maximum allowed slope ratio
argstr
=
[
'feval(f, X'
];
% compose string used to call function
for
i
=
1
:(
nargin
-
3
)
argstr
=
[
argstr
,
',P'
,
int2str
(
i
)];
end
argstr
=
[
argstr
,
')'
];
if
max
(
size
(
length
))
==
2
,
red
=
length
(
2
);
length
=
length
(
1
);
else
red
=
1
;
end
S
=
[
'Iteration '
];
i
=
0
;
% zero the run length counter
ls_failed
=
0
;
% no previous line search has failed
fX
=
[];
[
f1
df1
]
=
eval
(
argstr
);
% get function value and gradient
i
=
i
+
(
length
<
0
);
% count epochs?!
s
=
-
df1
;
% search direction is steepest
d1
=
-
s
'*
s
;
% this is the slope
z1
=
red
/(
1
-
d1
);
% initial step is red/(|s|+1)
while
i
<
abs
(
length
)
% while not finished
i
=
i
+
(
length
>
0
);
% count iterations?!
X0
=
X
;
f0
=
f1
;
df0
=
df1
;
% make a copy of current values
X
=
X
+
z1
*
s
;
% begin line search
[
f2
df2
]
=
eval
(
argstr
);
i
=
i
+
(
length
<
0
);
% count epochs?!
d2
=
df2
'*
s
;
f3
=
f1
;
d3
=
d1
;
z3
=
-
z1
;
% initialize point 3 equal to point 1
if
length
>
0
,
M
=
MAX
;
else
M
=
min
(
MAX
,
-
length
-
i
);
end
success
=
0
;
limit
=
-
1
;
% initialize quanteties
while
1
while
((
f2
>
f1
+
z1
*
RHO
*
d1
)
||
(
d2
>
-
SIG
*
d1
))
&&
(
M
>
0
)
limit
=
z1
;
% tighten the bracket
if
f2
>
f1
z2
=
z3
-
(
0.5
*
d3
*
z3
*
z3
)/(
d3
*
z3
+
f2
-
f3
);
% quadratic fit
else
A
=
6
*
(
f2
-
f3
)/
z3
+
3
*
(
d2
+
d3
);
% cubic fit
B
=
3
*
(
f3
-
f2
)
-
z3
*
(
d3
+
2
*
d2
);
z2
=
(
sqrt
(
B
*
B
-
A
*
d2
*
z3
*
z3
)
-
B
)/
A
;
% numerical error possible - ok!
end
if
isnan
(
z2
)
||
isinf
(
z2
)
z2
=
z3
/
2
;
% if we had a numerical problem then bisect
end
z2
=
max
(
min
(
z2
,
INT
*
z3
),(
1
-
INT
)
*
z3
);
% don't accept too close to limits
z1
=
z1
+
z2
;
% update the step
X
=
X
+
z2
*
s
;
[
f2
df2
]
=
eval
(
argstr
);
M
=
M
-
1
;
i
=
i
+
(
length
<
0
);
% count epochs?!
d2
=
df2
'*
s
;
z3
=
z3
-
z2
;
% z3 is now relative to the location of z2
end
if
f2
>
f1
+
z1
*
RHO
*
d1
||
d2
>
-
SIG
*
d1
break
;
% this is a failure
elseif
d2
>
SIG
*
d1
success
=
1
;
break
;
% success
elseif
M
==
0
break
;
% failure
end
A
=
6
*
(
f2
-
f3
)/
z3
+
3
*
(
d2
+
d3
);
% make cubic extrapolation
B
=
3
*
(
f3
-
f2
)
-
z3
*
(
d3
+
2
*
d2
);
z2
=
-
d2
*
z3
*
z3
/(
B
+
sqrt
(
B
*
B
-
A
*
d2
*
z3
*
z3
));
% num. error possible - ok!
if
~
isreal
(
z2
)
||
isnan
(
z2
)
||
isinf
(
z2
)
||
z2
<
0
% num prob or wrong sign?
if
limit
<
-
0.5
% if we have no upper limit