Title: | Fast k Nearest Neighbor |
---|---|
Description: | This are different Functions related to the k Nearest Neighbo classifier. The distance matrix is an input making the computation faster and allowing other distances than euclidean. |
Authors: | Gaston Besanson |
Maintainer: | Gaston Besanson <[email protected]> |
License: | GPL-3 |
Version: | 0.0.1 |
Built: | 2024-10-29 06:16:53 UTC |
Source: | https://github.com/besanson/fastknn |
Distance for KNN Test The Distance_for_KNN_test returns the distance matrix between the test set and the training set.
Distance_for_KNN_test(test_set, train_set)
Distance_for_KNN_test(test_set, train_set)
test_set |
is a matrix where the columns are the features of the test set |
train_set |
is a matrix with the features of the training set |
a distance matrix
knn_test_function
pdist
k.nearest.neigbors
gives the list of points (k neigbours) that are closest
to the row i in descending order.K Nearest Neighbors
the k.nearest.neigbors
gives the list of points (k neigbours) that are closest
to the row i in descending order.
k.nearest.neighbors(i, distance_matrix, k = 5)
k.nearest.neighbors(i, distance_matrix, k = 5)
i |
is from the numeric class and is a row from the distance_matrix. |
distance_matrix |
is a nxn matrix. |
k |
is from the numeric class and represent the number of neigbours that the function will return. |
The output of this function is used in the knn_test_function
function.
a k vector with the k closest neigbours to the i observation.
order
KNN Test The kk_test_function returns the labels for a test set using the KNN Clasification method.
knn_test_function(dataset, test, distance, labels, k = 3)
knn_test_function(dataset, test, distance, labels, k = 3)
dataset |
is a matrix with the features of the training set |
test |
is a matrix where the columns are the features of the test set |
distance |
is a nxn matrix with the distance between each observation of the test set and the training set |
labels |
is a nx1 vector with the labels of the training set |
k |
is from the numeric class and represent the number of neigbours to be use in the classifier. |
a k vector with the predicted labels for the test set.
k.nearest.neighbors
sample
# Create Data for restaurant reviews training <- matrix(rexp(600,1), ncol=2) test <- matrix(rexp(200,1), ncol=2) # Label "Good", "Bad", "Average" labelsExample <- c(rep("Good",100), rep("Bad",100), rep("Average",100)) # Distance Matrix distanceExample<-Distance_for_KNN_test(test, training) # KNN knn_test_function(training, test, distanceExample,labelsExample, k = 3)
# Create Data for restaurant reviews training <- matrix(rexp(600,1), ncol=2) test <- matrix(rexp(200,1), ncol=2) # Label "Good", "Bad", "Average" labelsExample <- c(rep("Good",100), rep("Bad",100), rep("Average",100)) # Distance Matrix distanceExample<-Distance_for_KNN_test(test, training) # KNN knn_test_function(training, test, distanceExample,labelsExample, k = 3)
KNN Training The knn_training_function returns the labels for a training set using the KNN Clasification method.
knn_training_function(dataset, distance, label, k = 1)
knn_training_function(dataset, distance, label, k = 1)
dataset |
is a matrix with the features of the training set |
distance |
is a nxn matrix with the distance between each observation of the training set |
label |
is a nx1 vector with the labels of the training set |
k |
is from the numeric class and represent the number of neigbours to be use in the classifier. |
This function is use to check the quality of the Classifier. Because then the predicted labels are compared with the true labels
a k vector with the predicted labels for the training set. #'
k.nearest.neighbors
sample