Publication Date:
2017
abstract:
We introduce and test a binary classification method aimed at detecting malicious
URL on the basis of some information both on the URL syntax and its domain properties. Our
method belongs to the class of supervised Machine Learning models, where, in particular, classifica-
tion is performed by using information coming from a set of URL's (samples in Machine Learning
parlance) whose class membership is known in advance.
The main novelty of our approach is in the use of a Spherical Separation-based algorithm, instead
of SVM-type methods, which are based on hyperplanes as separation surfaces in the sample space.
In particular we adopt a simplified Spherical Separation model which runs in O(tlogt) time (t is the
number of samples in the training set), and thus is suitable for large scale applications.
We test our approach using different sets of features and report the results in terms of training
correctness according to the well-established ten-fold cross validation paradigm.
Iris type:
01.01 Articolo in rivista
Keywords:
Classification; Spherical separation; Malicious Web sites
List of contributors:
Chiarello, Antonino; Astorino, Annabella
Published in: