Presentation 2021-03-05
Consideration of embedding methods and machine learning models for detecting malicious URLs
Qisheng Chen, Kazumasa omote,
PDF Download Page PDF download Page Link
Abstract(in Japanese) (See Japanese page)
Abstract(in English) Nowadays, Internet access is becoming more and more popular, which makes the harm of malicious websites more and more serious. There are many solutions to solve this problem such as Google Safe Browsing, which check whether the website is malicious. A blacklist method is very useful for detect malicious URL but there are still many shortcomings. For example, a lot of new malicious websites are generated every day, and the speed of blacklist expansion can not keep up. In this paper, we use different embedding methods and different machine learning models to detecting malicious URLs. Besides, we compared the accuracy of these embedding methods and machine learning models. In our evaluation, the embedding algorithm TF-IDF and Token segmentation method obtain a good performance and we draw a conclusion that segmentation method plays an important role in malicious URLs detection.
Keyword(in Japanese) (See Japanese page)
Keyword(in English) Machine learning / Malicious URLs detection / Embedding / Segmentation method
Paper # IT2020-157,ISEC2020-87,WBS2020-76
Date of Issue 2021-02-25 (IT, ISEC, WBS)

Conference Information
Committee WBS / IT / ISEC
Conference Date 2021/3/4(2days)
Place (in Japanese) (See Japanese page)
Place (in English) Online
Topics (in Japanese) (See Japanese page)
Topics (in English) Joint Meeting of WBS, IT, and ISEC
Chair Masanori Hamamura(Kochi Univ. of Tech.) / Tadashi Wadayama(Nagoya Inst. of Tech.) / Shoichi Hirose(Univ. of Fukui)
Vice Chair Takashi Shono(INTEL) / Masahiro Fujii(Utsunomiya Univ.) / Tetsuya Kojima(Tokyo Kosen) / Tetsuya Izu(Fujitsu Labs.) / Noboru Kunihiro(Tsukuba Univ.)
Secretary Takashi Shono(Okayama Univ. of Science) / Masahiro Fujii(National Defence Academy) / Tetsuya Kojima(Yamaguchi Univ.) / Tetsuya Izu(Saga Univ.) / Noboru Kunihiro(Tsukuba Univ.)
Assistant Duong Quang Thang(NAIST) / Masafumi Moriyama(NICT) / Masayuki Kinoshita(Chiba Univ. of Tech.) / Takahiro Ohta(Senshu Univ.) / Kazuki Yoneyama(Ibaraki Univ.)

Paper Information
Registration To Technical Committee on Wideband System / Technical Committee on Information Theory / Technical Committee on Information Security
Language ENG-JTITLE
Title (in Japanese) (See Japanese page)
Sub Title (in Japanese) (See Japanese page)
Title (in English) Consideration of embedding methods and machine learning models for detecting malicious URLs
Sub Title (in English)
Keyword(1) Machine learning
Keyword(2) Malicious URLs detection
Keyword(3) Embedding
Keyword(4) Segmentation method
1st Author's Name Qisheng Chen
1st Author's Affiliation University of Tsukuba(Univ. of Tsukuba)
2nd Author's Name Kazumasa omote
2nd Author's Affiliation University of Tsukuba(Univ. of Tsukuba)
Date 2021-03-05
Paper # IT2020-157,ISEC2020-87,WBS2020-76
Volume (vol) vol.120
Number (no) IT-410,ISEC-411,WBS-412
Page pp.pp.281-287(IT), pp.281-287(ISEC), pp.281-287(WBS),
#Pages 7
Date of Issue 2021-02-25 (IT, ISEC, WBS)