Presentation 2022-11-24
Anomaly Detection on Web Pages Using HDBSCAN and Deep SVDD
Yusuke Noji, Tomotaka Kimura, Jun Cheng,
PDF Download Page PDF download Page Link
Abstract(in Japanese) (See Japanese page)
Abstract(in English) In this paper, we propose an anomalous Web page detection method using Deep SVDD (Support Vector Data Description), which is one of deep learning methods. Although Deep SVDD assumes that most of the training data is normal data, the learning process is not stable because a certain percentage of abnormal web pages are included in the training data. Therefore, in this paper, we eliminate abnormal data by applying a clustering method before using Deep SVDD. Specifically, HDBSCAN (Hierarchical Density-based Spatial Clustering of Applications with Noise), a density-based clustering method, is used to remove anomalous data. Through experiments using a web page dataset, we show that HDBSCAN can remove anomalous data points and that the performance of Deep SVDD is stabilized by removing anomalous data.
Keyword(in Japanese) (See Japanese page)
Keyword(in English) anomaly detection / machine learning / clustering / web pages
Paper # CQ2022-51
Date of Issue 2022-11-17 (CQ)

Conference Information
Committee NS / ICM / CQ
Conference Date 2022/11/24(2days)
Place (in Japanese) (See Japanese page)
Place (in English) Humanities and Social Sciences Center, Fukuoka Univ. + Online
Topics (in Japanese) (See Japanese page)
Topics (in English) Network quality, Network measurement/management, Network virtualization, Network service, Blockchain, Security, Network intelligence/AI, etc.
Chair Tetsuya Oishi(NTT) / Yuji Nomura(Fujitsu) / Jun Okamoto(NTT)
Vice Chair Takumi Miyoshi(Shibaura Insti of Tech.) / Yu Miyoshi(NTT) / Eiji Takahashi(NEC) / Takefumi Hiraguri(Nippon Inst. of Tech.) / Gou Hasegawa(Tohoku Univ.)
Secretary Takumi Miyoshi(NTT) / Yu Miyoshi(Kogakuin Univ.) / Eiji Takahashi(NTT) / Takefumi Hiraguri(Fujitsu) / Gou Hasegawa(NTT)
Assistant Kotaro Mihara(NTT) / Ryo Yamamoto(Univ. of Electro-Comm) / Kimiko Kawashima(NTT) / Ryo Nakamura(Fukuoka Univ.) / Toshiro Nakahira(NTT) / Kenta Tsukatsune(Tokyo Metroplitan Univ.)

Paper Information
Registration To Technical Committee on Network Systems / Technical Committee on Information and Communication Management / Technical Committee on Communication Quality
Language JPN
Title (in Japanese) (See Japanese page)
Sub Title (in Japanese) (See Japanese page)
Title (in English) Anomaly Detection on Web Pages Using HDBSCAN and Deep SVDD
Sub Title (in English)
Keyword(1) anomaly detection
Keyword(2) machine learning
Keyword(3) clustering
Keyword(4) web pages
1st Author's Name Yusuke Noji
1st Author's Affiliation Doshisha University(Doshisha Univ.)
2nd Author's Name Tomotaka Kimura
2nd Author's Affiliation Doshisha University(Doshisha Univ.)
3rd Author's Name Jun Cheng
3rd Author's Affiliation Doshisha University(Doshisha Univ.)
Date 2022-11-24
Paper # CQ2022-51
Volume (vol) vol.122
Number (no) CQ-275
Page pp.pp.23-27(CQ),
#Pages 5
Date of Issue 2022-11-17 (CQ)