Presentation 2017-02-24
Detecting code clone using sequential data mining
Yoshihisa Udagawa,
PDF Download Page PDF download Page Link
Abstract(in Japanese) (See Japanese page)
Abstract(in English) This paper proposes a method for applying a sequential data mining algorithm to detect code clones (duplicated code fragments) and shows the experimental results using the source code of Java SDK SWING package. A frequent sequential data mining algorithm generally extracts vast amounts of sequential patterns when a minimum support threshold (minSup) is small, which is an obstacle to the detection of code clones. The proposed method reduces the number of extracted frequent sequences by incorporating pruning processes and techniques to extract the maximal frequent sequences. The result shows that the proposed sequential data mining algorithm maintains a practical level of performance until the minSup reaches to 2.
Keyword(in Japanese) (See Japanese page)
Keyword(in English) Code clone / Maximal frequent sequence / Longest common subsequence (LCS) algorithm / Java source code
Paper # SWIM2016-21
Date of Issue 2017-02-17 (SWIM)

Conference Information
Committee SWIM
Conference Date 2017/2/24(1days)
Place (in Japanese) (See Japanese page)
Place (in English) Kikai-Shinko-Kaikan Bldg.
Topics (in Japanese) (See Japanese page)
Topics (in English) Evaluation of business model and reliability, Student session, etc.
Chair Yoshihisa Udagawa(Tokyo Polytechnic Univ.)
Vice Chair Tadashi Ogino(Meisei Univ.) / Osamu Yuki(Canon)
Secretary Tadashi Ogino(Fujitsu Labs.) / Osamu Yuki(Bunkyo Univ.)
Assistant Akihiro Hayashi(Onosokki) / Kenji Saotome(Hosei Univ.)

Paper Information
Registration To Technical Committee on Software Interprise Modeling
Language JPN
Title (in Japanese) (See Japanese page)
Sub Title (in Japanese) (See Japanese page)
Title (in English) Detecting code clone using sequential data mining
Sub Title (in English) Experimental results on Java SWING source code
Keyword(1) Code clone
Keyword(2) Maximal frequent sequence
Keyword(3) Longest common subsequence (LCS) algorithm
Keyword(4) Java source code
1st Author's Name Yoshihisa Udagawa
1st Author's Affiliation Tokyo Polytechnic University(Tokyo Polytechnic Univ.)
Date 2017-02-24
Paper # SWIM2016-21
Volume (vol) vol.116
Number (no) SWIM-473
Page pp.pp.17-22(SWIM),
#Pages 6
Date of Issue 2017-02-17 (SWIM)