講演名 2022-01-28
An Objective Article Search Method from Printed Japanese Contract Document Using Optical Character Recognition
陳 仕璽(岡山大), 坂上 暢規(岡山大), 舩曵 信生(岡山大), 土信田 高(アストロラボ), 菅 恒平(アストロラボ),
PDFダウンロードページ PDFダウンロードページへ
抄録(和) A contract is essential for the involved companies to have successful businesses among them. Then, the contract document is the important legal document that defines the formal agreements and consists of multiple articles where one article describes the agreement on a certain subject or condition. It is useful to automatically search and extract the article describing the search subject, although contact documents are often filed on printed papers in many companies. In this paper, we propose an objective article search method from a printed Japanese contract document using the optical character recognition (OCR) technology. From the recognized characters, it finds the article whose title contains the subject, or finds the paragraph that well matches with the given keyword list. This list can be automatically generated by giving the sample articles related to the subject in existing contract documents. For evaluations, we implemented the proposed method using Python and applied it to $35$ contract documents. The results confirm the effectiveness of the proposal by successfully finding the objective articles from all of them.
抄録(英) A contract is essential for the involved companies to have successful businesses among them. Then, the contract document is the important legal document that defines the formal agreements and consists of multiple articles where one article describes the agreement on a certain subject or condition. It is useful to automatically search and extract the article describing the search subject, although contact documents are often filed on printed papers in many companies. In this paper, we propose an objective article search method from a printed Japanese contract document using the optical character recognition (OCR) technology. From the recognized characters, it finds the article whose title contains the subject, or finds the paragraph that well matches with the given keyword list. This list can be automatically generated by giving the sample articles related to the subject in existing contract documents. For evaluations, we implemented the proposed method using Python and applied it to $35$ contract documents. The results confirm the effectiveness of the proposal by successfully finding the objective articles from all of them.
キーワード(和) contract document / article / subject / OCR / regular expression
キーワード(英) contract document / article / subject / OCR / regular expression
資料番号 ICM2021-39,LOIS2021-37
発行日 2022-01-20 (ICM, LOIS)

研究会情報
研究会 LOIS / ICM
開催期間 2022/1/27(から2日開催)
開催地(和) オンライン開催
開催地(英) Online
テーマ(和) ライフログ活用技術、オフィス情報システム、ビジネス管理、一般
テーマ(英) Practical Use of Lifelog, Office Information System, Business Management, etc.
委員長氏名(和) 小林 透(長崎大) / 木下 和彦(徳島大)
委員長氏名(英) Toru Kobayashi(Nagasaki Univ.) / Kazuhiko Kinoshita(Tokushima Univ.)
副委員長氏名(和) 戸田 浩之(NTT) / 大石 晴夫(NTT) / 高橋 英士(NEC)
副委員長氏名(英) Hiroyuki Toda(NTT) / Haruo Ooishi(NTT) / Eiji Takahashi(NEC)
幹事氏名(和) 荒井 研一(長崎大学) / 齋藤 晴美(NTT) / 中山 裕貴(ボスコ・テクノロジーズ) / 内海 哲哉(富士通)
幹事氏名(英) Kenichi Arai(Nagasaki Univ.) / Harumi Saito(NTT) / Hiroki Nakayama(Bosco) / Tetsuya Uchiumi(Fujitsu)
幹事補佐氏名(和) 深江 一輝(長崎大) / 加藤 能史(NTT)
幹事補佐氏名(英) Kazuki Fukae(Nagasaki Univ.) / Yoshifumi Kato(NTT)

講演論文情報詳細
申込み研究会 Technical Committee on Life Intelligence and Office Information Systems / Technical Committee on Information and Communication Management
本文の言語 ENG
タイトル(和)
サブタイトル(和)
タイトル(英) An Objective Article Search Method from Printed Japanese Contract Document Using Optical Character Recognition
サブタイトル(和)
キーワード(1)(和/英) contract document / contract document
キーワード(2)(和/英) article / article
キーワード(3)(和/英) subject / subject
キーワード(4)(和/英) OCR / OCR
キーワード(5)(和/英) regular expression / regular expression
第 1 著者 氏名(和/英) 陳 仕璽 / Shixi Chen
第 1 著者 所属(和/英) 岡山大学(略称:岡山大)
Okayama University(略称:Okayama Univ.)
第 2 著者 氏名(和/英) 坂上 暢規 / Masaki Sakagami
第 2 著者 所属(和/英) 岡山大学(略称:岡山大)
Okayama University(略称:Okayama Univ.)
第 3 著者 氏名(和/英) 舩曵 信生 / Nobuo Funabiki
第 3 著者 所属(和/英) 岡山大学(略称:岡山大)
Okayama University(略称:Okayama Univ.)
第 4 著者 氏名(和/英) 土信田 高 / Takashi Toshida
第 4 著者 所属(和/英) アストロラボ株式会社(略称:アストロラボ)
Astrolab Inc.(略称:Astrolab)
第 5 著者 氏名(和/英) 菅 恒平 / Kohei Suga
第 5 著者 所属(和/英) アストロラボ株式会社(略称:アストロラボ)
Astrolab Inc.(略称:Astrolab)
発表年月日 2022-01-28
資料番号 ICM2021-39,LOIS2021-37
巻番号(vol) vol.121
号番号(no) ICM-354,LOIS-355
ページ範囲 pp.34-39(ICM), pp.34-39(LOIS),
ページ数 6
発行日 2022-01-20 (ICM, LOIS)