講演抄録/キーワード |
講演名 |
2022-01-28 10:00
An Objective Article Search Method from Printed Japanese Contract Document Using Optical Character Recognition ○Shixi Chen・Masaki Sakagami・Nobuo Funabiki(Okayama Univ.)・Takashi Toshida・Kohei Suga(Astrolab) ICM2021-39 LOIS2021-37 |
抄録 |
(和) |
A contract is essential for the involved companies to have successful businesses among them. Then, the contract document is the important legal document that defines the formal agreements and consists of multiple articles where one article describes the agreement on a certain subject or condition.
It is useful to automatically search and extract the article describing the search subject, although contact documents are often filed on printed papers in many companies.
In this paper, we propose an objective article search method from a printed Japanese contract document using the optical character recognition (OCR) technology. From the recognized characters, it finds the article whose title contains the subject, or finds the paragraph that well matches with the given keyword list. This list can be automatically generated by giving the sample articles related to the subject in existing contract documents.
For evaluations, we implemented the proposed method using Python and applied it to $35$ contract documents. The results confirm the effectiveness of the proposal by successfully finding the objective articles from all of them. |
(英) |
A contract is essential for the involved companies to have successful businesses among them. Then, the contract document is the important legal document that defines the formal agreements and consists of multiple articles where one article describes the agreement on a certain subject or condition.
It is useful to automatically search and extract the article describing the search subject, although contact documents are often filed on printed papers in many companies.
In this paper, we propose an objective article search method from a printed Japanese contract document using the optical character recognition (OCR) technology. From the recognized characters, it finds the article whose title contains the subject, or finds the paragraph that well matches with the given keyword list. This list can be automatically generated by giving the sample articles related to the subject in existing contract documents.
For evaluations, we implemented the proposed method using Python and applied it to $35$ contract documents. The results confirm the effectiveness of the proposal by successfully finding the objective articles from all of them. |
キーワード |
(和) |
contract document / article / subject / OCR / regular expression / / / |
(英) |
contract document / article / subject / OCR / regular expression / / / |
文献情報 |
信学技報, vol. 121, no. 355, LOIS2021-37, pp. 34-39, 2022年1月. |
資料番号 |
LOIS2021-37 |
発行日 |
2022-01-20 (ICM, LOIS) |
ISSN |
Online edition: ISSN 2432-6380 |
著作権に ついて |
技術研究報告に掲載された論文の著作権は電子情報通信学会に帰属します.(許諾番号:10GA0019/12GB0052/13GB0056/17GB0034/18GB0034) |
PDFダウンロード |
ICM2021-39 LOIS2021-37 |
|