Presentation | 2022-01-28 An Objective Article Search Method from Printed Japanese Contract Document Using Optical Character Recognition Shixi Chen, Masaki Sakagami, Nobuo Funabiki, Takashi Toshida, Kohei Suga, |
---|---|
PDF Download Page | PDF download Page Link |
Abstract(in Japanese) | (See Japanese page) |
Abstract(in English) | A contract is essential for the involved companies to have successful businesses among them. Then, the contract document is the important legal document that defines the formal agreements and consists of multiple articles where one article describes the agreement on a certain subject or condition. It is useful to automatically search and extract the article describing the search subject, although contact documents are often filed on printed papers in many companies. In this paper, we propose an objective article search method from a printed Japanese contract document using the optical character recognition (OCR) technology. From the recognized characters, it finds the article whose title contains the subject, or finds the paragraph that well matches with the given keyword list. This list can be automatically generated by giving the sample articles related to the subject in existing contract documents. For evaluations, we implemented the proposed method using Python and applied it to $35$ contract documents. The results confirm the effectiveness of the proposal by successfully finding the objective articles from all of them. |
Keyword(in Japanese) | (See Japanese page) |
Keyword(in English) | contract document / article / subject / OCR / regular expression |
Paper # | ICM2021-39,LOIS2021-37 |
Date of Issue | 2022-01-20 (ICM, LOIS) |
Conference Information | |
Committee | LOIS / ICM |
---|---|
Conference Date | 2022/1/27(2days) |
Place (in Japanese) | (See Japanese page) |
Place (in English) | Online |
Topics (in Japanese) | (See Japanese page) |
Topics (in English) | Practical Use of Lifelog, Office Information System, Business Management, etc. |
Chair | Toru Kobayashi(Nagasaki Univ.) / Kazuhiko Kinoshita(Tokushima Univ.) |
Vice Chair | Hiroyuki Toda(NTT) / Haruo Ooishi(NTT) / Eiji Takahashi(NEC) |
Secretary | Hiroyuki Toda(Nagasaki Univ.) / Haruo Ooishi(NTT) / Eiji Takahashi(Bosco) |
Assistant | Kazuki Fukae(Nagasaki Univ.) / Yoshifumi Kato(NTT) |
Paper Information | |
Registration To | Technical Committee on Life Intelligence and Office Information Systems / Technical Committee on Information and Communication Management |
---|---|
Language | ENG |
Title (in Japanese) | (See Japanese page) |
Sub Title (in Japanese) | (See Japanese page) |
Title (in English) | An Objective Article Search Method from Printed Japanese Contract Document Using Optical Character Recognition |
Sub Title (in English) | |
Keyword(1) | contract document |
Keyword(2) | article |
Keyword(3) | subject |
Keyword(4) | OCR |
Keyword(5) | regular expression |
1st Author's Name | Shixi Chen |
1st Author's Affiliation | Okayama University(Okayama Univ.) |
2nd Author's Name | Masaki Sakagami |
2nd Author's Affiliation | Okayama University(Okayama Univ.) |
3rd Author's Name | Nobuo Funabiki |
3rd Author's Affiliation | Okayama University(Okayama Univ.) |
4th Author's Name | Takashi Toshida |
4th Author's Affiliation | Astrolab Inc.(Astrolab) |
5th Author's Name | Kohei Suga |
5th Author's Affiliation | Astrolab Inc.(Astrolab) |
Date | 2022-01-28 |
Paper # | ICM2021-39,LOIS2021-37 |
Volume (vol) | vol.121 |
Number (no) | ICM-354,LOIS-355 |
Page | pp.pp.34-39(ICM), pp.34-39(LOIS), |
#Pages | 6 |
Date of Issue | 2022-01-20 (ICM, LOIS) |