Skip to main content
Log in

An input information enhanced model for relation extraction

  • Original Article
  • Published:
Neural Computing and Applications Aims and scope Submit manuscript

Abstract

We present a novel end-to-end model to jointly extract semantic relations and argument entities from sentence texts. This model does not require any handcrafted feature set or auxiliary toolkit, and hence it could be easily extended to a wide range of sequence tagging tasks. A new method of using the word morphology feature for relation extraction is studied in this paper. We combine the word morphology feature and the semantic feature to enrich the representing capacity of input vectors. Then, an input information enhanced unit is developed for the bidirectional long short-term memory network (Bi-LSTM) to overcome the information loss caused by the gate operations and the concatenation operations in the LSTM memory unit. A new tagging scheme using uncertain labels and a corresponding objective function are exploited to reduce the interference information from non-entity words. Experiments are performed on three datasets: The New York Times (NYT) and ACE2005 datasets for relation extraction and the SemEval 2010 task 8 dataset for relation classification. The results demonstrate that our model achieves a significant improvement over the state-of-the-art model for relation extraction on the NYT dataset and achieves a competitive performance on the ACE2005 dataset.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

Notes

  1. The NYT dataset can be downloaded at: https://github.com/shanzhenren/CoType.

References

  1. Zhang X, Zhao J, LeCun Y, (2015) Character-level convolutional networks for text classification. In: Proceedings of the 28th international conference on neural information processing systems - volume 1, NIPS’15, pp 649–657

  2. Chiu J, Nichols E (2016) Named entity recognition with bidirectional LSTM-CNNs. Trans Assoc Comput Linguist 4:357

    Article  Google Scholar 

  3. Cao K, Rei M (2016) A joint model for word embedding and word morphology. In: Proceedings of the 1st workshop on representation learning for NLP (Association for Computational Linguistics, 2016), pp 18–26. https://doi.org/10.18653/v1/W16-1603. http://www.aclweb.org/anthology/W16-1603

  4. Ma X, Hovy E (2016) End-to-end sequence labeling via bi-directional LSTM-CNNs-CRF. In: Proceedings of the 54th annual meeting of the association for computational linguistics (volume 1: long papers) (Association for Computational Linguistics, 2016), pp 1064–1074. https://doi.org/10.18653/v1/P16-1101. http://www.aclweb.org/anthology/P16-1101

  5. Miwa M, Bansal M (2016) Modeling joint entity and relation extraction with table representation. In: Proceedings of the 54th annual meeting of the association for computational linguistics (volume 1: long papers) (Association for Computational Linguistics, 2016), pp 1105–1116. https://doi.org/10.18653/v1/P16-1105. http://www.aclweb.org/anthology/P16-1105

  6. Zheng S, Wang F, Bao H, Hao Y, Zhou P, Xu B (2017) Joint extraction of entities and relations based on a novel tagging scheme. In: Proceedings of the 55th annual meeting of the association for computational linguistics (volume 1: long papers) (Association for Computational Linguistics, 2017), pp 1227–1236. https://doi.org/10.18653/v1/P17-1113. http://www.aclweb.org/anthology/P17-1113

  7. Hochreiter S, Schmidhuber J (1997) Backpropagation applied to handwritten zip code recognition. Neural Comput 9(8):1735. https://doi.org/10.1162/neco.1997.9.8.1735

    Article  Google Scholar 

  8. Graves A, Jaitly N, Mohamed AR (2014) Hybrid speech recognition with deep bidirectional LSTM. Automatic speech recognition and understanding IEEE, pp 273–278

  9. Hearst MA (1992) Automatic acquisition of hyponyms from large text corpora. In: COLING 1992 volume 2: the 15th international conference on computational linguistics. http://www.aclweb.org/anthology/C92-2082

  10. Brin S (1999) Extracting patterns and relations from the World Wide Web. In: Selected papers from the international workshop on The World Wide Web and databases, WebDB ’98. Springer, London. pp 172–183. http://dl.acm.org/citation.cfm?id=646543.696220

  11. Agichtein E, Gravano L (2000) Snowball: extracting relations from large plain-text collections. In: Proceedings of the fifth ACM conference on digital libraries, DL ’00. ACM, New York, pp 85–94. https://doi.org/10.1145/336597.336644

  12. Blum A, Lafferty J, Rwebangira MR, Reddy R (2004) Semi-supervised learning using randomized mincuts. In: Proceedings of the twenty-first international conference on machine learning, ICML ’04. ACM, New York, p 13. https://doi.org/10.1145/1015330.1015429

  13. Oakes MP (2005) Using Hearst’s rules for the automatic acquisition of hyponyms for mining a pharmaceutical corpus. In: International workshop text mining research, practice and opportunities, proceedings, Borovets, Bulgaria, 24 September 2005. Held in Conjunction with Ranlp 63–67

  14. Chen J, Ji D, Tan C.L, Niu Z (2006) Relation extraction using label propagation based semi-supervised learning. In: Proceedings of the 21st international conference on computational linguistics and 44th annual meeting of the association for computational linguistics (Association for Computational Linguistics, 2006), pp 129–136. http://www.aclweb.org/anthology/P06-1017

  15. Bunescu R, Mooney R (2007) Learning to extract relations from the Web using minimal supervision. In: Proceedings of the 45th annual meeting of the association of computational linguistics (Association for Computational Linguistics, 2007), pp 576–583. http://www.aclweb.org/anthology/P07-1073

  16. Bollegala DT, Matsuo Y, Ishizuka M (2010) Relational duality: unsupervised extraction of semantic relations between entities on the Web. In: Proceedings of the 19th international conference on World Wide Web, WWW ’10. ACM, New York, pp 151–160. https://doi.org/10.1145/1772690.1772707

  17. Nakashole N, Tylenda T, Weikum G (2013) Fine-grained semantic typing of emerging entities. In: Proceedings of the 51st annual meeting of the association for computational linguistics (volume 1: long papers) (Association for Computational Linguistics, 2013), pp 1488–1497. http://www.aclweb.org/anthology/P13-1146

  18. Zelenko D, Aone C, Richardella A (2003) Dropout: a simple way to prevent neural networks from overfitting. Mach Learn Res 3:1083

    MathSciNet  Google Scholar 

  19. Bunescu RC, Mooney RJ (2005) Subsequence kernels for relation extraction. In: Proceedings of the 18th international conference on neural information processing systems, NIPS’05. MIT Press, Cambridge, pp 171–178. http://dl.acm.org/citation.cfm?id=2976248.2976270

  20. Qian L, Zhou G, Kong F, Zhu Q, Qian P (2008) Exploiting constituent dependencies for tree kernel-based semantic relation extraction. In: Proceedings of the 22nd international conference on computational linguistics (Coling 2008) (Coling 2008 Organizing Committee, 2008), pp 697–704. http://www.aclweb.org/anthology/C08-1088

  21. Xu K, Feng Y, Huang S, Zhao D (2015) Semantic relation classification via convolutional neural networks with simple negative sampling. In: Proceedings of the 2015 conference on empirical methods in natural language processing (Association for Computational Linguistics, 2015), pp 536–540. https://doi.org/10.18653/v1/D15-1062. http://www.aclweb.org/anthology/D15-1062

  22. Zhang H, Sun Y, Zhao M, Chow TWS, Wu QMJ (2019) Understanding subtitles by character-level sequence-to-sequence learning. IEEE Trans Cybern. https://doi.org/10.1109/TCYB.2019.2900159

  23. dos Santos C, Guimarães V (2015) Boosting named entity recognition with neural character embeddings. In: Proceedings of the fifth named entity workshop (Association for Computational Linguistics, 2015), pp 25–33. https://doi.org/10.18653/v1/W15-3904. http://www.aclweb.org/anthology/W15-3904

  24. Zhang H, Li J, Ji Y, Yue H (2017) Understanding subtitles by character-level sequence-to-sequence learning. IEEE Trans Ind Inform 13(2):616. https://doi.org/10.1109/TII.2016.2601521

    Article  Google Scholar 

  25. LeCun Y, Boser B, Denker JS, Henderson D, Howard RE, Hubbard W, Jackel LD (1989) Long short-term memory. Neural Comput 1(4):541. https://doi.org/10.1162/neco.1989.1.4.541

    Article  Google Scholar 

  26. Xu Y, Mou L, Li G, Chen Y, Peng H, Jin Z (2015) Classifying relations via long short term memory networks along shortest dependency paths. In: Proceedings of the 2015 conference on empirical methods in natural language processing (Association for Computational Linguistics, 2015), pp 1785–1794. https://doi.org/10.18653/v1/D15-1206

  27. Xu Y, Jia R, Mou L, Li G, Chen Y, Lu Y, Jin Z (2016) Improved relation classification by deep recurrent neural networks with data augmentation. In: Proceedings of COLING 2016, the 26th international conference on computational linguistics: technical papers (The COLING 2016 Organizing Committee, 2016), pp 1461–1470. http://www.aclweb.org/anthology/C16-1138

  28. Zhang S, Zheng D, Hu X, Yang M (2015) Bidirectional long short-term memory networks for relation classification. In: Proceedings of the 29th Pacific Asia conference on language, information and computation, pp 73–78. http://www.aclweb.org/anthology/Y15-1009

  29. Zeng D, Liu K, Lai S, Zhou G, Zhao J (2014) Relation classification via convolutional deep neural network. In: Proceedings of COLING 2014, the 25th international conference on computational linguistics: technical papers (Dublin City University and Association for Computational Linguistics, 2014), pp 2335–2344. http://www.aclweb.org/anthology/C14-1220

  30. Wang L, Cao Z, de Melo G, Liu Z (2016) Relation classification via multi-level attention CNNs. In: Proceedings of the 54th annual meeting of the association for computational linguistics (volume 1: long papers) (Association for Computational Linguistics, 2016), pp 1298–1307. https://doi.org/10.18653/v1/P16-1123. http://www.aclweb.org/anthology/P16-1123

  31. dos Santos C, Xiang B, Zhou B (2015) Classifying relations by ranking with convolutional neural networks. In: Proceedings of the 53rd annual meeting of the association for computational linguistics and the 7th international joint conference on natural language processing (volume 1: long papers) (Association for Computational Linguistics, 2015), pp 626–634. https://doi.org/10.3115/v1/P15-1061. http://www.aclweb.org/anthology/P15-1061

  32. Vu NT, Adel H, Gupta P, Schütze H (2016) Combining recurrent and convolutional neural networks for relation classification. In: Proceedings of the 2016 conference of the North American chapter of the association for computational linguistics: human language technologies (Association for Computational Linguistics, 2016), pp 534–539. https://doi.org/10.18653/v1/N16-1065. http://www.aclweb.org/anthology/N16-1065

  33. Yang B, Cardie C (2013) Joint inference for fine-grained opinion extraction. In: Proceedings of the 51st annual meeting of the association for computational linguistics (volume 1: long papers) (Association for Computational Linguistics, 2013), pp 1640–1649. http://www.aclweb.org/anthology/P13-1161

  34. Singh S, Riedel S, Martin B, Zheng J, McCallum A (2013) Joint inference of entities, relations, and conference. In: Proceedings of the 2013 workshop on automated knowledge base construction, AKBC ’13. ACM, New York, pp 1–6. https://doi.org/10.1145/2509558.2509559. http://doi.acm.org/10.1145/2509558.2509559

  35. Miwa M, Sasaki Y (2014) Modeling joint entity and relation extraction with table representation. In: Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP) (Association for Computational Linguistics, 2014), pp 1858–1869. https://doi.org/10.3115/v1/D14-1200. http://www.aclweb.org/anthology/D14-1200

  36. Li Q, Ji H, Incremental Joint Extraction of Entity Mentions and Relations. in Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) (Association for Computational Linguistics, 2014), pp. 402–412. https://doi.org/10.3115/v1/P14-1038. http://www.aclweb.org/anthology/P14-1038

  37. Mintz M, Bills S, Snow R, Jurafsky D (2009) Distant supervision for relation extraction without labeled data. In: Proceedings of the joint conference of the 47th annual meeting of the acl and the 4th international joint conference on natural language processing of the AFNLP (Association for Computational Linguistics, 2009), pp 1003–1011. http://www.aclweb.org/anthology/P09-1113

  38. Riedel S, Yao L, Mccallum A (2010) Modeling relations and their mentions without labeled text. In: European conference on machine learning and knowledge discovery in databases, pp 148–163

  39. Hoffmann R, Zhang C, Ling X, Zettlemoyer L, Weld DS (2011) Knowledge-based weak supervision for information extraction of overlapping relations. In: Proceedings of the 49th annual meeting of the association for computational linguistics: human language technologies (Association for Computational Linguistics, 2011), pp 541–550. http://www.aclweb.org/anthology/P11-1055

  40. Ren X, Wu Z, He W, Qu M, Voss C.R, Ji H, Abdelzaher TF, Han J (2017) CoType: joint extraction of typed entities and relations with knowledge bases. In: Proceedings of the 26th international conference on World Wide Web (International World Wide Web Conferences Steering Committee, Republic and Canton of Geneva, Switzerland, 2017), WWW ’17, pp 1015–1024. https://doi.org/10.1145/3038912.3052708. https://doi.org/10.1145/3038912.3052708

  41. Mikolov T, Sutskever I, Chen K, Corrado G, Dean J (2013) Distributed representations of words and phrases and their compositionality. In: Proceedings of the 26th international conference on neural information processing systems - volume 2 (Curran Associates Inc., USA, 2013), NIPS’13, pp 3111–3119. http://dl.acm.org/citation.cfm?id=2999792.2999959

  42. Peng N, Poon H, Quirk C, Toutanova K, Yih W (2017) Cross-sentence N-ary relation extraction with graph LSTMs. Trans Assoc Comput Linguist 5:101

    Article  Google Scholar 

  43. Gormley MR, Yu M, Dredze M (2015) Improved relation extraction with feature-rich compositional embedding models. In: Proceedings of the 2015 conference on empirical methods in natural language processing (Association for Computational Linguistics, 2015), pp 1774–1784. https://doi.org/10.18653/v1/D15-1205. http://www.aclweb.org/anthology/D15-1205

  44. Tang J, Qu M, Wang M, Zhang M, Yan J, Mei Q (2015) LINE: large-scale information network embedding (WWW World Wide Web Consortium (W3C), 2015). https://www.microsoft.com/en-us/research/publication/line-large-scale-information-network-embedding/

  45. Socher R, Huval B, Manning CD, Ng AY (2012) Semantic compositionality through recursive matrix-vector spaces. In: Proceedings of the 2012 joint conference on empirical methods in natural language processing and computational natural language learning (Association for Computational Linguistics, 2012), pp 1201–1211. http://www.aclweb.org/anthology/D12-1110

  46. Kingma D.P, Ba J.L (2015) Adam: A Method for Stochastic Optimization. international conference on learning representations

  47. Srivastava N, Hinton G, Krizhevsky A, Sutskever I, Salakhutdinov R (2014) Dropout: a simple way to prevent neural networks from overfitting. J Mach Learn Res 15(1):1929

    MathSciNet  MATH  Google Scholar 

Download references

Acknowledgements

The authors would like to thank Xiang Ren, Zeqiu, Wu and Wenqi He et al. for the public NYT dataset constructed by them. The authors are also grateful to Mikolov et al. for their public program training word embeddings. This research work is supported by the National Key Research and Development Program of China (Grant No. 2017YFB0803302), the National Natural Science Foundation of China (No. 61751201) and the National Key Research and Development Program of China (No. 2016QY03D0602).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Chong Feng.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Lei, M., Huang, H., Feng, C. et al. An input information enhanced model for relation extraction. Neural Comput & Applic 31, 9113–9126 (2019). https://doi.org/10.1007/s00521-019-04430-3

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00521-019-04430-3

Keywords

Navigation