NER Resources
NER is short for Name Entity Recognition, which is one of fundamental tasks in NLP and critical to other NLP tasks.
As machine learning develops, more and more new methods have been applied in this area. This resource book attempts to give a glance of these methods.
Vanilla machine learning methods
CRF
CRF is short for Conditional Random Fields.
Toolkits
CRF++ is a simple, customizable, and open source implementation of Conditional Random Fields (CRFs) for segmenting/labeling sequential data.
CRFsuite: A fast implementation of Conditional Random Fields (CRFs).
python-crfsuite is a python binding to CRFsuite.
sklearn-crfsuite is a thin CRFsuite (python-crfsuite) wrapper which provides interface similar to scikit-learn.
CRF in Tensorflow Linear-chain CRF layer.
Papers
Lafferty, J., McCallum, A., & Pereira, F. (2001). Conditional random fields: Probabilistic models for segmenting and labeling sequence data. ACM.
Sutton, C., & McCallum, A. (2012). An Introduction to Conditional Random Fields. Now Pub.
Sha, F., & Pereira, F. (2003). Shallow parsing with conditional random fields. ACLWeb.
Other readings
Tutorial of using sklearn-crfsuite for NER task
Learning2Search
Learning to Search is a nickname for Vowpal Wabbit.
Toolkits
Vowpal Wabbit on Github
Papers
Chang, K.-W., He, H., Daumé, H., III, & Langford, J. (2015, March 19). Learning to Search for Dependencies. arXiv.org
Other readings
Named Entity Classification by Themis Mavridis from booking.com
Deep learning methods
LSTM
Toolkits
NeuroNER is a program that performs named-entity recognition (NER).
Papers
Dernoncourt, F., Lee, J. Y., & Szolovits, P. (2017, May 16). NeuroNER: an easy-to-use program for named-entity recognition based on neural networks. arXiv.org
Other readings
LSTM with CRF
Toolkits
Sequence Tagging with Tensorflow
Papers
Ma, X., & Hovy, E. (2016, March 4). End-to-end Sequence Labeling via Bi-directional LSTM-CNNs-CRF. arXiv.org.
Other readings