Header

Shop : Details

Shop
Details
978-3-8440-0213-3
35,80 €
ISBN 978-3-8440-0213-3
Paperback
198 Seiten
60 Abbildungen
296 g
22,5 x 16,0 cm
Englisch
Dissertation
Juli 2011
Roman Klinger
Conditional Random Fields for Named Entity Recognition
Feature Selection and Optimization in Biology and Chemistry

Most knowledge is stored and communicated in the form of natural language text. Databases including abstracts of journal articles or proceeding contributions are freely available. To make this knowledge available in a structured form, allowing for deeper analysis and combination with existing databases, technologies from the field of information extraction are necessary. A fundament for most methods like relation extraction or semantic search is named entity recognition. Conditional random fields are an established probabilistic method for labeling sequences. Nevertheless, the adaption to novel domains or entity classes of interest requires manual effort.

This dissertation presents such adaptions for entity classes from the biological and chemical domain. Workflows for the detection of gene and protein names, mentions of mutations of genes, and chemical names following the nomenclature of the International Union of Pure and Applied Chemistry. For these classes, training corpora are discussed and built. Questions addressed include how to use knowledge from multiple annotators, how stable a model is on data from different time ranges, or how to normalize found entities.

The presented use cases exemplify the need for feature design and selection. Different methods for choosing a meaningful feature subset decreasing the run time and number of features clearly are developed and evaluated. To extend the applicability of conditional random fields, a training method based on multicriterial optimization is introduced allowing the user to choose between different precision-recall weightings without increase of runtime. Additionally, it is analysed if automatically selected structures going beyond the common linear structure of conditional random fields can be beneficial for named entity recognition.

These methods and analyses support the generation of workflows to build novel named entity recognition tools with less user intervention.

Schlagwörter: Conditional Random Fields; Graphical Models; Feature Selection; Machine Learning; Text Mining; Named Entity Recognition
Fraunhofer Series in Information and Communication Technology
Herausgegeben von Fraunhofer-Verbund Informations- und Kommunikationstechnik
Band 2011,1
Verfügbare Online-Dokumente zu diesem Titel
Sie benötigen den Adobe Reader, um diese Dateien ansehen zu können. Hier erhalten Sie eine kleine Hilfe und Informationen, zum Download der PDF-Dateien.
Bitte beachten Sie, dass die Online-Dokumente nicht ausdruckbar und nicht editierbar sind.
Bitte beachten Sie auch weitere Informationen unter: Hilfe und Informationen.
 
 DokumentAbstract / Kurzzusammenfassung 
 DateiartPDF 
 Kostenfrei 
 AktionDownload der Datei 
     
Benutzereinstellungen für registrierte Online-Kunden (Online-Dokumente)
Sie können hier Ihre Adressdaten ändern sowie bereits georderte Dokumente erneut aufrufen.
Benutzer
Nicht angemeldet
Export bibliographischer Daten
Shaker Verlag GmbH
Am Langen Graben 15a
52353 Düren
  +49 2421 99011 9
Mo. - Do. 8:00 Uhr bis 16:00 Uhr
Fr. 8:00 Uhr bis 15:00 Uhr
Kontaktieren Sie uns. Wir helfen Ihnen gerne weiter.
Social Media