|
Collection of Articles On Text Categorization
The following articles helped me a lot in my work on Text
Classification. You will find only the articles, as I didn't want to
break any copyright laws. But you can find most of these papers by using
the titles as keywords at google.
Section 0
-
Rennie, Rifkin: Improving Multiclass Text Classification with
the Support Vector Machine (Oct. 2001)
(using 20 Newsgroups Data Set)
- Georges Siolas, Florence d'Alche-Buc:
Support Vector Machines based on a Semantic Kernel for Text
Categorization (using 20 Newsgroups Data Set)
-
Burges: A Tutorial on Support Vector Machines
-
Osuna et al.: Support Vector Machines,
Training and Applications
-
Ngai Tang: Text Categorisation using Support Vector Machines
(interesting dissertation, 30 August 2001)
Section 1
-
Domingos, Pazzani: On the
Optimality of the Simple Bayesian Classifier und Zero-One Loss
-
Fabrizio Sebastiani: A Tutorial on Automated Text
Categorisation
-
Fabrizio Sebastiani: Machine Learning in Automated Text
Categorization
-
Fabrizio Sebastiani: Machine Learning in Automated Text
Categorization (differently formatted, i.e. 55 pages instead of 63)
-
Galavotti, Sebastiani, Simi: Experiments on the Use of Feature
Selection and Negative Evidence in Automated Text Categorization
-
Automatic Web Page Categorization by Link and Context Analysis
- Categorisation by
Context
- Guest
Editors'Introduction to the Special Issue on Automated Text
Categorization
- Caropreso, Matwin,
Sebastiani: A Learner-Independent Evaluation of the Usefulness of
Statistical Phrases for Automated Text Categorization
- Lewis et al: Naive (Bayes) at Fourty
Section 2
-
Jason D.M. Rennie: Improving Multi-class Text Classification
with Naive Bayes (Master's Thesis)
- McCallum, Nigam, Rennie, Seymore: A Machine Learning Approach
to Building Domain-Specific Search Engines
- Nigam, McCallum, Thrun, Mitchell: Learning to Classify Text from
Labeled and Unlabeled Documents
- Nigam, McCallum, Thrun, Mitchell: Learning to Classify Text from
Labeled and Unlabeled Documents (condensed version)
- McCallum, Nigam: A Comparison of Event Models for Naive Bayes
Text Classification
- McCallum: Multi-Label
Text Classification with a Mixture Model Trained by EMn
- McCallum, Nigam: Employing EM and Pool-Based Active Learning for
Text Classification
- Craven, DiPasquo, Freitag, McCallum, Mitchell, Nigam, Slattery:
Learning to Extract Symbolic Knowledge from the WWW
- Baker, McCallum: Distributional Clustering of Words for Text
Classification (newer?)
- Baker, McCallum: Distributional Clustering of Words
for Text Classification
- Using Maximum Entropy for Text Classificationn
- Andrew McCallum, Fernando Freitag: Maximum Entropy
Markow Models for Information Extraction and Segmentation
- D'Alessio, Murray, Schiaffino: The Effect of Using Hierarchical
Classifiers in Text Categorization
Section 3
- David Yarowsky: Word-Sense Disambiguation, Using Statistical
Models of Roget's Categories, Trained on Large Corpora
- Ide, Veronis: Word Sense Disambiguation: The State of
the Art
- Schütze: Automatic Word Sense Discrimination
-
Mladenic, Grobelnik: Word Sequences as Features in Text-Learning
- Yang et. al.: Learning Approaches for Detecting
and Tracking News Events
Section 4
- Apte, Damerau, Weiss: Automated Learning of Decision Rules for
Text Categorization
- Susan Dumais, Hao Chen: Hierarchical Classification of Web Content
|
Section 5
- Lewis, Jones: Natural Language Processing for
Information Retrieval
- Wiener, Pedersen, Weigend: A Neural Network Approach
to Topic Spotting
- Gorniak, Peter: Sorting Email Messages by Topic
- Gorniak, Peter: MailMind, A Connectionist E-Mail
Sorting Client
Section 6
- Vijay Boyapati: Towards a Comprehensive Topic, Hierarchy for News
- Moulinier, Raskinis, Ganascia: Text Categorization: a
Symbolic Approach
- Quasthoff,
Wolff: Effizientes Dokumentclustering durch niederfrequente Therme
Section 7
- Yang, Pederson: A Comparative Study on Feature Selection in Text
Categorization
- Yang, Liu: A re-examination of Text Categorisation Methods
- Improving Text Classification by Shrinkage in a Hierarchy of Classes
- John, Kohavi, Pfleger: Irrelevant Features and the
Subset Selection Problem
- Martijn Spitters: Comparing feature sets for learning text categorization
- Ellen Riloff: Little Words Can Make a Big Difference
- Fuka, Hanka:
Feature Set Reductuction for Document Classification Problems
- Feature subset selection in text-learning
- Ruiz, Srinivasan: Hierarchical Neural Networks for Text
Categorization
Section 8 (some only in print)
- An Algorithm for Suffix Stripping
- Hsu, Lang: Feature Reduction and Database Maintanance in NETNEWS
Classification
- Thomas Hofmann: Learning and Representing Topic
- Seminararbeit: Advanced Information Retrieval
Methods
Section 9
- Sam Scott: Feature Engineering for a Symbolic
Approach to Text Classification
- Kermit, et al.: Automatic Complexity
Management: Personalised Document Retrieval from the World Wide Webn
- Michie, et. al.: Machine Learning, Neural and
Statistical Classification ( 298 pages!, review of different
approaches to text classification)
- Joachims: A Probabilistic Analysis of
the Rocchio Algorithm with TFIDF for Text Categorization
- Meghini et. al.: A Model of
Multimedia Information Retrieval
- Slonim, Tishby:
Document Clustering using Word
Clusters via the Information Bottleneck Method
- Mlademic: Turning Yahoo into an Automatic Web-Page Classifier
- Mlademic, Grobelnik:Assigning keywords to documents using machine
learning
- Fuhr et. al.: AIR/X a Rule-Based Multistage Indexing
System for Large Subject _Fields
- Mitchell: Machine
Learning,
Slides for instructors
Various
- Articles by Junker
- Mladenic:
Text-Learning and Related Intelligent Agents: A Survey
- Compression: A Key for Next-Generation Text
Retrieval Systems
- Chang: Enabling Concept-Based Relevance
Feedback for Information Retrieval on the WWW
|
|
|
Whatever you do will be insignificant, but it is very important that you do it. (Mahatma Gandhi)
© Copyright 1996 - 2018, Bernd Klein
Data Protection Declaration (DSGVO)
My German site
|
|