T.J. Trimble

Natural Language Processing
& Machine Learning Engineer

trimblet at me dot com


Projects

Adjectives in the LinGO Grammar Matrix

Master's Thesis & Project

Supervisor: Emily Bender

I extended the Grammar Matrix, an open source grammar engineering project, to enable the morphological, syntactic, and semantic analysis of adjectives cross-linguistically.

  • I developed a core HPSG linguistic analyses of adjectives accounting for data from dozens of languages.
  • I extended and added new features to an online grammar customization system using Python and JavaScript for starter natural language HPSG-style grammars.
  • I extended a Python server-side grammar customization library to produce machine readable grammatical description of adjectival lexemes, inflection, and agreement.
  • I developed a test suite of both constructed languages and natural languages to test system development and provide regression tests.

MachineLearningTools

I designed and implemented a collection of machine learning classification algorithms around a common collection of classes. These classes are designed to be useful in implementing a variety of machine learning algorithms.

  • Includes several machine learning classifiers including Decision Trees, Naive Bayes, and KNN.
  • Includes utility methods for interacting with Java standard library useful for machine learning algorithms.

Question Answering with Off-the-shelf Deep Processing Systems

with Woodley Packard & Melanie Bolla

We designed a TRAC-style Question Answering system, utilizing open source deep processing tools such as the Stanford CoreNLP dcoref coreference resolver, WordNet, DELPH-IN syntax/semantics processing, and NLTK.

  • I designed and implemented a distributed coreference resolver system for TRAC-style questions using Stanford CoreNLP dcoref and HTCondor.
  • Our system was best in our class after 9 weeks of development, getting a TRAC strict score of 21.76 and a lenient score of 33.89 on unseen data.

Towards Augmenting Coreference Resolution with a Broad Coverage Precision Grammar

with Ryan Aldrich

We utilized the large English Resource Grammar open source HPSG grammar to extend the Stanford CoreNLP dcoref Coreference Resolution system using semantic representations to augment existing functionality.

  • I designed several new rules for coreference resolution utilizing semantic representations in Minimal Recursion Semantics;
  • Our system increased the best-in-class dcoref CoNLL recall score by 1.63 on unseen data after 8 weeks of development;

Training Joint Models to Discover Topic Sentiment in Review Mining

with Yi-Shu Wei

We designed and implemented an end-to-end Sentiment Analyzer using Machine Learning Classifiers in MALLET. We implemented a feature selection algorithm using Latent Dirichlet Allocation to divide the data by topic in an attempt to improve training. We showed that LDA topic modeling did not improve classifier performance.

  • I implemented an end-to-end sentiment analyzer using shallow features and MALLET Machine Learning classifiers.
  • My system achieved 69.7% accuracy on unseen data with a three-way classification task.

Education

Professional Master of Science, Computational Linguistics

(2012-2014)

Coursework & Projects in Natural Language Processing, Machine Learning, Statistics, Systems Engineering, and Linguistics.

Bachelor of Arts, Linguistics

(2007-2009)

Coursework in Syntax, Semantics, Morphology, Phonology, Phonetics, Psycholinguistics, and Neurolinguistics.


Key Coursework

Advanced Statistical Methods in Natural Language Processing

I worked in a team to implement and test several Machine Learning algorithms and techniques, including Decision trees, KNN, Naive Bayes, and Support Vector Machines. I also developed systems for improving Machine Learning, such as feature selection algorithms (chi-squared) and boosting methods (such as Transformation Based Learning).

Deep Processing Techniques for NLP

I worked in a team to implement several deep processing methods, such as parsing, word sense disambiguation, and coreference resolution, using techniques such as CKY, (P)CFGs, and Hobb's algorithm.

Linguistics Expressions of Sentiment, Subjectivity, and Stance

This course included several presentations and discussion of cutting-edge sentiment analysis research, such as review mining, aspect extraction, recognizing spam, and summarization. I developed an end-to-end sentiment analysis system using MALLET (see above).

MRS in Applications

This course consisted of presentation and discussion of several NLP applications, including sentiment analysis, summarization, and coreference resolution, and how to apply deep processing techniques, especially with respect to graph-based sentential semantic models (Minimal Recursion Semantics), to improve existing cutting-edge systems.


Related Career Experience

Grader: Shallow Processing Techniques for NLP, Deep Processing Techniques for NLP, Advanced Statistical Methods in Natural Language Processing

I worked with a Teaching Assistant to analyze student projects and code, give feedback on student understanding of the material and programming techniques, and develop grading policies and practices.


© T.J. Trimble