Timothy Miller, PhD, is a scientist at the Computational Health Informatics Program (CHIP) at Boston Children's Hospital and an Instructor at Harvard Medical School. His research background is in computer science, with his thesis (2010) describing linear time syntactic models for speech repair. In his current position, he works on a variety of clinical natural language processing problems. He has made core contributions in temporal information extraction (Lin et al, 2014, Miller et al, 2013, Miller et al., 2015), UMLS relation extraction (Dligach et al, 2013), coreference resolution (Miller et al, 2012, Zheng et al, 2012, Miller et al., 2017a), and negation detection (Wu et al, 2014, Miller et al., 2017b). He also is a primary contributer to open source projects, including Apache cTAKES (clinical Text Analysis and Knowledge Extraction System) and ClearTK. He is currently interested in Bayesian grammar induction, temporal information extraction in the clinical domain, and domain adaptation for clinical NLP.

From june to july 2017, he goes to work with LIMSI's team on "Traitement Automatique de la Langue Naturelle".

Son programme :


Bayesian Methods for Unsupervised Multilingual Grammar Induction.
  • Séminaire GT TSDT 6/06 - 14h00 - LIMSI bat 508
    • Introduction to sequence models for Natural Language Processing
    • Sequence models for NLP – Focus on hidden Markov models (HMMs), including inference techniques, which form the core of the method, and their more complex siblings the hierarchical HMMs (HHMMs).
    • Optimizing HMM inference with modern GPU hardware
    • Alternative sequence models, including CRFs and RNNs
  • Séminaire TLP 13/06 - 11h30 LIMSI bat 508
    • Linear time parsing with HHMMs
    • Parsing strategies – An introduction to bottom-up, top-down, and right-corner parsing, from a psycholinguistic perspective.
    • Linear time parsing with HHMMs in a supervised machine learning framework
  • Séminaire CEA-LIST 22/06 - 10h00 CEA, amphi 34 bat 862
    • Bayesian inference for unsupervised POS tagging and parsing
    • Bayesian inference for unsupervised POS tagging with HMMs, and unsupervised parsing with HHMMs
Topics in Clinical NLP
  • Séminaire ILES 4/07 - 14h00 LIMSI bat 508
    • Generalizability and domain adaptation in clinical NLP
    • Overview of pipeline approaches to clinical NLP
    • Evidence from multiple tasks that performance degrades across tasks
    • Introduction to unsupervised domain adaptation algorithms
    • Preliminary work on domain adaptation for negation extraction
  • Séminaire GT D2K 12/07 - 9h00 LRI (bât PCRI) salle 455
    • Coreference resolution: state of the art and application to biomedical text.
    • Problem description, early systems, and applications – What is coreference, why is it important, what are some of the early methods, and what are some important use cases that rely on solving the coreference resolution problem?
    • Machine learning approaches – An overview of common machine learning approaches for the task, including pairwise, mention-synchronous, agglomerative clustering, easy-first, and even some of the unsupervised approaches
    • Biomedical coreference resolution – Domain-specific issues with solving coreference, as well as an introduction to domain-specific resources that are available for the task.
    • Future directions for coreference research - An introduction to hot topics in coreference resolution, including search-based learning, neural-network based representation learning, and cross-document coreference, with suggestions for how these methods can be applied to biomedical texts.