Maria Nădejde

New York City · maria [dot] nadejde [at] gmail [dot] com

I am a Research Scientist at Grammarly, working on improving how people communicate in writing. My recent projects focus on neural grammatical error correction systems that are robust across domains of texts. Previously, I was a PhD student of Philipp Koehn and Lexi Birch at the University of Edinburgh. The topic of my thesis was “Syntactic and Semantic Features for Statistical and Neural Machine Translation”. My avid interest in languages and machine translation developed during my earlier career in the translation industry.

News

February 2019

I am giving a talk at Columbia University on evaluating Grammar Error Correction systems across domains.

November 2018

Attended EMNLP in Brussels.

July 2018

PhD Graduation ceremony in Edinburgh:
.

October 2017

I started a new job as a Research Scientist at Grammarly.

Experience

Research Scientist

Grammarly, New York City

At Grammarly, I am working on tools that improve how people communicate in writing - Give it a try! Recently, I have been applying Neural Machine Translation techniques to scale Grammar Correction to different domains.

October 2017 - Present

Research Intern

Google, Mountain View

I worked on adapting Google Translate to new domains. I also had my most viewed YouTube appearance so far.

May 2015 - September 2015

Research Scientist / PhD Student

University of Edinburgh

As part of the Edinburgh's Machine Translation Group, I was at the forefront of research on neural machine translation. For example, I invented a novel neural translation model that learns a better syntactic representation of sentences and improves translation for several language pairs.

October 2012 - July 2017

Publications

Google Scholar Profile

Predicting Target Language CCG Supertags Improves Neural Machine Translation

Nădejde, M., Reddy, S., Sennrich, R., Dwojak, T., Junczys-Dowmunt, M., Koehn, P., Birch, A. (2017), Proceedings of the Second Conference on Machine Translation (WMT17)

Abstract: Neural machine translation (NMT) models are able to partially learn syntactic information from sequential lexical information. Still, some complex syntactic phenomena such as prepositional phrase attachment are poorly modeled. This work aims to answer two questions: 1) Does explicitly modeling target language syntax help NMT? 2) Is tight integration of words and syntax better than multitask training? We introduce syntactic information in the form of CCG supertags in the decoder, by interleaving the target supertags with the word sequence. Our results on WMT data show that explicitly modeling targetsyntax improves machine translation quality for German→English, a high-resource pair, and for Romanian→English, a lowresource pair and also several syntactic phenomena including prepositional phrase attachment. Furthermore, a tight coupling of words and syntax improves translation quality more than multitask training. By combining target-syntax with adding source-side dependency labels in the embedding layer, we obtain a total improvement of 0.9 BLEU for German→English and 1.2 BLEU for Romanian→English.

[ PDF ]

A Neural Verb Lexicon Model with Source-side Syntactic Context for String-to-Tree Machine Translation

Nădejde, M., Birch, A., Koehn, P. (2016), Proceedings of the International Workshop on Spoken Language Translation (IWSLT16)

Abstract: String-to-tree MT systems translate verbs without lexical or syntactic context on the source side and with limited targetside context. The lack of context is one reason why verb translation recall is as low as 45.5%. We propose a verb lexicon model trained with a feedforward neural network that predicts the target verb conditioned on a wide source-side context. We show that a syntactic context extracted from the dependency parse of the source sentence improves the model’s accuracy by 1.5% over a baseline trained on a window context. When used as an extra feature for re-ranking the n-best list produced by the string-to-tree MT system, the verb lexicon model improves verb translation recall by more than 7%.

[ PDF ]

Modeling Selectional Preferences of Verbs and Nouns in String-to-Tree Machine Translation

Nădejde, M., Birch, A., Koehn, P. (2016), Proceedings of the First Conference on Machine Translation (WMT16)

Abstract: We address the problem of mistranslated predicate-argument structures in syntaxbased machine translation. This paper explores whether knowledge about semantic affinities between the target predicates and their argument fillers is useful for translating ambiguous predicates and arguments. We propose a selectional preference feature based on the selectional association measure of Resnik (1996) and integrate it in a string-to-tree decoder. The feature models selectional preferences of verbs for their core and prepositional arguments as well as selectional preferences of nouns for their prepositional arguments. We compare our features with a variant of the neural relational dependency language model (RDLM) (Sennrich, 2015) and find that neither of the features improves automatic evaluation metrics. We conclude that mistranslated verbs, errors in the target syntactic trees produced by the decoder and underspecified syntactic relations are negatively impacting these features.

[ PDF ]

Edinburgh’s Syntax-Based Machine Translation Systems

Nădejde, M., Williams, P., Koehn, P. (2013), Proceedings of the Eighth Workshop on Statistical Machine Translation (WMT13)

Abstract: We present the syntax-based string-totree statistical machine translation systems built for the WMT 2013 shared translation task. Systems were developed for four language pairs. We report on adapting parameters, targeted reduction of the tuning set, and post-evaluation experiments on rule binarization and preventing dropping of verbs.

[ PDF ]