Hello!
I am a MSc student (Research Track) at MILA and McGill University, with Professor Jackie Cheung. I primarily work in natural language processing, and am interested in ways to train models with scarce data.
Previously, I worked on summarization and simplification (under Prof. Arman Cohan), and NLP for tabular data (under Prof. Dragomir Radev, Linyong Nan) at Yale.
Before grad school, I was a data scientist at McKinsey and Company (QuantumBlack) in New York, and an ML engineer intern at Elicit. I completed my undergraduate studies in statistics at Yale University.
Selected Publications
Improving Factuality & Accuracy of Language Models
- Lorenzo Flores and Arman Cohan. On the Benefits of Fine-Grained Loss Truncation: A Case Study on Factuality in Summarization, EACL 2024 [Paper, Video, Code]
- Lorenzo Flores, Heyuan Huang, Kejian Shi, Sophie Chheang, and Arman Cohan. 2023. Medical Text Simplification: Optimizing for Readability with Unlikelihood Training and Reranked Beam Search Decoding, EMNLP 2023 Findings [Paper, Video, Code, Demo]
- Linyong Nan, Lorenzo Flores, Yilun Zhao, Yixin Liu, Luke Benson, Weijin Zou, and Dragomir Radev. 2022. R2D2: Robust Data-to-Text with Replacement Detection, EMNLP 2022 [Paper]
- Lorenzo Flores, Yiding Hao. 2022. Adversarial Benchmark for Fake News Classification. AAAI 2022 AdvML Workshop [Paper, Code]
Low Resource Applications
Lorenzo Flores, Dragomir Radev. 2022. Look Ma, Only 400 Samples! Revisiting the Effectiveness of Automatic N-Gram Rule Generation for Spelling Normalization in Filipino, EMNLP 2022 SustaiNLP Workshop [Paper, Video, Code]
Chiara Ledesma, Oshean Lee Garonita, Lorenzo Flores, Isabelle Tingzon, and Danielle Dalisay. 2020. Interpretable Poverty Mapping using Social Media Data, Satellite Images, and Geospatial Information, ML4D Workshop, NeurIPS 2020, Best Workshop Paper Award [Paper]