Janosch Haber

Based in London, United Kingdom · janoschhaber at gmail.com

I am a Senior Deep Learning Researcher with a focus on (large) language models and natural language processing (NLP).
My work experience includes customer feedback analysis at Chattermill, and hatespeech detection at Rewire.
Before that, I completed a PhD in Computational Linguistics at Queen Mary University of London (QMUL).

Download my CV here.

News

The MYgration Project

Thrugh April and May 2024, our open-air exhibition MYgration will be open to the public in Westerpark Amsterdam. This exhibition is the culmination of a series of interviews with various people who, for various reasons, have left their own country to come to the Netherlands.
You can view the online version of the exhibition at mygration.nl

01.04.2024

Joining Chattermill as Senior Deep Learning Researcher

End of June I'll be joining Chattermill's data science team and work on developing dedicated neural language models to analyse customer feedback at scale.

15.06.2023

Rewire has been acquired by ActiveFence

Find a press statement here.

08.03.2023

PhD Thesis submitted

You can now find my PhD thesis on polysemy at the Online Library of QMUL.

01.11.2022

Enrichment placement at The Alan Turing Institute

In October 2021 I will be joining The Alan Turing Institute as an Enrichment Student.

07.09.2021

We founded our NGO Correspondents of the World

Together with some friends I founded Correspondents of the World, an online platform where we share personal stories of people from all around the world, to tell their experiences with global issues like environmental change and climate emergency, migration, gender and sexuality, liberation, education and the coronavirus. Find out more on correspondentsoftheworld.com.

01.05.2020

PhotoBook Task and Dataset Website is online

Visit our new website for the PhotoBook Task and Dataset at www.dmg-photobook.github.io.

10.06.2019

Publications

Polysemy - Evidence from Linguistics, Behavioural Science and Contextualised Language Models

Computational Linguistics

with Massimo Poesio.
Download the .bibtex file

2024

Modelling Brain Representations of Words' Concreteness in Context using GPT-2 and Human Ratings

Cognitive Science

with Andrea Bruera, Yuan Tao, Andrew Anderson, Derya Cokal, and Massimo Poesio.
Download the .nbib file

2023

Improving the Detection of Multilingual Online Attacks with Rich Social Media Data from Singapore

61st Annual Meeting of the Association for Computational Linguistics (ACL’23)

with Bertie Vidgen, Matthew S. Chapman, Vibhor Agarwal, Roy Ka-Wei Lee, Yong Keong Yap, and Paul Röttger.
Download the .bib file

2023

Computational Models of Anaphora

Annual Review of Linguistics

with Massimo Poesio, Juntao Yu, Silviu Paun, Abdulrahman Aloraini, Pengcheng Lu, Janosch Haber, and Derya Cokal

2023

Word Sense Distance and Similarity Patterns in Regular Polysemy

PhD Thesis, Queen Mary University of London

2023

Patterns of Polysemy and Homonymy in Contextualised Language Models

In Findings of the Association for Computational Linguistics: EMNLP 2021

with Massimo Poesio.

2021

Assessing Polyseme Sense Similarity through Co-predication Acceptability and Contextualised Embedding Distance

In Proceedings of the Ninth Joint Conference on Lexical and Computational Semantics (*SEM 2020)

with Massimo Poesio.
Download the .bib file or view a video recording of my talk.

2020

Word Sense Distance in Human Similarity Judgements and Contextualised Word Embeddings

In Proceedings of the Conference on Probability and Meaning (PaM 2020)

with Massimo Poesio.
Download the .bib file or view a video recording of my talk.

2020

Classification of Low-Agreement Pronouns Through Collaborative Dialogue: A Proof of Concept

In Proceedings of the 24th Workshop on the Semantics and Pragmatics of Dialogue (SemDial 2020)

with Massimo Poesio.
Download the .bib file

2020

The PhotoBook Dataset: Building Common Ground through Visually-Grounded Dialogue

In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics (ACL 2019)

with Tim Baumgärtner, Ece Takmaz, Lieke Gelderloos, Elia Bruni and Raquel Fernández.
Download the .bib file or visit the project website.

2019

How should we call it? - Introducing the PhotoBook Conversation Task and Dataset for Training Natural Referring Expression Generation in Artificial Dialogue Agents

Master's Thesis

under supervision of Dr. Raquel Fernández and Dr. Elia Bruni

2018

Education

Doctor of Philosophy (PhD) in Computational Linguistics

Queen Mary University of London (QMUL), United Kingdom

Focus on ambiguity and specifically the interpretation of polysemous expressions.

PhD Thesis written under supervision of Massimo Poesio.
Topic: Word Sense Distance and Similarity Patterns in Regular Polysemy.

September 2018 - July 2022

Master of Science (MSc) in Artificial Intelligence

University of Amsterdam (UvA), The Netherlands

Focus on the interface of Natural Language Processing and Machine Learning

Master Thesis written under supervision of Dr. Raquel Fernández and Dr. Elia Bruni.
Topic: Partner-Specificity in Visually-Grounded Dialogue.
Supported by a Facebook ParlAI Research Award

Cum Laude, GPA: 8.7

August 2016 - July 2018

Bachelor of Science (BSc) in Artificial Intelligence

University of Amsterdam (UvA), The Netherlands

Bachelor Thesis written under supervision of Dr. Roberto Valenti.
Topic: Modeling Distributed Cybernetic Management for Resource Based Economies - A simulation approach to Stafford Beer’s 1971 CyberSyn Project.

Cum Laude, GPA: 8.3

August 2012 - July 2015

Zeugnis der Allgemeinen Hochschulreife (Abitur)

Internatsschule Schloss Hansenberg (ISH), Johannisberg, Germany

Advanced courses in Mathematics, Chemistry and Politics & Economy

Final grade: 1.6

August 2009 - July 2011

Experience

Senior Deep Learning Scientist

Chattermill

Main tasks: Developing and testing dedicated LLMs for analysing customer feedback at scale.

June 2023 - now

Lead NLP Researcher

Rewire

Main tasks: Developing and testing dedicated LLMs for detecting hate speech and other forms of abusive language.

July 2022 - June 2023

Co-Founder, Board Member and Webdesigner

Correspondents of the World, correspondentsoftheworld.com

Main tasks: Managing long-term goals and operational tasks for a non-profit organisation with team members working remotely from all over the world; designing and maintaining the organisation's website.

Ongoing

Reviewer for the NeurIPS Datasets and Benchmarks Track 2021

Thirty-fifth Conference on Neural Information Processing Systems

June 2021 - September 2021

Teaching Assistant for Natural Language Processing

Queen Mary University of London, United Kingdom

Main tasks: Supervising lab sessions, explaining implementation concepts, answering student's questions and preparing and grading exercises.

September 2019 - August 2021

Student Assistant for the BSc Artificial Intelligence

University of Amsterdam (UvA), The Netherlands

Courses: Computersystemen, Computational Logic, Brein & Cognitie and Natuurlijke Taalmodellen en Interfaces

September 2016 - June 2017

Volunteering as Assistant in a Community Center

House of Light, Shefa-‘Amr, Israel

Main tasks: Organizing and supporting regular youth and children's meetings, maintaining public relations, improving communications with supporters and helping out the center's founders and members in a wide range of tasks.

Internationaler Jugendfreiwilligendienst (IJFD) with CFI Freiwilligendienste

October 2016 - June 2017

Student Assistant for the BSc Artificial Intelligence

University of Amsterdam (UvA), The Netherlands

Courses: Brein & Cognitie and Natuurlijke Taalmodellen en Interfaces

January 2015 - June 2015

Volunteering as High School Assistant Teacher

Jabez Christian School, Dasmariñas, Philippines

Main tasks: Teaching classes in Informatics and Politics (seniors), assisting the Kindergarten supervisors and organizing activities for children in the affiliated orphanage.

Internationaler Jugendfreiwilligendienst (IJFD) with Co-Workers International

September 2011 - August 2012

Awards & Scholarships

Enrichment Student at The Alan Turing Institute

9-month placement starting in October 2021.

Autumn 2021

Accepted for an Internship as NLP Applied Researcher at Amazon Seattle

Internship canceled due to the Corona Virus pandemic.

Autumn 2020

Research Scholarship

Awarded by the DALI project, funded by European Research Council Grant 695662

October 2018 - September 2021

Full Undergraduate and Graduate Scholarships

Awarded by Evangelische Studienstiftung Villigst e.V., Germany

February 2013 - July 2018

Jugendsoftware-Preis

International Competition hosted by Klaus Tschira Stiftung, Germany

In a team developed, programmed and presented an interactive learning software.

2009