Text2Story 2020
Third International Workshop on Narrative Extraction from Texts
held in conjunction with the 42nd European Conference on Information Retrieval
Third International Workshop on Narrative Extraction from Texts
held in conjunction with the 42nd European Conference on Information Retrieval
Building upon the success of the previous two editions of the workshop (Text2Story@ECIR’18 and Text2Story@ECIR’19 - IPM Journal issue), and the subsequent special issue hosted at IPM journal, we will organize the third edition of the Text2Story Workshop on Narrative Extraction from Texts at ECIR’20. Although the understanding of natural language has improved over the last couple of years - with research works emerging on the grounds of information extraction and text mining - the problem of constructing consistent narrative structures is yet to be solved.
In the third edition of the workshop, we aim to discuss scientific advances on all aspects of storyline identification from texts including but not limited to narrative extraction and understanding, content generation, formal representation, and visualization of narratives. This includes the following topics (not necessarily complete):
We invite two kinds of submissions:
All papers must be formatted according to the LNCS format style. Papers must be submitted electronically in PDF format through our Easy Chair link.
Abstract: Media monitoring is the activity of monitoring the output of the print, online and broadcast media to power the decision-making process of people and organizations (e.g., analysis of emerging technologies, competitive intelligence, public reputation, brand awareness). This is a resource intensive task which raises several challenges – The main issue discussed in this talk is how we can process and aggregate a vast amount of multilingual data to discover relevant stories, entities, topics and events; while at the same time meeting the specific information needs of each user. These can range from monitoring specific entities (competitors, brands, influencers), to coarse topics (e.g. “Aerospace Industry”) and even to fine-grained or ephemeral queries (e.g., “Return of the 737 Max model to service”). In this talk we’ll discuss how we can empower users with relevant and personalized content in the context of the media monitoring setting and introduce the approach Priberam is taking to the problems at hand; in particular by training text retrieval models on-the-fly from user feedback and integrating them in a media monitoring workflow.
Bio: Sebastião Miranda is the Head of Development at Priberam, a Portuguese SME that provides cutting-edge Natural Language Understanding and Artificial Intelligence technologies to companies in the Media, Legal and Healthcare industries. He started working on the problem of Multilingual Media Monitoring in 2016 during the 3-year SUMMA European H2020 research project with media partners British Broadcast Corporation (BBC) and Deutsche Welle, further developed during 2019 to tackle the problem of Technology Watch with Brazilian plane manufacturer Embraer. Sebastião holds an MSc. in Electrical and Computer Engineering from Instituto Superior Técnico (University of Lisbon), 2014, and has published work on High Performance Computing, Text Summarization, Entity Recognition and Linking, Crosslingual Clustering, and Fact-Checking.
Abstract: Specific discourse types present their own special challenges across the spectrum of NLP techniques, from models that are trained or tuned on specific types of discourse (e.g., wall street journal articles), to techniques making certain assumptions about the text (e.g., that everything described takes places "in the real world"). The narrative discourse form presents numerous interesting situations that both challenge the capabilities of existing techniques, and also suggest novel, NLP tasks that are specifically relevant to narrative. I review recent progress in the Cognac Laboratory on NLP as applied to narrative. I discuss four tasks. First, story detection, a variation of the text classification task where the goal is to identify whether a text contains a narrative. Second, animacy and character detection, where the goal is to determine whether a referent is animate and is acting as a "character". We see that this approach requires some narratological sophistication to be successful. Third, new improvements in sub-event detection on narrative texts that take advantage of certain important features of narrative discourse. And, fourth, new approaches to timeline extraction that significantly improve our ability to extract, organize, and characterize timelines of events. This collection of results represents concrete steps toward our ability to "do text2story", and points the way forward to an approach to NLP that is truly "narratologically aware."
Bio: Dr. Mark A. Finlayson is Eminent Scholar Chaired Assistant Professor of Computer Science in the School of Computing and Information Sciences at Florida International University (FIU) in Miami, Florida. His research intersects artificial intelligence, natural language processing, cognitive science, and the digital humanities. He directs the Cognac Laboratory whose members focus on advancing the science of narrative, including: understanding the relationship between cognition, narrative, and culture; developing new methods and techniques for investigating questions related to language and narrative; and endowing machines with the ability to understand and use narratives for a variety of applications. He received his Ph.D. from MIT in computer science in 2012 under the supervision of Professor Patrick H. Winston. He also holds the M.S. in Electrical Engineering from MIT (2001) and B.S. in Electrical Engineering from the University of Michigan, Ann Arbor (1998), and served as a research scientist at the MIT Computer Science and Artificial Intelligence Laboratory for 2½ years before coming to FIU. Dr. Finlayson was awarded an NSF CAREER Grant in 2018 in artificial intelligence and natural language processing, and in 2019 was recipient of an IBM Faculty Research Award, as well being named the US Patent and Trademark Office Edison Fellow for Artificial Intelligence. His work has been funded by NSF, NIH, ONR, DHS, and DARPA.
09:00 – 09:20: | Introduction
(Ricardo Campos ) |
Session Chair: Adam Jatowt | |
09h20 – 10h00 | Keynote 1: Tailoring Media Monitoring with User Feedback (Sebastião Miranda) |
Session Chair: Marc Spaniol | |
10h00 – 10h20 | Incorporating Context and Knowledge for Better Sentiment Analysis of Narrative Text (Chenyang Lyu , Tianbo Ji and Yvette Graham ) |
10h20 – 10h40 | Temporal Embeddings and Transformer Models for Narrative Text Understanding
(Vani Kanjirangat, Simone Mellace and Alessandro Antonucci) |
Session Chair: Arian Pasquali | |
10h40 – 11h00 | Measuring Narrative Fluency by Analyzing Dynamic Interaction Networks in Textual Narratives
(O-Joun Lee and Jin-Taek Kim) |
11h00 – 11h20 | Coffee Break - Breakout Rooms |
11h20 – 11h40 | Towards a Cross-article Narrative Comparison of News (position Paper)
(Martino Mensio , Alistair Willis and Harith Alani) |
Session Chair: Satya Almasian | |
11h40 – 12h00 | Time-centric Exploration of Court Documents
(Philip Hausner, Dennis Aumiller and Michael Gertz) |
12h00 – 12h20 | Creating Signed Networks of News Events
(Roshni Chakraborty, Srishti Bhandari, Nilotpal Chakraborty and Ritwika Das) |
Session Chair: Jeremy Pickens | |
12h20 – 12h40 | Batch Clustering for Multilingual News Streaming
(Mathis Linger and Mhamed Hajaiej) |
12h40 – 13h00 | Timelines: Entity-centric Event Extraction from Online News (demo)
(Jakub Piskorski, Vanni Zavarella, Martin Atkinson, and Marco Verile) |
13h00 – 14h00 | Lunch break |
14h00 – 15h00 | ECIR Keynote |
15h00 – 15h15 | ECIR Main Conference Coffee Break |
Session Chair: Alípio Jorge | |
15h15 – 15h55 | Keynote 2: Recent Advances in Narrative Natural Language Processing
Mark Finlayson |
Session Chair: Ismail Sengor Altingovde | |
15h55 – 16h15 | A Framework towards Computational Narrative Analysis on Blogs/Social Media (Kiran Kumar Bandeli, Muhammad Nihal Hussain and Nitin Agarwal) |
16h15 – 16h35 | Teargas, Water Cannons and Twitter:A case study on detecting protest repression events in Turkey 2013 (Fatma Elsafoury) |
16h35 – 16h50 | Coffee Break - Breakout Rooms |
Session Chair: Paulo Quaresma | |
16h50 – 17h10 | Breaking the Subtopic Barrier in Cross-Document Event Coreference Resolution (Michael Bugert, Nils Reimers, Shany Barhom, Ido Dagan and Iryna Gurevych) |
17h10 – 17h30 | Scene Linking Annotation and Automatic Scene Characterization in TV Series
(Aman Berhe, Camille Guinaudeau and Claude Barras) |
Session Chair: Sumit Bhatia | |
17h30 – 17h50 | Moving beyond triples (position paper) (Fabian M. Suchanek) |
17h50 – 18h10 | Wrap up and Awards
(Ricardo Campos, Alípio Jorge, Adam Jatowt, Sumit Bhatia) |
Important Information for Participation
Due to the outbreak of COVID-19, ECIR2020 and likewise the Text2Story workshop will be an open online event
If you are interested in participating in the text2story workshop proceed as follows.
Before the workshop starts:
During the Q&A:
Useful links:
More information http://www.ecir2020.org/