My dear friends,
sorry for a long absense š
Today I want to write about one of my previous experimental projects in the field of the automatic extractive text summarization.
Text summarization is the process of automatically creating a compressed version of a given text that provides useful information for the user. I want to focus on the generic multi-document text summarization, where the goal is to produce a summary of the many documents on the same unspecific topic, chosing a subset of the most relevant sentences. For example, having a set of news articles on the same topic, our system creates a short summarization of the most relative information from these articles.
I re-implemented an existing LexRank approach (graph-based lexical centrality as salience) and replaced the cosine similarity measure with a combination of features fromĀ ECNU [3], a new system for semantic similarity between sentences. This similarity approach is theĀ ensemble of 3 machine learning algorithms and 4 deep learning models byĀ averaging these 7 scores (EN-seven) and is one of the best approaches for calculating the semantic similarity (2016-2017).