My dear friends,
sorry for a long absense 🙂
Today I want to write about one of my previous experimental projects in the field of the automatic extractive text summarization.
Text summarization is the process of automatically creating a compressed version of a given text that provides useful information for the user. I want to focus on the generic multi-document text summarization, where the goal is to produce a summary of the many documents on the same unspecific topic, chosing a subset of the most relevant sentences. For example, having a set of news articles on the same topic, our system creates a short summarization of the most relative information from these articles.
I re-implemented an existing LexRank approach (graph-based lexical centrality as salience) and replaced the cosine similarity measure with a combination of features from ECNU , a new system for semantic similarity between sentences. This similarity approach is the ensemble of 3 machine learning algorithms and 4 deep learning models by averaging these 7 scores (EN-seven) and is one of the best approaches for calculating the semantic similarity (2016-2017).
Continue reading “Experiments with the semantic similarity measure between sentences for the LexRank Text Summarization System”
Data-to-text System using encoder-decoder architecture with attention, BiRNN and LSTMs
“Speak English!” said the Eaglet. “I don’t know the meaning of half those long words, and, what’s more, I don’t believe you do either!” — “Alice in Wonderland”, Chapter 3
let’s teach computers to speak 😉
Today you will read about Natural Language Generation AI that can describe images given some textual attributes.
Keywords: natural language generation (NLG), data-to-text, natural language processing (NLP), image description, encoder-decoder architecture, sequence-to-sequence architecture, biderectional recurrent neural networks (BiRNNs), long short-term memory neural networks (LSTMs), Attention mechanism, neural word embeddings, Machine Learning, Deep Learning, structured data
Why do we need such a system?
and many other useful things 🙂 Continue reading “AI describing images: Natural Language Generation (NLG) using textual attributes”
What I did in the last months and what we are going to talk about in future
After quite a long pause I am again with you with an ocean of very important and interesting information 🙂
This year was for me a year of inspiration, Continue reading “Again with you – a small overview of topics/projects”
Pre-process unstructured Data
Hello, my dear friends. In the last article we had an overview of some interesting datasets for Natural Language Processing and Machine Learning. Let’s learn, how to work with them!
Data preparation is the A and B for every data scientist. Despite having good data, you cannot access the information in it, unless it is processed. Only about 20% of information today is available in structured form. Majority of data is presented in text form, which is highly unstructured in nature.
In order to produce actionable insights from data, you have to prepare it. In this article we’ll learn, how to pre-process text information. Continue reading “Prepare your data. Part 1: Pre-processing”
How to find Good Data for AI projects
Dear AI friends,
all you need is … DATA! And love, of course 🙂
Yes, I don’t joke. The fuel for any AI System is Data. No matter how clever the technologies are, they depend on data. More importantly, they depend on “good” data. If you have good Data, you have already solved 50% of your problem. Any AI System is “data-hungry” and can only be as smart as the information you provide it with.
So, before we start with clever ML algorithms, let’s ensure we know, how to find Data for your AI 😉 Continue reading “Find Data for AI”
Creating a bot in Python with Telegram (an echo-bot, an else-if bot and a bot with Levenshtein Distance)
Hello, my dear friends 🙂
Today, we are going to create our own communication bot.
There are two reasons for it.
- Reason Number One: Today it’s a TREND and a MUST
First of all, bots become a real craze of the modern world. It’s about humans talking directly to machines. It’s about science fiction. It’s about future. Continue reading “Let’s create a chatbot”
I’m Tatjana Chernenko, a writer and computer scientist living in Germany.
I’m a student in Computational Linguistics at the University of Heidelberg. My interests focus on science, Artificial Intelligence, Machine Learning and Natural Language Processing.
Every day I gain a lot of interesting experience and valuable knowledge. I found my passion in the world of science and AI. Every single step in a journey as a computer scientist is a kind of magic. I decided to start this blog to share my knowledge with you!
Continue reading “AI Blog”