Introduction
Text mining allow us to derive useful insights from unstructured and unsorted data. Data Scientists who are able to perform these operations are highly sought after in the workforce.
Learn the various tools and techniques for mining and analyzing text data with R. Discover insightful patterns, extract knowledge and trends for decision making through unstructured data. A hands-on example of extraction of data from Web will be done in this course.
Course Content
- Introduction
- Introduction to text mining
- Applications of text mining
- Basic Text Functions
- Text manipulation functions
- Working with strings
- Working with gsub
- Advanced methods
- Converting to corpus
- Importing and Converting Data
- Converting docx into corpus
- Converting pdf into corpus
- Converting html to corpus
- Web scraping
- Tidytext Package
- Tidying text objects
- Tidying document term matrix objects
- Tidying document frequency matrix objects
- Tidying corpus objects
- Mining literacy works
- Word Frequencies & Relationships
- Pre-processing text
- Wordcloud
- Frequency analysis
- nGrams & bigrams
- Bigrams for sentiment analysis
- Visualizing bigrams network
- Sentiment Analysis
- Sentiment libraries
- Analyzing positive & negative words
- Comparing 3 sentiment libraries
- Common positive & negative words
- Topic Modelling
- Latent Semantic Indexing (LSI)
- Latent Dirichlet Allocation (LDA)
- Word topic probabilities
- Document – topic probabilities
- Chapters probabilities
- Per document classification
- Document Similarity & Classifier
- Text alignment & pairwise comparison
- Minihashing and locality sensitive hashing
- Extract key words
- Classify by location, language, topic
- Extracting data from the Internet and Social Media
- Extracting data from Amazon
- Extracting data from Twitter
- Extracting YouTube comments
- Extracting Facebook comments