Analyzing patterns using Text mining techniques
Muhammadu Buhari has been the President of Nigeria since 2015. There has been a lot of controversy around the president of the largest African nation, Nigeria. Especially since there have been a lot of boggling issues since the president's ascent to being the president of the nation. Particularly on Twitter, you’ll find a lot of tweeps spamming the President's tweets with comments such as IFB or or “I follow back” which is a way of protesting the president's rule as the GCFR. …
Web scraping can’t get easier with the Ralger Package
If you’re like me and you’re required to web scrape a lot, then you’re in for a treat. As a data scientist, you will often be responsible for dataset creation and thus web scraping is a very important skill to have in your toolbox. Here, I won’t be going into so many theories on web scraping, you may need to take a look at this article I wrote here. You should also learn how to use the SelectorGadget for chromium browsers detailed in my article.
There are many use cases for Machine Learning Models. Text analytics, a branch of Natural Language Processing provides ways of allowing Machine Learning algorithms to be applied to textual data for classification models. Patterns exist in data in ways we can not ordinarily detect all the time but using analytical tools we are able to detect these existing patterns. These patterns have even been made easier and scalable using Machine Learning and a typical example we probably interface with every day is the spam classifier.
This article takes you through how you can use represent textual data into encodings that…
Predicting Employee Churn
Employee churn is the overall turnover in an organization’s staff as existing employees leave and new ones are hired. The churn rate is usually calculated as the percentage of employees leaving the company over a specified time period. Although some staff turnover is inevitable, a high rate of churn is costly.
Don’t get overwhelmed with so many resources
As I write this, I have nothing less than 10 courses on Data science in Udemy alone, Combined with free micro-courses on Kaggle, Cognitive ai and a ton of free online pdfs. You would think I spend 20 hours daily learning… lol, I wish too.
Are you a newbie? Intermediate? whatever level you consider yourself to be in, it is very easy to get overwhelmed with so many resources. Perhaps as you read this you find yourself taking hours dabbling through various resources, and making very little progress? You find yourself constantly comparing…
Every resource you’d ever need from Beginner to Advanced
R is a programming language and free software environment for statistical computing and graphics supported by the R Foundation for Statistical Computing. It is easily available to any user with over 13000 packages already developed and more still under development. Because of the wide variety of packages available, one may get overwhelmed by so many resources and not know where to start from or even continue to while on the learning process.
To make your learning process very easy, this article seeks to give you a concise number of materials ranging…
Simultaneously scraping multiple web pages with R
This short tutorial will be on how to scrape multiple pages on a webpage simultaneously. This tutorial assumes you can use the google chrome css selector gadget. If not, see the first part here.
Due to the fact that some web pages contains large chunks of data (or text) e.g a comment section, comments usually spills to the next page. It could also be for sorting, whatever case it may be, it would be very difficult scraping each of these pages individually. …
Understanding your data
EDA is a critical and core skill every data analyst/scientist should have. It involves a critical and in-depth examination of possibilities captured in a dataset.
Exploratory data analysis” is an attitude, a state of flexibility, a willingness to look for those things that we believe are not there, as well as those we believe to be there. — John W. Tukey
EDA involves the application of statistical and visualization techniques to understand and gain insights into data. As long as there is data, there is thus, a need to explore. …
A large percentage of the top ten most demanded tech skills require some sort of programming skill or the other. The increasing reliance on machines or computers require that we tell them what to do or what not to do.
Programming is a way we achieve this. Basic human abilities are being modeled for computers to learn such as understanding emotions through Natural language processing, computers are even being trained using neural networks to smell. Consider programming as a schooling process for computers.
The 21st-century economy is called a knowledge-based economy, an economy where goods and services are produced through the innovative use of knowledge.
In the 21st century, you need leverage, the kind of leverage used in the 20th century to achieve maximum output required heavy machinery of some kind. In the Agrarian age, wealth was borne from natural tillage of soil i.e. nature, in the Industrial age, man harnessed nature through machines which became the new source of wealth — machinery. In the 21st century/Digital age, the leverage is data and knowledge.
The best innovators are those who have been able…
Data Scientist who writes too.