RSentiment

Every system needs continuous improvement. Feedback, positive or negative, plays an important role in that improvement. Humans are fairly instinctive in interpreting the tone of the feedback. But, to teach a machine to understand the same, is highly complex. Various algorithms and tools are available today to automatically identify and categorize opinions of any textual feedback.

The application of sentiment analysis is wide and useful. It gives us a wide overview of opinion regarding various topics. Sentiment analysis provides the ability to quickly understand the impact of any product or system and react accordingly.

In one of my works, I applied sentiment analysis to predict the opinion of students regarding various academic dimensions of an institute. It is published at Springer. I used R for the purpose and was playing around with various packages already existing at CRAN but none of them was working according to my need. So, I conceptualized a tool, which applies text mining techniques to elicit insights from textual data and  published it as an open source package (RSentiment) to CRAN.

Continue reading

RSelenium: A wonderful tool for web scraping.

For one of my projects, I needed to fetch data in R from online sources. We all know that its a common practice to collect data from Twitter, Facebook and other online social media websites and analyse them. I used to do the same using the XML package until a problem occurred while scraping data from this. Even after looking up the internet, I was unable to find a solution. Hence, I raised my concern at Stackoverflow where one was generous enough to tell me about the RSelenium package. And trust me, I fell in love with this package. Kudos to it’s author.

But I will have to admit, I faced lots of trouble while using this package even after following the steps mentioned here. The following steps are written in a simple but detailed manner to easily setup the RSelenium package.

Continue reading

Android Dictionary

Learning a new language is hard because it requires new cognitive frameworks. While learning a new language, you will come across new terms and words whose meaning you need to learn along with it’s usage. With age and experience, your vocabulary will develop and that will serve as a fundamental tool in communication.

Developing Android apps is similar to leaning a new language all together. Although it primarily uses the Java language for coding, one also has to know the underlying Android framework, the structure on which it is based. Following is a list of some of the online available dictionaries which you may refer to whenever required during your learning or implementation process. The more you use it, you’ll reinforce your knowledge and your memory.

 

Reading multiple files.

By now, we all are familiar with reading csv file into R. But, what if there is a block of operations that we need to perform on multiple files? I think that will be a quite tiring job to include each csv every time and run the script.

The best and the easiest way will be to automate the whole process for which we need to design a Rscript.

Step 1:  We begin by listing all the files in my working directory. We have specified the file format by mentioning “.csv ” as pattern.

file_list <- list.files(pattern="*.csv")

Step 2:  After listing, it’s time to find the number of csv files in the directory.

l <- length(file_list)

Step 3: Now, by running a loop, we can access the content of each csv file.

for (i in 1:l) {
  x <- read.csv(temp[i])
}

Yeah! by now we can read the contents of all the files automatically by running the  script.

Now, if you have the csv files with different number of columns and you want to work with specific columns of all the csv files, but the column number of that column is different in different csv file, it will be a quite difficult situation to handle.

Say, for an example, I have three files names “A.csv”, “B.csv” and “C.csv” and I want to work with “Entropy” Column of all the csv files, but it occurs as 3rd column in “A.csv”, 5th column in “B.csv” and and 9th column in “C.csv”. As there is no uniformity in the column number, it cannot be accessed dynamically as desired. This will be a great fallback in automating the process. So, what I would do is:

## checking if the name of the column is "Entropy"
if(collnames(x)[j]=="Entropy") {

  ## saving the original column name for future use
  y[j]<-colnames(x)[j]

  ## changing the name of the jth column
  colnames(x)[j]<-'test'

  ## accessing the column by it's name
  ent [q]<-entropy(table(x$test))

  ## again assigning the original column name to the jth column
  colnames(x)[j]<-y[j]
}

So, finally my RScript looks like this :

file_list <- list.files(pattern="*.csv")
l<-length(file_list)

for (i in 1:l) {
    
  x <- read.csv(temp[i])
  y <- names(x)

for( j in 1:ncol(x)) {

  if(collnames(x)[j]=="Entropy") {
     y[j]<-colnames(x)[j]
     csv<-c(csv,temp[i])
     colnames(x)[j]<-'test'
     ent [q]<-entropy(table(x$test))
     colnames(x)[j]<-y[j]
  }
  q<-q+1
}

df <- data.frame(csv=character(), entropy=character() , stringsAsFactors=FALSE)
df <- cbind(csv,attribute)

Hope this helps and saves lots of time and effort. Happy Mining!

Output Text To Console

There are various ways to output text in RStudio.

 

Without New Line:

cat("#9....")
cat("times")

Output:

#9....times

With New Line:

print("Happy Birthday")
print("dearie")

Output:

Happy Birthday
dearie

If you want to add any more methods to the list, please leave a comment below.

Adding Multiple Comments in R

We all will accept the fact that we all have required at least once in our life to comment the previously written codes.

To comment a single line in RStudio, we can put “#” before each line.

But when we want to comment multiple lines at one go, we can use following in RStuido:-

  • In Windows: CTRL + SHIFT + C

  • In OS-X: COMMAND + SHIFT + C