Twitter Sentiment Analysis using OpenNLP JAVA API

Hi, everyone !

Hope everyone is having a great time.
In this post, we are going to see the TWITTER SENTIMENT ANALYSIS by using JAVA as a programming language.

We are using OPENNLP Maven dependencies for doing this sentiment analysis.
Following is that Maven Dependency.

<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>com.mycompany</groupId>
<artifactId>twitter-sentiments</artifactId>
<version>1.0-SNAPSHOT</version>
<packaging>jar</packaging>
<properties>
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
<maven.compiler.source>1.7</maven.compiler.source>
<maven.compiler.target>1.7</maven.compiler.target>
</properties>
<dependencies>
<dependency>
<groupId>org.apache.opennlp</groupId>
<artifactId>opennlp-tools</artifactId>
<version>1.5.3</version>
</dependency>
<dependency>
<groupId>org.twitter4j</groupId>
<artifactId>twitter4j-core</artifactId>
<version>4.0.1</version>
</dependency>
<dependency>
<groupId>org.twitter4j</groupId>
<artifactId>twitter4j-stream</artifactId>
<version>4.0.1</version>
</dependency>
</dependencies>
</project>

view raw
pom.xml
hosted with ❤ by GitHub

As you can see from the above pom.xml file, we are using three dependencies here.
A couple of these are for twitter namely twitter4j-core and twitter4j-stream.
The remaining dependency is opennlp-tools which is responsible for depicting the nature of tweet.
OpenNLP will tell you whether a tweet is positive or negative.

For building this model, we are going to use following input file.

I got this input file from this link.

This file contains two columns separated by tab character.
First column contains either 0 or 1 where 0 indicates negative and 1 indicates positive.
Second column contains the actual tweet.

Following is the input text file.

1 Watching a nice movie
0 The painting is ugly, will return it tomorrow…
1 One of the best soccer games, worth seeing it
1 Very tasty, not only for vegetarians
1 Super party!
0 Too early to travel..need a coffee
0 Damn..the train is late again…
0 Bad news, my flight just got cancelled.
1 Happy birthday mr. president
1 Just watch it. Respect.
1 Wonderful sunset.
1 Bravo, first title in 2014!
0 Had a bad evening, need urgently a beer.
0 I put on weight again
1 On today's show we met Angela, a woman with an amazing story
1 I fell in love again
0 I lost my keys
1 On a trip to Iceland
1 Happy in Berlin
0 I hate Mondays
1 Love the new book I reveived for Christmas
0 He killed our good mood
1 I am in good spirits again
1 This guy creates the most awesome pics ever
0 The dark side of a selfie.
1 Cool! John is back!
1 Many rooms and many hopes for new residents
0 False hopes for the people attending the meeting
1 I set my new year's resolution
0 The ugliest car ever!
0 Feeling bored
0 Need urgently a pause
1 Nice to see Ana made it
1 My dream came true
0 I didn't see that one coming
0 Sorry mate, there is no more room for you
0 Who could have possibly done this?
1 I won the challenge
0 I feel bad for what I did
1 I had a great time tonight
1 It was a lot of fun
1 Thank you Molly making this possible
0 I just did a big mistake
1 I love it!!
0 I never loved so hard in my life
0 I hate you Mike!!
0 I hate to say goodbye
1 Lovely!
1 Like and share if you feel the same
0 Never try this at home
0 Don't spoil it!
1 I love rock and roll
0 The more I hear you, the more annoyed I get
1 Finnaly passed my exam!
1 Lovely kittens
0 I just lost my appetite
0 Sad end for this movie
0 Lonely, I am so lonely
1 Beautiful morning
1 She is amazing
1 Enjoying some time with my friends
1 Special thanks to Marty
1 Thanks God I left on time
1 Greateful for a wonderful meal
1 So happy to be home
0 Hate to wait on a long queue
0 No cab available
0 Electricity outage, this is a nightmare
0 Nobody to ask about directions
1 Great game!
1 Nice trip
1 I just received a pretty flower
1 Excellent idea
1 Got a new watch. Feeling happy
0 I feel sick
0 I am very tired
1 Such a good taste
0 Such a bad taste
1 Enjoying brunch
0 I don't recommend this restaurant
1 Thank you mom for supporting me
0 I will never ever call you again
0 I just got kicked out of the contest
1 Smiling
0 Big pain to see my team loosing
0 Bitter defeat tonight
0 My bike was stollen
1 Great to see you!
0 I lost every hope for seeing him again
1 Nice dress!
1 Stop wasting my time
1 I have a great idea
1 Excited to go to the pub
1 Feeling proud
1 Cute bunnies
0 Cold winter ahead
0 Hopless struggle..
0 Ugly hat
1 Big hug and lots of love
1 I hope you have a wonderful celebration

view raw
input.txt
hosted with ❤ by GitHub

By using above input file, we are going to use following code in order to decide the tweet that we have fetched is positive or negative.

/*
* To change this license header, choose License Headers in Project Properties.
* To change this template file, choose Tools | Templates
* and open the template in the editor.
*/
package com.mycompany.twitter.sentiments;
import java.io.BufferedWriter;
import java.io.FileInputStream;
import java.io.FileWriter;
import java.io.IOException;
import java.io.InputStream;
import opennlp.tools.doccat.DoccatModel;
import opennlp.tools.doccat.DocumentCategorizerME;
import opennlp.tools.doccat.DocumentSampleStream;
import opennlp.tools.util.ObjectStream;
import opennlp.tools.util.PlainTextByLineStream;
import twitter4j.Query;
import twitter4j.QueryResult;
import twitter4j.Status;
import twitter4j.Twitter;
import twitter4j.TwitterException;
import twitter4j.TwitterFactory;
import twitter4j.conf.ConfigurationBuilder;
/**
*
* @author milind
*/
public class SentimentAnalysisWithCount {
DoccatModel model;
static int positive = 0;
static int negative = 0;
public static void main(String[] args) throws IOException, TwitterException {
String line = "";
SentimentAnalysisWithCount twitterCategorizer = new SentimentAnalysisWithCount();
twitterCategorizer.trainModel();
ConfigurationBuilder cb = new ConfigurationBuilder();
cb.setDebugEnabled(true)
.setOAuthConsumerKey("3jmA1BqasLHfItBXj3KnAIGFB")
.setOAuthConsumerSecret("imyEeVTctFZuK62QHmL1I0AUAMudg5HKJDfkx0oR7oFbFinbvA")
.setOAuthAccessToken("265857263-pF1DRxgIcxUbxEEFtLwLODPzD3aMl6d4zOKlMnme")
.setOAuthAccessTokenSecret("uUFoOOGeNJfOYD3atlcmPtaxxniXxQzAU4ESJLopA1lbC");
TwitterFactory tf = new TwitterFactory(cb.build());
Twitter twitter = tf.getInstance();
Query query = new Query("udta punjab");
QueryResult result = twitter.search(query);
int result1 = 0;
for (Status status : result.getTweets()) {
result1 = twitterCategorizer.classifyNewTweet(status.getText());
if (result1 == 1) {
positive++;
} else {
negative++;
}
}
BufferedWriter bw = new BufferedWriter(new FileWriter("C:\\Users\\User\\Desktop\\results.csv"));
bw.write("Positive Tweets," + positive);
bw.newLine();
bw.write("Negative Tweets," + negative);
bw.close();
}
public void trainModel() {
InputStream dataIn = null;
try {
dataIn = new FileInputStream("C:\\Users\\User\\Downloads\\tweets.txt");
ObjectStream lineStream = new PlainTextByLineStream(dataIn, "UTF-8");
ObjectStream sampleStream = new DocumentSampleStream(lineStream);
// Specifies the minimum number of times a feature must be seen
int cutoff = 2;
int trainingIterations = 30;
model = DocumentCategorizerME.train("en", sampleStream, cutoff,
trainingIterations);
} catch (IOException e) {
e.printStackTrace();
} finally {
if (dataIn != null) {
try {
dataIn.close();
} catch (IOException e) {
e.printStackTrace();
}
}
}
}
public int classifyNewTweet(String tweet) throws IOException {
DocumentCategorizerME myCategorizer = new DocumentCategorizerME(model);
double[] outcomes = myCategorizer.categorize(tweet);
String category = myCategorizer.getBestCategory(outcomes);
System.out.print("—————————————————–\nTWEET :" + tweet + " ===> ");
if (category.equalsIgnoreCase("1")) {
System.out.println(" POSITIVE ");
return 1;
} else {
System.out.println(" NEGATIVE ");
return 0;
}
}
}

Following are the key points regarding above code

  • Line 44 is Consumer (API) Key
  • Line 45 is Consumer (API) Secret
  • Line 46 is Access Token
  • Line 47 is Access Token Secret
  • Line 50 indicates the keyword which is used for filtering out the tweets i.e. Udta Punjab
  • Line 69 depicts the trainModel() method which is used for creating the model using the tweets.txt file
  • Line 93 has classifyNewTweet() method which decides whether a tweet is positive or negative by using the already created model by trainModel() method

Once you execute this code, model will be created and then by using twitter API, tweets will be fetched which are then going to be classified either as POSITIVE or NEGATIVE.
The final output will be stored in an output file with name results.csv.

Twitter Sentiment Analysis Output Part 1
Twitter Sentiment Analysis Output Part 1
Twitter Sentiment Analysis Output Part 2
Twitter Sentiment Analysis Output Part 2
Twitter Sentiment Analysis Output Part 3
Twitter Sentiment Analysis Output Part 3

The final output looks something like this.

results file
results file


If you want to graphically represent the output of positive and negative tweets, you can use Microsoft Office Excel tool to do that.
You can open results.csv file in Excel and after selecting the output, go to Insert and click on Recommended Charts.
You will get the list of chart types and you can choose the suitable chart type for you.
I did use Bar Chart and the output looks something like this.

Sentiment Analysis Output
Sentiment Analysis Output

That is it people.
Thanks for having a read.
Kindly let me know if you have any concerns.
Have a great time. Cheers.

Published by milindjagre

I founded my blog www.milindjagre.co four years ago and am currently working as a Data Scientist Analyst at the Ford Motor Company. I graduated from the University of Connecticut pursuing Master of Science in Business Analytics and Project Management. I am working hard and learning a lot of new things in the field of Data Science. I am a strong believer of constant and directional efforts keeping the teamwork at the highest priority. Please reach out to me at milindjagre@gmail.com for further information. Cheers!

2 thoughts on “Twitter Sentiment Analysis using OpenNLP JAVA API

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.

%d bloggers like this: