Basic level Twitter Sentiment Analysis using colab,Tweepy

Colab - a place similar to Jupiter Notebook where data science takes place at ease. It is a product from google - of course free to use if you are interested to learn new things.


It comes with python 3 preinstalled and also has major library pre-installed, we just need to get into our logic to program inside.

To get started, just open https://colab.research.google.com

Obviously, storage space will be taken from your google drive hence google sign in is a must.

Let's begin sentiment analysis,

Here we are getting into the basic level of things,

Sentiment Analysis can be done by various methods by categorically it can be divided into Lexicon verification or deep learning(neural networks)

We choose Lexicon verification, which is easy as a set of predefined words with positive, neutral and negative is already classified under a package.

we need to do the following of installing pandas,vaderSentiment packages to start the analysis. Pandas help split and get cleaning work done.
Also for doing sentiment analysis in twitter, use tweepy - a python package for twitter api


Step1:

!pip install pandas
!pip install Tweepy
!pip install vaderSentiment



Just import the packages installed,

import tweepy
import pandas as pd
from nltk,sentiment.vander import SentimentIntensityAnalyzer

Above pic has numpy and matplotlib but you don't require it now.

For getting twitter api, get into https://apps.twitter.com


You need create an app after that you can get the following

consumer_key = ''
consumer_secret = ''
access_token = ''
access_token_secret = ''

just fill out your keys, to get access to twitter data.

the following code is to get authenticate into twitter and query you need for user or hastag using the keys provided, pandas will get those tweets into our data frame

auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_token, access_token_secret)

api = tweepy.API(auth)

tweets = api.search('techiebouncer', count=1000) "change to your requirement" - blue colored and also increase the count to get the number of tweets on the topic.


data = pd.DataFrame(data=[tweet.text for tweet in tweets], columns=['Tweets'])

display(data.head(10))


print(tweets[0].created_at)

Now, we get our data ready for analysis,

As sentiment analysis is more based on the words to check, we use Natural Language Tool Kit which vader package provides.


use the following commands to get your result:

import nltk

nltk.download('vader_lexicon') Downloads the vader lexicon package

sid = SentimentIntensityAnalyzer()  function inside the package


listy = []  Empty List which stores output data after looping

for index, row in data.iterrows():
  ss = sid.polarity_scores(row["Tweets"])
  listy.append(ss)
  
se = pd.Series(listy)
data['polarity'] = se.values


display(data.head(1000))

After running the above you will get, polarity in the tweets as shown.


Hope, you try it out more with colab,python - if interested leave comments about the use

Post a Comment

0 Comments