Analysis of Women Safety in Indian Cities Using Machine Learning On Tweets
Analysis of Women Safety in Indian Cities Using Machine Learning On Tweets
learning on tweets
ABSTRACT
Women and girls have been experiencing a lot of violence and harassment in
public places in various cities starting from stalking and leading to abuse
harassment or abuse assault. This research paper basically focuses on the role of
social media in promoting the safety of women in Indian cities with special
reference to the role of social media websites and applications including Twitter
platform Facebook and Instagram. This paper also focuses on how a sense of
responsibility on part of Indian society can be developed the common Indian
people so that we should focus on the safety of women surrounding them. Tweets
on Twitter which usually contains images and text and also written messages and
quotes which focus on the safety of women in Indian cities can be used to read a
message amongst the Indian Youth Culture and educate people to take strict action
and punish those who harass the women. Twitter and other Twitter handles which
include hash tag messages that are widely spread across the whole globe sir as a
platform for women to express their views about how they feel while we go out for
work or travel in a public transport and what is the state of their mind when they
are surrounded by unknown men and whether these women feel safe or not?
INTRODUCTION
Twitter has emerged as a major micro-blogging website, having over 100 million users
generating over 500 million tweets every day. With such large audience, Twitter has consistently
attracted users to convey their opinions and perspective about any issue, brand, company or any
other topic of interest. Due to this reason, Twitter is used as an informative source by many
organizations, institutions and companies.
On Twitter, users are allowed to share their opinions in the form of tweets, using only 140
characters. This leads to people compacting their statements by using slang, abbreviations,
emoticons, short forms etc. Along with this, people convey their opinions by using sarcasm and
polysemy.
A lot of research has been done on Twitter data in order to classify the tweets and analyze the
results. In this paper we aim to review of some researches in this domain and study how to
perform sentiment analysis on Twitter data using Python. The scope of this paper is limited to
that of the machine learning models and we show the comparison of efficiencies of these models
with one another.
There are certain types of harassment and Violence that are very aggressive including staring and
passing comments and these unacceptable practices are usually seen as a normal part of the
urban life. There have been several studies that have been conducted in cities across India and
women report similar type of sexual harassment and passing off comments by other unknown
people. The study that was conducted across most popular Metropolitan cities of India including
Delhi, Mumbai and Pune, it was shown that 60 % of the women feel unsafe while going out to
work or while travelling in public transport.
Women have the right to the city which means that they can go freely whenever they want
whether it be too an Educational Institute, or any other place women want to go. But women feel
that they are unsafe in places like malls, shopping malls on their way to their job location
because of the several unknown Eyes body shaming and harassing these women point Safety or
lack of concrete consequences in the life of women is the main reason of harassment of girls.
There are instances when the harassment of girls was done by their neighbors while they were on
the way to school or there was a lack of safety that created a sense of fear in the minds of small
girls who throughout their lifetime suffer due to that one instance that happened in their lives
where they were forced to do something unacceptable or was sexually harassed by one of their
own neighbor or any other unknown person.
Safest cities approach women safety from a perspective of women rights to the affect the city
without fear of violence or sexual harassment. Rather than imposing restrictions on women that
society usually imposes it is the duty of society to imprecise the need of protection of women
and also recognizes that women and girls also have a right same as men have to be safe in the
City.
Analysis of twitter texts collection also includes the name of people and name of women who
stand up against sexual harassment and unethical behavior of men in Indian cities which make
them uncomfortable to walk freely. The data set that was obtained through Twitter about the
status of women safety in Indian society was for the processed through machine learning
algorithms for the purpose of smoothening the data by removing zero values and using Laplace
and porter’s theory is to developer method of analyzation of data and remove retweet and
redundant data from the data set that is obtained so that a clear and original view of safety status
of women in Indian society is obtained.
People often express their views freely on social media about what they feel about the Indian
society and the politicians that claim that Indian cities are safe for women. On social media
websites people can freely express their view point and women can share their experiences
where they have faced sexual harassment or where we would have fight back against the sexual
harassment that was imposed on them.
The tweets about safety of women and stories of standing up against sexual harassment further
motivates other women data on the same social media website or application like Twitter. Other
women share these messages and tweets which further motivates other 5 men or 10 women to
stand up and raise a voice against people who have made Indian cities and unsafe place for the
women. In the recent years a large number of people have been attracted towards social media
platforms like Facebook, Twitter and Instagram point and most of the people are using it to
express their emotions and also their opinions about what they think about the Indian cities and
Indian society. There are several method of sentiment that can be categorized like machine
learning hybrid and lexicon-based learning.
Also there are another categorization presented with categories of statistical, knowledge-based
and age wise differentiation approaches. It is a common practice to extract the information from
the data that is available on social networking through procedures of data extraction, data
analysis and data interpretation methods. The accuracy of the Twitter analysis and prediction can
be obtained by the use of behavioral analysis on the basis of social networks.
Sentiments are subjective to the topic of interest. We are required to formulate that what kind of
features will decide for the sentiment it embodies.
In the programming model, sentiment we refer to, is class of entities that the person performing
sentiment analysis wants to find in the tweets. The dimension of the sentiment class is crucial
factor in deciding the efficiency of the model.
For example, we can have two-class tweet sentiment classification (positive and negative) or
three class tweet sentiment classification (positive, negative and neutral).
Sentiment analysis approaches can be broadly categorized in two classes – lexicon based and
machine learning based.
Lexicon based approach is unsupervised as it proposes to perform analysis using lexicons and a
scoring method to evaluate opinions. Whereas machine learning approach involves use of feature
extraction and training the model using feature set and some dataset.
The basic steps for performing sentiment analysis includes data collection, pre-processing of
data, feature extraction, selecting baseline features, sentiment detection and performing
classification either using simple computation or else machine learning approaches.
2) Text Cleaning: After the tweets for a topic are extracted before passing it to the classifier, we
need to clean the dataset to remove emoji’s, stop words so that the non-textual content not
pertinent to the analysis is identified and removed.
3) Sentiment Analysis: Once the data is cleansed it’s ready for classification, into positive,
negative and neutral tweets. There are various approaches to sentiment analysis like Machine
Learning, Lexicon based and Hybrid approach. Also there are some other approaches like
Natural Language Processing and Nero Linguistic Programming. Machine Learning involves a
training dataset and a testing dataset, where we used the training data and train the classifier
using one of many algorithms like Bayesian Networks, Naïve Bayes classification, Maximum
Entropy, Networks Support Vector Machine. The testing dataset is used to test the classifier for
its accuracy in the tweets. Lexicon approach does not use any training dataset, it makes the use
of an inbuilt dictionary where all words are associated with a human sentiment. The Hybrid
combines the Machine Learning and the Lexicon approach to improve performance of the
sentiment classifier.
3) Aspect-Level Classification: Aspect Level Classification judges various aspects of the entity
and giving different opinions about different aspects, it does not focus on the language
construction but focuses more on the opinion itself. The classification focuses around breaking
an opinion into sentiment of an opinion and target of opinion.
In paper [1], they present a classifier to predict contextual polarity of subjective phrases in a
sentence. Their approach features lexical scoring derived from the Dictionary of Affect in
Language (DAL) and extended through WordNet, allowing us to automatically score the vast
majority of words in our input avoiding the need for manual labeling. They augment lexical
scoring with n-gram analysis to capture the effect of context. They combine DAL scores with
syntactic constituents and then extract ngrams of constituents from all sentences. They also use
the polarity of all syntactic constituents within the sentence as features. Their results show
significant improvement over a majority class baseline as well as a more difficult baseline
consisting of lexical n-grams.
In this paper [2], we propose an approach to automatically detect sentiments on Twitter messages
(tweets) that explores some characteristics of how tweets are written and meta-information of the
words that compose these messages. Moreover, we leverage sources of noisy labels as our
training data. These noisy labels were provided by a few sentiment detection websites over
twitter data. In our experiments, we show that since our features are able to capture a more
abstract representation of tweets, our solution is more effective than previous ones and also more
robust regarding biased and noisy data, which is the kind of data provided by these sources.
In this paper [3], the author states that Microblogs as a new textual domain offer a unique
proposition for sentiment analysis. Their short document length suggests any sentiment they
contain is compact and explicit. However, this short length coupled with their noisy nature can
pose difficulties for standard machine learning document representations. In this work we
examine the hypothesis that it is easier to classify the sentiment in these short form documents
than in longer form documents. Surprisingly, we find classifying sentiment in microblogs easier
than in blogs and make a number of observations pertaining to the challenge of supervised
learning for sentiment analysis in microblogs.
In this paper [4], they demonstrate that it is possible to perform automatic sentiment
classification in the very noisy domain of customer feedback data. We show that by using large
feature vectors in combination with feature reduction, we can train linear support vector
machines that achieve high classification accuracy on data that present classification challenges
even for a human annotator. We also show that, surprisingly, the addition of deep linguistic
analysis features to a set of surface level word n-gram features contributes consistently to
classification accuracy in this domain.
DISADVANTAGES:
1. Twitter and Instagram point and most of the people are using it to express
their emotions and also their opinions about what they think about the Indian
cities and Indian society.
2. There are several method of sentiment that can be categorized like machine
learning hybrid and lexicon-based learning.
3. Also there are another categorization Janta presented with categories of
statistical, knowledge-based and age wise differentiation approaches
PROPOSED SYSTEM:
Women have the right to the city which means that they can gofreely whenever
they want whether it be too an Educational Institute, or any other place women
want to go. But women feel that they are unsafe in places like malls, shopping
malls on their way to their job location because of the several unknown Eyes body
shaming and harassing these women point Safety or lack of concrete consequences
in the life of women isthe main reason of harassment of girls. There are instances
when the harassment of girls was done by their neighbourswhile they were on the
way to school or there was a lack of safety that created a sense of fear in the minds
of small girls who throughout their lifetime suffer due to that one instance that
happened in their lives where they were forced to do something unacceptable or
was abusely harassed by one of their own neighbor or any other unknown person.
Safest cities approach women safety from a perspective ofwomen rights to the
affect the city without fear of violence or abuse harassment. Rather than imposing
restrictions on women that society usually imposes it is the duty of society to
imprecise the need of protection of women and also recognizes that women and
girls also have a right same as men have to be safe in the City.
ADVANTAGES:
1. Analysis of twitter texts collection also includes the name of people and
name of women who stand up against abuse harassment and unethical
behavior of men in Indian cities which make them uncomfortable to walk
freely.
2. The data setthat was obtained through Twitter about the status of women
safety in Indian society
Non-Functional Requirements:
Maintainability
Maintainability is the ease with which a product can be maintained in order to:
Usability
The primary notion of usability is that an object designed with a generalized users' psychology
and physiology in mind is, for example:
More efficient to use—takes less time to accomplish a particular task
Easier to learn—operation can be learned by observing the object
More satisfying to use
Reliability
Consistent uptime
The new system will be able to stay up and running at least 98% of the time. Any downtime
would be due to maintenance or upgrades. This downtime also includes any potential
failures/crashes.
Familiar Interface
The new system will have an interface that shares some of the feel of the old system so that users
who are familiar with the old system will not have trouble adjusting to the new system.
Real-time Feedback
The new registration system should display the student’s timetable and show the changes made
to it in real-time as the student adds and drops courses.
Focused Layout
The new system will reduce the potential for confusion by having a focused layout. This means
that it will display information that is relevant to the current task and conversely, leave out
irrelevant information.
Web Accessibility
The new system will be compatible with screen readers to assist the visually impaired. This
means that screen readers should interpret the displayed text into speech and should not output
anything that does not correspond to displayed text. It is also important that the colours are
designed so that colour-blind people can still distinguish changes in content.
Effective Recovery
The system must effectively recover from a crash within ten minutes. Effective recovery means
that the data is still in a consistent state accurate to 1 minute before the system crashes when the
system returns.
SYSTEM REQUIREMENTS:
HARDWARE REQUIREMENTS:
Feasibility Study
What is Feasibility Study?
Feasibility study is the process of determination of whether or not a project is worth doing.
Feasibility studies are undertaken within tight time constraints and normally culminate in a
written and oral feasibility report. The contents and recommendations of this feasibility study
helped us as a sound basis for deciding how to precede the project. It helped in taking decisions
such as which software to use, hardware combinations, etc. The following is the process diagram
for feasibility analysis. In the diagram, the feasibility analysis starts with the user set of
requirements. With this, the existing system is also observed. The next step is to check for the
deficiencies in the existing system. By evaluating the above points a fresh idea is conceived to
define and quantify the required goals. The user consent is very important for the new plan.
Along with, for implementing the new system, the ability of the organization is also checked.
Besides that, a set of alternatives and their feasibility is also considered in case of any failure in
the proposed System. Thus, feasibility study is an important part in software development.
WORKING CURRENT SYSTEM USERS CONSESUS
DEFICIENCES IN
USER CURRENT SYSTEM
STATED
REQUIREMENTS ANALYZE TO FIND DEFINE AND
DEFICIENCES QUANTIFY GOALS
CONSTRAINTS ON RESOURCES
1. Technical Feasibility: -
Technical feasibility determines whether the work for the project can be done with the existing
equipment, software technology and available personnel. Technical feasibility is concerned with
specifying equipment and software that will satisfy the user requirement.
This project is feasible on technical remarks also, as the proposed system is more beneficiary in
terms of having a sound proof system with new technical components installed on the system.
The proposed system can run on any machines supporting Windows and Internet services and
works on the best software and hardware that had been used while designing the system so it
would be feasible in all technical terms of feasibility.
The technologies used are matured enough so that they can be applied to our problems. The
practicality of the solution we have developed is proved with the use of the technologies we have
chosen. The technologies such as JAVA (JSP, Servlet), JavaScript and the compatible H/Ws are
so familiar with the today’s knowledge based industry that anyone can easily be compatible to
the proposed environment.
Do we currently possess the necessary technology?
We first make sure that whether the required technologies are available to us or nor. If they are
available then we must ask if we have the capacity. For instance, “Will our current Printer be
able to handle the new reports and forms required of a new system?
This consideration of technical feasibility is often forgotten during feasibility analysis. We may
have the technology, but that doesn’t mean we have the skills required to properly apply that
technology. As far as our project is concerned we have the necessary expertise so that the
proposed solution can be made feasible.
2. Economical Feasibility: -
Economical feasibility determines whether there are sufficient benefits in creating to make the
cost acceptable, or is the cost of the system too high. As this signifies cost benefit analysis and
savings. On the behalf of the cost-benefit analysis, the proposed system is feasible and is
economical regarding its pre-assumed cost for making a system. During the economical
feasibility test we maintained the balance between the Operational and Economical feasibilities,
as the two were the conflicting. For example the solution that provides the best operational
impact for the end-users may also be the most expensive and, therefore, the least economically
feasible. We classified the costs of Online Counselling according to the phase in which they
occur. As we know that the system development costs are usually one-time costs that will not
recur after the project has been completed. For calculating the Development costs we evaluated
certain cost categories viz.
• Personnel costs
• Computer usage
• Training
• Supply and equipments costs
• Cost of any new computer equipments and software.
In order to test whether the Proposed System is cost-effective or not we evaluated it through
three techniques viz.
Return on Investment
3. Operational Feasibility: -
Operational feasibility criteria measure the urgency of the problem (survey and study phases) or
the acceptability of a solution (selection, acquisition and design phases). Operational feasibility
is the measure of how well a proposed system solves the problems, and takes advantage of the
opportunities identified during scope definition and how it satisfies the requirements identified in
the requirements analysis phase of system development.
The operational feasibility assessment focuses on the degree to which the proposed development
project fits in with the existing business environment and objectives with regard to development
schedule, delivery date, corporate culture and existing business processes.
To ensure success, desired operational outcomes must be imparted during design and
development.
SYSTEM DESIGN
System Architecture:
DATA FLOW DIAGRAM:
GOALS:
The Primary goals in the design of the UML are as follows:
1. Provide users a ready-to-use, expressive visual modeling Language so that
they can develop and exchange meaningful models.
2. Provide extendibility and specialization mechanisms to extend the core
concepts.
3. Be independent of particular programming languages and development
process.
4. Provide a formal basis for understanding the modeling language.
5. Encourage the growth of OO tools market.
6. Support higher level development concepts such as collaborations,
frameworks, patterns and components.
7. Integrate best practices.
In this module, the Admin has to login by using valid user name and
password. After login successful he can perform some operations such as
View All Friends Req and Res,View All User Tweet Blogs,Add Filter
Details, View Negative Sentiment,View Positive Sentiment,View Women
Safety Results,View Tweet Score Results.
User
In this module, there are n numbers of users are present. User should register
before performing any operations. Once user registers, their details will be
stored to the database. After registration successful, he has to login by using
authorized user name and password. Once Login is successful user can
perform some operations like View Your Profile, Search Friend & Find
Friend Request, View All My Friends, Create Tweet Blog, View All My
Tweet Blogs ,View All My Friends Tweet Blogs
Testing
There are different methods that can be used for software testing. This chapter briefly describes
the methods available.
Black-Box Testing
The technique of testing without having any knowledge of the interior workings of the
application is called black-box testing. The tester is oblivious to the system architecture and
does not have access to the source code. Typically, while performing a black-box test, a tester
will interact with the system's user interface by providing inputs and examining outputs without
knowing how and where the inputs are worked upon.
The following table lists the advantages and disadvantages of black-box testing.
Advantages Disadvantages
Well suited and efficient for large code segments. Limited coverage, since only a
selected number of test scenarios is
actually performed.
Clearly separates user's perspective from the Blind coverage, since the tester
developer's perspective through visibly defined cannot target specific code segments
roles. or errorprone areas.
Large numbers of moderately skilled testers can test The test cases are difficult to design.
the application with no knowledge of
implementation, programming language, or
operating systems.
White-Box Testing
White-box testing is the detailed investigation of internal logic and structure of the code. White-
box testing is also called glass testing or open-box testing. In order to perform white-
box testing on an application, a tester needs to know the internal workings of the code.
The tester needs to have a look inside the source code and find out which unit/chunk of the code
is behaving inappropriately.
The following table lists the advantages and disadvantages of white-box testing.
Advantages Disadvantages
As the tester has knowledge of the source Due to the fact that a skilled tester is needed
code, it becomes very easy to find out to perform white-box testing, the costs are
which type of data can help in testing the increased.
application effectively.
Extra lines of code can be removed which It is difficult to maintain white-box testing,
can bring in hidden defects. as it requires specialized tools like code
analyzers and debugging tools.
Mastering the domain of a system always gives the tester an edge over someone with limited
domain knowledge. Unlike black-box testing, where the tester only tests the application's user
interface; in grey-box testing, the tester has access to design documents and the database.
Having this knowledge, a tester can prepare better test data and test scenarios while making a
test plan.
Advantages Disadvantages
Offers combined benefits of black-box and Since the access to source code is not
white-box testing wherever possible. available, the ability to go over the code
and test coverage is limited.
Grey box testers don't rely on the source The tests can be redundant if the software
code; instead they rely on interface definition designer has already run a test case.
and functional specifications.
Based on the limited information available, a Testing every possible input stream is
grey-box tester can design excellent test unrealistic because it would take an
scenarios especially around communication unreasonable amount of time; therefore,
protocols and data type handling. many program paths will go untested.
The internal workings of an The tester has limited Tester has full knowledge
application need not be knowledge of the internal of the internal workings of
known. workings of the application. the application.
Testing is based on external Testing is done on the basis of Internal workings are fully
expectations - Internal high-level database diagrams known and the tester can
behavior of the application and data flow diagrams. design test data
is unknown. accordingly.
It is exhaustive and the least Partly time-consuming and The most exhaustive and
time-consuming. exhaustive. time-consuming type of
testing.
Not suited for algorithm Not suited for algorithm Suited for algorithm
testing. testing. testing.
This can only be done by Data domains and internal Data domains and internal
trial-and-error method. boundaries can be tested, if boundaries can be better
known. tested.
Levels of Testing
There are different levels during the process of testing. In this chapter, a brief description is
provided about these levels.
Levels of testing include different methodologies that can be used while conducting software
testing. The main levels of software testing are −
Functional Testing
Non-functional Testing
Functional Testing
This is a type of black-box testing that is based on the specifications of the software that is to be
tested. The application is tested by providing input and then the results are examined that need
to conform to the functionality it was intended for. Functional testing of a software is conducted
on a complete, integrated system to evaluate the system's compliance with its specified
requirements.
There are five steps that are involved while testing an application for functionality.
Step Description
s
III The output based on the test data and the specifications of the application.
An effective testing practice will see the above steps applied to the testing policies of every
organization and hence it will make sure that the organization maintains the strictest of
standards when it comes to software quality.
Unit Testing
This type of testing is performed by developers before the setup is handed over to the testing
team to formally execute the test cases. Unit testing is performed by the respective developers
on the individual units of source code assigned areas. The developers use test data that is
different from the test data of the quality assurance team.
The goal of unit testing is to isolate each part of the program and show that individual parts are
correct in terms of requirements and functionality.
There is a limit to the number of scenarios and test data that a developer can use to verify a
source code. After having exhausted all the options, there is no choice but to stop unit testing
and merge the code segment with other units.
Integration Testing
Integration testing is defined as the testing of combined parts of an application to determine if
they function correctly. Integration testing can be done in two ways: Bottom-up integration
testing and Top-down integration testing.
1 Bottom-up integration
This testing begins with unit testing, followed by tests of progressively higher-
level combinations of units called modules or builds.
2 Top-down integration
In this testing, the highest-level modules are tested first and progressively, lower-
level modules are tested thereafter.
System Testing
System testing tests the system as a whole. Once all the components are integrated, the
application as a whole is tested rigorously to see that it meets the specified Quality Standards.
This type of testing is performed by a specialized testing team.
System testing is the first step in the Software Development Life Cycle, where the
application is tested as a whole.
The application is tested thoroughly to verify that it meets the functional and technical
specifications.
The application is tested in an environment that is very close to the production
environment where the application will be deployed.
System testing enables us to test, verify, and validate both the business requirements as
well as the application architecture.
Regression Testing
Whenever a change in a software application is made, it is quite possible that other areas within
the application have been affected by this change. Regression testing is performed to verify that
a fixed bug hasn't resulted in another functionality or business rule violation. The intent of
regression testing is to ensure that a change, such as a bug fix should not result in another fault
being uncovered in the application.
Minimize the gaps in testing when an application with changes made has to be tested.
Testing the new changes to verify that the changes made did not affect any other area of
the application.
Mitigates risks when regression testing is performed on the application.
Test coverage is increased without compromising timelines.
Increase speed to market the product.
Acceptance Testing
This is arguably the most important type of testing, as it is conducted by the Quality Assurance
Team who will gauge whether the application meets the intended specifications and satisfies the
client’s requirement. The QA team will have a set of pre-written scenarios and test cases that
will be used to test the application.
More ideas will be shared about the application and more tests can be performed on it to gauge
its accuracy and the reasons why the project was initiated. Acceptance tests are not only
intended to point out simple spelling mistakes, cosmetic errors, or interface gaps, but also to
point out any bugs in the application that will result in system crashes or major errors in the
application.
CONCLUSION
Throughout the project we have discussed machine learning algorithms that can
help us to organize and analyze the huge amount of Twitter data obtained including
millions of tweets and text messages shared every day. These machine learning
algorithms are very effective and useful when it comes to analyzing of large
amount of dataincluding the SPC algorithm and linear algebraic Factor Model
approaches which help to further categorize the data into meaningful groups.
Support vector machines is yet another form of machine learning algorithm that is
very popular in extracting Useful information from the Twitter and get an idea
about the status of women safety in Indian cities.
REFERENCES
1] Agarwal, Apoorv, Fadi Biadsy, and Kathleen R. Mckeown. "Contextual phrase level polarity
analysis using lexical affect scoring and syntactic n-grams." Proceedings of the 12th Conference
of the European Chapter of the Association for Computational Linguistics. Association for
Computational Linguistics, 2009.
[2] Barbosa, Luciano, and Junlan Feng. "Robust sentiment detection on twitter from biased and
noisy data." Proceedings of the 23rd international conference on computational linguistics:
posters. Association for Computational Linguistics, 2010.
[3] Bermingham, Adam, and Alan F. Smeaton. "Classifying sentiment in microblogs: is brevity
an advantage?." Proceedings of the 19th ACM international conference on Information and
knowledge management. ACM, 2010.
[4] Gamon, Michael. "Sentiment classification on customer feedback data: noisy data, large
feature vectors, and the role of linguistic analysis." Proceedings of the 20th international
conference on Computational Linguistics. Association for Computational Linguistics, 2004.
[5] Kim, Soo-Min, and Eduard Hovy. "Determining the sentiment of opinions." Proceedings of
the 20th international conference on Computational Linguistics. Association for Computational
Linguistics, 2004.
[6] Klein, Dan, and Christopher D. Manning. "Accurate unlexicalized parsing." Proceedings of
the 41st Annual Meeting on Association for Computational Linguistics-Volume 1. Association
for Computational Linguistics, 2003..
[7] Charniak, Eugene, and Mark Johnson. "Coarse-to-fine n-best parsing and MaxEnt
discriminative reranking." Proceedings of the 43rd annual meeting on association for
computational linguistics. Association for Computational Linguistics, 2005.
[8] Gupta, B., Negi, M., Vishwakarma, K., Rawat, G., & Badhani, P. (2017). Study of Twitter
sentiment analysis using machine learning algorithms on Python. International Journal of
Computer Applications, 165(9), 0975-8887.
[9] Sahayak, V., Shete, V., & Pathan, A. (2015). Sentiment analysis on twitter data. International
Journal of Innovative Research in Advanced Engineering (IJIRAE), 2(1), 178-183.
[10] Mamgain, N., Mehta, E., Mittal, A., & Bhatt, G. (2016, March). Sentiment analysis of top
colleges in India using Twitter data. In Computational Techniques in Information and
Communication Technology.