0% found this document useful (0 votes)

55 views8 pages

Anzovino 2018

Uploaded by

Phùng Mỹ Linh

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

55 views8 pages

Anzovino 2018

Uploaded by

Phùng Mỹ Linh

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 8

Automatic Identification

and Classification of Misogynistic

Language on Twitter

Maria Anzovino1 , Elisabetta Fersini1(B) , and Paolo Rosso2

1
University of Milano-Bicocca, Milan, Italy
[email protected]
2
PRHLT Research Center, Universitat Politècnica de València, Valencia, Spain

Abstract. Hate speech may take diﬀerent forms in online social media.
Most of the investigations in the literature are focused on detecting
abusive language in discussions about ethnicity, religion, gender iden-
tity and sexual orientation. In this paper, we address the problem of
automatic detection and categorization of misogynous language in online
social media. The main contribution of this paper is two-fold: (1) a cor-
pus of misogynous tweets, labelled from diﬀerent perspective and (2) an
exploratory investigations on NLP features and ML models for detecting
and classifying misogynistic language.

Keywords: Automatic misogyny identiﬁcation · Hate speech

Social media

1 Introduction

Twitter is part of ordinary life of a great amount of people1 . Users feel free to
express themselves as if no limits were imposed, although behavioural norms
are declared by the social networking sites. In these settings, different targets
of hate speech can be distinguished and recently women emerged as victims of
abusive language both from men and women. A first study [9] focused on the
women’s experience of sexual harassment in online social network, reports the
women perception on their freedom of expression through the #mencallmethings
hashtag. More recently, as the allegations against the Hollywood producers were
made public, a similar phenomenon became viral through the #metoo hashtag.
Another investigation was presented by Fulper et al. [4]. The authors studied
the usefulness of monitoring contents published in social media in foreseeing sex-
ual crimes. In particular, they confirmed that a correlation might lay between
the yearly per capita rate of rape and the misogynistic language used in Twit-
ter. Although the problem of hate speech against women is growing rapidly,
1
https://www.statista.com/statistics/282087/number-of-monthly-active-twitter-
users/.
c Springer International Publishing AG, part of Springer Nature 2018
M. Silberztein et al. (Eds.): NLDB 2018, LNCS 10859, pp. 57–64, 2018.
https://doi.org/10.1007/978-3-319-91947-8_6
58 M. Anzovino et al.

most of the computational approaches in the state of the art are detecting abu-
sive language about ethnicity, religion, gender identity, sexual orientation and
cyberpedophilia. In this paper, we investigate the problem of misogyny detection
and categorization on social media text. The rest of the paper is structured as
follows. In Sect. 2, we describe related work about hate speech and misogyny
in social media. In Sect. 3, we propose a taxonomy for modelling the misogyny
phenomenon in online social environments, together with a Twitter dataset man-
ually labelled. Finally, after discussing experiments and the obtained results in
Sect. 4, we draw some conclusions and address future work.

2 Misogynistic Language in Social Media

2.1 Hate Speech in Social Media

Recently, several studies have been carried out in the attempt of automatically
detecting hate speech. The work presented in [15] makes a survey of the main
methodologies developed in this area. Although linguistic features may differ
within the various approaches, the classification models implemented so far are
supervised. Furthermore, a great limit is that a benchmark data set still does not
exist. Hence, the authors of [11] built and made public a corpus labelled accord-
ingly to three subcategories (hate speech, derogatory, profanity), where hate
speech is considered as a kind of abusive language. In [17] the authors described
a corpus that was labelled with tags about both racial and sexist offenses. A
recent study about the distinction of hate speech and offensive language has
been presented in [8].

2.2 Misogyny in Social Media

Misogyny is a specific case of hate speech whose targets are women. Poland,
in her book about cybermisogyny [13], remarked among the others the prob-
lem of online sexual harassment. She deepened the matter about Gamergate
occurred in 2014 primarly bursted on 4chan and then spread across different
social networking sites. Gamergate was an organized movement which seriously
threatened lives of women belonging to the video games industry. The harass-
ment took place firstly online, then it degenerated offline. This episode confirms
that the cybermisogyny does exist, thus it is necessary to put effort in trying
to prevent similar phenomena. To the best of our knowledge only a preliminary
exploratory analysis of misogynous language in online social media has been
presented in [7]. The authors collected and manually labelleld a set of tweets as
positive, negative and neutral. However, nothing has been done from a compu-
tational point of view to recognize misogynous text and to distinguish among
the variety of types of misogyny.
Automatic Identification and Classification 59

Table 1. Examples of text for each misogyny category

Misogyny category Text

Discredit I’ve yet to come across a nice girl. They all end up being
bitches in the end #WomenSuck
Stereotype I don’t know why women wear watches, there’s a perfectly
good clock on the stove. #WomenSuck
Objectiﬁcation You’re ugly. Caking on makeup can’t ﬁx ugly. It just makes
it worse!
Sexual Harassment Women are equal and deserve respect. Just kidding, they
should suck my dick
Threats of Violence Domestic abuse is never okay.... Unless your wife is a bitch
#WomenSuck
Dominance We better not ever have a woman president @WomenSuckk
Derailing @yesaIIwomen wearing a tiny skirt is “asking for it”. Your
teasing a (hard working, taxes paying) dog with a bone.
That’s cruel. #YesAllMen

Table 2. Unbalanced composition of misogyny categories within the dataset

Misogyny category # Tweets

Discredit 1256
Sexual Harassment and Threats of Violence 472
Stereotype and Objectiﬁcation 307
Dominance 122
Derailing 70

3 Linguistic Misogyny: Annotation Schema and Dataset

3.1 Taxonomy for Misogynistic Behaviour
Misogyny may take diﬀerent forms in online social media. Starting from [13] we
designed a taxonomy to distinguish between misogynous messages and, among
the misogynous ones, we characterized the diﬀerent types of manifestations. In
particular, we modeled the followed misogynistic phenomena:
1. Discredit: slurring over women with no other larger intention.
2. Stereotype and Objectification: to make women subordinated or description
of women’s physical appeal and/or comparisons to narrow standards.
3. Sexual Harassment and Threats of Violence: to physically assert power over
women, or to intimidate and silence women through threats.
4. Dominance: to preserve male control, protect male interests and to exclude
women from conversation.
5. Derailing: to justify abuse, reject male responsibility, and attempt to disrupt
the conversation in order to refocus it.
A representative text for each category is reported in Table 1.
60 M. Anzovino et al.

3.2 Dataset

In order to collect and label a set of text, and subsequently address the misog-
yny detection and categorization problems, we start from the set of keywords in
[7] and we enriched these keywords with new ones as well as hashtags to down-
load representative tweets in streaming. We added words useful to represent the
different misogyny categories, when they were related to a woman, as well as
when implying potential actions against women. We also monitored tweets in
which potential harassed users might have been mentioned. These users have
been chosen because of their either public effort in feminist movements or piti-
ful past episodes of harassment online, such as Gamergate. Finally, we found
Twitter profiles who declared to be misogynistic, i.e. those user mentioning hate
against women in their screen name or the biography. The streaming download
started on 20th of July 2017 and was stopped on 30th of November 2017. Next,
among all the collected tweets we selected a subset querying the database with
the co-presence of each keyword with either a phrase or a word not used to
download tweets but still reasonable in picturing misogyny online. The labeling
phase involved two steps: firstly, a gold standard was composed and labeled by
two annotators, whose cases of disagreement were solved by a third experienced
contributor; secondly, the remaining tweets were labeled through a majority vot-
ing approach by external contributors on the CrowdFlower platform. The gold
standard has been used for the quality control of the judgements throughout
the second step. As far as it concerns the gold standard, we estimated the level
of agreement among the annotators before the resolution of the cases of dis-
agreement. The kappa coefficient [3] is the most used statistic for measuring the
degree of reliability between annotators. The need for consistency among anno-
tators immediately arises due to the variability among human perceptions. This
interagreement measure can be summarized as:
observedagreement − chanceagreement
k= (1)
1 − chanceagreement
However, considering only this statistic is not appropriate when the prevalence of
a given response is very high or very low in a specific class. In this case, the value
of kappa may indicate a low level of reliability even with a high observed pro-
portion of agreement. In order to address these imbalances caused by differences
in prevalence and bias, the authors of [2] introduced a different version of the
kappa coefficient called prevalence adjusted bias-adjusted kappa (PABAK). The
estimation of PABAK depends solely on the observed proportion of agreement
between annotators:

P ABAK = 2 · observedagreement − 1 (2)

A more reliable measure for estimating the agreement among annotators is

PABAK-OS [12], which controls for chance agreement. PABAK-OS aims to avoid
the peculiar, unintuitive results sometimes obtained from Cohen’s Kappa, espe-
cially related to skewed annotations (prevalence of a given label). Given the great
Automatic Identiﬁcation and Classiﬁcation 61

unbalance among the misogyny categories aforementioned within the dataset

employed, we applied the PABAK-OS metric: with regards to the misogyny detec-
tion we obtained 0.4874 PABAK-OS, whereas for the misogyny categorization
0.3732 PABAK-OS. Thus, on one side the level of agreement on the presence of
misogynistic language is moderate as it is within the range [0.4;0.6], and on the
other side the level of agreement about the misogynistic behavior encountered is
fair as it is within the range [0.2;0.4]. These measures are reasonable, because the
classification of the misogynistic behavior is more complex than the identification
of the misogynistic language. The final dataset is composed of 4454 tweets2 , bal-
anced between misogynous vs no-misogynous. Considering the proposed taxon-
omy, the misogynous text have been distinguished as reported in Table 2.

4 Methodology
4.1 Feature Space
Misogyny detection might be considered as a special case of abusive language.
Therefore, we chose representative features taking into account the guidelines
suggested in [11]:
1. N-grams: we considered both character and token n-grams, in particular,
from 3 to 5 characters (blank spaces included), and tokens as unigrams,
bigrams and trigrams. We chose to include these features since they usually
perform well in text classification.
2. Linguistic: misogyny category classification is a kind of stylistic classifica-
tion, which might be improved by the use of quantitative features [1,5]. Hence,
for the purpose of the current study we employed the following stylistic fea-
tures:
(a) Length of the tweet in number of characters, tweets labelled as sexual
harassment are usually shorter.
(b) Presence of URL, since a link to an external source might be an hint for
a derailing type tweet.
(c) Number of adjectives, as stereotype and objectification tweets include
more describing words.
(d) Number of mentions of users, since it might be useful in distinguishing
between individual and generic target.
3. Syntactic: we considered Bag-Of-POS. Hence, unigrams, bigrams and tri-
grams of Part of Speech tags.
4. Embedding: the purpose of this type of features is to represent texts through
a vector space model in which each word associated to a similar context lays
close to each other [10]. In particular, we employed the gensim library for
Python [14] and used the pre-trained model on a Twitter dataset3 that was
made public.
2
The dataset has been made available for the IberEval-2018 (https://amiibereval2018.
wordpress.com/) and the EvalIta-2018 (https://amievalita2018.wordpress.com/)
challenges.
3
https://www.fredericgodin.com/software/.
62 M. Anzovino et al.

Table 3. Accuracy performance for misogynistic language identiﬁcation

Features combination RF NB MPNN SVM

Char n-grams 0.7930 0.7508 0.7616 0.7586
Token n-grams 0.7856 0.7432 0.7582 0.7995
Embedding 0.6893 0.6834 0.7041 0.7456
Bag-of-POS 0.6064 0.6031 0.6017 0.5997
Linguistic 0.5831 0.6098 0.5963 0.5348
Char n-grams, Linguistic 0.7890 0.7526 0.7443 0.7627
Token n-grams, Linguistic 0.7739 0.7164 0.7593 0.7966
Embedding, Linguistic 0.6830 0.5878 0.7014 0.6556
Bag-of-POS, Linguistic 0.6069 0.6286 0.5997 0.5799
All Features 0.7427 0.7730 0.7613 0.7739

Table 4. Macro F-measure performance for misogyny category classiﬁcation

Features combination RF NB MPNN SVM

Char n-grams 0.3429 0.31953 0.3591 0.3548
Token n-grams 0.3172 0.38177 0.3263 0.3825
Embedding 0.2905 0.14510 0.1734 0.3635
Bag-of-POS 0.2629 0.24625 0.2810 0.2628
Linguistic 0.2292 0.14424 0.1544 0.2125
Char n-grams, Linguistic 0.3458 0.27725 0.3637 0.3651
Token n-grams, Linguistic 0.2794 0.14991 0.3097 0.3841
Embedding, Linguistic 0.2945 0.22272 0.1661 0.2636
Bag-of-POS, Linguistic 0.2423 0.23096 0.2585 0.2221
All Features 0.2826 0.20266 0.3697 0.3550

4.2 Experimental Investigation

Measures of Evaluation. Since the nature of the dataset, we chose diﬀerent

evaluation metrics in regard to the classiﬁcation task. Thus, for the Misogynistic
Language identiﬁcation we considered the accuracy as the dataset is balanced
in that respect. On the contrary, the representation of each misogyny category
is greatly unbalanced, and therefore the macro F-measure was used.

Supervised Classification Models. Linear Support Vector Machine (SVM),

Random Forest (RF), Naı̈ve Bayes (NB) and Multi-layer Perceptron Neural Net-
work (MPNN) were used4 , being these models the most effective in text catego-
rization [16].
4
We employed the machine learning package scikit-learn: http://scikit-learn.org/
stable/supervised learning.html.
Automatic Identification and Classification 63

Experiments. We ﬁrstly carried out 10-fold cross validation experiments with

each type of features, where character n-grams and token n-grams are evalu-
ated individually, next the linguistic features in conjunction with each of the
remaining ones, and ﬁnally the all features were considered5 .

Results. In Table 3 we report the results that have been obtained with the
classifiers employed in the misogyny identification. Further, in Table 4 the results
that have been obtained in the misogynistic behavior classification are shown.
As far as it concerns the misogynistic language identification, the performances
reached by the classifiers chosen are close to each other. On the contrary, about
the misogyny classification they may differ substantially. Token n-grams allow
to achieve competitive results, especially for the task of misogynistic language
identification in Twitter where indeed using all the features decreases marginally
the obtained accuracy6 . Results about misogyny classification show the difficulty
of recognizing the different phenomena of misogyny. No matter that, token n-
grams obtained competitive results if compared when also linguistic features were
employed. Although the investigated features show promising results, additional
sets related to skip character [6] could be considered in the future.

5 Conclusions and Future Work

The work presented is the ﬁrst attempt in detecting misogynistic language in

social media. Our aim was twofold: identifying whether a tweet is misogynous
or not, and also classifying it on the basis of misogynistic language that was
employed. Moreover, a taxonomy was introduced in order to study the different
types of misogynistic language behaviour. Last but not least, a dataset con-
taining tweets of the different categories was built and it will be made avail-
able to the research community to further investigate this sensitive problem
of automatic identification of misogynistic language. Although the misogyny
identification and classification is still in its infancy, the preliminary results we
obtained show a promising research direction. As future work, the increasing
level of aggressiveness will be tackles through a multi-label classification app-
roach. Final challenging task would be how to deal with irony in the context of
misogyny identification.

Acknowledgements. The work of the third author was partially funded by the Span-
ish MINECO under the research project SomEMBED (TIN2015-71147-C2-1-P).

5
When training the considered classifiers, we didn’t apply any feature filtering or
parameter tuning.
6
Results obtained with All Features are statistically significant (Student t-test with
p-value equal to 0.05).
64 M. Anzovino et al.

References
1. Argamon, S., Whitelaw, C., Chase, P., Hota, S.R., Garg, N., Levitan, S.: Stylistic
text classification using functional lexical features: research articles. J. Am. Soc.
Inf. Sci. Technol. 58(6), 802–822 (2007)
2. Byrt, T., Bishop, J., Carlin, J.B.: Bias, prevalence and kappa. J. Clin. Epidemiol.
46(5), 423–429 (1993)
3. Cohen, J.: A coefficient of agreement for nominal scales. Educ. Psychol. Measur.
20(1), 37–46 (1960)
4. Fulper, R., Ciampaglia, G.L., Ferrara, E., Ahn, Y., Flammini, A., Menczer, F.,
Lewis, B., Rowe, K.: Misogynistic language on Twitter and sexual violence. In:
Proceedings of the ACM Web Science Workshop on Computational Approaches to
Social Modeling (ChASM) (2014)
5. HaCohen-Kerner, Y., Beck, H., Yehudai, E., Rosenstein, M., Mughaz, D.: Cuisine:
classification using stylistic feature sets and/or name-based feature sets. J. Assoc.
Inf. Sci. Technol. 61(8), 1644–1657 (2010)
6. HaCohen-kerner, Y., Ido, Z., Ya’akobov, R.: Stance classification of tweets
using skip char Ngrams. In: Altun, Y., Das, K., Mielikäinen, T., Malerba, D.,
Stefanowski, J., Read, J., Žitnik, M., Ceci, M., Džeroski, S. (eds.) ECML PKDD
2017. LNCS (LNAI), vol. 10536, pp. 266–278. Springer, Cham (2017). https://doi.
org/10.1007/978-3-319-71273-4 22
7. Hewitt, S., Tiropanis, T., Bokhove, C.: The problem of identifying misogynist
language on Twitter (and other online social spaces). In: Proceedings of the 8th
ACM Conference on Web Science, pp. 333–335. ACM, May 2016
8. Davidson, T., Warmsley, D., Macy, M., Weber, I.: Automated hate speech detection
and the problem of offensive language. In: Proceedings of the 12th International
AAAI Conference on Web and Social Media (2017)
9. Megarry, J.: Online incivility or sexual harassment? Conceptualising women’s expe-
riences in the digital age. In: Women’s Studies International Forum, vol. 47, pp.
46–55. Pergamon (2014)
10. Le, Q., Mikolov, T.: Distributed representations of sentences and documents. In:
International Conference on Machine Learning, pp. 1188–1196, January 2014
11. Nobata, C., Tetreault, J., Thomas, A., Mehdad, Y., Chang, Y.: Abusive language
detection in online user content. In: Proceedings of the 25th International Confer-
ence on World Wide Web, pp. 145–153. International World Wide Web Conferences
Steering Committee (2016)
12. Parker, R.I., Vannest, K.J., Davis, J.L.: Effect size in single-case research: a review
of nine nonoverlap techniques. Behav. Modif. 35(4), 303–322 (2011)
13. Poland, B.: Haters: Harassment, Abuse, and Violence Online. University of
Nebraska Press, Lincoln (2016)
14. Rehurek, R., Sojka, P.: Software framework for topic modelling with large cor-
pora. In: Proceedings of the LREC 2010 Workshop on New Challenges for NLP
Frameworks (2010)
15. Schmidt, A., Wiegand, M.: A survey on hate speech detection using natural lan-
guage processing. In: Proceedings of the Fifth International Workshop on Natural
Language Processing for Social Media. Association for Computational Linguistics,
Valencia, Spain, pp. 1–10 (2017)
16. Sebastiani, F.: Machine learning in automated text categorization. ACM Comput.
Surv. (CSUR) 34(1), 1–47 (2002)
17. Waseem, Z., Hovy, D.: Hateful symbols or hateful people? predictive features for
hate speech detection on Twitter. In: SRW@ HLT-NAACL, pp. 88–93 (2016)

JIFS179023 Author v2
No ratings yet
JIFS179023 Author v2
11 pages
Annotating Online Misogyny
No ratings yet
Annotating Online Misogyny
17 pages
Automatic Classification of Sexism in Social Netwo
No ratings yet
Automatic Classification of Sexism in Social Netwo
14 pages
2021 Wanlp-1 16
No ratings yet
2021 Wanlp-1 16
10 pages
Let Mi An Arabic Levantine Twitter Datas
No ratings yet
Let Mi An Arabic Levantine Twitter Datas
10 pages
Tania G. Levey - Sexual Harassment Online - Shaming and Silencing Women in The Digital Age-Lynne Rienner Publishers (2022)
No ratings yet
Tania G. Levey - Sexual Harassment Online - Shaming and Silencing Women in The Digital Age-Lynne Rienner Publishers (2022)
224 pages
Sexual Harassment Online Shaming and Sil
No ratings yet
Sexual Harassment Online Shaming and Sil
24 pages
Misogyny Online - A Short and Brutish Hi
No ratings yet
Misogyny Online - A Short and Brutish Hi
17 pages
Research Proposal - Mahshid Monfared
No ratings yet
Research Proposal - Mahshid Monfared
5 pages
G28.docx 10 75
No ratings yet
G28.docx 10 75
66 pages
8 - Hateful Symbols or Hateful People Predictive Features For Hate Speech Detection On Twitter
No ratings yet
8 - Hateful Symbols or Hateful People Predictive Features For Hate Speech Detection On Twitter
6 pages
ITfC Twitter Report Profitable Provocations
No ratings yet
ITfC Twitter Report Profitable Provocations
81 pages
Online Misogyny As A Hate Crime - A Challenge For Legal Regulation
No ratings yet
Online Misogyny As A Hate Crime - A Challenge For Legal Regulation
147 pages
Farrell2019 PDF
No ratings yet
Farrell2019 PDF
10 pages
Hate Speech Final
No ratings yet
Hate Speech Final
11 pages
Internet Misogyny
No ratings yet
Internet Misogyny
7 pages
Seminar Research Format
No ratings yet
Seminar Research Format
14 pages
Engeneering Chemistry
No ratings yet
Engeneering Chemistry
15 pages
TW Identifying Cyber Hate On PDF
No ratings yet
TW Identifying Cyber Hate On PDF
15 pages
How Ai Bots Have Reinforced Gender Bias in Hate Speech
No ratings yet
How Ai Bots Have Reinforced Gender Bias in Hate Speech
16 pages
Reflecting On Gender and Digital Networked Media
No ratings yet
Reflecting On Gender and Digital Networked Media
12 pages
A Big-Data Processing and Visualization Platform
No ratings yet
A Big-Data Processing and Visualization Platform
21 pages
Online Abuse of Women: Scoping Review
No ratings yet
Online Abuse of Women: Scoping Review
20 pages
Gender and Misinformation Part II
No ratings yet
Gender and Misinformation Part II
18 pages
Siddiqua Et Al 2023 Twitter Trolling of Pakistani Female Journalists A Patriarchal Society Glance
No ratings yet
Siddiqua Et Al 2023 Twitter Trolling of Pakistani Female Journalists A Patriarchal Society Glance
12 pages
Hate Speech Chapter Final Preprint
No ratings yet
Hate Speech Chapter Final Preprint
27 pages
Weiss Et Al. - 2024 - A Narrow Gateway From Misogyny To The Far Right Empirical Evidence For Social Media Exposure Effect
No ratings yet
Weiss Et Al. - 2024 - A Narrow Gateway From Misogyny To The Far Right Empirical Evidence For Social Media Exposure Effect
20 pages
Marwick & Caplan-2018-Drinking Male Tears - Language, The Manosphere and Networked Harassment
No ratings yet
Marwick & Caplan-2018-Drinking Male Tears - Language, The Manosphere and Networked Harassment
18 pages
Automated Hate Speech Detection and The Problem of Offensive Language
No ratings yet
Automated Hate Speech Detection and The Problem of Offensive Language
4 pages
3 Detection of Hate Speech in Social Networks A Surv
No ratings yet
3 Detection of Hate Speech in Social Networks A Surv
19 pages
The Life of William Shakespeare Infographic in Beige Blue Textured Flat Graphic Style
No ratings yet
The Life of William Shakespeare Infographic in Beige Blue Textured Flat Graphic Style
1 page
Folleto GA1 240202501 AA1 EV03.
No ratings yet
Folleto GA1 240202501 AA1 EV03.
2 pages
Web-11 03 24-Bulletin
No ratings yet
Web-11 03 24-Bulletin
3 pages
CMPE 127 Chapter 10 Exercises
No ratings yet
CMPE 127 Chapter 10 Exercises
7 pages
Introduction To B R Yeagers Amygdalatrop
No ratings yet
Introduction To B R Yeagers Amygdalatrop
40 pages
Ap Calc BC 2019
No ratings yet
Ap Calc BC 2019
55 pages
Psychology Assignment
No ratings yet
Psychology Assignment
9 pages
JSU BI Year 5
No ratings yet
JSU BI Year 5
3 pages
The Planets Comparative Superlative - 87358
100% (1)
The Planets Comparative Superlative - 87358
2 pages
Practical File 1 To 41 PDF
100% (1)
Practical File 1 To 41 PDF
91 pages
ADE Updated Questionbank
No ratings yet
ADE Updated Questionbank
7 pages
Aderet Eliyahu - en - Sefaria Community Translation
No ratings yet
Aderet Eliyahu - en - Sefaria Community Translation
9 pages
Top Notch 1b Unit 8 Test (WWW - Languagecentre.ir)
No ratings yet
Top Notch 1b Unit 8 Test (WWW - Languagecentre.ir)
7 pages
Untitled Presentation
No ratings yet
Untitled Presentation
8 pages
89 153 PB
No ratings yet
89 153 PB
104 pages
Resume For Digital Portfolio
No ratings yet
Resume For Digital Portfolio
2 pages
Session Plan
No ratings yet
Session Plan
7 pages
Unicorn Thesis
100% (3)
Unicorn Thesis
7 pages
Buổi 1 - Key Chi Tiết
No ratings yet
Buổi 1 - Key Chi Tiết
5 pages
Ordinary - First Language Oshindonga Paper 3 6100-3 - First Proof 24.05.2022
100% (1)
Ordinary - First Language Oshindonga Paper 3 6100-3 - First Proof 24.05.2022
4 pages
BIM Project: Store Management System
No ratings yet
BIM Project: Store Management System
5 pages
Literature Review Example Poetry
100% (3)
Literature Review Example Poetry
4 pages
Relative Motion Notes
No ratings yet
Relative Motion Notes
14 pages
Hallujah
No ratings yet
Hallujah
8 pages
General Overview of Language - PPT
No ratings yet
General Overview of Language - PPT
5 pages
Guided Reading Lesson Plan - Observation 5
No ratings yet
Guided Reading Lesson Plan - Observation 5
9 pages
Manual For Mba Project Writing: Graduate School of Management
No ratings yet
Manual For Mba Project Writing: Graduate School of Management
30 pages
6.RLC Circuit
No ratings yet
6.RLC Circuit
16 pages
(Incl) ZXLWB Include
No ratings yet
(Incl) ZXLWB Include
629 pages
Killer Queen Worksheets
100% (2)
Killer Queen Worksheets
17 pages

Anzovino 2018

Uploaded by

Anzovino 2018

Uploaded by

Automatic Identification

and Classification of Misogynistic

Maria Anzovino1 , Elisabetta Fersini1(B) , and Paolo Rosso2

Keywords: Automatic misogyny identiﬁcation · Hate speech

2 Misogynistic Language in Social Media

2.2 Misogyny in Social Media

Table 1. Examples of text for each misogyny category

Misogyny category Text

Table 2. Unbalanced composition of misogyny categories within the dataset

Misogyny category # Tweets

3 Linguistic Misogyny: Annotation Schema and Dataset

P ABAK = 2 · observedagreement − 1 (2)

A more reliable measure for estimating the agreement among annotators is

unbalance among the misogyny categories aforementioned within the dataset

Table 3. Accuracy performance for misogynistic language identiﬁcation

Features combination RF NB MPNN SVM

Table 4. Macro F-measure performance for misogyny category classiﬁcation

Features combination RF NB MPNN SVM

4.2 Experimental Investigation

Measures of Evaluation. Since the nature of the dataset, we chose diﬀerent

Supervised Classification Models. Linear Support Vector Machine (SVM),

Experiments. We ﬁrstly carried out 10-fold cross validation experiments with

5 Conclusions and Future Work

The work presented is the ﬁrst attempt in detecting misogynistic language in

You might also like