All "File - CSV" 'Texttype' 'String'

This document describes the process of building and evaluating an LSTM neural network for text classification. It loads and preprocesses event narrative data, trains a word embedding on the text, converts the documents to sequences of indices, creates an LSTM network with input, LSTM, softmax and classification layers, trains the network, evaluates it on validation data, and uses it to predict labels for new text data.

Uploaded by

Muhammad abdur rehman

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as RTF, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

44 views4 pages

All "File - CSV" 'Texttype' 'String'

Uploaded by

Muhammad abdur rehman

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as RTF, PDF, TXT or read online on Scribd

You are on page 1/ 4

1.

clc
2. clear all
3.
4. filename = "file.csv";
5. data = readtable(filename,'TextType','string');
6. head(data)
7.
8. data.event_type = categorical(data.event_type);
9.
10. % f = figure;
11. % f.Position(3) = 1.5*f.Position(3);
12. figure
13. histogram(data.event_type);
14. xlabel("Class")
15. ylabel("Frequency")
16. title("Class Distribution")
17. %% Get the frequency counts of the classes and their names from the
histogram.
18.
19. classCounts = h.BinCounts;
20. classNames = h.Categories;
21. % Find the classes containing fewer than ten observations.
22.
23. idxLowCounts = classCounts < 10;
24. infrequentClasses = classNames(idxLowCounts)
25. %%
26. %The next step is to partition it into sets for training and validation.
27. %Partition the data into a training partition and a held-out partition
for validation and testing.
28. %Specify the holdout percentage to be 20%.
29. cvp = cvpartition(data.event_type,'Holdout',0.2);
30. dataTrain = data(training(cvp),:);
31. dataValidation = data(test(cvp),:);
32. %% Extract the text data and labels from the partitioned tables.
33. textDataTrain = dataTrain.event_narrative;
34. textDataValidation = dataValidation.event_narrative;
35. YTrain = dataTrain.event_type;
36. YValidation = dataValidation.event_type;
37. %%
38. figure
39. wordcloud(textDataTrain);
40. title("Training Data")
41. %% Pre processing
42. textDataTrain = erasePunctuation(textDataTrain);
43. textDataTrain = lower(textDataTrain);
44. documentsTrain = tokenizedDocument(textDataTrain);
45. %View the first few preprocessed training documents.
46. documentsTrain(1:5)
47. %% Word embedding
48. embeddingDimension = 100;
49. embeddingEpochs = 50;
50. emb = trainWordEmbedding(documentsTrain, ...
51. 'Dimension',embeddingDimension, ...
52. 'NumEpochs',embeddingEpochs, ...
53. 'Verbose',0)
54.
55. %% The trainingOptions function provides options to pad and truncate
input sequences automatically.
56. documentLengths = doclength(documentsTrain);
57. figure
58. histogram(documentLengths)
59. title("Document Lengths")
60. xlabel("Length")
61. ylabel("Number of Documents")
62. %% Convert the documents to sequences of numeric indices using
doc2sequence. To truncate or left-pad the sequences to have length 10,
set the 'Length' option to 10.
63. sequenceLength = 75;
64. documentsTruncatedTrain = docfun(@(words)
words(1:min(sequenceLength,end)),documentsTrain);
65. XTrain = doc2sequence(emb,documentsTruncatedTrain);
66. XTrain(1:5)
67. %%
68. emb = trainWordEmbedding(documentsTrain, ...
69. 'Dimension',embeddingDimension, ...
70. ylabel("Number of Documents")
71. %Convert the validation documents to sequences using the same options.
72. XValidation =
doc2sequence(enc,documentsValidation,'Length',sequenceLength);
73.
74.
75. %% Create and Train LSTM Network
76. inputSize = embeddingDimension;
77. outputSize = 180;
78. numClasses = numel(categories(YTrain));
79.
80. layers = [ ...
81. sequenceInputLayer(inputSize)
82. lstmLayer(outputSize,'OutputMode','last')
83. fullyConnectedLayer(numClasses)
84. softmaxLayer
85. classificationLayer]
86.
87. %% ..Specify Training Options
88. options = trainingOptions('adam', ...
89. 'GradientThreshold',1, ...
90. 'InitialLearnRate',0.01, ...
91. 'Plots','training-progress', ...
92. 'Verbose',0);
93. %% Train the LSTM network using the trainNetwork function.
94. net = trainNetwork(XTrain,YTrain,layers,options);
95.
96. %% Test LSTM Network
97. textDataTest = erasePunctuation(textDataTest);
98. textDataTest = lower(textDataTest);
99. documentsTest = tokenizedDocument(textDataTest);
100.
101. %%
102. documentsTruncatedTest = docfun(@(words)
words(1:min(sequenceLength,end)),documentsTest);
103. XTest = doc2sequence(emb,documentsTruncatedTest);
104. for i=1:numel(XTest)
105. XTest{i} = leftPad(XTest{i},sequenceLength);
106. end
107. XTest(1:5)
108.
109. %% Classify the test documents using the trained LSTM network.
110. YPred = classify(net,XTest);
111.
112. %% Calculate the classification accuracy
113. accuracy = sum(YPred == YTest)/numel(YPred)
114. %% ......Predict Using New Data
115. %%
116. reportsNew = [ ...
117. "Lots of water damage to computer equipment inside the
office."
118. "A large tree is downed and blocking traffic outside Apple
Hill."
119. "Damage to many car windshields in parking lot."
120. ];
121.
122. %Preprocess the text data using the preprocessing steps as the
training documents.
123.
124. reportsNew = lower(reportsNew);
125. reportsNew = erasePunctuation(reportsNew);
126. documentsNew = tokenizedDocument(reportsNew);
127. %% Convert the text data to sequences using doc2sequence with the
same options as when creating the training sequences.
128. documentsTruncatedNew = docfun(@(words)
words(1:min(sequenceLength,end)),documentsNew);
129. XNew = doc2sequence(emb,documentsTruncatedNew);
130. for i=1:numel(XNew)
131. XNew{i} = leftPad(XNew{i},sequenceLength);
132. end
133. %Classify the new sequences using the trained LSTM network.
134. [labelsNew,score] = classify(net,XNew);
135.
136. %Show the weather reports with their predicted labels.
137.
138. [reportsNew string(labelsNew)]
139.
140.
141.

Medical Text Classifier GabrieldeOlaguibel
No ratings yet
Medical Text Classifier GabrieldeOlaguibel
12 pages
Student Name: Course: Machine Learning Group: E27-24 Date: 16.01.2025
No ratings yet
Student Name: Course: Machine Learning Group: E27-24 Date: 16.01.2025
10 pages
Over Description About The Model
No ratings yet
Over Description About The Model
3 pages
Lab 5
No ratings yet
Lab 5
7 pages
Classification CNN
No ratings yet
Classification CNN
7 pages
Lab Report 8
No ratings yet
Lab Report 8
11 pages
Unit 4
No ratings yet
Unit 4
23 pages
Deep Learning PGM 1
No ratings yet
Deep Learning PGM 1
6 pages
566f0619-9145-4b8f-b12b-cb8a5b0cd30d
No ratings yet
566f0619-9145-4b8f-b12b-cb8a5b0cd30d
17 pages
LLM Fine Tune
No ratings yet
LLM Fine Tune
11 pages
DL Lab Manual 29.7.25
No ratings yet
DL Lab Manual 29.7.25
53 pages
AI Lab6
No ratings yet
AI Lab6
22 pages
Transform Raw Texts Into Training and Development Data: Instructor: Nikos Aletras
No ratings yet
Transform Raw Texts Into Training and Development Data: Instructor: Nikos Aletras
2 pages
Rajeek 7
No ratings yet
Rajeek 7
3 pages
Implementation
No ratings yet
Implementation
14 pages
Assingment-3 NLP
No ratings yet
Assingment-3 NLP
5 pages
Deep Learning
No ratings yet
Deep Learning
5 pages
Fake News Detection with LSTM
No ratings yet
Fake News Detection with LSTM
8 pages
Sample
No ratings yet
Sample
6 pages
2022-23 Odd Et Cse DLNLP
No ratings yet
2022-23 Odd Et Cse DLNLP
4 pages
Research Paper Summarization
No ratings yet
Research Paper Summarization
13 pages
Big Data
No ratings yet
Big Data
5 pages
Computer Vision Lab Guide
No ratings yet
Computer Vision Lab Guide
120 pages
30 Assignments PDF
No ratings yet
30 Assignments PDF
5 pages
Model Determination
No ratings yet
Model Determination
23 pages
Python Text Classification Guide
No ratings yet
Python Text Classification Guide
34 pages
Neural Networks in NLP Overview
No ratings yet
Neural Networks in NLP Overview
162 pages
Optimizing RNNs with LSTM Techniques
No ratings yet
Optimizing RNNs with LSTM Techniques
141 pages
SkimLit: NLP Model Development Guide
No ratings yet
SkimLit: NLP Model Development Guide
32 pages
DEEP LEARNING LAB - Manual
No ratings yet
DEEP LEARNING LAB - Manual
64 pages
Experiment 6.2
No ratings yet
Experiment 6.2
4 pages
DL Programs
No ratings yet
DL Programs
13 pages
Machine Learning Lab Manual 2021-22
No ratings yet
Machine Learning Lab Manual 2021-22
23 pages
Deep Learning Manual
No ratings yet
Deep Learning Manual
53 pages
DL 3
No ratings yet
DL 3
6 pages
Deep Learning Lab Syllabus 2024-2025
No ratings yet
Deep Learning Lab Syllabus 2024-2025
67 pages
DL Lab - Merged
No ratings yet
DL Lab - Merged
60 pages
Keras NLP Encoding and Sentiment Analysis
No ratings yet
Keras NLP Encoding and Sentiment Analysis
8 pages
NN From Scratch PDF 1735495327
No ratings yet
NN From Scratch PDF 1735495327
19 pages
Deep Learning Lab
No ratings yet
Deep Learning Lab
7 pages
GAN Image Generation with MATLAB
No ratings yet
GAN Image Generation with MATLAB
10 pages
代码
No ratings yet
代码
3 pages
Next Word Prediction With NLP and Deep Learning
No ratings yet
Next Word Prediction With NLP and Deep Learning
13 pages
Deep Neural Network for Image Classification
No ratings yet
Deep Neural Network for Image Classification
17 pages
Code Explanation
No ratings yet
Code Explanation
8 pages
Report
No ratings yet
Report
89 pages
DLT Experiment 2
No ratings yet
DLT Experiment 2
7 pages
M6L5 Lyst1370
No ratings yet
M6L5 Lyst1370
22 pages
Represented Using Tensors, and As A Result, Neural Network Programming Utilizes
No ratings yet
Represented Using Tensors, and As A Result, Neural Network Programming Utilizes
32 pages
Python Tasks and ML Projects
0% (1)
Python Tasks and ML Projects
5 pages
Kami Export - Assignment - 2 - 20240709
No ratings yet
Kami Export - Assignment - 2 - 20240709
13 pages
Foundations of Python For AI
No ratings yet
Foundations of Python For AI
67 pages
CCS355
No ratings yet
CCS355
29 pages
Prediction Using ANN
No ratings yet
Prediction Using ANN
3 pages
DSCI 303: Machine Learning For Data Science Fall 2020
No ratings yet
DSCI 303: Machine Learning For Data Science Fall 2020
5 pages
Wa0000.
No ratings yet
Wa0000.
40 pages
Assignment No 2: Q1) C Program For Fixed Incremental Algorithm
No ratings yet
Assignment No 2: Q1) C Program For Fixed Incremental Algorithm
13 pages
Google Aiml
No ratings yet
Google Aiml
50 pages
A Quick Recap: Artificial Intelligence LAB
No ratings yet
A Quick Recap: Artificial Intelligence LAB
29 pages
Lecture 4-Discrete-Time Processing of Continuous-Time Signals (Online Learning)
No ratings yet
Lecture 4-Discrete-Time Processing of Continuous-Time Signals (Online Learning)
15 pages
Muhammad Ibrahim Tariq Resume
No ratings yet
Muhammad Ibrahim Tariq Resume
1 page
PHYS500 Topic5 Tutorial
No ratings yet
PHYS500 Topic5 Tutorial
2 pages
Question: Find The Inductance Per Phase Per KM of Double Circuit 3ph
No ratings yet
Question: Find The Inductance Per Phase Per KM of Double Circuit 3ph
4 pages
MA319-7-RT Resit Test Paper
No ratings yet
MA319-7-RT Resit Test Paper
3 pages
Course Handbook MTH231 - LinearAlgebra FA19
No ratings yet
Course Handbook MTH231 - LinearAlgebra FA19
5 pages
Question # 1a: Vo 1 Vo 1 2 16 Vo 1 32 V Vo 0 Vo 0
No ratings yet
Question # 1a: Vo 1 Vo 1 2 16 Vo 1 32 V Vo 0 Vo 0
9 pages
Component Tolerance Modelling and Analysis Using Matlab
No ratings yet
Component Tolerance Modelling and Analysis Using Matlab
36 pages
COMSATS University Islamabad (CUI), Lahore Campus
No ratings yet
COMSATS University Islamabad (CUI), Lahore Campus
4 pages
Java Inheritance & Polymorphism Quiz
No ratings yet
Java Inheritance & Polymorphism Quiz
26 pages
COMSATS University Islamabad (CUI), Lahore Campus
No ratings yet
COMSATS University Islamabad (CUI), Lahore Campus
3 pages
Quiz For Thursday 21 May 2020: Due Date May 21, 2020 Maximum Marks: 10
No ratings yet
Quiz For Thursday 21 May 2020: Due Date May 21, 2020 Maximum Marks: 10
5 pages
Home Work 16
No ratings yet
Home Work 16
1 page
MCQ Practice Oop
No ratings yet
MCQ Practice Oop
26 pages
Assignment 1 - Semester Fall 2019: COMSATS University Islamabad, Lahore Campus Department of Electrical Engineering
No ratings yet
Assignment 1 - Semester Fall 2019: COMSATS University Islamabad, Lahore Campus Department of Electrical Engineering
2 pages
QD Quiz
No ratings yet
QD Quiz
2 pages
Teacher Competencies for Effective Instruction
No ratings yet
Teacher Competencies for Effective Instruction
2 pages
ELE 412 - EEE 412 Old
No ratings yet
ELE 412 - EEE 412 Old
113 pages
Corp Acc UNIT III
No ratings yet
Corp Acc UNIT III
22 pages
Perdev Q2 Module 21
100% (5)
Perdev Q2 Module 21
22 pages
Nephrolithiasis Nursing Care Plan
86% (7)
Nephrolithiasis Nursing Care Plan
9 pages
SAER Vertical and End Suction Pumps
No ratings yet
SAER Vertical and End Suction Pumps
4 pages
TSA 2023: S3 English Performance Analysis
No ratings yet
TSA 2023: S3 English Performance Analysis
131 pages
RedBus: India's Leading Bus Ticket Platform
No ratings yet
RedBus: India's Leading Bus Ticket Platform
3 pages
30 Days of Phrasal Verbs Learning
No ratings yet
30 Days of Phrasal Verbs Learning
71 pages
December in Order
No ratings yet
December in Order
11 pages
641-Ch. 9 VETERINARY HOMEOPATHY COURSE.pdf
No ratings yet
641-Ch. 9 VETERINARY HOMEOPATHY COURSE.pdf
41 pages
Can, Could, Be Able To: Exercises
No ratings yet
Can, Could, Be Able To: Exercises
2 pages
Lab Manuals for ICT Course at BUITEMS
No ratings yet
Lab Manuals for ICT Course at BUITEMS
6 pages
English Grammar
No ratings yet
English Grammar
128 pages
Large-Scale Forecasting of Electric Vehicle Charging Demand Using
No ratings yet
Large-Scale Forecasting of Electric Vehicle Charging Demand Using
12 pages
Geography Dissertation Writing Help
100% (2)
Geography Dissertation Writing Help
4 pages
Thesis Help for Public Health Dentistry
100% (3)
Thesis Help for Public Health Dentistry
5 pages
Bloom's Verb PDF
100% (1)
Bloom's Verb PDF
10 pages
SGGSJ Govt. College, Paonta Sahib Collegefee Paid Invoice
No ratings yet
SGGSJ Govt. College, Paonta Sahib Collegefee Paid Invoice
1 page
How Netflix Reinvented HR by Patty Mccord PDF
No ratings yet
How Netflix Reinvented HR by Patty Mccord PDF
19 pages
Introduction To Computers and Office Automation 20 - 251022 - 111546
No ratings yet
Introduction To Computers and Office Automation 20 - 251022 - 111546
2 pages
Mark Fisher Postcapitalist Desire MA Seminar
No ratings yet
Mark Fisher Postcapitalist Desire MA Seminar
3 pages
Numerical Solution of Batch Crystallization Models: Qamar S., Seidel-Morgenstern A
No ratings yet
Numerical Solution of Batch Crystallization Models: Qamar S., Seidel-Morgenstern A
6 pages
Lead-Free Piezoceramics Study
No ratings yet
Lead-Free Piezoceramics Study
8 pages
The Good Girl Raven Souls MC Candice Wright Download Full Chapters
No ratings yet
The Good Girl Raven Souls MC Candice Wright Download Full Chapters
120 pages
San Pascual Senior High School 2: Research Method and Procedure
No ratings yet
San Pascual Senior High School 2: Research Method and Procedure
7 pages
Non-Proportional Reinsurance Guide
No ratings yet
Non-Proportional Reinsurance Guide
17 pages
Superlab Network Infrastructure
No ratings yet
Superlab Network Infrastructure
19 pages
Cities and People British English Teacher
No ratings yet
Cities and People British English Teacher
7 pages

All "File - CSV" 'Texttype' 'String'

Uploaded by

All "File - CSV" 'Texttype' 'String'

Uploaded by

1.

You might also like