Research Topics
Natural Language Processing
Image Processing
CSC 3990
Natural Language Processing
CSC 3990
What is NLP?
• Natural Language Processing (NLP)
– Computers use (analyze, understand,
generate) natural language
– A somewhat applied field
• Computational Linguistics (CL)
– Computational aspects of the human
language faculty
– More theoretical
Why Study NLP?
• Human language interesting & challenging
– NLP offers insights into language
• Language is the medium of the web
• Interdisciplinary: Ling, CS, psych, math
• Help in communication
– With computers (ASR, TTS)
– With other humans (MT)
• Ambitious yet practical
Goals of NLP
• Scientific Goal
–Identify the computational machinery
needed for an agent to exhibit various
forms of linguistic behavior
• Engineering Goal
–Design, implement, and test systems
that process natural languages for
practical applications
Applications
• speech processing: get flight information or book
a hotel over the phone
• information extraction: discover names of people
and events they participate in, from a document
• machine translation: translate a document from
one human language into another
• question answering: find answers to natural
language questions in a text collection or
database
• summarization: generate a short biography of
Noam Chomsky from one or more news articles
General Themes
• Ambiguity of Language
• Language as a formal system
• Computation with human language
• Rule-based vs. Statistical Methods
• The need for efficiency
Topic Ideas
1.Text to Speech – artificial voices
2.Speech Recognition - understanding
3.Textual Analysis – readability
4.Plagiarism Detection – candidate selection
5.Intelligent Agents – machine interaction
Text to Speech – artificial voice
• Text Input
• Break text into phonemes
– Match phonemes to voice elements
– Concatenate voice elements
– Manipulate pitch and spacing
• Output results
• Research question: How can a human voice be
used to produce an artificial voice?
• Model Talker - opportunities for active, hands-on
research (http://www.modeltalker.com)
Speech Recognition
• Spoken Input
• Identify words and phonemes in speech
– Generate text for recognized word parts
– Concatenate text elements
– Perform spelling, grammar and context checking
• Output results
• Research question: How can speech recognition
assist a deaf student taking notes in class?
• VUST – Villanova University Speech Transcriber
(http://www.csc.villanova.edu/~tway/publications/wayAT08.pdf)
Textual Analysis - Readability
• Text Input
• Analyze text & estimate “readability”
– Grade level of writing
– Consistency of writing
– Appropriateness for certain educ. level
• Output results
• Research question: How can computer
analyze text and measure readability?
• Opportunities for hands-on research
Plagiarism Detection
• Text Input
• Analyze text & locate “candidates”
– Find one or more passages that might be plagiarized
– Algorithm tries to do what a teacher does
– Search on Internet for candidate matches
• Output results
• Research question: What algorithms work like
humans when finding plagiarism?
• Experimental CS research
Intelligent Agents
• Example: ELIZA
• AIML: Artificial Intelligence Modeling Lang.
• Human types something
• Computer parses, “understands”, and generates
response
• Response is viewed by human
• Research question: How can computers
“understand” and “generate” human writing?
• Also good area for experimentation
Image Processing
CSC 3990
Some slides from Xin Li lecture notes, West Virginia Univ.
What is Image Processing?
• Digital Image Processing
– Analog transmission in 1920
– Early improvements in 1920s
– Required digital computer (1948)
– Rapid advancement since
Historical Background
Newspaper industry used
Bartlane cable picture
transmission system to send
pictures by submarine cable
between London and New
York in 1920s
The number of distinct gray
levels coded by Bartlane
system was improved from 5
to 15 by the end of 1920s
Digital Image Processing
• The images in previous slides are digital
(now), but they are NOT the result of DIP
• Digital Image Processing is
– Processing digital images by a digital
computer
• DIP requires a digital computer and other
supporting technologies (e.g., data storage,
display and transmission)
Cool Applications
The first picture of moon
by US spacecraft Ranger 7
on July 31, 1964 at
9:09AM EDT
•Digitization
•Compression
•Error Recovery
Sir Godfrey N. Housefield and Prof.
Allan M. Cormack shared 1979
Nobel Prize in Medicine for the
invention of CT
• Enhancement
• Edges, Contrast,
Brightness, etc.
• Acquisition
– Digital cameras, scanners
– MRI and Ultrasound imaging
– Infrared and microwave imaging
• Transmission
– Internet, wireless communication
• Display
– Printers, LCD monitor, digital TV
Past 20 Years
Photography
Motion Pictures
Law Enhancement and Biometrics
Remote Sensing
Hurricane Andrew
taken by NOAA GEOS
America at night
(Nov. 27, 2000)
Thermal Images
Human body disperses
heat (red pixels)
Different colors indicate
varying temperatures
Operate in infrared frequency
Medical Diagnostics
chest head
Operate in X-ray frequency
PET and Astronomy
Positron Emission Tomography
Cygnus Loop in the
constellation of Cygnus
Operate in gamma-ray frequency
Cartoon Pictures (Non-photorealistic)
Synthetic Images in Gaming
Age of Empire III by Ensemble Studios
Virtual Reality (Photorealistic)
General Themes
• Human vision is limited
• Digital images contain more information
that humans perceive
• Computers can use algorithms to extract
more information from digital images
• Computers can acquire, manipulate,
compress, transmit and modify images
Topic Ideas
1.Biometrics – identifying faces & retinas
2.Target Acquisition – see a tank from space
3.Computer Vision – detect microscopic flaws in
manufacturing
4.Assistive Technology – convert visual images
into tactile or textual form
5.Entertainment – remove red eye, morph faces,
digital filmmaking, movie magic
6.Image Description – use 3D dictionary to
describe contents of 2D image

way_topics.ppt

  • 1.
    Research Topics Natural LanguageProcessing Image Processing CSC 3990
  • 2.
  • 3.
    What is NLP? •Natural Language Processing (NLP) – Computers use (analyze, understand, generate) natural language – A somewhat applied field • Computational Linguistics (CL) – Computational aspects of the human language faculty – More theoretical
  • 4.
    Why Study NLP? •Human language interesting & challenging – NLP offers insights into language • Language is the medium of the web • Interdisciplinary: Ling, CS, psych, math • Help in communication – With computers (ASR, TTS) – With other humans (MT) • Ambitious yet practical
  • 5.
    Goals of NLP •Scientific Goal –Identify the computational machinery needed for an agent to exhibit various forms of linguistic behavior • Engineering Goal –Design, implement, and test systems that process natural languages for practical applications
  • 6.
    Applications • speech processing:get flight information or book a hotel over the phone • information extraction: discover names of people and events they participate in, from a document • machine translation: translate a document from one human language into another • question answering: find answers to natural language questions in a text collection or database • summarization: generate a short biography of Noam Chomsky from one or more news articles
  • 7.
    General Themes • Ambiguityof Language • Language as a formal system • Computation with human language • Rule-based vs. Statistical Methods • The need for efficiency
  • 8.
    Topic Ideas 1.Text toSpeech – artificial voices 2.Speech Recognition - understanding 3.Textual Analysis – readability 4.Plagiarism Detection – candidate selection 5.Intelligent Agents – machine interaction
  • 9.
    Text to Speech– artificial voice • Text Input • Break text into phonemes – Match phonemes to voice elements – Concatenate voice elements – Manipulate pitch and spacing • Output results • Research question: How can a human voice be used to produce an artificial voice? • Model Talker - opportunities for active, hands-on research (http://www.modeltalker.com)
  • 10.
    Speech Recognition • SpokenInput • Identify words and phonemes in speech – Generate text for recognized word parts – Concatenate text elements – Perform spelling, grammar and context checking • Output results • Research question: How can speech recognition assist a deaf student taking notes in class? • VUST – Villanova University Speech Transcriber (http://www.csc.villanova.edu/~tway/publications/wayAT08.pdf)
  • 11.
    Textual Analysis -Readability • Text Input • Analyze text & estimate “readability” – Grade level of writing – Consistency of writing – Appropriateness for certain educ. level • Output results • Research question: How can computer analyze text and measure readability? • Opportunities for hands-on research
  • 12.
    Plagiarism Detection • TextInput • Analyze text & locate “candidates” – Find one or more passages that might be plagiarized – Algorithm tries to do what a teacher does – Search on Internet for candidate matches • Output results • Research question: What algorithms work like humans when finding plagiarism? • Experimental CS research
  • 13.
    Intelligent Agents • Example:ELIZA • AIML: Artificial Intelligence Modeling Lang. • Human types something • Computer parses, “understands”, and generates response • Response is viewed by human • Research question: How can computers “understand” and “generate” human writing? • Also good area for experimentation
  • 14.
    Image Processing CSC 3990 Someslides from Xin Li lecture notes, West Virginia Univ.
  • 15.
    What is ImageProcessing? • Digital Image Processing – Analog transmission in 1920 – Early improvements in 1920s – Required digital computer (1948) – Rapid advancement since
  • 16.
    Historical Background Newspaper industryused Bartlane cable picture transmission system to send pictures by submarine cable between London and New York in 1920s The number of distinct gray levels coded by Bartlane system was improved from 5 to 15 by the end of 1920s
  • 17.
    Digital Image Processing •The images in previous slides are digital (now), but they are NOT the result of DIP • Digital Image Processing is – Processing digital images by a digital computer • DIP requires a digital computer and other supporting technologies (e.g., data storage, display and transmission)
  • 18.
    Cool Applications The firstpicture of moon by US spacecraft Ranger 7 on July 31, 1964 at 9:09AM EDT •Digitization •Compression •Error Recovery Sir Godfrey N. Housefield and Prof. Allan M. Cormack shared 1979 Nobel Prize in Medicine for the invention of CT • Enhancement • Edges, Contrast, Brightness, etc.
  • 19.
    • Acquisition – Digitalcameras, scanners – MRI and Ultrasound imaging – Infrared and microwave imaging • Transmission – Internet, wireless communication • Display – Printers, LCD monitor, digital TV Past 20 Years
  • 20.
  • 21.
  • 22.
  • 23.
    Remote Sensing Hurricane Andrew takenby NOAA GEOS America at night (Nov. 27, 2000)
  • 24.
    Thermal Images Human bodydisperses heat (red pixels) Different colors indicate varying temperatures Operate in infrared frequency
  • 25.
  • 26.
    PET and Astronomy PositronEmission Tomography Cygnus Loop in the constellation of Cygnus Operate in gamma-ray frequency
  • 27.
  • 28.
    Synthetic Images inGaming Age of Empire III by Ensemble Studios
  • 29.
  • 30.
    General Themes • Humanvision is limited • Digital images contain more information that humans perceive • Computers can use algorithms to extract more information from digital images • Computers can acquire, manipulate, compress, transmit and modify images
  • 31.
    Topic Ideas 1.Biometrics –identifying faces & retinas 2.Target Acquisition – see a tank from space 3.Computer Vision – detect microscopic flaws in manufacturing 4.Assistive Technology – convert visual images into tactile or textual form 5.Entertainment – remove red eye, morph faces, digital filmmaking, movie magic 6.Image Description – use 3D dictionary to describe contents of 2D image