Accepted To Carnegie Mellon, Stanford, Cornell, and The University of Washington, and Rejected From MIT and Berkeley

sop

Uploaded by

harita327

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

23 views3 pages

Accepted To Carnegie Mellon, Stanford, Cornell, and The University of Washington, and Rejected From MIT and Berkeley

sop

Uploaded by

harita327

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 3

Accepted to Carnegie Mellon, Stanford, Cornell, and the University of

Washington, and rejected from MIT and Berkeley

It is important to find algorithms that run efficiently on large datasets, since

many applications require large amounts of data. Biologists have to fish
through billions of base pairs to pry information out of the human genome.
The Square Kilometer Array of telescopes will produce exabytes of data every
day.

Unfortunately, because these datasets are so huge, it can be prohibitively

expensive to run algorithms on them. In grad school I would like to find
algorithms that help us process data at this massive scale.

This problem became personally important to me when I entered the (mock)

Netflix challenge for a class project. With 100 million movie ratings, any
algorithm I implemented took many hours to run, even with 12 GB of RAM.
That's when I grew frustrated and started dreaming of ways to make big data
smaller. I came across some papers by Li et al. on conditional random
sampling [1-2], which convinced me that sketching was a really good idea. I
was excited to hear sketching could be used to compute Euclidean distances,
since in some of my algorithms computing Euclidean distance was the
slowest step. I was also pretty excited that it could be useful for optimizing
database performance, since I spent a whole term learning about databases.

After exploring this further I found Jelani Nelson's papers on data streams [3-
4], which introduced me to the idea of computing summary statistics of data
streams in sublinear space. I was amazed that he could approximate the
norm of a changing vector without keeping track of what all the coordinates
were. I was also impressed by Cormode and Muthukrishnan's paper on the
count-min sketch [5], which did the same for inner products and various
other queries. Reading these papers made me want to work on similar topics
in grad school.

I'm very interested in big data, but also have a lot of other interests. Since
high school I've done research in genetics, bioinformatics, psychology,
cognitive science, and pure math. I think my interdisciplinary interests would
be good for you because they let me look at problems from a different
perspective. People who have only been exposed to computer science will
probably think like a computer scientist would. But as an interdisciplinary
researcher I can think like a computer scientist, a biologist, or a
mathematician. I also feel my math classes and research have made me
comfortable with abstraction, which would be very useful in a computer
science PhD program.

When I was fifteen I did a project where I modeled the dynamics of the selfish
gene Medea. Our lab had built a gene that could spread very quickly through
a population of flies. I calculated how quickly the gene would spread under a
wide range of conditions. This work led to three journal papers, one of them
in Science.
While I was on medical leave from Caltech (2008-2010), I did a project where
I modeled the dynamics of therapist-client interactions. These dynamics can
be very complicated, so we simplified them by only considering the
therapist's and client's "happiness." My mentor related these variables to
each other using nonlinear differential equations. I solved these equations,
which led to two journal papers and two posters.

The summer of my sophomore year I did a project about Tutte polynomials

and the Kontsevich conjecture. The Kontsevich conjecture states that the
number of roots of a certain type of polynomial (over a finite field) is
polynomially dependent on the size of the field. Even though the conjecture
failed for graph polynomials, we thought it might work for Tutte polynomials.
It turns out it didn't, but I did find some intermediate results that were
published in the International Journal of Geometric Methods in Modern
Physics.

The summer before I entered college my mentor and I modified the

bioinformatics tool BLAST to better detect similarities between proteins. We
did get some improvement, at least on the few data sets we tried, but we
didn't achieve our main goal, which was to look for motor proteins in
bacteria. A couple years later I did another bioinformatics project at Protabit,
where I benchmarked their software against other software, and I did a small
project at MIT where I ran experiments on how people learn words.

Stanford is my top choice school because it's a large school with world-class
faculty in a variety of fields. This would be good for me because my interests
are broad and interdisciplinary in nature. At Stanford I could apply big data to
problems in biology, machine learning, natural language processing, or any
field that calls for it - and still get a great advisor. My objective in pursuing a
PhD at Stanford is to do good work on interesting problems and eventually
become a professor.

Citations

[1] Li, Ping, Kenneth W. Church, and Trevor J. Hastie. "One sketch for all:
Theory and application of conditional random sampling." Neural Information
Processing Systems. 2008.

[2] Li, Ping, Kenneth W. Church, and Trevor J. Hastie. "Conditional random
sampling: A sketch-based sampling technique for sparse data." Advances in
neural information processing systems 19 (2007): 873.

[3] Kane, Daniel M., Jelani Nelson, and David P. Woodruff. "On the exact
space complexity of sketching and streaming small norms." Proceedings of
the Twenty-First Annual ACM-SIAM Symposium on Discrete Algorithms.
Society for Industrial and Applied Mathematics, 2010.

[4] Harvey, Nicholas JA, Jelani Nelson, and Krzysztof Onak. "Sketching and
streaming entropy via approximation theory." Foundations of Computer
Science, 2008. FOCS'08. IEEE 49th Annual IEEE Symposium on. IEEE, 2008.
[5] Cormode, Graham, and S. Muthukrishnan. "An improved data stream
summary: the count-min sketch and its applications." Journal of
Algorithms55.1 (2005): 58-75.

Statement of Purpose Sample
76% (51)
Statement of Purpose Sample
39 pages
Sop Vipulsingh
100% (1)
Sop Vipulsingh
2 pages
Statement of Purpose Rudrasis UChicago
No ratings yet
Statement of Purpose Rudrasis UChicago
2 pages
11 Ea
No ratings yet
11 Ea
3 pages
Faisal Khan BIO ASSIGMENT 1
No ratings yet
Faisal Khan BIO ASSIGMENT 1
6 pages
Ebarghya AS: Education Experience
No ratings yet
Ebarghya AS: Education Experience
1 page
CS PS
No ratings yet
CS PS
1 page
Computer Engineering Sample Sop
No ratings yet
Computer Engineering Sample Sop
2 pages
2223 CS300 20bit HW01
No ratings yet
2223 CS300 20bit HW01
5 pages
Aspiring Computer Engineer's Journey
No ratings yet
Aspiring Computer Engineer's Journey
2 pages
Interview Question Pack (Sciences & Maths)
No ratings yet
Interview Question Pack (Sciences & Maths)
6 pages
ML - Lab Manual (BAI702) - Updated 2-7-2025
100% (1)
ML - Lab Manual (BAI702) - Updated 2-7-2025
32 pages
Final Project Data Streaming
No ratings yet
Final Project Data Streaming
8 pages
Research Paper
No ratings yet
Research Paper
11 pages
Advanced Options in Computer Science
No ratings yet
Advanced Options in Computer Science
5 pages
Statement of Purpose-Purdue University
100% (1)
Statement of Purpose-Purdue University
3 pages
M.sc. Computer
No ratings yet
M.sc. Computer
48 pages
Statement of Purpose: Wei-Bung Wang
No ratings yet
Statement of Purpose: Wei-Bung Wang
1 page
Kelsey Kerr: Education
No ratings yet
Kelsey Kerr: Education
1 page
Resume Latest
50% (4)
Resume Latest
3 pages
Discrete Mathematics For Computer Science
No ratings yet
Discrete Mathematics For Computer Science
270 pages
Application Questions
No ratings yet
Application Questions
5 pages
Resume Soham
No ratings yet
Resume Soham
2 pages
Some Project Idea
No ratings yet
Some Project Idea
22 pages
Assignment - 1 (IDSUP) - Omkar
No ratings yet
Assignment - 1 (IDSUP) - Omkar
13 pages
Blood Histology Images in January 2020 As I Was Inclined Towards Solving Problems Which
No ratings yet
Blood Histology Images in January 2020 As I Was Inclined Towards Solving Problems Which
2 pages
Advanced Programming for Pros
No ratings yet
Advanced Programming for Pros
7 pages
Why Math
No ratings yet
Why Math
6 pages
Program - 10 (1) - Merged
No ratings yet
Program - 10 (1) - Merged
13 pages
Lab Manual: Semester-VII
100% (1)
Lab Manual: Semester-VII
65 pages
Greg Nelson CV 2023
No ratings yet
Greg Nelson CV 2023
5 pages
Sandeep Sen Algorithms Notes
No ratings yet
Sandeep Sen Algorithms Notes
397 pages
Resume 2
No ratings yet
Resume 2
5 pages
Naja A. Mack
No ratings yet
Naja A. Mack
2 pages
Introduction To Statistical Mechanics
No ratings yet
Introduction To Statistical Mechanics
3 pages
Project Ideas
No ratings yet
Project Ideas
3 pages
Atif Austin
No ratings yet
Atif Austin
2 pages
PhD Admission Guide: Computational Science Stream
No ratings yet
PhD Admission Guide: Computational Science Stream
7 pages
Resume 2
No ratings yet
Resume 2
1 page
Abhinandan Resume
No ratings yet
Abhinandan Resume
1 page
AI Search Techniques Overview
No ratings yet
AI Search Techniques Overview
29 pages
Kelsey Kerr: Education
No ratings yet
Kelsey Kerr: Education
1 page
Engineering Data Analysis Learning Mateial (2nd Week)
No ratings yet
Engineering Data Analysis Learning Mateial (2nd Week)
6 pages
E2 Boom Bam Bin Bang
No ratings yet
E2 Boom Bam Bin Bang
3 pages
Basic Vlsi Design Principles and Applications PDF
0% (1)
Basic Vlsi Design Principles and Applications PDF
16 pages
Maths Research Projects
No ratings yet
Maths Research Projects
5 pages
Statement of Purpose MS in Computer Science
No ratings yet
Statement of Purpose MS in Computer Science
2 pages
Digital Logic & Circuit Design
No ratings yet
Digital Logic & Circuit Design
3 pages
Research Databases 40 %
No ratings yet
Research Databases 40 %
3 pages
Lattin Et Al - Analyzing Multivariate Data - 281-283
No ratings yet
Lattin Et Al - Analyzing Multivariate Data - 281-283
3 pages
Accident Detection with Alert System
No ratings yet
Accident Detection with Alert System
20 pages
Emerging Jobs Report India Sept2018-D5B5
No ratings yet
Emerging Jobs Report India Sept2018-D5B5
29 pages
What Is AI
No ratings yet
What Is AI
14 pages
A Comprehensive Guide To Ensemble Learning (With Python Codes) PDF
100% (1)
A Comprehensive Guide To Ensemble Learning (With Python Codes) PDF
49 pages
Group 3 - Artificial Intelligence in Banking Services
No ratings yet
Group 3 - Artificial Intelligence in Banking Services
5 pages
Process Mining Handbook: Wil M. P. Van Der Aalst Josep Carmona
No ratings yet
Process Mining Handbook: Wil M. P. Van Der Aalst Josep Carmona
503 pages
6-Month AI and ML Learning Plan
No ratings yet
6-Month AI and ML Learning Plan
4 pages
Research Areas in AI and Technology
No ratings yet
Research Areas in AI and Technology
10 pages
A Brief Survey On Data Mining For Biological and Environmental Problems.
No ratings yet
A Brief Survey On Data Mining For Biological and Environmental Problems.
46 pages
Resume Rohan Shah
No ratings yet
Resume Rohan Shah
1 page
Udit Raj Singh 3
No ratings yet
Udit Raj Singh 3
1 page
Extremely Low Resource Neural Machine Translation For Asian Languages
No ratings yet
Extremely Low Resource Neural Machine Translation For Asian Languages
36 pages
AI in Nonverbal Communication Proposal
No ratings yet
AI in Nonverbal Communication Proposal
6 pages
Machine Learning Quiz on H2O
No ratings yet
Machine Learning Quiz on H2O
3 pages
Bachelorarbeit Vladimir Elvov
No ratings yet
Bachelorarbeit Vladimir Elvov
147 pages
Report Bert
No ratings yet
Report Bert
2 pages
DR S Vimaladevi
No ratings yet
DR S Vimaladevi
22 pages
Power BI - Exam Prep - 29 - 3
No ratings yet
Power BI - Exam Prep - 29 - 3
40 pages
Wolverine Traffic and Road Condition Estimation Using 1wml9whzmo
No ratings yet
Wolverine Traffic and Road Condition Estimation Using 1wml9whzmo
6 pages
190319windercleaningdatascience1576692643371 PDF
No ratings yet
190319windercleaningdatascience1576692643371 PDF
110 pages
IJCRT2403067
No ratings yet
IJCRT2403067
6 pages
Module-1 Backpropagation Process in Deep Neural Network
No ratings yet
Module-1 Backpropagation Process in Deep Neural Network
5 pages
Xie-2023-Automatic Identification of Individual Nanoplastics by Raman Spectroscopy Based On Machine Learning
No ratings yet
Xie-2023-Automatic Identification of Individual Nanoplastics by Raman Spectroscopy Based On Machine Learning
12 pages
Ugrd Cybs6101 Artificial Intelligence Fundamentals Midterms Exams
50% (2)
Ugrd Cybs6101 Artificial Intelligence Fundamentals Midterms Exams
58 pages
R20 Machine Learning Unit 4
No ratings yet
R20 Machine Learning Unit 4
49 pages
Analysis of Machine Learning Algorithms On Cancer Dataset
No ratings yet
Analysis of Machine Learning Algorithms On Cancer Dataset
10 pages
Introduction To AI and AI Bingo
No ratings yet
Introduction To AI and AI Bingo
28 pages
360DigiTmg E Book Data Science
100% (1)
360DigiTmg E Book Data Science
168 pages
Machine Learning: Building A Linear Regression Model: Abhishek & Pukhraj
No ratings yet
Machine Learning: Building A Linear Regression Model: Abhishek & Pukhraj
11 pages

Accepted To Carnegie Mellon, Stanford, Cornell, and The University of Washington, and Rejected From MIT and Berkeley

Uploaded by

Accepted To Carnegie Mellon, Stanford, Cornell, and The University of Washington, and Rejected From MIT and Berkeley

Uploaded by

Accepted to Carnegie Mellon, Stanford, Cornell, and the University of

Washington, and rejected from MIT and Berkeley

It is important to find algorithms that run efficiently on large datasets, since

Unfortunately, because these datasets are so huge, it can be prohibitively

This problem became personally important to me when I entered the (mock)

The summer of my sophomore year I did a project about Tutte polynomials

The summer before I entered college my mentor and I modified the

You might also like