Sunday, April 03, 2005

Numb3rs Review: "Identity Crisis"

I never got around to posting my Numb3rs review because this episode was rather low on the math. There was a small element of Bayesian reasoning, where Charlie attempts to figure out the likelihood that Charlie convicted the wrong guy based on two crimes having the same M.O.

Plot Summary: Don discovers a crime with an M.O identical to a case that he thought he had solved a year ago. He gets Charlie to tell him that the likelihood of a coincidence was very small, and then goes after the real killer. Along the way, Charlie insults a fingerprint specialist, but she gets her revenge, with a "butcher's hook" (no no, not that kind).

What I found more interesting was a real life Numb3rs story (with no crime though), via the New Yorker, starring the brothers Chudnovsky, and a delicately tampered rug. And if you're tired of such seriousness, check out this profile of the most beautiful mathematics grad student on TV, Navi Rawat (courtesy Sepia Mutiny).

Tags: ,

The /. translation

of the NYT article on defense funding that I mentioned earlier:
  • I'll put it in StarCraft terms: you're spending your minerals on upgrading your Zealots, and failing to invest in the pylons and tech structures that would allow you to build a whole frickin' fleet of Protoss Carriers
  • You can't cut Counter Strike research! What will I play now?
The discussion is interesting though (at +5 filtering). Some of the standard fallacies are brought up (and occasionally demolished):
  • All useful CS research was done in the 70s
  • We don't need no stinkin' guvmint: a teenage hacker can do better
  • Why should DARPA fund academic research
  • Why do CS researchers needs tons of money anyway ? All they do is think !

Saturday, April 02, 2005

The role of the university in the current funding crisis

When writing a post, it often happens that I have too many thoughts to fit into what is conventionally a short note format. Further reflection on my post about funding led to this:

Although it is not clear that the dot-com bust had anything to do with increased submission rates to conferences/funding agencies, it is fair to say that the dot-com boom had a lot to do with this.
I recall that in the late 90s and early 00s, universities were desperately trying to increase the sizes of their CS departments, allegedly in response to increased undergradate enrollment and the consequent increased demand for teachers.

Needless to say, when you hire tenure-track professors, you get more teaching, but you get a heck of a lot more research, as you have more slaves running the tenure treadmill as fast their legs can go. That inevitably leads to more submissions to conferences and more requests for money.

But tenure-track jobs often lead to tenure, i.e permanent jobs. These professors are not likely to stop publishing and stop requesting funding, but the underlying supply of students appears to be drying up (at least at the graduate level, though probably not at the undergraduate level). What happens then ?

This appears reminiscient of what has happened in biology. The PCR revolution led to an explosion in the number of researchers in biology and genetics. The NIH has historically been generous in funding graduate work in biology. However, funding for PIs and post-docs is far more scarce, creating a supply-demand bottleneck that means long years toiling at postdocs for ever elusive faculty jobs.

In computer science, we are far from that, also because industry jobs are to an extent still a viable option for researchers. But as industrial research labs cut back, and funding for basic research dries up, we could easily be facing the same kind of funding crises that biology faces today. And to draw another lesson from biology, what is true for rats is true for humans: as resources dwindle, organisms become more ferocious towards each other. if you think your grant reviews were harsh before, watch out !

Dwindling research funding in the U.S.

Today's New York Times has an article by John Markoff titled 'Pentagon Redirects Its Research Dollars'. The lead paragraph:
The Defense Advanced Research Projects Agency at the Pentagon - which has long underwritten open-ended "blue sky" research by the nation's best computer scientists - is sharply cutting such spending at universities, researchers say, in favor of financing more classified work and narrowly defined projects that promise a more immediate payoff.
Others have already commented on this article: it is interesting to read it in conjunction with the latest David Patterson letter in CACM. One of the most fascinating tables in this letter is a chart outlining the evolution (in terms of money, and where the research was done) of many of the most important technologies available today. Technology transfer is a rather overused buzzword, but this chart really shows how technology moves between academia and industrial research labs (in both directions), and then to commercial settings. If ever one wanted proof of the value of academic "blue-sky" research, this is it.



As always, there are a number of related trends:
  • In a previous letter, David Patterson lamented the increasing submission loads to various conferences. If you look at the chart of NSF funding requests and acceptances over the past few years, it looks strikingly similar.
  • Research is becoming more conservative; again, David Patterson in his two letters cites the lack of 'out-there' conference papers as well as the 'chilling effect' of funding on truly innovative grant proposals.
It is not hard to see why these are related: in a highly competitive conference (and funding) acceptance scenario, papers are not accepted; they are merely 'not rejected'. In other words, we find reasons to eliminate papers rather than accept them. What this means that is you have to have a cast-iron presentation of a paper/grant; anything that can get shot down, will. Thus, more outlandish and innovative proposals die.

Being conservative about accepting 'out-there' papers is not necessarily a bad thing; however, I'd argue that one cannot afford to be conservative about funding new ideas: to use a rather abhorrent analogy, funding is like the VC seed capital for a company, and conference/journal peer review is like the market place; if you cut off the source of innovation, there is no fuel for good ideas to emerge.

One has to ask though: How much is ACM itself responsible for this state of affairs ? I have read more interesting articles in CACM in the past three months than I had in the prior 7 years, and I am truly grateful that David Patterson is raising issues that are familiar to all of us in research. However, the ACM, over the past many years, has morphed from one of the truly representative computer science organizations (many foundational research results were published in CACM in days gone by), to an IT trade organization. The ACM does not speak to me anymore, at a time when an active lobbying group for basic computer science research has never been more important.

Part of this, for better or for worse, has been the explosion in the IT industry over the last ten years. With the kind of money pouring into the IT world, it probably made sense to cater to the needs of IT professionals. However, companies are fickle creatures; one of the main problems we face right now is the lack of industrial funding for basic research, something that used to be a big part of the computer science landscape.

Let's be clear here: DARPA is right that if they are
devoting more attention to "quick reaction" projects that draw on the fruits of earlier science and technology to produce useful prototypes as soon as possible.
then they don't need to fund university research (and in fact academic researchers wouldn't want to work on these problems). Their mistake of course is not seeing the value of long-term fundamental research even now, but you have to ask: when the flagship magazine of the most well known computer science organization cannot be distinguished from a random trade magazine, who's to tell the difference between academic research and corporate software efforts ?

Hat tip to Chandra Chekuri for pointing out the above articles.

Friday, April 01, 2005

Theory Day at Maryland

UMIACS@Maryland is organizing a theory day just after STOC, on May 25. STOC is close to Baltimore this year, so for those who can make it, this would be a nice extra day of activities (that is, if you aren't burned out by talks already!)

Here's the program:
Tags: ,

No wifi please, we're Japanese...

Uh-oh:
Todd Ogasawara went to Japan for business and discovered that unlike in the US, where hotels are tripping over themselves to offer WiFi, over there it’s mighty rough trying to find a hotel with any sort of broadband access (whether or wired or wireless) for guests. Not that there isn’t WiFi in Japan or anything like that, but apparently most hotels assume that you’ll have some sort of high-speed access via your cellphone and so don’t feel compelled to offer WiFi.
Looks like blogging will be greatly reduced on my upcoming trip :)

Proofs

The Economist has a nice article on what they claim to be a potential new trend in mathematical theorem proving: the use of computers to demonstrate the proof of a theorem. They cite the 4-color theorem and Kepler's conjecture as examples of 'proof by computer' and discuss the pros and cons of such approaches.

The article distinguishes two kinds of computer proof: one where the the computer is used to enumerate and check a large number of subcases in a proof, and one where the computer is used to generate a proof via an automated theorem-proving process. The article argues that
It is possible that mathematicians will trust computer-based results more if they are backed up by transparent logical steps, rather than the arcane workings of computer code, which could more easily contain bugs that go undetected.
I am skeptical whether mathematicians will be willing to trust a computer over their own intuition. The dark secret at the heart of mathematics is that most proofs are established by ultimately appealing to the reader's intuition. In fact, it is more likely that I can convince a listener of my proof by intuitive arguments than by pure formal methods. This is a sliding scale of course; my intuitively obvious claim is another person's completely nontrivial lemma, and it is through the give and take among mathematicians in the community that a proof is 'stabilized' or 'confirmed'.

Ultimately, it's worth remembering that part of the beauty of mathematics is that rigid formal truths have beautiful intuitive interpretations, and it is these intuitive metaphors that are much more lasting than the proofs themselves.

A quote from the last paragraph of the Economist article is fitting, and would make Timothy Gowers proud:
Why should the non-mathematician care about things of this nature? The foremost reason is that mathematics is beautiful, even if it is, sadly, more inaccessible than other forms of art. The second is that it is useful, and that its utility depends in part on its certainty, and that that certainty cannot come without a notion of proof.


Tags: ,

Monday, March 28, 2005

Open Problems

John Langford points us to the COLT 2005 open problems CFP. The COLT open problems appear in the proceedings, a feature that distinguishes them from (for example) the SoCG open problems sessions, which are more informal.

It's worth pointing out that there are open problems, and there are open problems. Anyone who writes a paper can come up with open problems; these are merely the questions you didn't get time to address, and are either too bored or too unmotivated to work on yourself. These are not really that interesting. Good open problems are not necessarily easy, and are not too hard either (P?=NP is an open problem in the true sense of the word, but is not particularly useful as a proposal for this kind of forum).

A good open problem thus has some intrigue, has some surprise, and should tantalize the reader; the solution should appear to be just over the horizon, rather than indistinctly fading away. Any thoughts on good candidates ?

Friday, March 25, 2005

New (to me) book on randomization

As the networks are prone to say during the summer, "If you haven't seen it, it's new to you".

Cambridge University Press has cornered the market on randomized algorithms. After publishing the truly excellent Randomized Algorithms by Rajeev Motwani and Prabhakar Raghavan (ed: wasn't Rajeev your advisor ? me: that has nothing, nothing at all to do with it), they have now published Probability and Computing by Michael Mitzenmacher and Eli Upfal. I caught a brief look at the book the other day, and it seems to have a slightly higher focus on probabilistic tools and techniques and possibly a slightly lesser focus on randomized algorithms, but I could be wrong here.

All in all, looks like a great book to have; it's definitely on my list of books-to-get.

Tuesday, March 22, 2005

Blogging from FSU

where I am visiting Piyush Kumar and giving a talk. On the way into town, I saw this amusing billboard:
The best vitamin for Christians -- B1

The latest issue of Salon (you can read the article if you sit thru an ad) reviews two books on Gödel that were mentioned here earlier: Rebecca Goldstein's Incompleteness and Palle Yourgrau's World Without Time...

Monday, March 21, 2005

New Abel Prize(s)

The 2005 Abel Prize has been awarded to Peter Lax
for his groundbreaking contributions to the theory and application of partial differential equations and to the computation of their solutions
An interesting section from the long popularized explanation of his work:
Lax considers himself both a pure and an applied mathematician. His advice to young mathematicians is summarized in 'I heartily recommend that all youngmathematicians try their skill in some branch of applied mathematics. It is a gold mine of deep problems whose solutions await conceptual as well as technical breakthroughs. It displays an enormous variety, to suit every style; it gives mathematicians a chance to be part of the larger scientific and technological enterprise. Good hunting!'
I really like the fact that the Abel Prize committee supplies these popular explanations of the work of the awardees. The ACM has a detailed press release for its awardees, but a popular exposition of the work would be a great addition.

In other news, the Abel Prize Committee is funding a new prize called the Ramanujan award. From the news blurb on the Abel Prize site:
The Prize will be awarded annually to a researcher from a developing country less than 45 years of age at the time of the award, who has conducted outstanding research in a developing country. The first winner will be announced in 2005.
The award amount will be $10,000.
An interesting tidbit I heard at lunch: apparently Paul Wolfowitz (DoD under-secretary, World Bank chief nominee) is the son of a very well known information theorist and statistician. Jacob Wolfowitz wrote one of the classic books on coding theory.

Wednesday, March 16, 2005

Mmmm, ice cream

From the AIR:
There is, or at least was briefly, an ice cream flavor created in honor of physicist Richard Feynman. Devised by Toscanini's Ice Cream, it was inspired by the story that earlier inspired the name of the classic book Surely You're Joking, Mr. Feynman!
On that note,

The Newton: comforting and familiar, with a rich chocolate sauce. But it ends, and you are sad.

The Heisenberg: Everytime you think you can nail it down, it escapes you.

The Einstein: Mindbending, dude !

The Godel: I could describe the flavor, but I need another icecream...

Tuesday, March 15, 2005

Clickety click

Ernie points out another knitting mathematician. I have to say that I never thought there could be so many blogs about knitting (see Andrea's blogroll).

Friday, March 11, 2005

Numb3rs Review: "Counterfeit Reality"

...also to be referred as the 'projective Daubechies' episode.

After a brief hiatus, Numb3rs is back, with another kidnapping episode. As usual, spoilers abound..

Plot Summary: Series of store robberies and two murdered teens lead to a gang of counterfeiters, who kidnapped an artist and forced her into doing note sketches for them. If Don, Charlie, and the rest of their merry men can't find her in time, it's lights out !

This is a good "character development episode", with backstory about Don's love life being revealed. I guess the producers didn't have the courage to go all CSI/Law and Order/Law and Order SVU/Law and Order CI/Law and Order WillYouStopWithAllTheSpinoffs on us and dispense with the characters' backgrounds. Pity...(Yes, I know this is a frequent rant, but ...)

As I always say, if you want character development, don't read this blog ! My heart races when they say 'wavelets', which in fact they did. Local cop refers to a 'video-enhancement algorithm' that Charlie has developed, and I was wondering when wavelets would come up. As it turns out, there are scenes with pictures of what appear to be wavelet basis functions (though they didn't look like Haar wavelets at least).

I have now seen 7 episodes of Numb3rs, and in these seven episodes, Charlie uses statistical modelling, attacks P vs NP, does cryptography, understands advanced number theory, knows civil engineering, is a mean hacker and consults on string theory problems. Not to mention having tenure. So my question is: What could Charlie's field of specialization possibly be ? I mean, what on earth could he have written a thesis in ? The 'professor of applied math' part could cover the modelling and the civil engineering, but the rest ? And does he not know that at certain universities, math department professors wrinkle their noses when a professor of 'applied math' walks by ? I have certain departments in mind, but I'm not naming names ;)

Best/Most inane line of the episode: Charlie's grad student (who appears to be a wavelet expert as well as a combinatoricist), says 'In combinatorics, sometimes we vary the angle of attack'.

In algorithms, sometimes we eat the sugared candy...
Tags: ,

Thursday, March 10, 2005

Gödel, Original Sin, and Sausage...

Jordan Ellenberg (he of the countably infinite reading list) has a nice Slate review of a new book about Gödel's theorem (Incompleteness, by Rebecca Goldstein). He makes the important point that the incompleteness theorem did not have the cataclysmic effect on modern mathematics that non-mathematicans were under the impression it does. All it does is destroy a
"Communist takeover of mathematics" in which individuality and intuition would be subjugated, for the common good, to logical rules.
This tidbit is funny.
Yet, Gödel is routinely deployed by people with antirationalist agendas as a stick to whack any offending piece of science that happens by. A typical recent article, "Why Evolutionary Theories Are Unbelievable," claims, "Basically, Gödel's theorems prove the Doctrine of Original Sin, the need for the sacrament of penance, and that there is a future eternity."
To that, I have only one response: the Sacred Sausage Invocation. Amen.

Tags:

Wednesday, March 09, 2005

The answer...

Here's the answer to yesterday's question:

It becomes clear fairly quickly once you start playing with examples that the thing to do is create some kind of bipartite graph, where the left side is the "bad answer" and the right side is the "good answer". Since picking all the vertices on a side is a valid cover, and you have to pick all of at least one side, it is clear that the smaller side is the optimal solution. So the goal is to make an example with more nodes on the left than on the right, and force the algorithm to pick something from the left each time (if you could direct the edges, to force picking from one side, then it would be easier, but then you've just created a SET COVER instance).

The catch is that just by averaging, if the left side has more nodes than the right, then it seems that the nodes on the right should have higher degree and should be picked first. How does one avoid this ? This is where the "trick" is used. Gronwall's theorem bounds the sum of the number of divisors of n. Roughly, the sum of the divisors of n behaves like n ln ln n as n goes to infinity. So here's the construction:

Let 1 = d1 < d2 < ... < dk = n be the divisors of n. Create n nodes on the right side. For each i, create di nodes on the left side. Of these di nodes, connect the first to the first n/di nodes on the right, and the second to the next n/di, and so on. There are (asymptotically) n ln ln n nodes on the left side. The maximum degree on the left side is n/d1 (= dk!), and the maximum degree on the right side is k, since each node on the right connects to only one node in each group on the left. It is easy to check that dk is always greater than k, so the algorithm always picks the first node from the left side. Repeat, use induction, clean your hands, and voila! you end up picking n ln ln n nodes instead of n: a ln ln n approximation gap.

This is not the best solution possible. With a bit more work based on this solution alone you can go up to c ln n. I'll leave that as an exercise ;).

Tags: , ,

Tuesday, March 08, 2005

An old question, a new approach

In a graph, say that a vertex v covers an edge e if v is adjacent to e. VERTEX COVER, one of the original 6 NP-Complete problems, asks if you can find a set of vertices of minimum size that cover all edges in the graph (for purists, the decision problem is to check, for a given G, and value B, if there is a vertex cover of size at most B).

One of the most obvious attacks on this problem is to use a greedy approach. Pick the vertex of highest degree, and remove all the adjacent edges. Repeat this till all edges have been removed. Clearly the set of vertices picked form a cover. But how good is this ?

The answer is, 'not very good', and one of the first homework problems in a graduate class on algorithms is to show why, by coming up with a counter-example. There are some standard (but not trivial) constructions that can be used, and you can amuse yourself with thinking about this.

However, this is not the point of this post. Amit Chakrabarti is teaching an graduate algorithms class up in Dartmouth, and Khanh Do Ba, a junior taking the class, came up with an ingenious solution that I at least had not heard of before. I will not tell you the solution just yet, but I will give you the hint that Amit gave me when he told me about it: the student used Gronwall's Theorem.

Happy pondering. Tune in tomorrow for the answer.

Tags: , ,

Sp311ing

No Numb3rs review this week because my TiVo flaked did its job, but in its stead, I present to you,
Sp311ing
Courtesy of the most bizarre blog on the web: Fafblog !

“Oh no, a crime!”
“Maybe we can solve it – with spelling!”
“Applesauce. A-P-P-L-E-S-A-U-C-E. Applesauce.”

“So, you are the American dog who spells for the Great Satan, eh? Well, now you will spell for the jihad!”
“Never!”

“We’ll never disarm this bomb in time! It’s impossible”
“You can’t spell ‘impossible’ without ‘possible’! Let’s go, team!”
“Australopithecine. A-U-S-T-R-A…”
Tags:

Monday, March 07, 2005

Alpher, Bethe, Gamow

Hans Bethe, Nobel laureate and giant of theoretical physics, died today. I will leave commentary about his life and work to the physicists, but remember one fairly well-known anecdote regarding him and George Gamow.

From Quark Soup:
Here is Gamow's own account, from his popular science book _The Creation of the Universe_:

"The results of these calculations were first announced in a letter to _The Physical Review_, April 1, 1948. This was signed Alpher, Bethe, and Gamow, and is often referred to as the 'alphabetical article.' It seemed unfair to the Greek alphabet to have the article signed by Alpher and Gamow only, and so the name of Dr. Hans A. Bethe (_in absentia_) was inserted in preparing the manuscript for print. Dr. Bethe, who received a copy of the manuscript, did not object, and, as a matter of fact, was quite helpful in subsequent discussions. There was, however, a rumor that later, when the alpha, beta, gamma theory went temporarily on the rocks, Dr. Bethe seriously considered changing his name to Zacharias.

"The close fit of the calculated curve and the observed abundances is shown in Fig. 15, which represents the results of later calculations carried out on the electronic computer of the National Bureau of Standards by Ralph Alpher and R. C. Herman (who stubbornly refuses to change his name to Delter.)"

Tags:

First we wrestle, then we ooze

In which the academic, fresh from a bout of rabid-squirrel-wrestling, oozes off into the primordial swamp of knowledge:

How is the modern professoriate like a slime mold? Consider this description from Steven Johnson:

[T]hey solve problems by drawing on masses of relatively stupid elements, rather than a single, intelligent "executive branch." They are bottom-up systems, not top-down. They get their smarts from below. In a more technical language, they are complex adaptive systems that display emergent behavior. [Emergence: Scribner, 2001]

Read the article: it's interesting.

(HT: Three-Toed Sloth)

Tags:

Saturday, March 05, 2005

updating web pages..

Seen on a mailing list I subscribe to:
for many academics, the practice of keeping
one's web page up to date with both their long term and short
term research interests is something that gets prioritized just
below rabid-squirrel-wrestling.
And just for the heck of it:
Tags: , ,

Friday, March 04, 2005

Wake up, ACM.

The ACM is the mothership of all CS endeavours. It runs most of our conferences, has the excellent digital library, and dishes out some of the most prestigious awards in the field. But those of us who use the ACM's services on a regular basis know that its affinity for technology is, how shall I say this politely, somewhat on the weaker side. Try buying anything from the ACM store online and you'll see what I mean.

Current case in point: RSS. A recent issue of CACM covered the blogosphere in great detail, so clearly the ACM is familar with the notion of RSS. However, there are still no RSS feeds for tables of contents for journals. ACM instead provides a TOCalert email service, something that was in vogue quite some years ago.

There is now a standard for RSS feeds from journals. Nature Publishing Group, that publishes most of the Nature:XX journals, provides RSS feeds for all of its journals. The IEEE provides feeds for its journals (though not in the PRISM standard format). The ArXiv provides feeds on all subject categories.Why can't the ACM ?

The reason I bring this up has to do with CiteULike. One of the things you can do with CiteULike is track RSS feeds of many journals archived there. There are numerous CS journals, including many in the area of machine learning. However, since the ACM doesn't provide RSS feeds, these can't be tracked inside CiteULike.

Even if ACM can't provide feeds in a standard format, any feed would be useful. Maybe some of you readers know people who know people who know people.... inside ACM ?

Wednesday, March 02, 2005

On Grace Murray Hopper

When Karen was in high school, in a special school for the sciences, Rear Admiral Grace Murray Hopper came to speak with them:
Her messages to us:

If you want to get something done, do it first. Apologize (if apologies are necessary) later.

Don't let anyone tell you you can't succeed, especially if you are a woman.

She gave us lengths of wire cut to the distance light travels in a nanosecond, and handed out pepper grain sized wires that she called picoseconds for the same reason.

Her talk was less about doing computer science and more about having the will to succeed against all odds. Very inspirational.

CS Awards

This year's Kanellakis prize goes to Robert Schapire and Yoav Freund for their work on boosting. They had previously won the Gödel Prize in 2003 for their paper "A Decision Theoretic Generalization of On-Line Learning and an Application to Boosting".

Jennifer Rexford has won the Grace Murray Hopper Award.

With a mixture of pride and regret I point out that all three are AT&T Labs alumni.

If you want to know more about what these awards are given for, visit the ACM Awards page. The Kanellakis prize commemorates an achievement that spans theory and practice (has theoretical heft and has shown to be effective in practice). The Gödel Prize acknowledges a single journal paper that has had great impact in the last seven years. The Grace Murry Hopper Award is a 'young achiever' award, given to researchers below the age of 35. The Turing Award, of course, is the computer science "Nobel" for lifetime achievement. The Knuth prize is the theory lifetime achievement award.

Gödel, Turing and Knuth need no introduction. Paris Kanellakis was a professor at Brown who died tragically in a plane crash in 1995. He was one of the pioneers in database theory. Grace Murray Hopper was one of the first programmers of modern computers, the first compiler designer, and is attributed with the coining of the term 'bug' to describe an error in a program.

Now that's the way to say it:

A university is not a Walmart
-- Daniel Lemire.

Monday, February 28, 2005

a different set of numbers...

350+ posts

87,000+ words

70,000+ visits

113,000+ page views.

....

and a tired yet satisfied blogger, 1 year old, and plugging away.

Ah, the good old days...

BoingBoing points us to a dictionary of swearing, in every language imaginable. Reading the Hindi section brings back memories of the good ol' days at IIT Kanpur: I can assure you that every phrase on the list could be heard almost every day. In fact, we spent the first month of freshman year receiving instruction in all possible variations on these insults (being IITans, we were quick learners ;))

[ brief pause while I am overcome with nostalgia...]

It should come as no surprise to the Indians who read this that the Punjabi section was one of the top swear picks. Yiddish also seems like a good choice, but I am surprised at the inclusion of French in the top picks list.

Tags:

The Road to Reality

Roger Penrose's new book, The Road To Reality, is out. Peter Woit has a detailed review of the book (I am waiting for mine to show up), but in the meantime, consider this amusing quote from the NYT review:
The perfect reader for ''The Road to Reality,'' I fantasized, would be someone comfortable traversing the highest planes of abstract reasoning, yet who had missed some of the most important landmarks of scientific history -- a being, perhaps, from another place and time.
I don't know. It sounds an awful lot like a mathematician to me.

Sunday, February 27, 2005

SODA Outtakes: Market Equilibria

I know I know. By the time I am done with SODA outtakes, it will be time for SODA 2006. This is why we need more algorithms bloggers ! Machine learning blogs appear to be popping up faster than you can say 'expectation-maximization', but there is no corresponding growth spurt in algorithms. Sigh....

This outtake is much easier, because there is a well-written survey I can point to. The topic is market equilibria, or the problem of determining when equilibrium sets in when traders exchange good at various prices, and how fast this can be determined.

More formally, the problem is: given an initial allocation of goods to traders (a "supply" vector, where each "dimension" represents a single commodity), and a "preference" function for each trader (I have a high affinity for gold and cookies, you have an aversion to silver, but really like beer), determine a set of prices for goods, and a "demand" vector for each trader so that if each trader sells what they have and buys what they want (at the given prices), their preference is maximized.

Here, the preference function component for any commodity is a concave function of its amount. Note that you want a concave function in order to model "diminishing returns". In addition, having concave utilities helps ensure unique optimal solutions (because the average of two solutions has better utility).

There are also sanity conditions (you can only buy with whatever money you have from selling, and the total amount of goods is fixed - this does not model producer economies).

This SIGACT News survey by Codenotti, Pemmaraju and Varadarajan lays out the terrain of market equilbria very nicely, outlining the history, the techniques (some beautiful methods) and what is currently known.

It's a good example of an area that has been studied extensively (in economics) but where the computer science angle brings in questions that were not necessarily of concern earlier: issues of computational feasibility, efficiency, approximations that are characteristic of the algorithmic way of approaching a problem.

Tags: , ,

Friday, February 25, 2005

Numb3rs Review: "Sabotage"

A nice thrilling episode, with very little math (but see below). As always, major spoiler alert.

Plot summary: Series of trains are derailed. At each site, a paper containing the same set of numbers is left behind. Everyone thinks it's a code, but it actually isn't (at least not in the traditional sense). Some good old sleuthing yields the answer.

It was funny when the NTSB official asks Charlie if he knows any cryptography. Although technically isn't cryptanalysis what the code breakers do ? Larry the loony physicist had some hilarious quotes:
  • "Our evening was primal, in the carnal sense, not the mathematical."
  • "The math department, the least libidinous place on campus"
The last fact is sadly not too far from the truth, but wait ! Mathematicans are cool (via Mathpuzzle)

The only real math nuggets came at the end, when Charlie was trying to explain to the female FBI agent that math appears in nature. His example: The Fibonacci series ! Bzzttt...

As it turns out, Rudbeckia Hirta just today posted a note about the exaggerated claims concerning the Fibonacci series; that the Greeks referred to the "golden ratio" (no), that the pyramids and the Parthenon somehow magically encode this ratio (only if you round off heavily), and even how the ratio is supposed to be aesthetically pleasing (no evidence). Keith Devlin has more on this.

Oh well: if Dan Brown can do it...
Tags: ,

p.s I understand that we need some kind of romantic interest to give the series some spice (although CSI and Law and Order manage perfectly well without it), but why on earth are they setting up an advisor-student relationship ? Allegedly at some distant point in the past this used to be fairly common in the humanities, but nowadays ? My only question is: how long before this becomes a "Big Issue" in the academic blogosphere ?

Meta: recent comments

I added a new feature so you can see the recent comments posted on the site (this is handy if you posted a note and are looking to see if there was a response). It's on the left, below the "Previous Posts" section.

Comments were broken for a while because I forgot to update the hack that allows me comments in the first place. They should be working again.

SODA Outtakes: Lloyd's algorithm and analysing heuristics

Uri Feige's invited talk at SODA 05 was on 'Rigorous Analysis of Heuristics for NP-hard Problems'. His main argument (at least in the initial portion of his talk) was that it's a good idea to analyze known heuristics for problems, because they work well in practice, and a rigorous analysis will shed light on why.

[Aside: in the theoryCS community, a "heuristic" usually refers to a procedure that has no provable performance guarantees; this could mean either no clear running time analysis, or no clear quality of optimization guarantee. The term "algorithm" is usually reserved for a procedure that has well-defined steps and ends in a well-defined state after a definite (finite) number of steps. One will occasionally find the term 'algorithm' used to refer to a heuristic in non-theory areas; beware !]

Doing such analysis, although maybe not as "sexy" as coming up with a new algorithm, is important also for "sociological" reasons. Although it might technically be correct to argue that a particular heuristic is not reliable and thus should not be used, the resolute theoretician is often faced with a mountain of empirical evidence in support of said heuristic, compared with the possibly complex and hard-to-understand algorithm that she proposes.

Some of the best known heuristics show up in clustering, especially since this is a problem that transcends many different application domains, including areas that are not as familiar with theoretical algorithmic methods. One of the best known methods for clustering with k centers is Lloyd's algorithm (or the k-means algorithm), which works essentially as a discrete variant of expectation-maximization. The algorithm itself is very simple:
  1. Choose k "centers"
  2. Assign each point to its nearest center.
  3. Compute the k centroids of points assigned to a given center. These are the new k "centers".
  4. Go back to step 2.
It is known that Lloyd's algorithm will hit a local minimum fairly quickly (though not a global one: k-means is NP-hard). However, there is no analysis of how quickly this happens. In a SODA 2005 paper, Har-Peled and Sadri examine the complexity of k-means. They show that:
  1. In 1D, and in d-D (on the grid), the number of iterations of the k-means algorithm is polynomial in the number of points, the dimension, and the "spread" (the ratio of max distance to min-distance).
  2. Even in 1D, there are example point sets on which the k-means algorithm takes a linear number of iterations to converge.
What they also show is that slight modifications to the k-means heuristic result in algorithms that converge in time independent of the dimension (and with much better upper bounds). The simpler of their two schemes can be described as follows: in Step 2 of the original approach, merely move one "misclassified" point to its true cluster, rather than all.

They go on to present empirical evidence in support of their claim that these algorithms generate similar quality answers (in similar time) to k-means. This part is still a work in progress: there are very fast implementations of k-means that they don't compare against, (but the data structures might be used to speed up their approach as well).

It is not clear to me that their paper answers to any degree the question of why Lloyd's algorithm works as well as it appears to. However, the fact that they can prove bounds for their alternate algorithms suggests that maybe this is a line of attack to take when analyzing Lloyd's method. These alternates are also quite simple, and so could conceivably be used as k-means replacements if one wished guaranteed running time bounds.

One final point: the authors have their code available online. Would that more would do the same !

Tuesday, February 22, 2005

CiteULike

Yaroslav Bulatov, who runs one of the new machine learning blogs, points us to CiteULike, a very intriguing site. Some of you may have used web-based bookmark sites like Furl or del.icio.us, and may have even used the "social bookmarking" aspect of these sites.

CiteULike is a similar site, for archiving papers. Here's how it works. If you find a paper you are interested in, you click on a bookmark link, and the paper is added to a web-based collection under your name. However, what makes this unique is that if the paper link is on a "well known" site like the ACM DL, Citeseer, arxiv, or many other places, bibliographic information is automatically added to the entry (this is because the server parses pages etc and extracts the necessary bibliographic information)

Subsequently, when you want to extract bib entries for all these papers, one click gets you a .bib file for all the papers you want. This is very handy (for example) when you are doing a literature search and reading/downloading papers by the dozen.

You can also associate tags with the entries if you choose to. This gets into the "social bookmarking" aspect of CiteULike: the idea being that if you tag a bunch of papers as being related to "artgallery", and so does someone else, you will learn what the other person has found and vice versa. Personally, I am either too oldfashioned or too clueless to have made effective use of this feature, but the archiving/.bib features above are enough for me try it out.

Like any respectable web-based service, there are RSS feeds for any view of the collection (by tag, by person, by personal tag), and there are also watch lists, so if you want to keep track of when a new page (tag or person) adds a new paper, you can do so.

Here's my paper feed. It will be an interesting experiment.

If you think your committee is rough on you...

From Jim Holt's New Yorker article on Einstein and Gödel:
Einstein’s conclusions were the product of pure thought, proceeding from the most austere assumptions about nature. In the century since he derived them, they have been precisely confirmed by experiment after experiment. Yet his June, 1905, paper on relativity was rejected when he submitted it as a dissertation. (He then submitted his April paper, on the size of atoms, which he thought would be less likely to startle the examiners; they accepted it only after he added one sentence to meet the length threshold.)
The article dwells at length on Gödel's interpretation of time (something I had mentioned a while ago). Read the whole thing: it's fascinating.

(HT: Libertarianoid)

Tags: , , ,

Monday, February 21, 2005

A great new resource for molecular modelling

From WorldChanging:
ZINC -- which, in the free software tradition of recursive acronyms, stands for ZINC Is Not Commercial -- is a free database of compounds for "virtual screening." That is, ZINC provides 3D models of chemical compounds in a standard "docking" format used in chemistry and biochemistry software, allowing researchers to assemble and test new chemical compositions on their computers.
This is really great. When I was doing my Ph.D, I mucked around with molecular structures for docking and matching purposes, and I had accces only to structures that our collaborators at Pfizer provided us. Having such a large database to play with will really help geometry folks who do molecular modelling, not to mention all the medical researchers who probably already use this.

Things we take for granted..

Richard Hofstadter's classic book Anti-intellectualism in American Life is a fascinating sweep through American history that provides a good answer for those who wail 'How could someone as (apparently) stupid as Bush get elected ?'. Another book, The Development of Academic Freedom in the United States with Walter P. Metzger, should be at the top of my list of reading, based solely on this review by Scott McLemee. An excerpt from the review (emphasis mine):

Condensing 500 pages into five paragraphs is a fool's errand, but here goes anyway.

The belief that only the community of scholars has the final authority to determine what counts as valid research or permissible speech has deep roots in the history of the university, going all the way back to its origins in medieval Europe. But it was extremely slow to develop in colonial and antebellum America, which had few institutions of higher learning that were anything but outgrowths of religious denominations.

How we take certain things for granted...

Saturday, February 19, 2005

Numb3rs Review: "Prime Suspect"

This was the most mathematically oriented episode of all: can't get much better than the Riemann Hypothesis ! Scott Aaronson has a review as well: go and read that first !

I agree with much of what Scott says. I think the problem is a little different though: some of the random phrases attributed to mathematicians (which mathematicians refers to an open problem as a "math problem") actually make sense if a non-mathematician uses them, and when Don (the FBI agent) uses them, it doesn't sound incongruous at all. It's just that the writers need to switch gears a little when writing dialogue for the mathematical characters.

They've done a good job getting a sense of the mathematics; just less of a sense of the mathematician.

Plot Summary: Mathematician claims to have proof of Riemann Hypothesis. Hackers kidnap daughter, demanding proof (and algorithm), which they will then use to crack cryptography.

Major beef: there is no direct link between a solution to the Riemann Hypothesis and factoring. It is undoubtedly true that a solution will shed a lot of light on the structure of primes, but it's not the kind of immediate corollary that is commonly implied in the popular press (take this, for example).

But credit goes where credit is due: I was wondering how they would make the conceptual jump from a theorem to an actual algorithm (assuming said connection). At least they tried to make it credible, with Charlie pointing out that an algorithm could take years to develop.

A point that I didn't really consider, but both Scott and my wife noticed, was that by using the Riemann Hypothesis, the plot line introduced the idea of a problem that has remained unsolved for 150 years, and that it's good for people to realize that "math problems" can take a while to solve. In fact, this point is reinforced when Don expects Charlie to solve it in a few hours :).

Trivia note: the mathematician father was played by Neil Patrick Harris, the "original" child prodigy from Doogie Howser, M.D.
Tags: ,

Tuesday, February 15, 2005

Erik Demaine and Origami

The NYT has a fabulous article on Erik Demaine and origami. It is worth noting that
  • The article has no quotes that make him look like an absent-minded professor.
  • It does a beautiful job of capturing both the art and the science of origami.
  • It has no real clangers in terms of the mathematical descriptions.

SoCG 2005 Results

Are out: 41/140 papers in (no I didn't submit anything).

The entire list is here.

Monday, February 14, 2005

"Laws of Chance"

One would think I would let this pass without comment, but oh well...

A recent /. post links to an article on RedNova News about a "Random Event Generator" that purports to predict world events:
One of these new technologies was a humble-looking black box known was a Random Event Generator (REG). This used computer technology to generate two numbers - a one and a zero - in a totally random sequence, rather like an electronic coin-flipper.
Now it's all very well to generate randomness from a computer, and in fact there have been suggestions that certain kinds of electrical fluctuations on a chip can be recorded and used for randomness generation. But to claim that this is "totally random" by virtue of using the magical "computer technology" is a bit of a stretch.

But wait, it gets worse:

The pattern of ones and noughts - 'heads' and 'tails' as it were - could then be printed out as a graph. The laws of chance dictate that the generators should churn out equal numbers of ones and zeros - which would be represented by a nearly flat line on the graph. Any deviation from this equal number shows up as a gently rising curve.

I presume they mean that they plot |H-T| against n. But as any book on probability will tell you, you do expect to see random fluctuations (this is after all a binomial process), and the standard deviation is around sqrt(N), so bumps are to be expected: a flat line is what would be suspicious.

During the late 1970s, Prof Jahn decided to investigate whether the power of human thought alone could interfere in some way with the machine's usual readings. He hauled strangers off the street and asked them to concentrate their minds on his number generator. In effect, he was asking them to try to make it flip more heads than tails.

It was a preposterous idea at the time. The results, however, were stunning and have never been satisfactorily explained.

Well it sounds like the results haven't been stated either. What exactly were these "stunning" results ? Ah, here we go:

Again and again, entirely ordinary people proved that their minds could influence the machine and produce significant fluctuations on the graph, 'forcing it' to produce unequal numbers of 'heads' or 'tails'.

According to all of the known laws of science, this should not have happened - but it did. And it kept on happening.

"All the known laws of science" can't explain an unequal number of heads and tails. In other words, if I just managed to get a streak of 10 heads with a coin, I'm psychic ?

Aieee......

Saturday, February 12, 2005

Theory folks in the press

The NYT has an article about spamblocking in which the 1992 work of Dwork and Naor on junk email is cited:
In a paper titled "Pricing Via Processing, or Combating Junk Mail ," two computer scientists, Cynthia Dwork and Moni Naor, came up with a way to force a sender to pay every time a message was sent - payment not in money, but in time, by applying the computer's resources to a computational puzzle, devised on the fly for that particular message.

[...snip...]

Ms. Dwork and her colleagues have named this the Penny Black Project.
This is remarkably prescient: in 1992 I was grateful to have access to email, let alone worry about junk mail. On an irrelevant note, I am impressed that the NYT has started providing real links in their articles.

Tags: ,

SODA Outtakes: Approximation algorithms for metric embeddings.

One of the most exciting areas of algorithms in the past ten years has been the study of metric embeddings. Given two metric spaces, can we "embed" one in the other (which basically means construct an 1-1 map from points in the first to points in the second) so that distances are roughly preserved ? The seminal paper of Linial, London and Rabinovich demonstrated how to use a low-distortion embedding of a metric space into l1, to determine a good approximation algorithm for the sparsest cut problem (the distortion of the embedding is the approximation ratio achieved).

This set off a flurry of research into questions of the form "What is the best distortion for embedding spaces of type A into spaces of type B". There is no way I can survey all the results in this area; a new CRC book chapter by Indyk and Matoušek should be able to cover most of the terrain, and the excellent book by Deza and Laurent has much of the technical background. Everything started (in a sense) from the result by Bourgain that any metric space can be embedded into l1 with distortion O(log n) (albeit in log2 n dimensions).

Other questions that were studied along the way:
  • what is the minimum distortion that one must face when embedding A into B (metrics derived from expanders proved to be useful gadgets in this regard)
  • Can one reduce the dimensionality of a space without distorting distances too much ? This question has numerous implications for all kinds of problems in clustering, data mining, database search, you name it...
  • Are there general subclasses of metrics that have low-distortion embeddings ? A fundamental open question here involves metrics on planar graphs. It is still open as to whether they can be embedded in l1 with constant distortion (again, this has implications for sparsest cut in planar graphs)
  • Metric "Ramsey theory": if I can't embed the entire metric space, can I embed a large fraction of the points/edges with smaller distortion ? Yair Bartal just gave a nice talk at AT&T on partial embeddings (the "edge" variant).
A set of questions that is now getting more attention is the "natural " approximation angle. Although we know that any metric can be embedded in an lp space with distortion log n, what about a particular metric given to us ? Might one not be able to do better ? And can we come up with an algorithm to determine the BEST distortion for this particular metric ?

It was shown early on that one can find the optimal distortion into l2 (upto additive error) in polynomial time using semi-definite programming. For l1, the corresponding question is NP-hard. What I have been seeing is a slew of interesting recent papers that explore hardness results and approximation algorithms for embedding problems:
  • Kenyon/Rabani/Sinclair (STOC 2004): They consider bijections between two n-point metric spaces, and present a constant factor approximation in the case when the metrics are derived from a line (as long as OPT is small).
  • Papadimitriou/Safra (SODA 2005): They show that the minimum distortion bijection between two 3D point sets is hard to approximate to within a factor of 3.
  • Badoiu, Dhamdhere, Gupta, Rabinovich, Raecke, Ravi, and Sidiropoulos (SODA 2005): A collection of results on embedding metrics in 1D and 2D. One main result is an O(sqrt{n})-approximation for embedding a metric on the line.
  • Badoiu, Chuzhoy, Indyk, Sidiropoulos (STOC 2005): I don't have a copy of the paper, but the title is indicative enough: "Low-Distortion Embeddings of General Metrics Into the Line"
This is not to say that these problems have not been studied earlier. As far back as 1993 for example, Farach-Colton, Kannan and Warnow had looked at embedding a metric into an ultrametric. However, the current focus on such problems is an interesting new direction in the theory of metric embeddings.



p.s usual disclaimers apply: this is a blog post, not a survey, and hence is opinionated and far from comprehensive. Pointers to relevant add ons are welcome.

Turing Train Terminal

When I was an undergrad, we were once given an assignment to write out a universal turing machine as a state diagram. I have heard of others being asked to program one (using gturing?).

This is by far a more enjoyable variant.



Via BoingBoing.

Friday, February 11, 2005

Numb3rs review: "Structural Corruption"

Bill Gasarch has a second review of the P vs NP episode over at Lance's place. He speculates as to what the characters might have meant by saying that P vs NP is "unsolvable". My favorite:
Charlie was using proofs that naturalize, which won't work because of the results of Razborov and Rudich (Judd Hirsch: So use unnatural techniques)
Last night's episode dealt with calculations involving the structural integrity of a building. Not much actual math shown in this episode, and there were some curious points:
  • Charlie is a "professor of applied mathematics", and yet in the first episode crazy string theorist wants his help with some calculations. I guess string theory could be viewed as applied math in some circles...
  • When trying to figure out why the dead student had come to him for help with calculations (is he a math prof or a human calculator?), Charlie says "We can only infer that it had something to do with math". Uh, yeah......
  • Charlie, "professor of applied math" and string theorist par excellence, can also mock up a structural integrity demo, with GUI, 3D graphics and all, in one night. Sheesh; give him a job at Google, people...
There was one sequence that sounded like a reference to Benford's Law. Charlie finds out worker names and union IDs have been manufactured, by detecting that the numbers on the IDs follow suspicious patterns. This is actually plausible, in that Benford's law has been used to detect fraud and fake accounting.
Tags: ,

Thursday, February 10, 2005

Football and Dynamic Programming.

What's not to like !

On Ph.D Theses

When I was in grad school, it was the conventional "wisdom" that the people who read your Ph.D thesis consist of
  • your advisor
  • your committee
  • your friends (to see if they've been acknowledged)
  • your parents (who don't necessarily read it so much as keep it as a trophy/proof that you can now get a "real job".
But the truth is that Ph.D theses can serve a valuable purpose. Over the years, I have come to appreciate a well written thesis in an area that I am new to. Not necessarily because of the results; those I can often dig up from journal papers. What I found most useful is the introduction, because in a well written thesis, the introduction lays out the terrain of the problem: what is known, and what is not. What is folklore and what results follow trivially. What papers are key to understanding the evolution of the area, and what their contributions are. Basically a lot of what a good survey has to offer, but more expansive, more leisurely, and often more useful.

So if there are any graduate students reading this, my recommendation to them is: take the time to expound on your area in an introduction. You are (or should be) an expert on this topic by now, and your voice is authoritative enough, even if you may not feel that way. The reward, of having someone read your work and appreciate it, is something that can be surprisingly elusive.

Using the GPU

GPU (Graphics card) programming is one of my pet projects, and it has been interesting to see the gradual harnessing of the GPU as a fast streaming coprocessor. A new direction is an even tighter coupling with a computer's underlying GUI and window management system.
  • OS X: "Quartz Extreme uses a supported graphics card built into your Mac to relieve the main PowerPC chip of on screen calculations. This dramatically improves system performance, making Panther much more responsive."
  • Longhorn: "Longhorn will feature a graphics subset called WGF (Windows Graphic Foundation). Its goal is to unify 2D and 3D graphics operation in one and will bring 2D Windows drawing and 3D operations together. Nowadays, 3D is done using a Direct X subset with the current version 9.0c. Longhorn will also use 3D menus and 3D interfaces and will require at least Shader 2.0 compliant cards, so it will make the graphics card an important part of your PC."
  • And late to the party, but getting there, Unix/X (via Thane Plambeck): " David Reveman, who became a Novell employee a couple of weeks ago, has been writing a new X server on OpenGL/Glitz called Xgl. Because Xgl is built on GL primitives it naturally gets the benefit of hardware acceleration. For example, window contents get rendered directly into textures (actually they get copied once in video memory for now), and so you get the benefit of the 3d hardware doing the compositing when you move semi-opaque windows or regions around."
Graphics cards will get even cheaper, since everyone will need one. Intel will probably put even more effort into the little joke they call their onboard graphics accelerator. Interesting times....

Wednesday, February 09, 2005

Heal thyself ?

Adam Buchsbaum observes that although Google Maps is the coolest thing since, I don't know, Google Scholar, there is one thing it doesn't do: if you type an address in the regular Google toolbar, it asks you if you want to plot a map using Mapquest or Yahoo !

C'mon guys, you can do better than that.

p.s take the tour. Try searching for 'free wifi 19130': excellent !!

Monday, February 07, 2005

"To chop a tea kettle"

A while back, Lance and Ernie and I were discussing the relative lack of questions at theory talks (in conferences specially). Having just returned from SODA, I can confirm that questions were few and far between at talks (although the odd talk here and there generated fairly animated discussions).

Scott McLemee at Inside Higher Ed has an interesting column on modern academic interactions, and how they relate to the blogosphere. The article is worth reading in its entirety, (as is the Crooked Timber note that pointed me there), but I couldn't resist excerpting the following, on discussions at conferences:

At conferences, scholars would stand up and read their papers, one by one. Then the audience would "ask questions," as the exercise is formally called. What that often meant, in practice, was people standing up to deliver short lectures on the papers they would have liked to have heard, instead -- and presumably would have delivered, had they been invited.

Hypothetically, if everyone on a panel read one another's papers beforehand, they might be able to get some lively cross-talk going. This does happen in some of the social sciences, but it seems never to occur among humanities scholars. The whole process seems curiously formal, and utterly divorced from any intent to communicate. A routine exercise, or rather perhaps an exercise in routinism. A process streamlined into grim efficiency, yielding one more line on the scholar's vita.

[...snip...]

The inner dynamic of these gatherings is peculiar, but not especially difficult to understand. They are extremely well-regulated versions of what Erving Goffman called "face work" -- an "interaction ritual" through which people lay claim to a given social identity. Thanks to the steady and perhaps irreversible drive to "professionalization," the obligation to perform that ritual now comes very early in a scholar's career.

And so the implicit content of many a conference paper is not, as one might think, "Here is my research." Rather, it is: "Here am I, qualified and capable, performing this role, which all of us here share, and none of us want to question too closely. So let's get it over with, then go out for a drink afterwards.

And the title of my post ? Read the whole article...

Damn !

Unlike some, I prefer to watch the game

Friday, February 04, 2005

Numb3rs Review: "Vector"

This episode had very little math, except a reference to a standard model of disease spread called an SIR model.

I have to say that the writers spent a lot more time getting the math details right than getting the biology right. Why on earth would you compare geometric structures of DNA, when you could just look at the sequence ? My wife the biologist had two comments on this, her first episode.
  • "This is not a TV show, it's a horror movie"
  • "I've been beaten senseless"
(as an aside, she refused to watch Jurassic Park when it first came out). She also assures me that the pictures of the DNA model are wrong (the alleged differences would be on the inside of the helix, not the outside). They did mumble something about RNA, but RNA wouldn't look like the pictures they had on screen.

There was an amusing incident with the eccentric string theorist where he forgot which way he was headed. I could be wrong, but I think this refers to an anecdote involving Herman Weyl at Princeton.

There was more Numb3rs PR in the math press. Keith Devlin (the Math Guy) had an interesting article on Numb3rs at MAA Online. He does an occasional segment on Weekend Edition with Scott Simon. His interviews make for interesting listening (a recent one was on the continuum hypothesis).
Tags: ,

And the wrath of God thunders down upon the unbelievers

Ernst Mayr, "leading evolutionary biologist of the 20th century", died today. From the NYT obit:

He was known as an architect of the evolutionary or modern synthesis, an intellectual watershed when modern evolutionary biology was born. The synthesis, which has been described by Dr. Stephen Jay Gould of Harvard as "one of the half-dozen major scientific achievements in our century," revived Darwin's theories of evolution and reconciled them with new findings in laboratory genetics and in field work on animal populations and diversity.


Polynomial hierarchy collapse

Another thing that puzzled me about the Unknotting result was the corollary:

If Unknotting is NP-complete, then PH collapses to the second level.

As it turns out, this is a trivial corollary of an older result by Boppana, HÃ¥stad and Zachos that if co-NP has short interactive proofs, then PH collapses. Specifically if Unknotting is NP-complete, then Knotting is co-NP-complete and since it is in AM = IP(2), this implies that co-NP is in IP(2), and everything comes crashing down.

It would be nice if there was some way to augment the Complexity Zoo to make such results easier to find.

Unknotting (proof translation)

Jeff Erickson kindly provides a translation in the comments to my previous post. I reproduce the full comment here, lightly edited for links and with minor corrections/additions marked in italics:

The paper relies on a construction (due to Haken?) of a triangulated 3-dimensional manifold M(K) from a given knot K, such that M(K) is homeomorphic to the 3-sphere if and only if K is unknotted. Specifically, M(K) consists of two copies of S3-T(K), glued along their boundaries (which are toruses), where T(K) is a tubular neighborhood of the knot K.

So one algorithm for deciding whether a knot K is trivial is to construct M(K) and check whether M(K) is the 3-sphere. The 3-sphere recognition problem is in NP. Haas et al. used a slightly different technique to prove that UNKOTTING is in NP, but both NP-proofs rely on something called "normal surface theory" introduced by Kneser in 1929 and used by Haken in the early 1960s to boil everything down to integer programming(!). (For a brief outline of normal surface theory, read the introduction of Benjamin Burton's Ph.D thesis)

Hara, Tani, and Yamamoto describe an interactive proof protocol for KNOTTING based on this characterization, and similar to an interactive protocol for graph non-isomorphism. The prover is trying to convince the verifier that her knot is nontrivial. The verifier constructs two triangulated 3-manifolds—the manifold M(K) described above, and a particular triangulation S(K) of the 3-sphere. After randomly permuting the vertex labels, the verifier presents the prover with a one of the two triangulations, chosen uniformly at random, without telling the prover which it is. The prover then (supposedly) tells the verifier whether his given manifold is the 3-sphere.

(1) If the prover says that S(K) is the 3-sphere, the verifier learns nothing.

(2) If the prover says that S(K) is NOT the 3-sphere, the verifier knows he's lying and assumes the knot is trivial.

(3) If the prover says that M(K) is the 3-sphere, the verifier assumes the knot is trivial.

(4) if the prover says that M(K) is NOT the 3-sphere, the verifier assumes the knot is nontrivial.

If the knot is trivial, then no prover can convince this verifier with probablity better than 1/2. If the knot is nontrivial, an honest prover can convince this verifier with probability 1/2. Running this protocol twice inflates the probabilities to 1/4 and 3/4.

END PROOF

As an aside, the original two-round interactive proof for graph nonisomorphism is a beautiful example of the power of interactive proofs. It was first demonstrated by Goldreich, Micali and Wigderson in 1986, and I reproduce the proof here in full (for definitions, refer to the original paper).

1. Given two graphs G1, G2 on n vertices. the verifier randomly picks i ε {1,2} and a random permutation π of [1..n].
2. Verifier then sends π(Gi) to prover
3. Prover returns j ε {1,2}, and verifier accepts iff j = i.

Proof:
If G1 and G2 are nonisomorphic, the prover can figure out which is which by comparing the graph sent by the verifier to G1 and G2, and thus will always return the correct answer.

If G1 and G2 are isomorphic, the prover cannot tell the difference, and thus can "fool" the verifier with probability 1/2.
QED.

Thursday, February 03, 2005

Unknotting...

With this I hope to start a series of posts on papers/ideas that I found interesting at SODA 2005. One of the side effects of no wireless access at SODA was that I couldn't comment on talks as they happened, but it (hopefully) meant that I had time to reflect on what I found interesting.

Exhibit 1, in no particular order, is the following result, by Hara, Tani and Yamamoto:

Unknotting is in AM co-AM

Unknotting is one of the more fundamental problems in knot theory. If we think of a (tame) knot as a (really tangled) loop in three dimensions, then much of simple knot theory focuses on the question: Given two knots, are they equivalent ? (where equivalence is expressed in terms of homeomorphisms).

The trivial knot (or the "unknot") is the simple circle in 3D, and thus the unknotting problem is:

UNKNOTTING: Given a knot, is it equivalent to the unknot ?

It was first shown in 1961 by Haken that Unknotting was decidable. After a brief gap of 36 years, Haas, Lagarias and Pippenger showed that Unknotting was in fact in NP (!), and conjectured that it was in NP ∩ co-NP. Other results along the way showed that Genus (is the genus of a knot less than g) was in PSPACE, and ManifoldGenus (a generalization to a given arbitrary manifold), was NP-Complete.

The new result speaks for itself. Formally, the authors show that Knotting (the complement of Unknotting) is in AM, by presenting a two-round interactive protocol for it. They also mention a corollary that if Unknotting is NP-Complete, then PH collapses to the second level. If I had paid better attention in my complexity class I would have figured out how this follows, but alas... I'd be grateful for any enlightenment. (Update: here it is)

The tragedy for me is that although it is clear that this result is interesting, the proof itself is rather impenetrable, requiring a knowledge of combinatorial topology above and beyond what I can muster up. This is why translators are needed ! (Update: here it is)

Knot theory is a good example of a (relatively arcane) subfield of mathematics that has suddenly become profoundly important in "real life". For more on this, read what well-known theoretical computer scientist John Baez has to say :).

On a more frivolous note, you can do knot art ! Calling our resident mathematical knot knitter...

Frank Harary, 1921-2005.

I wish I had known this earlier. Frank Harary, a graph theory giant most well known for his graph theory textbooks, died on Jan 4, 2005. There is more information at his homepage. (via mathpuzzle.com)

Timothy Gowers and the study of mathematics

When I was in high school I used to watch the Carl Sagan series 'Cosmos', and was fascinated by how he used to bring physics to life. I even remember the episode where he describes the physical consequences of the natural constants being closer to real life (suppose the speed of light was 50 mph, etc).

Ever since then, I've looked for expositors who could extract the same kind of beauty out of mathematics and computer science, distilling its essence in a way that might seem obvious to those of us who practice it, but that allows the core concepts to be revealed to a non-technical audience in a way that captures the imagination.

Timothy Gowers, the Fields Medalist, is well known for his expository articles on a variety of topics. He gave a speech in 2000 in Paris at an event sponsored by the Clay Mathematics Institute. The speech was titled 'The Importance of Mathematics', and I highly encourage watching it.

What struck me the most about this lecture was his eloquent defense of mathematics, and how he conveyed a sense of the aesthetic beauty of the field (why mathematicians refer to a theorem as 'beautiful'). With beautiful examples, he systematically laid out the landscape of a mathematical life, not in terms of topics, but in terms of how mathematical work is done.

He firmly dismissed the idea of mathematics needing to be useful (something we have had our own controversies over in theoryCS), and distinguished between the two statements
(1) Mathematics is not a useful subject
(2) A typical mathematician does not try to be useful.
arguing that (1) is false even though (2) is usually true, and that it should be this way.

A simple way of expressing some of the ideas in his lecture would be:
Mathematics is surprisingly useful, because complexity arises from simplicity and beauty reveals itself to be important.
I just bought his book 'Mathematics: A Very Short Introduction', which brings out these ideas in much more detail, and also puts to rest some of the usual canards associated with mathematics (the 30yr old rule, the lack of women, etc).

To some degree, the description of the daily activities of an algorithms researcher are similar to that of a mathematician, but in many ways they are different. We need a similar exposition for theoryCS !

Scian Melt #7

The latest delectable mix of things scientific and Indian is up.

Disqus for The Geomblog