Computational Complexity

Sunday, January 30, 2022

Regan Lipton celebrates my 1000th blog post and random thoughts this inspires

Ken Regan emailed me recently asking if our software could tell how many blogs I had done (not how many Lance+Bill had done). We didn't know how to do that but he managed it anyway. Apparently he was more interested in this question than either Lance or I was.

But the answer was interesting: My1000th post of Complexityblog was about Betty White dying at just the wrong time to be in those those we say goodbye to articles that appear CLOSE to the end of the year. (I don't know why, but I think the fact that my 1000th post was on Betty White is just awesome!) The post is here. He was asking this because he thought (correctly) that I was around 1000 and wanted to do a tribute blog to me (actually it was done by Lipton and Regan- more on that later). And indeed they did do the post, its here.

RANDOM THOUGHT ONE

While preparing it Ken asked me about my papers. This brings up the more general question: When looking at your old work what do you think? Common reactions are

1) Gee, I was smarter then. That was very clever. OH, now I remember, my co-author did it.

2) Gee, I was dumber then. I could do that argument so much better now.

3) Why did I care about Muffins so much to write a book about it? (Replace Muffins with whatever you worked on and book with the venue it appeared in.)

Item 3 is probably the most common: As a graduate student one works on things without really have a vision of the field (though the advisor can mitigate this) so what you work on may seem odd later on. And ones tastes can change as well.

RANDOM THOUGHT TWO

Ken and Dick write actual posts together. I find that amazing! By contrast, the extent of Lance and my interactions about the blog are:

a) Someone died. Which of us should do the blog obit? or get a guest blogger.?(Whenever Lance phones me on the telephone I answer who died and usually someone did.)

b) Which of us does the April Fools Day post this year (we usually alternate, or do we)?

c) I plan on doing 2 posts close together- a question and an answer, so when do you NOT plan on blogging so I can do that.

d) Someone proved X. Which of us should blog? Or should we get a guest blogger?

e) Establish a general rule for the year like Bill will post Sunday's, Lance Thursdays.

f) I ask Lance for technical help on the blog. How do you get rid of the white background when I cut and paste?

g) Sometimes one of us wants commentary on a blog we are working no- but that is rare. Though I asked Lance for this post and he added a few things to this list.

h) Sometimes I look at one of his posts before it goes out and offer commentary, or vice versa. Also rare.

i) Lance writes the end of year posts, but always with my input. We jointly choose the theorem of the year.

j) The very rare joint posts.

k) If we happen to be in the same place at the same time, like Dagsthul, we'll do a typecast capturing our conversations. In the past we've also had podcasts and vidcasts together.

Wednesday, January 26, 2022

A Failure to Communicate

The screenwriter Aaron Sorkin wrote an article on prioritizing "Truth over Accuracy". He tells stories from his movies The Social Network and Being the Ricardos, of where he moves away from accuracy to get to the truth of a situation.

My friend and teacher, the late William Goldman, said of his Academy Award-winning screenplay for All the President's Men, "If I'm telling the true story of the fall of the President of the United States, the last thing I'm going to do is make anything up." I understand what he meant in context, but the fact is, as soon as he wrote "FADE IN," he'd committed to making things up. People don't speak in dialogue, and their lives don't play out in a series of scenes that form a narrative. Dramatists do that. They prioritize truth over accuracy. Paintings over photographs.

As scientists we focus on accuracy, as we should in our scientific publications. However being fully accurate can distract from the "truth", the underlying message you want to say, particularly in the title, abstract and introduction of our papers

Even more so when we promote our research to the public. A science writer once lamented to me that scientists would focus too much on the full accuracy of the science and the names behind it, even though neither serves the reader well.

Reminds me of the recent Netflix movie Don't Look Up satirizes scientists trying to communicate an end-of-the-world event to an untrusting society. I wish it was a better movie but still worth watching just to see Leo DiCaprio and Jennifer Lawrence play scientists frustrated with their ability to communicate a true existential crisis to the government and the general public.

So how should we as scientists try to frame our messaging to get people onboard, particularly when we say things they don't want to hear? Most importantly, how do scientists regain trust in a world where trust is in short supply. Perhaps we should paint more and photograph less.

Sunday, January 23, 2022

Personal Reflections on Dick Lipton in honor of his 1000th blog/75th bday.

Is it an irony that Lipton's 1000th post and 75th bday are close together? No. Its a coincidence. People use irony/paradox/coincidence interchangeably. Hearing people make that mistake makes me literally want to strangle them.

The community celebrated this milestone by having talks on zoom in Lipton's honor. The blog post by Ken Regan that announced the event and has a list of speakers is here. The talks were recorded so they should be available soon. YEAH KEN for organizing the event! We may one day be celebrating his 2000th blog post/Xth bday.

I will celebrate this milestone by writing on how Lipton and his work have inspired and enlightened me.

1) My talk at the Lipton zoom-day-of-talks was on the Chandra-Furst-Lipton (1983) paper (see here) that sparked my interest in Ramsey Theory, lead to a paper I wrote that improved their upper and lower bounds, and lead to an educational open problem that I posted on this blog, that was answered. There is still more to do. An expanded version of the slide talk I gave on the zoom-day is here. (Their paper also got me interested in Communication complexity.)

2) I read the De Millo-Lipton-Perlis (1979) paper (see here) my first year in graduate school and found it very enlightening. NOT about program verification, which I did not know much about, but about how mathematics really works. As an ugrad I was very much into THEOREM-PROOF-THEOREM-PROOF as the basis for truth. This is wrongheaded for two reasons (1) I did not see the value of intuition, and (2) I did not realize that the PROOF is not the END of the story, but the BEGINNING of a process of checking it- many people over time have to check a result. DLP woke me up to point (2) and (to a lesser extend) point (1). A scary thought: most results in math, once published, are never looked at again. So their could be errors in the math literature. However, the important results DO get looked at quite carefully. Even so, I worry that an important result will depend on one that has not been looked at much...Anyway, a link to a blog post about a symposium about DLP is here.

3) The Karp-Lipton theorem is: if SAT has poly sized circuits than PH collapses (see here), It connects uniform and non-uniform complexity. This impressed me but also made me thing about IF-THEN statements. In this case something we don't think is true implies something else we don't think is true. So--- do we know something? Yes! The result has been used to get results like

If GI is NPC then PH collapses.

This is evidence that GI is not NPC.

4) Lipton originally blogged by himself and a blog book came out of that. I reviewed it in this column. Later it became the Lipton-Regan blog, which also gave rise to a book, which I reviewed here. Both of these books inspired my blog book. This is a shout-out to BOTH Lipton AND Regan.

5) Lipton either thinks P=NP or pretends to since he wants people to NOT all think the same thing. Perhaps someone will prove P NE NP while trying to prove P=NP. Like in The Hitchhiker's Guide to the Galaxy where they say that to fly, you throw yourself on the ground and miss. I took Lipton's advice in another context: While trying to prove that there IS a protocol for 11 muffins, 5 students where everyone gets 11/5 and the smallest piece is 11/25, I wrote down what such a protocol would have to satisfy (I was sincerely trying to find such a protocol) and ended up proving that you could not do better than 13/30 (for which I already had a protocol). Reminds me of a quote attributed to Erdos: when trying to prove X, spend half your time trying to prove X and half trying to prove NOT(X).

6) Lipton had a blog post (probably also a paper someplace) about using Ramsey Theory as the basis for a proof system (see here). That inspired me to propose a potential randomized n^{log n) algorithm for the CLIQUE-GAP problem (see here). The comments showed why the idea could not work-- no surprise as my idea would have lead to NP contained in RTIME(n^{log n}). Still, it was fun to think about and I learned things in the effort.

Wednesday, January 12, 2022

On the California Math Framework

Guest post by Boaz Barak and Jelani Nelson

In a recent post, Lance Fortnow critiqued our open letter on the proposed revisions for the California Mathematics Framework (CMF). We disagree with Lance’s critique, and he has kindly allowed us to post our rebuttal here (thank you Lance!).

First, let us point out the aspects where we agree with both Lance and the authors of the CMF. Inequality in mathematical education, and in particular the obstacles faced by low-income students and students of color, is a huge problem in the US at large and California in particular. As a Black mathematician, this portion of the CMF’s introduction particularly resonated with me (Jelani):

Girls and Black and Brown children, notably, represent groups that more often receive messages that they are not capable of high-level mathematics, compared to their White and male counterparts (Shah & Leonardo, 2017). As early as preschool and kindergarten, research and policy documents use deficit-oriented labels to describe Black and Latinx and low-income children’s mathematical learning and position them as already behind their white and middle-class peers (NCSM & TODOS, 2016).

We agree with the observation that bias in the public education system can have a negative impact on students from underrepresented groups. Where we strongly oppose the CMF though is regarding their conclusions on how to address this concern.

The CMF may state that they are motivated by increasing equity in mathematics. However, if we read past the introduction to the actual details of the CMF revisions, then we see they suffer from fundamental flaws, which we believe if implemented, would exacerbate educational gaps, and in particular make it harder for low-income students and students of color to reach and be successful in college STEM.

You can read our detailed critique of the CMF, but the revisions we take issue with are:

Recommendation to drop the option of Algebra I in the 8th grade
Recommendation to offer (and in fact push and elevate above others) a “data science” pathway for high school education as an alternative to the traditional Algebra and Geometry curriculum. While data science can be a deep and important field, teaching it for students without a math background will be necessarily shallow. Indeed, the proposed data science courses focus on tools such as using spreadsheets etc., and do not provide mathematical foundations.

1 and 2 make it all but impossible for students that follow the recommended path to reach calculus (perhaps even pre-calculus) in the 12th grade. This means that such students will be at a disadvantage if they want to pursue STEM majors in college. And who will be these students? Since the CMF is only recommended, wealthier school districts are free to reject it, and some already signalled that they will do so. Within districts that do adopt the recommendations, students with means are likely to take private Algebra I courses outside the curriculum (as already happened in San Francisco), and reject the calculus-free “data science” pathway. Hence this pathway will amount to a lower-tier track by another name, and worse than now, students will be tracked based on whether their family has the financial means to supplement the child’s public education with private coursework.

Notably, though the CMF aims to elevate data science, we’ve had several data science faculty at the university level express disapproval of the proposal by signing our opposition letter, including a founding faculty member of the Data Science Institute at UCSD, and several others who are directors of various undergraduate programs at their respective universities, including four who direct their universities' undergraduate data science programs (at Indiana University, Loyola University in Chicago, MIT, and the University of Wisconsin)!

One could say that while the framework may hurt low-income or students of color who want to pursue STEM in college, it might help other students who are not interested in STEM. However, interest in STEM majors is rapidly rising, and with good reasons: employment in math occupations is projected to grow much faster than other occupations. With the increasing centrality of technology and STEM to our society, we urgently need reforms that will diversify these professions rather than the other way around.

As a final note, Lance claimed that by rejecting the CMF, we are “defending the status quo”. This is not true. The CMF revisions are far from the “only game in town” for improving the status quo in mathematics education. In fact, unlike these largely untested proposals, there is a history of approaches that do work for teaching mathematics for under-served populations. We do not need to change the math itself, just invest in more support (including extracurricular support) for students from under-resourced communities. For example, Bob Moses’ Algebra Project has time and again taken the least successful students according to standardized exams, and turned them into a cohort that outperformed state averages in math. One of our letter’s contact people is Adrian Mims, an educator with 27 years of experience, whose dissertation was on "Improving African American Achievement in Geometry Honors" and who went on to found The Calculus Project, a non-profit organization creating a pathway for low-income students and students of color to succeed in advanced mathematics.

To close, a critique of the proposed CMF revision is not a defense of the status quo. Even if change is needed, not all change is good change, and our letter does make some recommendations on the front, one of which is a matter of process: if a goal is to best prepare Californian youth for majors in data science and STEM more broadly, and ultimately careers in these spaces, then involve college-level STEM educators and STEM professionals in the Curriculum Framework and Evaluation Criteria Committee.

Sunday, January 09, 2022

Math problems involving Food

A few people emailed me an Math article on arxiv about cutting a pizza, and since I wrote the book (literally) on cutting muffins, they thought it might interest me. It did, though perhaps not in the way they intended. I got curious about math problems that involve food. Here are some

The Muffin Problem. See my book (here), or my website (here)

The Candy Sharing Game. See this paper (here).

Sharing a pizza. See this paper (here)

Cake Cutting. See this book (here) or google Fair Division on amazon

Chicken McNugget Problem. See this paper (here)

The Ham Sandwich Theorem. See this paper (here)

Spaghetti Breaking Theorem. See this paper (here)

Perfect Head on a Beer. See this paper (here)

A smorgasbord of math-food connections, see this pinterest posting (here)

And of course the question that has plagued mankind since the time of Stonehenge:

Why do Hot Dogs come in packs of 10 and Hot Dog buns in Packs of 8 (here)

All of these problems could have been stated without food (The Chicken McNugget Problem is also Frobenius's Problem) but they are easier to understand with food.

I am sure I missed some. If you now of any other food-based math problems, leave a comment.

Sunday, January 02, 2022

Did Betty White die in 2021?/Why do people have their `end-of-the-year' lists before the end of the year?

I am looking at Parade Magazine's issue whose cover story is

We say goodbye to the stars we lost in 2021.

The date on the magazine is Dec 19-26. Betty White is not in it. Neither is Bishop Tutu. Why not? Maybe they did not die in 2021. No, that's not it. They died after the magazine appeared. They also won't be in the issue a year from now which has cover story

We say goodbye to the stars we lost in 2022.

Why do magazines and TV shows have their end-of-the-year stuff before the end of the year? Because they want to beat the competition? Because they all do it, so its a race-to-the-bottom? Tradition?

This blog does the same--- we already posted our end-of-the-year post. Every year I worry that someone will prove P=NP or P NE NP between Dec 24 and Jan 1. We don't have the Betty-White-Problem since the end-of-the-year post is based on when we blogged about it, not when it happened. So if theorist X died on Dec 27 then we would do the blog obit in Jan, and would mention it in the end-of-the-year post for the next year. (This happened with Martin Kruskal who died on Dec 26, 2006.)

Why do we have our end-of-the-year post before the end of the year? Tradition! That might not be a good reason.

ADDED LATER: Ken Regan in the comments points out that Betty White DID die in 2022 if you use AWE: Anywhere on Earth!

This leads to the following question:

a) If a celebrity dies on Dec 31 but its Jan 1 in some time zones then do they get to be in the

WE SAY GOODBYE TO THE STARS

articles for the Jan 1 year?

b) Is there a general rule on this? I doubt it and I doubt it comes up a lot. I noticed that celebrities dying between Dec 24 and Dec 31 don't make those lists about 20 years ago, and I have never seen a Dec 31 case before, though I am sure there have been some.

c) Note that this `bad to die in that zone' really only applies to minor celebrities. Betty White and Desmond Tutu did get proper attention when they died.

Thursday, December 23, 2021

Complexity Year in Review 2021

The pandemic hampered many activities but research flourished with a number of great results in complexity. Result of the year goes to

Locally Testable Codes with constant rate, distance, and locality by Irit Dinur, Shai Evra, Ron Livne, Alexander Lubotzky and Shahar Mozes
and independently in
Asymptotically Good Quantum and Locally Testable Classical LDPC Codes by Pavel Panteleev and Gleb Kalachev

They achieved the seemingly impossible, a code that can be checked with constant queries, constant rate, constant distance and constant error. Here's Irit's presentation, a Quanta article and an overview by Oded Goldreich. Ryan O'Donnell presents the Panteleev-Kalachev paper, which also resolved a major open question in quantum coding theory.

Other great results include Superpolynomial Lower Bounds Against Low-Depth Algebraic Circuits by Nutan Limaye, Srikanth Srinivasan and Sébastien Tavenas, The Complexity of Gradient Descent by John Fearnley, Paul W. Goldberg, Alexandros Hollender and Rahul Savani, Slicing the Hypercube is not easy by Gal Yehuda and Amir Yehudayoff, and The Acrobatics of BQP by Scott Aaronson, DeVon Ingram, and William Kretschmer. The latter paper answers a 16-year old open question of mine that suggests you cannot pull out quantumness like you can pull out randomness from a computation.

In computing overall, the story continues to be the growth in machine learning and the power of data. We're entering a phase where data-driven programming often replaces logic-based approaches to solving complex problems. This is also the year that the metaverse started to gain attention. Too early to know where that will take use, but the virtual space may become as disruptive as the Internet over the next decade, and its potential effect in research and education should not be ignored. In the next pandemic, we may wonder how we survived earlier pandemics without it.

The NSF might be going through some major changes and significant increases, or not, especially with the Build Back Better bill on hold. The Computing Research Policy Blog can help you through the morass.

We remember Matthew Brennan, Benny Chor, Alan Hoffman, Arianna Rosenbluth, Walter Savitch, Alan Selman, Bob Strichartz and Stephen Wiesner.

We thank our guest posters Paul Beame, Varsha Dani, Evangelos Georgiadis, Bogdan Grechuk, David Marcus and Hunter Monroe.

In May I posted on emerging from the pandemic. Then came Delta. Now comes Omicron pushing us back online next month. I hope Pi-day doesn't bring us the Pi-variant.

Wishing for a more normal 2022.

Friday, December 17, 2021

Fifty Years of P vs. NP and the Possibility of the Impossible

I have a new article Fifty Years of P vs. NP and the Possibility of the Impossible, to mark the anniversary of the 1971 publication of Steve Cook's seminal paper, a month too late in the January 2022 Communications of the ACM.

Initially Moshe Vardi asked me to update my 2009 CACM survey The Status of the P versus NP Problem. The P vs NP problem hasn't changed much but computing has gone through dramatic changes in the last dozen years. I try to view P vs NP in the lens of modern optimization and learning, where we are heading to a seemingly impossible Optiland (a play on Impagliazzo's Pessiland), where we can solve many of the toughest NP-complete problems in practice and yet cryptography remains unscathed.

CACM produced a slick companion video to the article.

Fifty Years of P Versus NP and the Possibility of the Impossible from CACM on Vimeo.

Sunday, December 12, 2021

Did Lane Hemaspaandra invent the Fib numbers?

(I abbreviate Fibonacci by Fib throughout. Lane Hemaspaandra helped me with this post.)

We all learned that Fib invented or discovered the Fib Numbers:

f_0=1,

f_1=1, and

for all n\ge 2, f_n = f_{n-1} + f_{n-2}.

We may have learned that they come up in nature (NOT true, see here) or that they were important in mathematics (questionable--see this blog post here which says no, but some comments give good counterarguments). You also learned that Fibonacci was the mathematician who first studied them. Also not true! This one surprised me.

1) I came across this blog post: here that says they were invented by Hemachandra first. Wow--I then recalled that Lane Hemaspaandra's birth surname was Lane Hemachandra (he married Edith Spaan and they are now both Hemaspaandra). So naturally I emailed him to ask how a 20th-century person could invent something earlier than 1170. He told me a picture of him in the basement ages while he stays young.

2) It would be nice to say OH, let's call them Hemachandra numbers (would that be easier than convincing the world to use tau instead of pi,? See The Tau Manifesto) and let students know that there were people other than Europeans who did math back then. But even that story is not as simple as it seems. Lane emailed me this article: here that tells the whole muddled story. (In what follows I leave out the accents.)

Virahanka seems to be the formulator of the Fib recurrence, though not quite the numbers. His motivation was Sanskrit Poetry. He did this between 600 and 800 AD.

Gopala, in work prior to 1135, was aware of Virhanka's work. In particular he know about the inductive rule. But he also set the initial values and generated numbers, so he was the first to have the sequence we now call the Fib numbers. His motivation was Sanskrit Poetry.

Hemachandra in 1150 also formulated them, independently. His motivation was Sanskrit poetry.

(I will learn some Sanskrit poetry the next time I teach Discrete Math so I can give the students an application of the material!)

So does Virhanka win? Hardly:

Acarya Pingala's writings from the 5th or 6th century BC (YES- BC!) indicate that he knew about the Fib numbers in the context of (you guessed it!) Sanskrit poetry.

3) I would support changing the name of the Fib Numbers to the Pingala numbers. This is both good and bad news for Lane:

a) Bad news in that he does not get a sequence of number that shares his pre-marriage name.

b) Good news in that if I settled on Hemachandra numbers then Lane and Edith would have to decide if 0 or 1 or 2 of them want to change their name to Hemachandra. I would guess not--too complicated. Plus one name change in a life is enough.

4) The story (especially the articles I pointed to) shows just how complicated history can get. Even a straightforward question like:

Who first formulated the Fib Numbers?

might not have a well-defined answer. Perhaps this is the wrong question since if people formulate the concept independent of each other, they should all get some credit. Even if the authors are 1000 years apart.

Side note: Independent Discovery may be harder to assert now since, with the web, Alice could have seen Bob's paper so it may be hard to call Alice's discovery independent. As I have mentioned before on this blog, my students have a hard time with the notion of Cook and Levin coming up with NP-completeness independently since surely one would have posted it and the other would have seen it. An era before posting was possible! Unimaginable to them. Sometimes even to me.

Wednesday, December 08, 2021

Defending the Status Quo

When the Wall Street Journal's editorial board and the New York Post endorse your efforts, that should ring warning bells.

Several members of the theory and mathematics community and have written and endorsed an Open Letter on K-12 Mathematics that attacks the proposed revisions to the California Mathematics Framework. I have mixed feelings about these efforts.

Certainly the CMF has its issues, and the FAQs protest too much. But the letter goes too far in the other direction, arguing mainly for the status quo that worked well for those who signed the letter, very few of which have significant experience in K-12 education. The open letter allows for only incremental change unlikely to lead to any significant improvements.

Before you sign the letter, take a look at the CMF introduction

To develop learning that can lead to mathematical power for all California students, the framework has much to correct; the subject and community of mathematics has a history of exclusion and filtering, rather than inclusion and welcoming. There persists a mentality that some people are “bad in math” (or otherwise do not belong), and this mentality pervades many sources and at many levels. Girls and Black and Brown children, notably, represent groups that more often receive messages that they are not capable of high-level mathematics, compared to their White and male counterparts. As early as preschool and kindergarten, research and policy documents use deficit-oriented labels to describe Black and Latinx and low-income children’s mathematical learning and position them as already behind their white and middle-class peers. These signifiers exacerbate and are exacerbated by acceleration programs that stratify mathematics pathways for students as early as sixth grade.
Students internalize these messages to such a degree that undoing a self-identity that is “bad at math” to one that “loves math” is rare. Before students have opportunities to excel in mathematics, many often self-select out of mathematics because they see no relevance for their learning, and no longer recognize the inherent value or purpose in learning mathematics.

You may or may not agree with the CMF approach, but it's hard to deny the real challenges they are trying to address and students they are trying to help. If you don't agree with the CMF, work with them to come up with a good alternative that helps create a more inclusive mathematical citizenry. An outright rejection of the approach won't fix problems and probably won't be taken seriously, except from the conservative press.

Update (1/12/22): Boaz Barak and Jelani Nelson respond to this post.

Sunday, December 05, 2021

Yes Virginia, there is a Santa Clause for Complexity Theorists, If you Only Believe

(Guest Post by Hunter Monroe) In this guest post and discussion paper, I present a remarkable set of structurally similar conjectures which, if you only believe them, conjure up a dream world for theorists by asserting a new form of diagonalization based on naturally nonrelativizing facts invoking a deep linkage to underlying noncomputable languages. These conjectures, all stronger than the things to be proved, imply that the polynomial hierarchy does not collapse because the arithmetic hierarchy does not collapse, and P≠NP≠coNP. The diagonalizations imply the existence of hard instances, with the result that many complexity classes have speedup, including the Π side of PH, and proof speedup for tautologies stems from proof speedup for arithmetic. These conjectures do two things: (1) let us explore a hypothetical world where many open problems about uniform complexity classes are resolved and consider steps beyond e.g. to circuit complexity, and (2) reduce numerous open questions to a single plausible claim about how Turing machines have limited information about noncomputable languages. This would potentially allow a slew of open questions to be resolved at once with a skeleton key.

The following conjecture implying $\textbf{P!=NP}$ is remarkable: it hints at a deeper, unnoticed relationship between complexity and noncomputability; it is equivalent to speedup for all paddable $\textbf{coNP}$-complete languages and in proof length for tautologies; tweaked versions would separate other complexity classes; and if true it is a nonrelativizing fact.

Conjecture: (*) For any deterministic TM $M$ accepting the $\textbf{coNP}$-complete language ``nondeterministic TM $N$ on input $x$ does not halt within $t$ steps'' ($\texttt{coBHP}$), there exists a $\langle N',x'\rangle\in\texttt{coHP}$ ($N'$ does not halt on $x'$, ever) with $M$'s running time $f(t)=T_M(N',x',1^t)$ not bounded by any polynomial.

If true, (*) is a nonrelativizing fact; there is no hard $\langle N, x\rangle$ for $M$ with an exponential time oracle. The noncomputable language $\texttt{coHP}$ potentially explains why (*) is true, by analogy with this trivial theorem:

Theorem. For any $M$ accepting $\texttt{coBHP}$, there exists some non-halting $\langle N',x'\rangle\in\texttt{coHP}$ with $f(t)=T_M(N',x',1^t)$ not bounded by a constant.

Otherwise, $M$ would accept $\texttt{coHP}$ and have too much information about a non-c.e. language; (*) is just a stronger version. In the extreme, any $M$ is completely ignorant about some $\langle N',x'\rangle$ and requires on the order of $2^t$ steps to rule out every potentially halting branch. Tweaking (*) yields conjectures implying that $\textbf{PH}$ does not collapse:

Conjectures: For any $M^{\Pi^p_i}$ that accepts $\Pi^p_{i+1}=\{\langle N^{\Pi^p_i},x,1^t\rangle|$ $N^{\Pi^p_i}$ does not halt on input $x$ in $t$ steps$\}$, there is a non-halting $\langle N'^{\Pi^p_i},x'\rangle\in \Pi_i$ with $M^{\Pi^p_i}$'s running time not bounded by any polynomial.

By invoking every level $\Pi_i$ of the arithmetic hierarchy ($\textbf{AH}$), these conjectures state that the noncollapse $\textbf{PH}$ is due to the noncollapse of $\textbf{AH}$. The conjecture (*) can be calibrated depending on the desired separation to equip $M$ with an oracle or nondeterminism or constrain its resources, to choose a resource-bounded complete problem and underlying non-c.e. language, and to fine tune how hard a hard instance needs to be.

Proof speedup for tautologies (equivalent to (*)) may stem from the proof speedup for arithmetic that occurs when adding undecidable statements as new axioms, allowing new theorems to be proved and shortening the proof of existing theorems. This literature translates any arithmetic theorem free of existential quantifiers into a tautology by replacing $k$-bit numbers with $k$ Boolean variables. The analogy with (*) suggests this stronger conjecture may in fact also be equivalent:

Conjecture: The following two statements are equivalent: (1) there is no optimal propositional proof system; and (2) Any propositional proof system $P$ is outperformed by a sufficiently powerful conservative extension $T$ of the Peano arithmetic, and $T$ can be improved further by adding any undecidable statement in $T$ as a new axiom.

So (*) is a Swiss army knife for generating conjectures that give us a vision of a world in which answering essentially one question would serve as a skeleton key that unlocks many open problems.

Wednesday, December 01, 2021

TheoretiCS: A New TCS Journal

Guest Post from Paul Beame on behalf of the TheoretiCS Foundation

I am writing to let you know of the launch today of TheoretiCS, a new fully open-access journal dedicated to Theoretical Computer Science developed by the members of our community that I have been involved in and for which I gave a brief pre-announcement about at STOC.

This journal has involved an unprecedented level of cooperation of representatives of leading conferences from across the entire Theoretical Computer Science spectrum. This includes representatives from STOC, FOCS, SODA, CCC, PODC, SoCG, TCC, COLT, ITCS, ICALP, which may be more familiar to readers of your blog, as well as from LICS, CSL, CONCUR, ICDT, MFCS and a number of others.

Two Points of Emphasis

Our quality objective - TheoretiCS aims at publishing articles of a very high quality, and at becoming a reference journal on par with the leading journals in all of Theoretical Computer Science
The inclusive view of Theoretical Computer Science that this journal represents, which is evident in the choice of two excellent co-editors-in-chief, Javier Esparza and Uri Zwick, and an outstanding inaugural editorial board.

Guiding principles and objectives

We believe that our field (and science in general) needs more 'virtuous' open-access journals, a whole eco-system of them, with various levels of specialization and of selectivity. We also believe that, along with the structuring role played by conferences in theoretical computer science, we collectively need to re-develop the practice of journal publications.
The scope of TheoretiCS is the whole of Theoretical Computer Science, understood in an inclusive meaning (concretely: including, but not restricted to, the Theory of Computing and the Theory of Programming; or equivalently, the so-called TCS-A and TCS-B, reminiscent of Jan van Leeuwen et al.'s Handbook of Theoretical Computer Science).
Our aim is to rapidly become a reference journal and to contribute to the unity of the Theoretical Computer Science global community. In particular, we will seek to publish only papers that make a very significant contribution to their respective fields, that strive to be accessible to a wider audience within theoretical computer science, and that are, generally, of a quality on par with the very best journals in the field.
TheoretiCS adheres to the principles of 'virtuous' open-access: there is no charge to read the journal, nor to publish in it. The copyright of the papers remains with the authors, under a Creative Commons license.

Organization and a bit of history

The project started in 2019 and underwent a long gestation. From the start, we wanted to have a thorough discussion with a wide representation of the community, on how to best implement the guiding principles sketched above. It was deemed essential to make sure that all fields of theoretical computer science would feel at home in this journal, and that it would be recognized as a valid venue for publication all over the world.

This resulted in the creation of an Advisory Board, composed of representatives of most of the main conferences in the field (currently APPROX, CCC, COLT, CONCUR, CSL, FOCS, FoSSaCS, FSCD, FSTTCS, ICALP, ICDT, ITCS, LICS, MFCS, PODC, SoCG, SODA, STACS, STOC, TCC) and of so-called members-at-large.

Logistics and answers to some natural questions

The journal is published by the TheoretiCS Foundation, a non-profit foundation established under German law. Thomas Schwentick, Pascal Weil, and Meena Mahajan are officers of the foundation.
TheoretiCS is based on the platform episciences.org, in the spirit of a so-called overlay journal.
The Advisory Board, together with the Editors-in-Chief and the Managing Editors, spent much of their efforts in designing and implementing an efficient 2-phase review system: efficient in terms of the added-value it brings to the published papers and their authors, and of the time it takes. Yet, as this review system relies in an essential fashion on the work and expertise of colleagues (like in all classical reputable journals), we can not guarantee a fixed duration for the evaluation of the papers submitted to TheoretiCS.
Being charge-free for authors and readers does not mean that there is no cost to publishing a journal. These costs are supported for the foreseeable future by academic institutions (at the moment, CNRS and Inria, in France; others may join).
The journal will have an ISSN, and each paper will have a DOI. There will be no print edition.

Sunday, November 28, 2021

Open: 4 colorability for graphs of bounded genus or bounded crossing number (has this been asked before?)

I have co-authored (with Nathan Hayes, Anthony Ostuni, Davin Park) an open problems column on the topic of this post. It is here.

Let g(G) be the genus of a graph and cr(G) be the crossing number of a graph.

As usual chi(G) is the chromatic number of a graph.

KNOWN to most readers of this blog:

{G: \chi(G) \le 2} is in P

{G: \chi(G) \le 3 and g(G)\le 0 } is NPC (planar graph 3-col)

{G : \chi(G) \le 4 and g(G) \le 0} is in P (it's trivial since all planar graphs are 4-col)

{G: \chi(G) \le 3 and cr(G) \le 0} is NPC (planar graph 3-col)

{G: \chi(G) \le 4 and cr(G) \le 0} is in P (trivial since all planar graphs are 4-col)

LESS WELL KNOWN BUT TRUE (and brought to my attention by my co-authors and also Jacob Fox and Marcus Schaefer)

For all g\ge 0 and r\ge 5, {G : \chi(G) \le r and g(G) \le g} is in P

For all c\ge 0 and r\ge 5, {G : \chi(G) \le r and cr(G) \le c} is in P

SO I asked the question: for various r,g,c what is the complexity of the following sets:

{G: \chi(G) \le r AND g(G) \le g}

{G: \chi(G) \le r AND cr(G) \le c}

SO I believe the status of the following sets is open

{G : \chi(G) \le 4 and g(G)\le 1} (replace 1 with 2,3,4,...)

{G : \chi(G) \le 4 and cr(G)\le 1} (replace 1 with 2,3,4...)

QUESTIONS

1) If anyone knows the answer to these open questions, please leave comments.

2) The paper pointed to above mentions all of the times I read of someone asking questions like this. There are not many, and the problem does not seem to be out there. Why is that?

a) It's hard to find out who-asked-what-when. Results are published, open problems often are not. My SIGACT News open problems column gives me (and others) a chance to write down open problems; however, such venues are rare. So it's possible that someone without a blog or an open problems column raised these questions before. (I checked cs stack exchange- not there- and I posted there but didn't get much of a response.)

b) Proving NPC seems hard since devising gadgets with only one crossing is NOT good enough since you use the gadget many times. This may have discouraged people from thinking about it.

c) Proving that the problems are in P (for the r\ge 6 case) was the result of using a hard theorem in graph theory from 2007. The authors themselves did not notice the algorithmic result. The first published account of the algorithmic result might be my open problems column. This may be a case of the graph theorists and complexity theorists not talking to each other, though that is surprising since there is so much overlap that I thought there was no longer a distinction.

d) While I think this is a natural question to ask, I may be wrong. See here for a blog post about when I had a natural question and found out why I may be wrong about the problems naturalness.

Monday, November 22, 2021

Finding an element with nonadaptive questions

Suppose you have a non-empty subset S of {1,...N} and want to find an element of S. You can ask arbitrary questions of the form "Does S contain an element in A?" for some A a subset of {1,...N}. How many questions do you need?

Of course you can use binary search, using questions of the form "is there number greater than m in S?". This takes log N questions and it's easy to show that's tight.

What if you have to ask all the questions ahead of time before you get any of the answers? Now binary search won't work. If |S|=1 you can ask "is there a number in S whose ith bit is one?" That also takes log N questions.

For arbitrary S the situation is trickier. With randomness you still don't need too many questions. Mulmuley, Vazirani and Vazirani's isolating lemma works as follows: For each i <= log N, pick a random weight w_i between 1 and 2 log N. For each element m in S, let the weight of m be the sum of the weights of the bits of m that are 1. With probability at least 1/2 there will be an m with an unique minimum weight. There's a cool proof of an isolating lemma by Noam Ta-Shma.

Once you have this lemma, you can ask questions of the form "Given a list of w_i's and a value v, is there an m in S of weight v whose jth bit is 1?" Choosing w_i and v at random you have a 1/O(log N) chance of a single m whose weight is v, and trying all j will give you a witness.

Randomness is required. The X-search problem described by Karp, Upfal and Wigderson shows that any deterministic procedure requires essentially N queries.

This all came up because Bill had some colleagues looking a similar problems testing machines for errors.

I've been interested in the related question of finding satisfying assignments using non-adaptive NP queries. The results are similar to the above. In particular, you can randomly find a satisfying assignment with high probability using a polynomial number of non-adaptive NP queries. It follows from the techniques above, and even earlier papers, but I haven't been able to track down a reference for the first paper to do so.

Wednesday, November 17, 2021

CS Slow to Change?

Back in March of 2019 I wrote

I was also going to post about Yann LeCun's Facebook rant about stodgy CS departments but then Yann goes ahead and wins a Turing award with Geoffrey Hinton and Yoshua Bengio for their work on machine learning. I knew Yann from when we worked together at NEC Research in the early 2000's and let's just congratulate him and the others and let them bask in glory for truly transforming how we think of computing today. I'll get back to his post soon enough.

So not that soon. Yann's post was from 2015 where he went after "stodgy" CS departments naming Yale, Harvard, Princeton and Chicago.

CS is a quickly evolving field. Because of excess conservatism, these departments have repeatedly missed important trends in CS and related field, such as Data Science. They seem to view CS as meaning strictly theory, crypto, systems and programming languages, what some have called "core CS", paying lip service to graphics, vision, machine learning, AI, HCI, robotics, etc. But these areas are the ones that have been expanding the fastest in the last decades, particularly machine learning and computer vision in the last decade....It is quite common, and somewhat natural, that newer areas (eg ML) be looked down upon by members of older, more established areas (eg Theory and Systems). After all, scientists are professional skeptics. But in a fast evolving disciplines like CS and now Data Science, an excessive aversion to risk and change is a recipe for failure.

We've seen some changes since. Yale's Statistics Department is now Statistics and Data Science. The University of Chicago has a new Data Science undergrad major and institute.

I wonder if that's the future. CS doesn't really change that much, at least not quickly. Data science, and perhaps cybersecurity, evolve as separate fields which only have limited intersection with traditional CS. The CS degree itself just focuses on those interested in how the machines work and the theory behind them. We're busy trying to figure this out at Illinois Tech as are most other schools. And what about augmented/virtual reality and the metaverse, quantum computing, fintech, social networks, human and social factors and so on? How do you choose which bets to make?

Most of all, universities, traditionally slowly moving machines, need to far more agile even in fields outside computing since the digital transformation is affecting everything. How do you plan degrees when the computing landscape when students graduate is different from when they start?

Sunday, November 14, 2021

When did Computer Science Theory Get so Hard?

I posted on When did Math get so hard? a commenter pointed out that one can also ask

When did Computer Science Theory Get so Hard?

For the Math-question I could only speculate. For CS- I WAS THERE! When I was in Grad School one could learn all of Complexity theory in a year-long course (a hard one, but still!). The main tools were logic and combinatorics. No Fourier Transforms over finite fields. I am NOT going to say

Those were the good old days.

I will say that it was easier to make a contribution without knowing much. Oddly enough, it is MORE common for ugrads and grad students to publish NOW then it was THEN, so that may be a pair of ducks.

Random Thoughts on This Question

1) The Graph Minor Theorem was when P lost its innocence. Before the GMT most (though not all) problems in P had easy-to-understand algorithms using algorithmic paradigms (e.g., Dynamic Programming) and maybe some combinatorics. Computational Number Theory used.... Number Theory (duh), but I don't think it was hard number theory. One exception was Miller's Primality test which needed to assume the Extended Riemann Hypothesis- but you didn't have to understand ERH to use it.

1.5) GMT again. This did not only give hard-deep-math algorithms to get problems in P. It also pointed to how hard proving P NE NP would be--- to rule out something like a GMT-type result to get SAT in P seems rather hard.

2) Oracle Constructions were fairly easy diagonalizations. It was bummed out that I never had to use an infinite injury priority argument. That is, I knew some complicated recursion theory, but it was never used.

2.5) Oracles again. Dana Angluin had a paper which used some complicated combinatorics to construct an oracle, see here. Later Andy Yao showed that there is an oracle A such that PH^A NE PSPACE^A. You might know that result better as

Constant depth circuits for parity must have exponential size.

I think we now care about circuits more than oracles, see my post here about that issue. Anyway, oracle results since then have used hard combinatorial and other math arguments.

3) The PCP result was a leap forward for difficulty. I don't know which paper to pick as THE Leap since there were several. And papers after that were also rather difficult.

4) I had a blog post here where I asked if REDUCTIONS ever use hard math. Some of the comments are relevant here:

Stella Biderman: The deepest part of the original PCP theorem is the invention of the VC paradigm in the 1990's.

Eldar: Fourier Theory was introduced to CS with Hastad's Optimal Approximation results. Today it might not be considered deep, but I recall when it was.

Also there are Algebraic Geometry codes which use downright arcane mathematics...

Hermann Gruber refers to Comp Topology and Comp Geometry and points to the result that 3-manifold knot genus is NP-complete, see here.

Anonymous (they leave many comments) points to the deep math reductions in arithmetic versions of P/NP classes, and Mulmuley's work (Geometric Complexity Theory).

Timothy Chow points out that `deep' could mean several things and points to a math overflow post on the issue of depth, here.

Marzio De Biasi points out that even back in 1978 there was a poly reduction that required a good amount of number theory: the NPC of the Diophantine binary quad equation

ax^2 + by + c = 0

by Manders and Adelman, see here.

(Bill Comment) I tend to think this is an outlier- for the most part, CS theory back in the 1970's did not hard math.

4) Private Info Retrieval (PIR). k databases each have the same n-bit string and cannot talk to each other. a server wants the ith bit and (in the info-theoretic case) wants the DBs to know NOTHING about the question i.

Easy results (to understand) 2-server, n^{1/3}. here.

Hard results: 2-server n^{O(\sqrt{loglogn/log n)}, here.

(I have a website on PIR, not maintained, here.)

5) Babai's algorithm for GI in quasi-poly time used hard math.

6) If I knew more CS theory I am sure I would have more papers listed.

But now its your turn:

When did you realize Gee, CS theory is harder than (a) you thought, (b) it used to be.

Thursday, November 11, 2021

20 Years of Algorithmic Game Theory

Twenty years ago DIMACS hosted a Workshop on Computational Issues in Game Theory and Mechanism Design. This wasn't the very beginning of algorithmic game theory, but it was quite the coming out party. From the announcement

The research agenda of computer science is undergoing significant changes due to the influence of the Internet. Together with the emergence of a host of new computational issues in mathematical economics, as well as electronic commerce, a new research agenda appears to be emerging. This area of research is collectively labeled under various titles, such as "Foundations of Electronic Commerce", Computational Economics", or "Economic Mechanisms in Computation" and deals with various issues involving the interplay between computation, game-theory and economics.
This workshop is intended to not only summarize progress in this area and attempt to define future directions for it, but also to help the interested but uninitiated, of which there seem many, understand the language, the basis principles and the major issues.

Working at the nearby NEC Research Institute at the time I attended as one of those "interested but unititated."

The workshop had talks from the current and rising stars in the field in both the theoretical computer science, AI and economics communities. The presentations included some classic early results including Competitive Analysis of Incentive Compatible Online Auctions, How Bad is Selfish Routing? and the seminal work on Competitive Auctions.

Beyond the talks, just having the powerhouse of people at the meeting, established players, like Noam Nisan, Vijay Vazirani, Eva Tardos and Christos Papadimitriou, with several newcomers who are now the established players including Tim Roughgarden and Jason Hartline just to mention a few from theoretical computer science.

The highlight was a panel discussion on how to overcome the methodological differences between computer scientists and economic game theorists. The panelists were an all-star collection of John Nash, Andrew Odlyzko, Christos Papadimitriou, Mark Satterthwaite, Scott Shenker and Michael Wellman. The discussion focused on things like competitive analysis though to me, in hindsight, the real difference is between the focus on models (game theory) vs theorems (CS).

Interest in these connections exploded after the workshop and a new field blossomed.

Sunday, November 07, 2021

Reflections on Trusting ``Trustlessness'' in the era of ``Crypto'' Blockchains (Guest Post)

I trust Evangelos Georgiadis to do a guest post on Trust and Blockchain.

Today we have a guest post by Evangelos Georgiadis on Trust. It was written before Lance's post on trust here but it can be viewed as a followup to it.

And now, here's E.G:

==========================================================

Trust is a funny concept, particularly in the realm of blockchains and "crypto".

Do you trust the consensus mechanism of a public blockchain?

Do you trust the architects that engineered the consensus mechanism?

Do you trust the software engineers that implemented the code for the consensus mechanism?

Do you trust the language that the software engineers used?

Do you trust the underlying hardware that that the software is running?

Theoretical Computer Science provides tools for some of this. But then the question becomes

Do you trust the program verifier?

Do you trust the proof of security?

I touch on these issues in:

Reflections on Trusting ‘Trustlessness’ in the era of ”Crypto”/Blockchains

which is here. Its only 3 pages so enjoy!

Wednesday, November 03, 2021

A Complexity View of Machine Learning?

Complexity is at its best when it models new technologies so we can study it in a principled way. Quantum computing comes to mind as a good relatively recent example. With machine learning playing an every growing role in computing, how can complexity play a role?

The theory community questions about machine learning typically look at finding mathematical reasons to explain why the models well with little overfitting or trying to get good definitions of privacy, fairness, explainability to mitigate the social challenges of ML. But what about from a computational complexity point of view? I don't have a great answer yet but here are some thoughts.

In much of structural complexity, we use relativization to understand the relative power of complexity classes. We define an oracle as a set A where a machine can ask questions about membership to A and magically get an answer. Relativization can be used to help us define classes like Σ₂^P = NP^NP or allow us to succinctly state Toda's theorem as PH in P^#P.

As I tweeted last week, machine learning feels like an oracle, after all machine learning models and algorithms are typically accessed through APIs and Python modules. What kind of oracle? Definitely not an NP-complete problem like SAT since machine learning fails miserably if you try to use it to break cryptography.

The real information in machine learning comes from the data. For a length parameter n, consider a string x which might be exponential in n. Think of x as a list of labeled or unlabeled examples of some larger set S. Machine learning creates a model M from x that tries to predict whether x is in S. Think of M as the oracle, as some compressed version of S.

Is there a computational view of M? We can appeal to Ockham's razor and consider the simplest model consistent with the data for which x as a set are random in the S that M generates. One can formalize this Minimum Description Length approach using Kolmogorov Complexity. This model is too ideal, for one it can also break cryptography, and typical deep learning models are not simple at all with sometimes millions of parameters.

This is just a start. One could try time bounds on the Kolmogorov definitions or try something different completely. Adversarial and foundational learning models might yield different kinds of oracles.

If we can figure out even a rough complexity way to understand learning, we can start to get a hold of learning's computational power and limitations, which is the purpose of studying complexity complexity in the first place.

Sunday, October 31, 2021

When did Math Get So Hard?

I have been on many Math PhD thesis defense's as the Dean's Representative. This means I don't have to understand the work, just make sure the rules are followed. I've done this for a while and I used to understand some of it but now there are times I understand literally none of it. As a result, when the student leaves the room and we talk among ourselves I ask

When did Math get so hard?

I mean it as a statement and maybe a joke, but I decided to email various people and ask for a serious answer. Here are some thoughts of mine and others

1) When you get older math got harder. Lance blogged on this here

2) When math got more abstract it got harder. Blame Grothendieck.

3) When math stopped being tied to the real work it got harder. Blame Hardy.

4) Math has always been hard. We NOW understand some of the older math better so it seems easy to us, but it wasn't at the time.

5) With the web and more people working in math, new results come out faster so its harder to keep up.

6) All fields of math have a period of time when they are easy, at the beginning, and then as the low-hanging fruit gets picked it gets harder and harder. So if a NEW branch was started it might initially be easy. Counterthought- even a new branch might be hard now since it can draw on so much prior math. Also, the low hanging fruit may be picked rather quickly.