Showing posts with label random. Show all posts
Showing posts with label random. Show all posts

Thursday, June 12, 2008

/dev/random (literally and figuratively)

1. So I'm trying to figure out a way to construct an extractor (in the sense of Barak and Halevi; I'm still unclear in exactly what way this is related to the usual definition of an extractor) from a block cipher.

A construction like this should work pretty well: Determine an m-bit key K from a long string with high min-entropy. Then, use block cipher B with that key to encrypt the all-0 block; then, take consecutive substrings A_i of length m (padding the last if necessary) and compute a sequence defined by E_(i+1) = B(K, E_i xor A_i). The last term of this sequence is the output of the extractor.

Assuming the key is chosen from some distribution with reasonably high min-entropy, then E_i and A_i should be mostly independent. If E_i is "near uniform" (i.e. has min-entropy close to the Shannon entropy) and small, and A_i and E_i are uncorrelated, then the min-entropy is almost additive. The problem is, it's not obvious how to make this rigorous, or even under what conditions it actually holds.

What does this have to do with /dev/random? Well, it's well-known how to build a pseudo-random generator from a block cipher in counter mode. So if we can also construct an extractor from a secure block cipher, we can obtain a robust, secure random generator which has the added bonuses of being 1) small and 2) fast. Which makes it a perfect candidate for replacing/supplementing /dev/random, CryptGenRandom, etc.

2. Anyone know a good resource for learning about expanders? (Other than Luca Trevisan's blog, of course; his series on the subject is pretty much just epic. I need an introduction to the topic, though, which is shallower and broader).

3. How on earth do four Supreme Court justices believe that Gitmo detainees aren't entitled to challenge their detention in civil court? If Scalia and Thomas are such "strict constructionists," why don't they note that the relevant clause of the Constitution reads:
The privilege of the writ of habeas corpus shall not be suspended, unless when in cases of rebellion or invasion, the public safety may require it.
Which, you might notice, says nothing at all about "enemy combatants," or "acceptable alternatives to habeas corpus." Unless they personally amended the Constitution themselves?

Sunday, March 23, 2008

Coloring lattices

A problem:

You have an infinite (2-dimensional) chessboard. Can you 2-color the squares of the chessboard so that a chess king, starting on any one square, can move to at most n distinct squares of the same color for some fixed n? (I.e., that all the connected monochromatic regions are of bounded size?)

If you think about it for a few minutes, it should become obvious that the answer is no. Proving it is a bit more difficult. I don't wanna cut this post (also I'm not sure how to in Blogger), so I'll just put the proof outline in rot13 below.

Fhccbfr bgurejvfr. Vg'f pyrne gung n zbabpuebzngvp ertvba arrqf gb or pbzcyrgryl fheebhaqrq ol n ertvba bs gur bccbfvgr pbybe; vg'f nyfb boivbhf gung gur nern bs gur bhgre ertvba cyhf gur rairybcrq fdhnerf vf fgevpgyl terngre guna gur nern bs gur rairybcrq fdhnerf. Abj, vg'f rnfl gb fubj gung nf gur nern bs n pbaarpgrq ertvba nccebnpurf vasvavgl, fb qbrf gur nern bs na rairybcvat ertvba. Chggvat gurfr snpgf gbtrgure, gur gurberz rnfvyl sbyybjf.

Not that bad, but what happens if we increase the number of colors (and the number of dimensions)? Using an inductive argument, it's possible to color Z^d with (d+1) colors and get regions of bounded size; but I don't see any easy combinatorial argument that gives any (non-constant) lower bound. Any ideas?

Saturday, December 29, 2007

It runs in the family

So at dinner on Wednesday, I noticed that my younger brother (age 14, non-Honors Algebra II, smart but unmotivated, pretty much what I would have been had I had an academically overachieving older sibling) was filling out 16-man brackets. But these weren't NCAA basketball brackets (what he would have been doing with those in December is beyond me anyway), or fantasy NCAA football playoffs, or any of the above. They were 16 competitors labeled from "A" to "P."

So I asked him: What are you doing?
Him: Filling out brackets.
Me: Why?
(Silence).
(A little later)
Me: Chris, why are you filling out brackets all over your paper?
(Silence).
(Later still)
Him: Harrison, I have a question.
Me: Shoot.
Him: If you have 16 people and you know their initial positions in a bracket, what's the minimum number of spots you have to fill in so you can fill in the rest of the bracket?

I'm pretty sure I've been more proud of a member of my family than I was of him at that moment. But I'm having trouble thinking of an example.

[PS: I told him he should guest-post the problem on this blog. He politely declined.]

Wednesday, December 19, 2007

P, BPP, VDW, and all that jazz: A Complex Conversation in Three Parts

BACKGROUND

Louis and I were discussing the problem of finding an arithmetic progression of "special elements" in a general series. It all started like this:

Louis: Awesome question:
Me: ?
Louis: Find the longest arithmetic sequence of awesome XKCD comics
Me: XD
Me: 1. that's totally subjective, 2. that's AWESOME
Louis: well, it's different for every person of coures
Louis: *course
Me: well, of course, since they do have positive density, if xkcd were to continue forever
Me: there'd be arbitrarily long APs...
[...]
Louis: I'm talking a computational
Louis: problem
Me: oh
Me: hm.
Louis: not VDW :P
Me: well that's technically szemeredi...
Me: wait, is finding long APs...
Louis: VDW.
Me: szemeredi, and is it in P?
Louis: VDW.
Louis: And yes, I believe so
Louis: given that there are only n^2 possible AP'sMe: oh, d'oh. gp.
Me: lol
Louis: ;-)
Me: (incidentally, there's more like n^3 partial APs, but still polynomial)
Louis: ah, partial AP's.
Louis: well, we needn't consider only partial AP's
Louis: for each a_i, for each a_j, search the arithmetic sequence going a_i, a_j, a_i + 2(a_j - a_i),...
Louis: as far as you can go until you reach something not in the set.
Me: yeah, yeah
Me: and, btw, it is about n^2 log n [Exercise for the reader: Why?]

I'm many things, but I'm not an Algorithms person, and this was now, to me, an Algorithms question and therefore dead. (It didn't help that I couldn't find any way of checking for an AP in o(n^2) time). But then:

AN EPIPHANY

Me: I'm wondering if this might
Me: if this could provide an oracle separation between P and BPP
Louis: XD
Me: seriously!
Louis: how?
Louis: it's in P!
Me: yeah, but
Me: let's take sets of size
Me: well, let's consider the problem of finding an AP of length at least n in the range [1, superpoly(n)], 2-colored according to some cuh-razy random oracle.
Louis: okay
Me: (actually, we should probably specify that it has to be one specific color)
Louis: you do realize you're now reminding me of the quantum database search problem, too?
Me: let's say that all of the elements of the AP have to be yellow, and the oracle colors a number yellow with some fairly high probability
Louis: okay
Me: actually, let's not make it length n, let's make it length [O(log(superpoly(n)))]
Louis: how will BPP help us?
Me: for almost all oracles, all we have to do is pick a random AP and there's positive probability it's yellow.
Louis: but wait
Louis: if the numbers are colored randomly
Louis: then why would randomized search be any better than brute force?
Me: on average? duh, it wouldn't. worst-case, though...
Me: it seems less hard to show brute-force won't work well in some fraction of oracle universes

Mathematical backing-up of my assertions in the conversation: Consider a random coloring of [1, superpoly(n)] s.t. the probability of a given number being colored yellow is \delta. There are approximately superpoly(n)^2 APs of length log(superpoly(n)) [I'm totally fudging the numbers there, but the basic argument should still hold.] If the implied constant of the log(superpoly(n)) is sufficiently small, then the probability of a randomly chosen AP of that length being yellow is at least \delta^log(superpoly(n)) > 1/superpoly(n)^2.

N.B., in fact, that by doing nothing more than messing around with a few constants, we can get the probability to be > k/superpoly(n)^2 for any constant k we want to work with. (Or even, for that matter, > k/superpoly(n).) Then the expected value of the number of APs of the...oh, screw it, of the right length is k.

So! In retrospect, the functions I picked don't work very well, but it's not difficult to change them around a bit so that our above argument (after throwing in some stuff about the Chernoff bound) actually ends up telling us the following:

1. For almost all oracles we've considered here, there exists f(n) superpolynomial, g(n) really slow-growing such that a BPP algorithm can pick polynomially many APs of length g(n) in the range [1, f(n)] s.t. with positive, constant probability, at least one will be yellow.

And here's the crux: There's no obvious way that a deterministic algorithm can do the same thing. Actually, with all these oracle universes floating around, it should be obvious to even the most dimwitted individual who holds an advanced degree in hyperbolic topology, n'hey [sorry], that some of them won't.

CONCLUSION

The problem is, though, that like most lower bounds, this fact is easier noticed than proved. And since there are already known oracle separations of P from BPP, one more wouldn't be particularly exciting. Still, though, it's fun to see two of my (and some other occasional readers') main interests -- computational complexity and additive combinatorics -- work together in a way I can understand, even if I can't quite finish the proof.

Wednesday, November 21, 2007

The chromatic number of n-space

Consider the infinite graph G defined as follows:

The vertex set of G is the set of points of n-dimensional Euclidean space R^n. Two vertices are adjacent if the corresponding points are unit distance apart. The question is, what is the chromatic number of G?

First note that a simple compactness argument shows that this is equivalent to finding the maximum chromatic number of a unit distance graph in R^n; i.e., if we can show that any finite subgraph of G is k-chromatic, then G is k-chromatic.

For n = 2, this is a fairly famous problem. But for higher dimensions, although a lot of work has been done, the known results are (as often happens in combinatorics!) sometimes woefully poor.

Just as one example: The best known general upper bound for the chromatic number of R^n is (3+o(1))^n. But consider the c x c x ... x c hypercube, where c < 3, subdivided into 3^n hypercubes of edge length c/3. If we color each of these hypercubes distinctly, we may tessellate n-space with them, so \Chi(R^n) \leq 3^n is practically trivial!

This is one of the many examples in combinatorics where the best known results are only marginally better than easy, or even trivial, results. (Another good one: upper bounds for the Ramsey numbers R(n,n).) The interesting thing here is that it derives from a geometric fact: namely, that the only regular polytope that tiles n-space for n \geq 5 is the hypercube. The best known bound (7) for 2-space derives from a hexagonal tiling; the best known bound for 3-space is 16 -- I don't know where that comes from. The best bound for 4-space is 49 -- does this arise directly from the tessellation of hyperspace by 24-cells? I'd be greatly indebted to anyone who can shed light on that.

Incidentally, the best known lower bound for \Chi(R^n) is something like (1.239+o(1))^n. I don't see any obvious way to construct even a superlinear unit distance graph in R^n. Anyone know the details of the proof of that bound?

Tuesday, September 18, 2007

The Riemann Hypothesis for Dummies; or, How Bad Music can Make You Obscenely Rich

No, this blog post isn't about 50 Cent.

Here's the scenario:

I'm an ultra-modern, avant-garde composer. I...compose music. One day, I decide, "Hey! I'm gonna write the weirdest piece EVER!"

Here's what I do:

For strings, I write a waltz (which, as you probably know, is in 3/4 time.) But actually, I lied. I don't write a waltz; I only write one measure of a waltz. Which seems like a weird thing to do, but you'll see my plan.

Then, for horns, I write, say, one measure of 4/4 time. And for piano, one measure of, let's say, 7/4 time. And so on.

Then I instruct each section to play their one single measure over and over and over again until they get tired, and then to keep playing it until all the sections finish their respective measures at the same moment.

The musicians will gripe about this arrangement, as musicians often do, but the overall sound won't repeat for a very long time! For example: If, as above, the strings play in 3/4, the horns in 4/4, and the piano in 7/4, then the piece will last a total of 3*4*7 = 84 beats.

But there's a catch! Yes, I'm an edgy and avant-garde composer, but I'm also environmentally conscious. Which means I want to make this piece as long as I possibly can (without repeating itself, of course) while only using a fixed number of beats, so that I won't waste paper. (Don't think too hard about that part. It's avant-garde composer logic.)

So, here's my question: Exactly how long can I make my piece last without writing more than, oh, n total beats?

Believe it or not, this question is actually unsolved! More than 100 years ago, in 1902, a German mathematician named Edmund Landau proved that when n is really big, I can't make it go on for much longer than e^\sqrt{n ln n} beats (where e is the base of the natural logarithms, or 2.71828...) This is a weird function! It grows a LOT slower than any exponential function (e.g. 1,2,4,8,...) but a LOT faster than any polynomial (for example, n^2, or 1,4,9,16...).

But mathematicians think they can do better than Landau. In fact, they think that something similar to Landau's answer -- exactly the same, in fact, except that n ln n has been replaced by something called "Li^-1(n)," which behaves very similarly -- provides an upper bound for big enough n, not just a close-to-upper bound.

But no one's been able to prove their conjecture! In fact, mathematicians have shown that this conjecture is precisely equivalent to the famous Riemann Hypothesis.

And the RH is considered to be one of the hardest problems in mathematics. It's so hard, in fact, that the Clay Mathematics Institute is offering a million-dollar reward to the first person who can publish a correct proof. In other words, my composer's idle question is actually worth seven figures.

Monday, September 3, 2007

Revenge of the Knapsack

Bram Cohen describes a modified knapsack system and suggests it could be used as a public-key cryptosystem. My thoughts:

1. Let's call the elements of the public key a_i, for 1 \leq i \leq k. Consider the sets B_n = {b_i: b_i = a_i (mod n)} for arbitrary n.

2. Note that, if gcd(p,n) is sufficiently large (i.e., > p/2k), then the elements of B_n will follow a very specific distribution. Specifically, each will fall into one of the intervals [a*gcd(p,n), a*gcd(p,n)+p/2k) for some integer a.

3. For p << n << kp with gcd(p,n) large, there will necessarily be more than 1 residue in at least some of the intervals, and, with high probability, each interval will contain at least one residue.

4. Trivially: gcd(p,n) = gcd(p,n+p).

So let's posit a hypothetical function f(n) that will give us a good probabilistic estimate of whether the residues of the a_i (mod n) really do "clump together" in the intervals of size p/2k. This should be close to 0 when gcd(p,n) is low, larger when gcd(p,n) is greater than about p/k, and close to 1 when gcd(p,n) = p. Then f(n) is periodic (or "close to it") with period p. So if we can construct f as a quantum function, we may be able to use a QFT to estimate its period, which would recover the private key.

This is shaky on several levels; first, I don't know whether you can use a QFT to measure functions that are "close to periodic." However, this could perhaps be rectified by defining a new function g: N -> {0, 1/2, 1}, where g(n) is whichever of those three f(n) is nearest to. Then g(n) might be periodic with positive probability.

More crucially: Can we actually devise a function f with the properties outlined above? I would venture to say "yes," since humans can (with high probability) tell whether 300 numbers fall within just three separate, small intervals. And as humans can do, so can do computers. The problem is actually exhibiting a function which we can then prove to work.

Any thoughts on the matter would be appreciated.


Harrison

Thursday, August 30, 2007

Things I Believe, Part 1: 13.7*10^9 B.C.E - c. 1 B.C.E.

So, just for the hell of it, I've decided to attempt a complete inventory of all the things that I believe. All my beliefs, if you will. This'll be a three-part series: Part 1 will cover events from the Beginning of the Universe 'til around the time of some guy's birth in Judea (since that's how people tend to reckon their calendars these days); Part 2 will cover around that dude's birth until the present day; and Part 3 will cover what I believe to be Mathematical and Other Truths (and the Future?). Capitalized. So, without further ado, here is What Harrison Believes.

1. I believe that the Universe as we know it exploded out of a small, hot, dense region approximately 13.7 billion years ago, plus or minus 200 million;

2. That an infinitesimal fraction of a second later, space expanded at an exponential rate far beyond c, and that this led to a "flattening" of local spacetime, and that quantum fluctuations were magnified to macroscopic scales;

3. That a fraction of a second later, the electromagnetic force separated from the weak force, and fundamental particles acquired mass;

4. That under a second after this, quarks were bound together by the strong nuclear force into hadrons and anti-hadrons (most of which annihilated each other immediately);

5. That about 100-300 seconds later, temperatures were right for some protons and neutrons left over to fuse into deuterium nuclei and those of a few other light elements (mostly helium);

6. That hadron/anti-hadron and lepton/anti-lepton annihilation reactions created an abundance of photons, which reacted over the next ~400,000 years with other nuclei, leptons, and hadrons wandering the cosmos;

7. That, as nuclei and electrons combined to form atoms (at around 400,000 years after the Big Bang), the Universe became transparent to radiation, and we see this radiation today as the cosmic microwave background;

8. That, around 200-500 million years later, the first stars (made almost entirely of lightweight elements) began to form, as did the first galaxies;

9. That, around this time, a very special galaxy began forming, one that certain sentient beings living in it would later call the "Milky Way"...


More to come - hold on tight.

Friday, August 10, 2007

Random graphs, logic, and math, oh my!

So a random graph on n vertices is, (semi-)formally, a probability space on the set of all graphs on n vertices. Normally, we consider pretty "vanilla" probability spaces; for example, the one defined by setting the probability of there being an edge between u and v (which we'll denote by P(E(u,v)) for the remainder of this blog post) to be some constant 0 < c < 1. (Technically, yes, we can set c=0 and c=1, but those are really boring probability spaces.) However, as is well known, I'm not normal. So I'd like to investigate whether we can even construct probability spaces, in general, that have weird properties. Here's what I want to know:


Given a sentence Q of first-order logic with two free variables; say, u and v (where the variables will represent vertices of our graph, and we'll restrict ourselves to the predicate E(w,x) which, in short, means "there is an edge between w and x"), and a constant k>1, does there exist a (nontrivial*) probability space on the set of graphs with n vertices which satisfies:

P(E(u,v)|Q(u,v)) \geq k*P(E(u,v))

for all vertices u,v?

*"Nontrivial" in this context basically means "the probability of the empty graph isn't 1," or maybe "for any k, there's some n for which the probability of choosing a graph with at least k edges is positive." I'm not really sure.

By studying these distributions, we can look at "random" graphs which nonetheless are expected to have some preassigned structure (i.e., by letting Q be "there exists w: E(u,w) and E(w,v)", we're looking at graphs that are in some sense "likelier" to contain triangles than a random graph. Alternatively, by setting k very large and constructing rather intricate FOL sentences, we might be able to get some insight into Ramsey-theoretic questions. This is definitely true if we move to SOL; I'm not not quite as sure with this weaker model.) You know, just in case you're the kind of weird person who wants some sort of "motivation" for studying a problem.


Apologies for the awkward wording of my problem; I'm just not sure how else to express it.

Tuesday, August 7, 2007

Problems #1

1. Given an unsorted list L with n elements, define the set S: {a, b \in L: a < b and f(a) < f(b)} (where f(x) is x's position in the list). Call a list c-sorted (0 < c \leq 1) if |S|/(\choose{n, 2}) \geq c. For what 0 < c < 1 do there exist sublinear algorithms to c-sort any list L? (We can consider either algorithms that are guaranteed to produce a c-sorting of a list, or those for which the expected value of the expression |S|/(\choose{n, 2}) is at least c.)

2. Define a candy-passing game on a graph G as follows: For the initial state s_0, we assign a number of "chips" to each vertex v. We go from state s_i to state s_(i+1) as follows: if a vertex v with degree d has at least d chips on it, then it "passes" one chip to each adjacent vertex. Denote the maximum degree of a vertex in G by \Delta(G); is it true that, for any initial configuration with at least (\Delta(G)+1)*|V(G)| total chips, there is some k for which s_k = s_(k+1)? Can we place an upper bound on k?

3. Anyone know how to do LaTeX (or any sort of mathematical notation stuff) in Blogger?