Math Quiz 9

***What’s the average length of all the line segments which can be drawn inside a 1 inch square?

(accurate to 5 decimal digits)

(show your work)




point of clarification: I assume you mean the average of all possible line segments, which allows for intersecting and overlapping lines, if drawn within a single square. For example, you could have an X formed by two lines stretching from opposite corners, or you could have two lines that lay on top of each other, one going from corner to opposite corner and the other from the center to the same corner.

by “inside” are they allowed to touch the edge of the square?

Does it matter?

I I’ll post my work tonight

What assumed was that when you do this, every line length possible creates a square with rounded edges with radius from 0 to 1/2. So I integrated all those together and divided by 1/2 to get my answer

My second attempt over lunch came up with 0.55272, but I’m pretty sure that’s not right (It’s too high). I know what i’m missing from my equations (logically, even if i haven’t figured out how to express it mathematically just yet) though.

I just wish I remembered more of my calculus than I do!

I finally came up with 0.28967 as my final answer. Work below, colored white so people can ignore it until they’ve done their own work on the problem :slight_smile: Note: I don’t remember my calculus very well anymore, so I had to use wolfram alpha to get me (hopefully) close enough!

To solve this (assuming it’s correct), I needed a couple of components and assumptions/assertions.

Lets start with a simple case - the average length of line segments contained within a N length line. A good description of this can be found here:

What that boils down to, is that if I draw a line such that it intersects two edges of the square, taking the length of that line divided by 3 gives me the average length of every line segment that sits on that line. This little trick lets us GREATLY simplify the calculations, as we can use it to assume we’re only looking at lines that intersect two walls of the square, and effectively ignore snort lines that don’t intersect the lines of the square.

So, with that in hand, we can now figure out the length of every possible line that intersects two lines of the square. This can be broken down into two categories: lines that intersect adjacent edges of the square, and lines that intersect non-adjacent edges. Due to the symmetry of a square, we know that the first group can be simplified into 4 repetitions (top/right, right/bottom, bottom/left, left/top) and the second into 2 repetitions (top/bottom, left/right). This will come in handy later.

So, lets look at the case where you have adjacent sides. You really have two variables here, X and Y, each being an independent number between 0 and 1 representing their location on one of the sides, and the line that stretches between them. The length of that line, as defined by the Pythagorean Theorem is sqrt(x^2+y^2)… but remember we want that length/3 to get the average length of line segments along it. And remember that we have an infinite number of points between 0 and 1 for both x and y… calculus! So, written in a form wolfram alpha will recognize, integrating over x and y gives us:

integrate integrate (sqrt(x^2+y^2)/3) dx dy from 0 to 1 from 0 to 1

or 0.255065.

We can do the same for the case of non adjacent sides. Here the equation is a little trickier, but ultimately the line length is sqrt((x-y)^2+1). Dividing by 3 and integrating gives us:

integrate integrate (sqrt((x-y)^2+1)/3) dx dy from 0 to 1 from 0 to 1

or 0.358879.

Keep in mind that the average of an integral is that integral times 1/(b-a) - in this case, b-a is 1, so we don’t need to do anything else.

So, lets get these two numbers together. Remember, we have 4 sets of the adjacent sides, and 2 sets of the non-adjacent sides. And since those sides all integrated over the same values, we should just be able to average the sets, right? So, averaging those in proportion gives us:

(40.255065 + 20.358879)/6

or 0.28967.

Do a sanity check with a simple Monte Carlo simulation.


So I integrated all those together and divided by 1/2

Here’s my wild guess: 0.52026

In other words, sqrt(2)/e


I ran a simulation with 100M iterations and got 0.521388
The value seemed to be decreasing with greater numbers of iterations. Then I plugged the value into this handy tool and took the first result :rolleyes:

After seeing the original post earlier today, I did not check back until I had this. It does not seem that anyone got as far as I did:

Initial thoughts and reasoning:
After looking at the problem for about a minute, my “eyeball integrator” came up with “a bit ovor 0.5”. While I did not go through all these steps conciously, this is roughly what I think happened:

Consider a circles of radius 1/2 unit and 1 unit. Considering 25 points in a 5x5 grid, including the corners of the square and the center of the square.
When the circles are co-centered with the square, about 79% of the square is inside the small circle, all within the large circle. There is one such point.
When the circles are located halfway between the center and an edge, the small circle covers 66% of the square, the large circle all of the square. There are four such points.
When the circles are located halfway between the center and a corner, the small circle covers 55% of the circle. The large circle covers 99+% of the square (ignore). There are four such points.
When the circles are located at the center of an edge, the small circle covers 37% of the large square. About 4% of the square is outside of the large circle. There are four such points, but only half of a square centered here is inside the square. Weight=2.
When the circles are located halfway between the previous point and a corner, the small circle covers 33% of the square. About 8% of the square is outside the large circle. There are 8 such points, only half of the square centered here is inside the big square. Weight = 4.
When the circles are located at the corner, only 18% of the square is inside the small circle, and 21% of the square is outside the large circle. There are four points, onle a quarter of a square centered here is inside the big square. Weight=1.
Therefore, for an arbitrary point, .79 + .18 + 2 * .37 + 4 * (.66 + .55 + .33) / 16 = 49% of segments are shorter than 1/2 a unit. Likewise, (.04*2 + .08 * 4 + .21 * 1)/16 = 4% of the segments are longer than a full unit. I expect the answer to be close to a half unit, perhaps a bit larger.

Next, a numerical solution:

#!/bin/gawk -f
  for (steps=10; steps<200; steps+=10){
    for (i=0; i< steps; i++)
      for (j=0; j< steps; j++)
        for (k=0; k< steps; k++)
          for (l=0; l< steps; l++) {
            numer += sqrt((i-j)*(i-j)+(k-l)*(k-l));
    print steps, numer/denom


10 0.518687
20 0.520757
30 0.521121
40 0.521247
50 0.521304
60 0.521336
70 0.521354
80 0.521366
90 0.521375
100 0.52138
110 0.521385
120 0.521388
130 0.521391
140 0.521393
150 0.521394
160 0.521396
170 0.521397
180 0.521398
190 0.521399

This appears to meet Ether's criterion of 5 significant digits at ".52140".

Nonetheless, let's try to find a closed-form solution.  As indicated in my numerical solution, the problem is at its most basic the ratio of two quadruple integrals over the interval of [0,1], the content of the numerator integral being the hypotenuse formula and the denominator being unity.  As integrating 1 over the range 0 to 1 yields 1, doing this four times still yields one, so we can skip the denominator.  I recognize that this solution counts all of the non-zero-length segments twice,and the zero-length segments only once.  As there are infinitely fewer zero-length segments than non-zero-length segments, and I am calculating an average, this is a problem that I can safely ignore. 
I had originally envisioned (i,j) as one point, and (k,l) as the other point, and did not see how to approach the problem. Fortunately, I used the form above for the numerical solution, which seems to indicate a simpler integral.  
Let us address the simpler problem of "What is the average length of a segment within the unit line segment?" and keep track of the statistics.  This problem is more easily understood by considering individual points for the "outer integral" and what the lengths are for the "inner integral".
When the outer integral as near the center, the inner integral yield lengths equally likely between 0 and 0.5 (one each way).  When the outer integral variable is at an end, the inner integral value yields lengths equally likely between 0 and 1.  
When the outer integral variable has a value x&lt;0.5, the inner integral yeilds 2 solutions for numbers less than x, and one for answers between x and 1-x.  
These indicate a linear ramp of "x-lengths" from a maximum likelihood of 1.0 at lenth zero, to a likelihood of 0.0 at length one.  Multiplying this by the line length yields the integral of 2(1-x)(x), or 2(x-x2).  The integral of 2x over 0..1 is one.  The integral of 2x2 over 0..1 is 2/3. The average length of a segment on a unit segment is 1/3.

The simple linear ramp found in the integral above implies a fairly simple two-variable integral for the average length of a segment in a square:
4ʃʃ(1-x)(1-y)√(x2+y2)dxdy, both integrals being over 0..1.
After looking for a couple of hours, I’ll admit that I do not see an obvious closed-form solution.  Unless I find one in the thread, I’ll probably search for one on and off for the next two or three months.

**No one has yet come up with an answer correct to 5 decimal places.

OK to use whatever computational tools you like.**

Here’s a script I wrote in order to try to solve this problem. My second attempt was to, for each line length find the area where the center of the line could be. This resulted, for line length 0 --> 1 in a box with quarter circles cut out. For 1 --> sqrt(2), it was a little more complicated. Overall, it gives the answer of 0.77019523 (1.78 KB) (1.78 KB)

Sanity-check your answer with a simple Monte Carlo simulation.

*this can be done with a one-line AWK script

What is a Monte Carlo simulation?

Randomly sample the set of all line segments, and find the mean length of only the segments in the sample (a finite number chosen to suit your computational resources, rather than the uncountably infinite number in the entire set). If the sampling matches the probability distribution/weighting of the set, then the law of large numbers says your mean will approach the true mean as sample size increases.

Aren beat me to it.

One-line AWK script:

BEGIN{for(i=1;i<100000;i++){sum+=sqrt((rand()-rand())^2+(rand()-rand())^2);}print sum/i;}

I ran the Monte Carlo thing in a python script and the answers, each run with 1,000,000 trials, were all around 0.333. I’m gonna keep thinking, though, how to get an exact answer.

EDIT: I found the error - the value comes out to 0.521

I am thinking the answer is a number that is very small (close to 0), as you can draw a ton more 0.000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000001 lines than you can 1 inch (or larger on a diagonal) lines.

Yet if you look at the Monte Carlo simulation, it looks closer to 0.521"