Rearranging Digits – Brute Force Solutions


Posted on December 26, 2016


In this post we will revisit an example from the previous post Rearranging Digits with a brute force coding solution, and expand with an another example (analytically) that is trivial with with a brute force solution. The two examples are:

  1. Geometric Progression: Find all ordered pairs of distinct single-digit positive integers $(A, B, C)$ such that the three 3-digit positive integers $ABC$, $BCA$, and $CAB$ form a geometric progression.
  2. Arithmetic Progression: Find all ordered pairs of distinct single-digit positive integers $(A, B, C, D, E, F)$ such that the four 6-digit positive integers $ABCDEF$, $CDEFAB$, $FABCDE$, and $DEFABC$ form an arithmetic progression.
These solutions will highlight the pros (simplicity and guaranteed solutions) and cons (far less insight) of sacrificing analytic solutions for brute force solutions. These brute force solutions clearly have exponential run time. The algorithmic complexity is not a significant hurdle for three or six digit numbers, but would be for larger numbers. Coding examples will be presented in MATLAB and Python.

In the previous post we walked through a somewhat informed, yet clumsy and incomplete analytic solution that end with a less than satisfying guess and check (not to say there aren’t elegant solutions out there, I didn’t quite get there though).

This can be remedied quite easily by brute forcing our way through each of the $9^3$ possibilities for $A$, $B$, and $C$.  It would look something like this in MATLAB

function [] = three_digit_with_fors()
for a=1:9
    for b=1:9
        for c=1:9
            three_digit_geometric([a, b, c])
        end
    end
end
end %function

This calls the function three_digit_geometric:

function [] = three_digit_geometric(final)
tens = [100 10 1];
N1=dot(final([1 2 3]), tens);
N2=dot(final([2 3 1]), tens);
N3=dot(final([3 1 2]), tens);
 
if N2/N1==N3/N2 && N3~=N2
    fprintf('%d, %d, %d -> ', final);
    fprintf('%d, %d, %d, r=%g \n', N1, N2, N3, N3/N2);
end
end

If we wanted to generalize to higher digit numbers, the for loops would get out of hand pretty quickly, especially in with Python with forced indentation (good luck with PEP 8). To alleviate this (at least for our eyes), let’s generate a recursive function call to put these for loops in.

Now, if you’ve been around the block a bit, you know that python has itertools.permutations and MATLAB has perms – and you could use those.  If you do, be wary of edge cases! The next problem is a rearrangement where some of the digits can be 0, but others are leading digits and therefore cannot. Tread lightly.

Also, perms in MATLAB appears to return the permutations in a large array (as opposed to the memory friendly generator for itertools.permutations in Python). Also, perms is recursively implemented so it’s quite easy to reach max recursive depth.  Just try perms([1:n] for a large enough value of n (11 does it for me on 32-bit MATLAB 2011).

Back to our recursion.  Let’s call the function rfor (short for recursive for).  When given an arbitrary number of arrays, it will recursively loop through them.  This will have the same shortcomings that come with any recursive algorithm, without even writing it one can posit that it will be slower and will risk reaching max recursive depth for large problems.

All of that aside, here’s what rfor would look like in MATLAB

function [] = rfor (current, final_func, level, varargin)
a=varargin;
for i=1:length(a{1})
    current(level) = a{1}(i);
    if length(a)>1
        rfor (current, final_func, level+1, a{2:length(a)})
    else
        feval(final_func, current);
    end
end
end

If you’re wondering whether or not time is saved by tracking the level of recursion instead of dynamically changing the size of the array, a quick benchmark showed a $>10\%$ speedup over the assignment current = [current a{1}(i)];

An example use would be

a=1:9; b=1:9; c=1:9;
rfor([], 'three_digit_geometric', 1, a, b, c);

The output is

4, 3, 2 -> 432, 324, 243, r=0.75 
8, 6, 4 -> 864, 648, 486, r=0.75

In case you aren't familiar with it, varargin is a way to pass a variable number of parameters to a MATLAB function. If the length is 1, we are at the lowest level of recursion required.

Python has an simple implementation as well.

def three_digit_geometric(final):
    tens = [100, 10, 1]
    n1 = sum([tens[i]*final[i] for i in range(len(final))])
    n2 = 10 * (n1 % 100) + n1 // 100
    n3 = 10 * (n2 % 100) + n2 // 100
    if n2 / float(n1) == n3 / float(n2) and n3 != n2:
        print n1, n2, n3, '->', final, 'r=%g' % (n3 / float(n2))


def rfor(current, final_func, level, args):
    for i in args[0]:
        current[level] = i
        if len(args) == 1:
            globals()[final_func](current)
        else:
            rfor(current, final_func, level+1, args[1:])


def main():
    a, b, c = [range(1, 10)] * 3
    rfor([0] * 3, 'three_digit_geometric', 0, [a, b, c])

As a last task, let’s solve one more fun challenge: The four six-digit numbers $ABCDEF$, $CDEFAB$, $FABCDE$, and $DEFABC$ for an arithmetic sequence. Find all possible values of ($A$, $B$, $C$, $D$, $E$, $F$).

This could probably be solved analytically. To avoid 3 equations and 6 unknowns, you would want to treat $AB$ and $DE$ as individual variables, reducing the problem to 4 unknowns (and knowledge of which should be 1-digit a 2-digit). This will be left to the reader as an exercise ;)

Here’s the code for the brute force solution (note how some digits are allowed to be zero but not others).

a=1:9; b=0:9; c=1:9; d=1:9; e=0:9; f=1:9;
rfor([], 'six_digit', a, b, c, d, e, f);

function [] = six_digit(final)
tens = [100000 10000 1000 100 10 1];
N1 = dot(final(1:6), tens);
N2 = dot(final([3:6 1:2]), tens);
N3 = dot(final([6 1:5]), tens);
N4 = dot(final([4:6 1:3]), tens);
 
if N2-N1==N3-N2 && N4-N3==N3-N2 && N3>N2
    fprintf('%d, %d, %d, %d, %d, %d -> ', final);
    fprintf('%d, %d, %d, %d, d=%d \n', N1, N2, N3, N4, N3-N2);
end
end

Here’s the output

1, 0, 2, 5, 6, 4 -> 102564, 256410, 410256, 564102, d=153846 
1, 5, 3, 8, 4, 6 -> 153846, 384615, 615384, 846153, d=230769 
1, 6, 2, 3, 9, 3 -> 162393, 239316, 316239, 393162, d=76923 
2, 1, 3, 6, 7, 5 -> 213675, 367521, 521367, 675213, d=153846 
2, 6, 4, 9, 5, 7 -> 264957, 495726, 726495, 957264, d=230769 
2, 7, 3, 5, 0, 4 -> 273504, 350427, 427350, 504273, d=76923 
3, 2, 4, 7, 8, 6 -> 324786, 478632, 632478, 786324, d=153846 
3, 8, 4, 6, 1, 5 -> 384615, 461538, 538461, 615384, d=76923 
4, 3, 5, 8, 9, 7 -> 435897, 589743, 743589, 897435, d=153846 
4, 9, 5, 7, 2, 6 -> 495726, 572649, 649572, 726495, d=76923 
6, 0, 6, 8, 3, 7 -> 606837, 683760, 760683, 837606, d=76923 
7, 1, 7, 9, 4, 8 -> 717948, 794871, 871794, 948717, d=76923 

So there are $12$ solutions.

Again if you are wondering where these came from, look no further than repeating decimals for fractions with $117$ (one of the many divisors of $999, 999$) in the denominator.

Rearranging Digits – Analytic Solutions


Posted on September 22, 2016


In this post we will work through some digit rearrangement problems.
  1. Six-Digit Numbers: The 6-digit number $ABCDEF$ is exactly $23/76$ times the number $BCDEFA$ (where $A$, $B$, $C$, $D$, $E$, and $F$ represent single-digit non-negative integers and $A,B>0$). What is $ABCDEF?$
  2. Arithmetic Progression: Find all ordered pairs of distinct single-digit positive integers $(A,B,C)$ such that the three 3-digit positive integers $ABC$, $BCA$, and $CAB$ form an arithmetic progression.
  3. Geometric Progression: Find all ordered pairs of distinct single-digit positive integers $(A,B,C)$ such that the three 3-digit positive integers $ABC$, $BCA$, and $CAB$ form a geometric progression.

This post will include analytic solutions, a future post will show some code-based solutions.

Six-Digit Numbers

Let's attack the first problem, repeated for clarity:

The 6-digit number $ABCDEF$ is exactly $23/76$ times the number $BCDEFA$ (where $A$, $B$, $C$, $D$, $E$, and $F$ represent single-digit non-negative integers and $A,B>0$). What is $ABCDEF?$

We will let $x=BCDEF$ be a five-digit number. Then

\begin{eqnarray*} ABCDEF &=& \frac{23}{76}\cdot BCDEFA \\ 100000\cdot A+x &=& \frac{23}{76} (10\cdot x+A) \\ 7600000\cdot A+76\cdot x &=& 230\cdot x+23\cdot A \\ 7600000\cdot A+76\cdot x &=& 230\cdot x+23\cdot A \\ 7599977\cdot A &=& 154\cdot x \\ 98701\cdot A &=& 2\cdot x \end{eqnarray*}

Remember, $x$ is a five-digit integer and $A$ is a one-digit number, so one such solution is $A=2$ and $x=98701$. Are there other solutions though? Well, $A$ must be even for $x$ to be an integer, but if $A>2$ then $x$ would be a six-digit number. So there is just the one solution. As a check, this makes our numbers $ABCDEF=298701$ and $BCDEFA=987012$.

(Side note: if you are wondering where these seemingly random six-digit numbers came from, check out the repeating decimals $\frac{23}{77}$ and $\frac{76}{77}$. Fractions whose denominators are factors of $999,999$ will create repeating decimals with cycles of six digits or less and quite often use the same digits in different order.

Arithmetic Progression

Now let's look at the second problem, repeated for clarity: Find all ordered pairs of distinct single-digit positive integers $(A,B,C)$ such that the three 3-digit positive integers $ABC$, $BCA$, and $CAB$ form an arithmetic progression.

We will approach this the same way, let's define the 3 numbers \begin{eqnarray*} N_1&=&100A+10B+C \\ N_2&=&100B+10C+A \\ N_3&=&100C+10A+B \end{eqnarray*}

These terms are in arithmetic progression, which can be exploited. Let the common difference of this sequence be $d$, then \begin{eqnarray*} N_2-N_1&=& 100B+10C+A-(100A+10B+C)=d\\ N_3-N_2&=& 100C+10A+B-(100B+10C+A)= d\\ N_3-N_1&=& 100C+10A+B -(100A+10B+C)= 2d \end{eqnarray*}

After simplifying, we have \begin{eqnarray*} -99A +90B +~9C - d=0\\ ~9A -99B +90C - d =0\\ -90A ~-9B +99C - 2d =0 \end{eqnarray*}

We have a system with three equations and four unknowns, which should make you a little anxious (and the third equation is just the sum of the first two). However, if we simplify a little bit (one could do this by hand, but rref in MATLAB is our friend)

Using rref on the coefficient matrix gives us

1  0  -1  0.021021
0  1  -1  0.012012
0  0   0         0

We need integer solutions, so where do we go from here? Take a look at the two decimals, $0.\overline{021}$ and $0.\overline{012}$. The fraction equivalents are $\displaystyle\frac{7}{333}$ and $\displaystyle\frac{4}{333}$ respectively.

The equations are then \begin{eqnarray*} a-c+\frac{7}{333}d=0 \\ b-c+\frac{4}{333}d=0 \end{eqnarray*}

So if we need integers, let's get rid of the fraction by letting $d=333$. Then we have $a-c=-7$, and $b-c=-4$. This gives us the solutions $(a,b,c)=(1,4,8)$ and $(2,5,9)$. Did we get them all though? We can also let $d=-333$, which yields $a-c=7$, and $b-c=4$. This gives us the solutions $(a,b,c)=(8,5,1)$ and $(9,6,2)$.

We could try multiples of $333$ as well, but will quickly run into trouble. If $d=666$, then $a-c=14$, and $b-c=8$ but $a$ and $c$ must be single digit numbers and larger values of $d$ will violate this. We can also let $d=0$ but that just gives us the trivial solutions of an arithmetic progression with common difference $0$.

Therefore our four solutions are \begin{eqnarray*} 148,~481,~814 \\ 185,~518,~851 \\ 259,~592,~925 \\ 296,~629,~962 \\ \end{eqnarray*}

If you are wondering if these numbers came out of nowhere, check out the decimal equivalents of the $27$ths. It is no coincidence that $\displaystyle\frac{4}{27}=0.\overline{148}$, $\displaystyle\frac{13}{27}=0.\overline{481}$, and $\displaystyle\frac{22}{27}=0.\overline{814}$.

Geometric Progression

Now let's look at the last problem, again repeated for clarity: Find all ordered pairs of distinct single-digit positive integers $(A,B,C)$ such that the three 3-digit positive integers $ABC$, $BCA$, and $CAB$ form a geometric progression.

For the geomtric sequence \begin{eqnarray*} N_2-r\cdot N_1&=& 100B+10C+A-r(100A+10B+C)= 0\\ N_3-r\cdot N_2&=& 100C+10A+B-r(100B+10C+A)= 0 \end{eqnarray*}

There are other manipulations, but I haven't found any that are linearly independent of the two equations shown.

Not only do we have two equations and four unknowns, the equations are non-linear. With some simplification, we have \begin{eqnarray*} 100B+10C+A-r(100A+10B+C)= 0\\ 100C+10A+B-r(100B+10C+A)= 0 \end{eqnarray*}

Using the symbolic toolbox and rref we can simplify

% a  b                          c
[ 1, 0,     (r - 10)/(10*r^2 - 1)]
[ 0, 1, (r*(r - 10))/(10*r^2 - 1)]

Not incredibly helpful. However, we can reverse the columns of the array (now instead of A,B,C we have C,B,A) and rref gives us

% c  b                      a
[ 1, 0, (10*r^2 - 1)/(r - 10)]
[ 0, 1,                    -r]

Which is a little cleaner. So where do we go from here? Well, from the last equation we have $b-ra=0 \rightarrow r=b/a$ from the second equation. So we could just try every $(a,b)$ combination ($72$ since $a\neq b$) and see if $$c=\displaystyle\frac{10r^2 - 1}{10 - r}a$$ is an integer. Still some brute force required though. You could also substitute $r=b/a$ into $c=\displaystyle\frac{10r^2 - 1}{10 - r}a$ yielding $$ c= \displaystyle\frac{10b^2-a^2}{10a - b} $$ which we just have to ensure is an integer. But again, it looks like guessing and checking is required. You could even simplify further with some polynomial division $$ c = \displaystyle\frac{10b^2-a^2}{10a - b} = \displaystyle\frac{999a^2}{10a - b} - 10b - 100a$$ which takes you down a path of finding $a$ and $b$ such that $10a - b$ is a divisor of $999a^2$, but that may even be more dangerous than guess and checking since it's so easy to leave out a case and every case must be checked to ensure $c$ is a positive single digit integer.

You can though, and it will lead you to two ordered triples, $(a,b,c)=(4,3,2)$ and $(8,6,4)$ (the second is just a less exciting multiple of the first, both have the common ratio $r=\frac{3}{4}$. The geometric series are $432,~324,~243$ and $864,~648,~486$.

This last solution left me feeling less than satisfied. In the next post we will do a code-based solution to ensure with some certainty that those are the only values.

Speeding up Code with Vectorization


Posted on January 15, 2016


In a previous post we numerically solved the heat equation with an explicit finite difference formulation, one of the simpler 2D problems in numerical analysis.

In this post we are going to speed up that computation with some vectorization. Results show we can more than double the speed of our code with some simple changes (but also that we can make it dangerously slow if implemented incorrectly).

The code examples below are in MATLAB, but you can take advantage of vectorization with other interpreted languages such as Python as well. You can read more about vectorization of MATLAB code here.

Step 1: Initialize Variables

Not much changes in this section except that we will replace nx and ny with np and use a square mesh for simplicity.

%Initialize Variables
dt=.005;                 %time step
nts=500;                 %number of time steps
alpha=.01;               %thermal diffusivity
l=1;                     %reference length plate
np=50;                   %number of points (x and y)
dx=l/(np-1); dy=dx;      %discretized distance

%Temperature and derivatives stored in 2D arrays
u = zeros(np); uxx = zeros(np); uyy = zeros(np);

Step 2: Apply Boundary Conditions

In the following sections we will use snippets outlined in yellow to show the original code, green to show standard vectorization, and red to show sliced vectorization. In the original code (below) we looped through the two edges to apply the boundary conditions

%Boundary Conditions, original loop
for i=1:np
   u(1,i)=1;
end

for i=1:np
   u(i,np)=1;
end

This is a one-time computation so there isn’t significant time saved by changing this, but regardless we can do better by both simplifying and speeding up our code.

%Boundary Conditions, % Vectorization without slicing
u(1,:)=u(1,:)+1;
u(:,np)=u(:,np)+1;
u(1,np)=1;

We had to correct the value of at the coordinate $(1,np)$ because are adding to it twice. So why didn’t we just take a slice of the array, like 2:np? Here is what it looks like with sliced vectorization

%Boundary Conditions, vectorization with slicing
part=2:np;
u(1,:)=u(1,:)+1;
u(part,np)=u(part,np)+1;

This looks a little nicer, but actually has a significant impact on run time. Results of a benchmark comparison with 10,000,000 test cases is shown in the table below

Original loop 13.46 s
Vectorization with slicing 33.29 s
Vectorization without slicing 10.29 s

So while sliced vectorization looks nice and doesn't require a correction, it more than triples execution time.

Step 3: Time Loop

The more important changes come with the code inside of the time loop, so let's shift our focus there. Here is the original code.

% Original Loop
for t=1:nts

   %Calc the Spatial 2nd Derivative, Centered in Space
   for i=2:np-1
      for j=2:np-1
         uxx(i,j)=(u(i+1,j)-2*u(i,j)+u(i-1,j))/(dx)^2;
         uyy(i,j)=(u(i,j+1)-2*u(i,j)+u(i,j-1))/(dy)^2;
      end
   end

   %Update Temperature, Forwards in Time
   for i=2:np-1
      for j=2:np-1
         u(i,j)=u(i,j)+alpha*(uxx(i,j)+uyy(i,j))*dt;
      end
   end
end

We can simplify this code greatly with vectorization. The code below requires corrections, but improves run time drastically.

% Vectorization without slicing
for t=1:nts

   %Calc the Spatial 2nd Derivative, Centered in Space
    for i=2:np-1
        uxx(i,:)=(u(i+1,:)-2*u(i,:)+u(i-1,:))/(dx)^2;
        uyy(:,i)=(u(:,i+1)-2*u(:,i)+u(:,i-1))/(dy)^2;
    end
    uxx(2,1)=0; %correction
    uyy(np,np-1)=0; %correction
    % Note, we do not need to correct other BC points since the uxx,uyy=0 - it would be safer to tho
    
    %Update Temperature, Forwards in Time
    u=u+alpha*(uxx+uyy)*dt;
end

Note that we had to once again make a correction after the vectorized calculations. Here is what it looks like with sliced vectorization that wouldn't require corrections

% Vectorization with slicing
part=2:np-1;
for t=1:nts

    %Calc the Spatial 2nd Derivative, Centered in Space
    for i=2:np-1
        uxx(i,part)=(u(i+1,part)-2*u(i,part)+u(i-1,part))/(dx)^2;
        uyy(part,i)=(u(part,i+1)-2*u(part,i)+u(part,i-1))/(dy)^2;
    end
        
    %Update Temperature, Forwards in Time
    u=u+alpha*(uxx+uyy)*dt;
end

Once again this looks much nicer but as we see below, the slices are once again much slower.

The comparison of all three methods run 30 times with 5000 time steps each are shown in the following table.

Original loop 25.48 s
Vectorization with slicing 84.31 s
Vectorization without slicing 10.68 s

With properly implemented vectorization we can drastically speed up our code, but as the results show it must be done carefully and tested to ensure both accuracy (by making the proper corrections) and speed.