Kaprekar Series
Nice name isn't it? Well it is named after an Indian mathematician Mr. D.R. Kaprekar who discovered it.
Back in 1989 I read about this series in a Mathematical journal, and was fascinated the moment I read about it. It is simple to generate. Just consider a four digit number ... hmmm lets say 3241. Now order the digits in ascending and descending order: that gives us 4321 and 1234. Now subtract the two numbers - which gives us 3087. Now repeat this operation and notice what happens ...
3241: 4321 - 1234 ------- 3087 3087: 8730 - 0378 ------- 8352 8352: 8532 - 2358 ------- 6174 6174: 7641 - 1467 ------- 6174 6174: 7641 - 1467 ------- 6174
Notice that you are stuck with 6174, once you reach that number. The fun part is that all four digit numbers, except 9 of them (no awards for guessing that right!), lead to this very same number! The number of steps required to reach the number might be different, but sooner or later you'll land up with 6174.
You can try this with larger or smaller numbers. For three digit numbers you end up with 495.
It gets more interesting with 2 digit numbers. Consider 43.
93: 93 - 39 ----- 54 54: 54 - 45 ----- 09 09: 90 - 09 ----- 81 81: 81 - 18 ----- 63 63: 63 - 36 ----- 27 27: 72 - 27 ----- 45 45: 54 - 45 ----- 09
Notice that the series does not converge to one number - but it loops over a series of numbers: 09 -> 81 -> 63 -> 27 -> 45 and then back to 09 ad infinitum. Now this happens for all the 2-digit numbers (except 9 of them).
For number with more number of digits, it gets even more interesting. For 5-digit numbers, it can be one of three possible series; for 6-digit it can either end up in a series or could end up with one of two numbers (631764 or 549945).
The program
Back then I didn't have a computer at my disposal and so I did those calculations mostly by hand or by my calculator. Of course finding a series would take a long time and the possibility of error was high; also it would get pretty messy for large numbers. I remember finding some series for up to 12-13 digit numbers and then that was it ... I must have found something else that was exciting. Since it was impossible for me to sit down and do the calculations for each and every number, there was no guarantee that I had found all the series. This job was definitely for a computer program. After I learnt programming, the thought was there to create a program to find the Kaprekar series, but I don't remember actually writing down the program. So one day last week I decided that now was the time to create the program.
It started out with a simplistic program that would churn out the series after going through all the numbers; and went through a series of refinements.
The program is written is java. I know that C++ would be a better language to do this, but ...
The Logic
The core logic of the program is simple.
- Take a number.
- Break it up into digits.
- Sort the digits.
- Create the largest and smallest numbers from these digits.
- Subtract the two numbers.
- Check for a pattern.
- If a pattern was not found then repeat steps 2 onwards using this new number.
If a pattern is found then it is stored in a repository. Before storing a check is made if such a series already exists in the repository.
The Base Class
I created a base class to do the basic tasks of storing and retrieving the series. It would also take up the task of separating the digits, sorting the digits and finding the difference between the larger and smaller numbers - in essence getting the "next" number.
/**
* Base class for storing the Kaprekar series.
*/
public abstract class Kaprekar {
private Vector patterns = new Vector();
protected int nDigits;
public Kaprekar() {
super();
}
protected void addPattern(Series s) {
...
}
/**
* Abstract method that would find out all the
* Kaprekar series for a given size of the number.
* @param nDigits The number of digits in the number.
*/
public abstract void getAllSeries(int nDigits);
protected long getDiff(short[] digits) {
...
}
/**
* Check if the series has already been stored.
* @return boolean
* @param ser
*/
public boolean isKnown(Series ser) {
return patterns.contains(ser);
}
/**
* Sort the digits.
* @return short[]
* @param digits short[]
*/
protected short[] sort(short[] digits) {
...
}
/**
* Abstract method to find the series to which
* the specified number converges.
*
* @return
* @param n The number.
*/
public abstract Series findSeries(long n);
/**
* Method to find the "next" number.
* @return long
* @param n long
*/
protected long getNext(long n) {
...
}
/**
* Method to dump all the gathered patterns on to the screen.
*/
public void printPatterns() {
...
}
}
I have deleted the method implementations above to conserve space :) Click Here for the source code.
The Series class
Another class that was created was the Series class. This would actually store one Kaprekar series. It is basically a Vector.
The first implementation
The first implementation (named KaprBF) looped through all the numbers for the given digit and searched for patterns. This was obviously a very crude way of finding the series.
Kaprekar1 is a sub-class of Kaprekar and so has to implement two methods - getAllSeries and findSeries. And a main method is required to run it.
Here's the implementation:
public void getAllSeries(int nDig) {
nDigits = nDig;
long min, max;
min = (long) Math.pow(10, nDigits - 1); //the smallest n-Digit number
max = (long) (Math.pow(10, nDigits) - 1); //the largest n-Digit number
checkedNos = new Stack();
for(long i = min; i <= max; i++) {
Series s = findSeries(i);
addPattern(s);
}
}
public Series findSeries(long n) {
int location = -1;
Stack stack = new Stack();
while(true) {
stack.push(new Long(n));
n = getNext(n);
location = stack.search(new Long(n));
if( location != -1)
break;
}
List ser = stack.subList(stack.size() - location, stack.size());
return new Series(ser);
}
public static void main(String[] args) {
Kaprekar1 k = new Kaprekar1();
k.getAllSeries(i);
k.printPatterns();
}
A pretty small program. It runs fairly well for small numbers, but as you can guess, is all trouble for the bigger guys.
To give you an idea of the time taken:
For 2-digit numbers it took 290 ms!
For 3-digit 931 ms
For 4-digit 11 sec 987 ms
For 5-digit 2 min 28 sec 243 ms
You can see the steep rise in processing time as the number of digits is increased.
Click Here for the code.Improvement - Checked numbers
So the next step was to determine how to make this process faster. One simple way is to keep track of the numbers that have already been checked. If any point in time the program encounters a number that it has already worked on, it quits processing for the current number and proceeds to the next number.
This was accomplished by using another Stack that was a class member variable. And there was a dramatic improvement in performance:
For 2-digit 91 ms!!
For 3-digit 260 ms
For 4-digit 5 sec 859 ms
For 5-digit 56 sec 921 ms
This was good, but not good enough. One problem that I could foresee was memory related and I bet you can see it too.
Click Here for the code.Improvement - multiples.
Take a close look at the numbers in the Kaprekar series:
2-digit: 9 -> 81 -> 63 -> 27 -> 45
3-digit: 495
4-digit: 6174
5-digit: 74943 -> 62964 -> 71973 -> 83952
63954 -> 61974 -> 82962 -> 75933
53955 <-> 59994
Notice that all of them are multiples of 9! So what if we checked only multiples of 9, instead of all the numbers? That would reduce the number of numbers to check to 1/9th the original!
The time taken was: 2 -> 110 ms 3 -> 90 ms 4 -> 581 ms 5 -> 6 sec 299 ms 6 -> 2 min 34 sec 102 ms 7 -> 27 min 47 sec 304 ms
On checking the effect of removing the above "Checked Numbers" change, I noticed that it didn't make a drastic difference. I was willing to take on the overhead to reduce the memory burden.
2 -> 120 ms 3 -> 170 ms 4 -> 1 sec 432 ms 5 -> 16 sec 354 ms
Improvement - multiples again!
Now pay an ever closer attention to the numbers - to the series for odd number of digits. Ok, I didn't find it out that way, so I'll tell you how I found it out ...
Consider a 2-digit number whose digits have already been sorted.
Let the larger of the two numbers be n1.
n1 = 10 * a + b
So the smaller number will be
n2 = 10 * b + a
The difference will be:
n1 - n2 = (10 * a + b) - (10 * b + a)
= 9 * (a - b)
So that explains it - they are all multiples of nine. Cool thing. It reduced the number of digits to go through from 99 to 11!
Similarly for 3-digit numbers
Diff3 = (100 a + 10 b + c) - (100 c + 10 b + a)
= 99 (a - c)
Interesting.
Similarly
Diff4 = 9 * {111 (a - d) + 10 (b - c)}
Diff5 = 99 * {101 (a - e) + 10 (b - d)}
Diff6 = 9 * {11111 (a - f) + 1110 (b - e) + 100 (c - d)}
Diff7 = 99 * {10101 (a - g) + 1010 (b - f) + 100 (c - e)}
So you saw the pattern. All the odd-digited series would be a multiple of 99! To the more mathematical amongst you, I leave it as an exercise to prove that. For the rest - Click Here [or here if you have a mathml enabled browser] you lazy bums. BTW, this link is being tracked using cookies and cup-cakes; so I'll know who are lazy (which is all of you, right?) and who is not.
Adding this new improvement in the code dramatically reduces the time for the odd-digited numbers:
2 -> 110 ms 3 -> 30 ms 4 -> 581 ms 5 -> 621 ms 6 -> 2 min 35 sec 293 ms 7 -> 3 min 4 sec 395 ms
Click Here for the code.
Improvement - the equation
Now instead of using the multiples, what about using the above equations themselves???
For that we'll need to look into the pattern in the equations :)
Diff2 = 9 (a - b)
Diff3 = 99 (a - c)
Diff4 = 999 (a - d) + 90 (b - c)
Diff5 = 9999 (a - e) + 990 (b - d)
Diff6 = 99999 (a - f) + 9990 (b - e) + 900 (c - d)
Diff7 = 999999 (a - g) + 99990 (b - f) + 9900 (c - e)}
Ok. You see the pattern:
Diffn = (10^n - 1) * (...) + {10^(n-3) - 1} * 10 * (...) + ...
There is another interesting thing out here.
Consider Diff2.
a >= b and 0 <= a, b <= 9.
This implies, 0 <= (a - b) <= 9.
Consider Diff4
a >= b >= c >= d and 0 <= a, b, c, d <= 9
This implies 0 <= (a - d), (b - c) <= 9
Now, a >= b
=> (a + c) >= (b + c)
c >= d
=> (b + c) >= (b + d)
Therefore, (a + c) >= (b + d)
=> (a - d) >= (b - c)
Thus,
0 <= (b - c) <= (a - d) <= 9
It can similarly be proved that for Diff5
0 <= (b - d) <= (a - e) <= 9
and that for Diff6
0 <= (c - d) <= (b - e) <= (a - f) <= 9
A neat little discovery that would reduce the time taken by the program considerably.
So armed with these changes, you can find the Kaprekar series for 18-digit numbers in 3 minutes!!! Imagine how much time it would have taken for this on a calculator!
2 -> 160 ms 3 -> 70 ms 4 -> 180 ms 5 -> 251 ms 6 -> 801 ms 7 -> 881 ms 8 -> 2 sec 353 ms 9 -> 4 sec 256 ms 10 -> 6 sec 199 ms 11 -> 6 sec 599 ms 12 -> 13 sec 940 ms 13 -> 13 sec 960 ms 14 -> 34 sec 650 ms 15 -> 31 sec 515 ms 16 -> 1 min 17 sec 802 ms 17 -> 1 min 10 sec 321 ms 18 -> 3 min 4 sec 134 ms
Click Here for the code.
Improvement - Huge numbers
Try running the previous program for 19-digit numbers and you're in for a surprise. Well, not really - since you know that "long" cannot handle these huge numbers. So what do we do? Give up and say, well it's a problem of the Language - of those guys who designed java. Its all because of them that I can't handle these big integers! They are to be blamed for all my problems! And you'll soon know that it won't help :)
Ok. I ended up writing a new class called Huge. That can possibly deal with integers up to 1038654705646 - 1. But I wonder which crazy soul would need to deal with such huge integers ... well on second thoughts maybe I'll be that person someday ;)
To deal with Huge, three more methods need to be added - well really it's the change of the signatures of three of the methods:
public Series findSeries(Huge n)
protected Huge getDiffH(short[] digits)
protected Huge getNext(Huge n)
And how long does it take? Well, right now, for a 35-digit number it takes a complete hour!! At least it better than not being able to generate it at all, right? NO. I need to work on this one - some time ... maybe after another decade :-P
2 -> 91 ms 3 -> 0 ms 4 -> 20 ms 5 -> 40 ms 6 -> 50 ms 7 -> 60 ms 8 -> 190 ms 9 -> 281 ms 10 -> 490 ms 11 -> 581 ms 12 -> 1 sec 312 ms 13 -> 1 sec 392 ms 14 -> 3 sec 405 ms 15 -> 3 sec 425 ms 16 -> 7 sec 921 ms 17 -> 8 sec 413 ms 18 -> 17 sec 985 ms 19 -> 24 sec 705 ms 20 -> 45 sec 526 ms 21 -> 53 sec 908 ms 22 -> 1 min 35 sec 357 ms 23 -> 1 min 48 sec 537 ms 24 -> 3 min 12 sec 396 ms 25 -> 3 min 18 sec 255 ms 26 -> 6 min 20 sec 427 ms 27 -> 5 min 49 sec 813 ms 28 -> 12 min 5 sec 756 ms 29 -> 10 min 11 sec 962 ms 30 -> 23 min 43 sec 971 ms 31 -> 18 min 2 sec 774 ms 32 -> 40 min 17 sec 264 ms 33 -> 30 min 56 sec 724 ms 34 -> 1 hrs 10 min 35 sec 587 ms 35 -> 1 hrs 4 min 7 sec 101 ms
Click Here for the code.
A closer look at the Series class
To Those who've been wondering - why a separate class to hold the series, why not a Vector object instead? Or, why is the Series class the way it is?
Lets take an example!
Take a two digit number and lets iterate over it:
23 -> 9 ... WAIT, we've reached the series here, so we'll stop.
Lets take another,
92 -> 63 ... Again, we've reached the series.
Notice that we don't always enter the series at the same point, it could enter from anywhere in the series.
9 -> 81 -> 63
/\ |
| \/
- 45 <- 27
The class implements the equals method, which checks if this series is equivalent to the given object. This method is based on the tenet that a n-Digit number can belong to one and only one kaprekar series. So, if it finds the first number of the given series in this series, it declares the two series equivalent.
The equals method is used internally by the contains method of the Vector class.
Have fun!
Please note that the various programs were run under different conditions - so ideally speaking we can't compare the results critically. But we can surely take note of the trend - of decreasing run tine.
Also note that these programs were written college-style - i.e. no design was made. I just took the m/c and started typing out the code!
Your comments and suggestions are definitely welcome!
On the TODO List for these programs:
- Sorting of the digits [short[] Kaprekar::sort(short[] digits)] uses a very ancient mechanism to do the sorting. Need to change the code there.
- Huge is a huge kludge right now. Wish I had designed it first.
- Add more functionality to Huge (like multiplication) - but that's a totally different area.
- Write classes to find the statistical part of Kaprekar Series (KStat)
- What is the maximum number of iterations to reach a series?
- To which series do the maximum number of numbers zero in on?
- How many numbers reach a series in, say, 5 iterations?
- For each number in a series - how many numbers enter the series at that number.
- Wrote functions to represent the series in XML (Well, I had to do it! :). Write class to load (KReader) and to check it (KCheck).
- Write class to read the XML and present it in a presentable fashion (probably html).
- Allow staggered running of the program. One should be allowed to start the program - and pause it, if need be, to resume it later.
- The ever present "Improve Performance".
- Would multiple threads help in improving performance?
- Port it to C++ and maybe do part of it in ASM.
Click Here for the Kaprekar Series for 2-10 digit numbers.
Click Here to download The Kaprekar Series Generator Version 1.0.