url: Leetcode: Problem no. 483
Given an integer n
represented as a string, return the smallest good base of n
.
We call k >= 2
a good base of n
, if all digits of n
base k
are 1
's.
n
is an integer in the range $[3, 10^{18}]$.n
does not contain any leading zeros.I'd like to believe there is a formula for this.
Every digit should be 1 so
$$
n = \sum_{p=0}^t k^p
$$
I need to find out what k is. I don't know that t
gives the smallest k
for a given n
. I do know that for t=1
$n = k^0 + k^1$
$k = n-1$
so t = 1 gives the largest (and is some cases like example 3 above the only) base k
for which all the digits are 1.
I also know that as I keep increasing t, then I will keep finding a k
smaller than the previous one (if it exists).
But where do I stop? or better yet where do I begin high and keep decreasing t
until I find a valid k
since $k \ge 2$, we can find the largest t where k is 2.
$$
\begin{align*}
n &= \sum_{p=0}^t 2^p \\
n &= 2^0 + 2^1 + 2^2 + ... + 2^t \\
n &= \frac{1-2^{t+1}}{1-2} \\
n &= 2^{t+1} - 1 \\
t+1 &= \log_2(n+1) \\
t &= \log_2(\frac{n+1}{2})
\end{align*}
$$
so we can start at t
equal to the above value, and k=2
we can compute $n\prime$ as
$$
n\prime = \frac{k^{t+1} -1}{k-1}
$$
Now if this $n\prime$ is equal to the given n then we have found that the good base k
. If not, then there are two cases. If $n\prime \le n$, then we should increment k
otherwise decrement t
. Now the solution seems innocent enough.
Remember that in the worst case we have to go from k=2
go k = n-1
and thats not all. We are iterating over t
which starts at the log of n+1
and stops at t=2
. For simplification, lets say that we check for all values of k
in range [2, n-1], for all values of t
in range $[log_2{(n+1)/2}, 2]$. Then the worst case time complexity is $\mathcal{O}(n log(n))$.
Which doesn't seem much. The sorting algorithms are $\mathcal{O}(n log(n))$. But look at the constraints. The n can be as high as $10^{18}$. Now my laptop has a cpu with speed of 2GHz. (forget about how many processors ans such for now). That means that it can run a maximum of $2*10^9$ clock cycles per second(also forgetting about boost and overclock because I also don't know). Lets be very very very optimistic here and suppose that every operation in that algorithm takes a single clock cycle[I may not know hardware stuff very much but I did have microprocessor as a subject in one of semester. So trust me when I say this would have been the biggest lie I've ever told if it wasn't for the next line.]. Lets also suppose that the line of code translates 1:1 to operations then the code below (in python).
def good_base(n: float):
t = math.floor(math.log2((n+1) / 2))
k = 2
while True:
n_prime = (math.pow(k, t+1) - 1) / (k-1)
if n_prime == n:
break
elif n_prime < n:
k+=1
else:
t-=1
return k
Then the loop of 8 lines takes about $\frac{8n}{2*10^{9}}$ seconds to run. For numbers less than $10^9$, it takes less than a second. Why bother with the numbers in between, lets go straight to the last one. It takes about $4000000000$ seconds. That is, it takes $126.84$ years for this hugely oversimplified and optimistic imaginary computational model to find out that the required k is $10^{18} -1$.
So what to do. what to do.
aha. Good old binary search to the rescue
why do i have to search the the numbers for k 2 through n-1. what i can do is
check for every t from the maximum one given by $\log_2(\frac{n+1}{2})$ down to 1
if k = 2 results in $n\prime = n$. if it does we have found the smallest good base if not we have to go for the middle one. so what is the middle one
$$
k_{mid} = \frac{k_0(2)+?}{2}
$$
we need to figure out the maximum k. Now could just use n-1
. if we are doing binary search then we only check the log(n) numbers.
$log_2(10^{18}) = 59.79$
and we have to check that max t to 1 and max t is also given by log base 2 of n. So the number of elements we are checking for the maximum n defined for this problem is nearly $60*60=3600$. compare that with $60 * 10 ^{18}$ we were doing earlier.
Lo and behold
import math
def good_base2(n):
if n < 2:
return 1
t = 0
temp = (n + 1) // 2
while temp > 1:
temp //= 2
t += 1
found = False
while not found and t >= 0:
k = 2
k_m = n
prev_k = 1
while k < n:
numerator = pow(k, t + 1) - 1
denominator = k - 1
n_prime = numerator // denominator
if n_prime == n:
found = True
break
elif n_prime < n:
prev_k = k
k = (k + k_m) // 2
else:
k_m = k
k = (prev_k + k_m) // 2
if prev_k == k:
t -= 1
break
return k
an imaginary and hopefully delusional time of 126 years to
but we can do better. atleast theoretically. It does involve computing square root.
What am I taking about?
well you see how we computed the maximum t given a n.
we can also do the opposite. we can compute the maximum k
for a given t
such that any value of k more than this will make n prime
greater than n
.
we have
$$
n = k^0 + k ^1 + k^2 + .... + k^t
$$
lets us ignore all the expression of k
before $k^t$ so it becomes
$$
\begin{aligned}
n = k^t \
k = \sqrt[t]{n}
\end{aligned}
$$
for any k greater than this would make the calculated n prime greater than n.
as a bonus we can just check with the final value as $\sqrt[t]{n}$ and if the calculated n is smaller than given n then we just go to the next t. What if its bigger than the given then we just subtract one from the calculated k
for this t ($\sqrt[t]{n}$).
def good_base3(n):
if n < 2:
return 1
t = 0
temp = (n + 1) // 2
while temp > 1:
temp //= 2
t += 1
k = math.floor(n ** (1/t))
while t > 0:
numerator = pow(k, t + 1) - 1
denominator = k - 1
n_prime = numerator // denominator
if (n_prime == n):
break
if (n_prime > n):
k -= 1
else:
t-=1
k = math.floor(n ** (1/t))
return k
And I wanted to calculate how many iterations each one took.
The good_base2
took a total of 3511 iterations while the good_base3
took 62 iterations. This make me not sure about the correctness of the last one. I mean I could just try it on leetcode and see. Also I have been avoiding the String thing in the questions. I'm assuming I can just parse the string as u128 in rust.
okay second one is definetely wrong.
well it wasn't wrong the floats caused the error. if t ever becomes 1, the float power function to calculate k cannot
store the value exactly. (this was the problem in both rust and python). rust has f128 which didn't cause this issue but thats not a stable feature so i could not use that.
anyway if it t becomes 1 then we know the answer is n-1 so i run the loop until t is greater than 1.
when the loop exit, I know if it exited by finding the value or if t reaching one. I used bool found for that but as im writing this I realize I could have just checked t is 1 or not.
also python solution also caused the floating point approximation error. so does that mean python stores floating point as 64 bits too.
and why didn't this happen for the number $10^{18}$ but for a complete random number. I think this could all be explained if I made some proof and explaination of the problem but then again I'm really not in the mood. so
I don't really know what to write here. I mean usually I write how the solution works but for this one I feel like I've already explained everything in the [[#Journey]] section above. So have a look at the solution in rust.
impl Solution {
pub fn smallest_good_base(n: String) -> String {
let n = n.parse::<u128>().unwrap();
let mut temp = (n + 1) / 2;
let mut t = 0;
let mut found = false;
// approximate log base 2
while temp > 1 {
temp /= 2;
t += 1;
}
let mut k = ((n as f64).powf(1.0 / t as f64)).floor() as u128;
while t > 1 {
let n_prime = (k.pow(t + 1) - 1) / (k - 1);
if n_prime == n {
found = true;
break;
} else if n_prime > n {
k -= 1;
} else {
t -= 1;
k = ((n as f64).powf(1.0 / t as f64)).floor() as u128;
}
}
if found {
k.to_string()
} else {
(n-1).to_string()
}
}
}
algorithm
Maybe explore my archives. Find me on the Mathstodon. Send me your blog and I’ll read it.