LeetCode - Verbal Arithmetic Puzzle

verbal arithmetic puzzle

I encountered LeetCode problem 1307. Verbal Arithmetic Puzzle in Weekly Contest 169, and it was fun to figure out why my first submission is TLE.

Question

Given an equation, represented by words on left side and the result on right side.

You need to check if the equation is solvable under the following rules:

Each character is decoded as one digit (0 - 9).
Every pair of different characters they must map to different digits.
Each words[i] and result are decoded as one number without leading zeros.
Sum of numbers on left side (words) will equal to the number on right side (result).

Return True if the equation is solvable otherwise return False.

Example 1:

Input: words = ["SEND","MORE"], result = "MONEY"
Output: true
Explanation: Map 'S'-> 9, 'E'->5, 'N'->6, 'D'->7, 'M'->1, 'O'->0, 'R'->8, 'Y'->'2'
Such that: "SEND" + "MORE" = "MONEY" ,  9567 + 1085 = 10652

Example 2:

Input: words = ["SIX","SEVEN","SEVEN"], result = "TWENTY"
Output: true
Explanation: Map 'S'-> 6, 'I'->5, 'X'->0, 'E'->8, 'V'->7, 'N'->2, 'T'->1, 'W'->'3', 'Y'->4
Such that: "SIX" + "SEVEN" + "SEVEN" = "TWENTY" ,  650 + 68782 + 68782 = 138214

Example 3:

Input: words = ["THIS","IS","TOO"], result = "FUNNY"
Output: true

Example 4:

Input: words = ["LEET","CODE"], result = "POINT"
Output: false

Constraints:

2 <= words.length <= 5
1 <= words[i].length, results.length <= 7
words[i], result contains only upper case English letters.
Number of different characters used on the expression is at most 10.

My First Submission (TLE)

class Solution {
public:
  set<char> c_set;
  vector<char> c;
  set<char> no_zeros;
  map<char, int> l_factors;
  map<char, int> r_factors;
  set<int> used;
  
  bool isSolvable(vector<string>& words, string result) {
    for(int i = 0; i < words.size(); i++) {
      int factor = 1;
      for(int j = 0; j < words[i].size(); j++) {
        char tmp = words[i][words[i].size() - j - 1];
        if (j == words[i].size() - 1) {
          no_zeros.insert(tmp);
        }
        c_set.insert(tmp);
        
        if (l_factors.find(tmp) == l_factors.end()) {
          l_factors[tmp] = factor;
        } else {
          l_factors[tmp] += factor;
        }
        
        factor *= 10;
      }
    }
    for(int i = 0, factor = 1; i < result.size(); i++) {
      char tmp = result[result.size() - i - 1];
      if (i == result.size() - 1) {
        no_zeros.insert(tmp);
      }
      c_set.insert(tmp);
      if (r_factors.find(tmp) == r_factors.end()) {
        r_factors[tmp] = factor;
      } else {
        r_factors[tmp] += factor;
      }
      factor *= 10;
    }
    
    for(auto i : c_set) {
      c.push_back(i);
    }
    
    return dfs(0, 0, 0);
  }
  
  bool dfs(int r, int left, int right) {
    if (r == c.size()) {
      return left == right;
    }
    
    char tmp = c[r];
    int l_factor = 0;
    if (l_factors.find(tmp) != l_factors.end()) {
        l_factor = l_factors[tmp];
    }
    
    int r_factor = 0;
    if (r_factors.find(tmp) != r_factors.end()) {
      r_factor = r_factors[tmp];
    }
    
    for(int i = 0; i < 10; i++) {
      if (i == 0 && no_zeros.find(tmp) != no_zeros.end()) {
        continue;
      }
      if (used.find(i) != used.end()) {
        continue;
      }
      used.insert(i);
      
      if (dfs(r + 1, left + i * l_factor, right + i * r_factor)) {
        return true;
      }
      used.erase(i);
    }
    return false;
  }
};

Set vs Array

The main problem of the above solution is the set<int> used. As we will insert and erase elements from the set in O(n * 10!/(10-n)!) times, which n is the unique number of chars of the input strings, the complexity of the insert and erase operations matters here. It’s interesting that changing the set to an array is enough to pass the test cases. The time complexity difference between set and array here should be constant as the size of the set is fixed (10 in this question). So the constant must be large so that LeetCode can have stable test cases to TLE my solution. So what’s the constant here?

Below is a poor man’s benchmark to figure out the constant:

#include <chrono>
#include <cstdlib>
#include <ctime>
#include <iostream>
#include <set>

using namespace std;

#define SIZE 10
#define PRE_FILL (SIZE / 2)
#define OP 1000000

set<int> test_set;
bool test_array[SIZE];

void set_find() {
  int count = 0;
  for(int i = 0; i < OP; i++) {
    int j = std::rand() % SIZE;
    if (test_set.find(j) != test_set.end()) {
      count += 1;
    }
  }
}

void array_find() {
  int count = 0;
  for(int i = 0; i < OP; i++) {
    int j = std::rand() % SIZE;
    if (test_array[j]) {
      count += 1;
    }
  }
}

void set_insert_and_delete() {
  for(int i = 0; i < OP; i++) {
      int j = std::rand() % SIZE;
      test_set.insert(j);
      test_set.erase(j);
  }
}

void array_insert_and_delete() {
  for(int i = 0; i < OP; i++) {
      int j = std::rand() % SIZE;
      test_array[j] = true;
      test_array[j] = false;
  }
}

void measure(string name, void (*func)()) {
  auto start = std::chrono::high_resolution_clock::now();
  
  func();
  
  auto finish = std::chrono::high_resolution_clock::now();
  std::chrono::duration<double> elapsed = finish - start;
  std::cout << name << ": " << elapsed.count() << " s\n";
}

int main(int argc, char *argv[]) {
  std::srand(std::time(nullptr)); // use current time as seed for random generator

  memset(test_array, 0, sizeof test_array);
  
  for(int i = 0; i < PRE_FILL; i++) {
      int j = std::rand() % SIZE;
      test_set.insert(j);
      test_array[j] = true;
  }

  measure("set_find", set_find);
  measure("array_find", array_find);
  measure("set_insert_and_delete", set_insert_and_delete);
  measure("array_insert_and_delete", array_insert_and_delete);
}

Output(I was running on a MacBook Pro (Retina, 13-inch, Late 2012) with 2.5 GHz Intel Core i5):

set_find: 0.161238 s
array_find: 0.0205961 s
set_insert_and_delete: 0.515822 s
array_insert_and_delete: 0.0111742 s

The constant is around 50 (In this specific run, 0.515822 / 0.0111742 = 46), which is larger than my expectation, as I assume it should be close to log(10) as the underlying black red tree operation is log(N), but it turns out to be much more expensive than that. So it’s petty expensive to use a set in this case.

Lesson Learned

The rules of thumb matter. If I was aware of the difference, I would try using array more for sure.

Here are some related articles I found during searching for the topic:

Talking about rules of thumb, below are the latency numbers Jeff Dean thought every programmer should know:

Latency Comparison Numbers (~2012)
----------------------------------
L1 cache reference                           0.5 ns
Branch mispredict                            5   ns
L2 cache reference                           7   ns                      14x L1 cache
Mutex lock/unlock                           25   ns
Main memory reference                      100   ns                      20x L2 cache, 200x L1 cache
Compress 1K bytes with Zippy             3,000   ns        3 us
Send 1K bytes over 1 Gbps network       10,000   ns       10 us
Read 4K randomly from SSD*             150,000   ns      150 us          ~1GB/sec SSD
Read 1 MB sequentially from memory     250,000   ns      250 us
Round trip within same datacenter      500,000   ns      500 us
Read 1 MB sequentially from SSD*     1,000,000   ns    1,000 us    1 ms  ~1GB/sec SSD, 4X memory
Disk seek                           10,000,000   ns   10,000 us   10 ms  20x datacenter roundtrip
Read 1 MB sequentially from disk    20,000,000   ns   20,000 us   20 ms  80x memory, 20X SSD
Send packet CA->Netherlands->CA    150,000,000   ns  150,000 us  150 ms

Notes
-----
1 ns = 10^-9 seconds
1 us = 10^-6 seconds = 1,000 ns (it's called one microsecond.)
1 ms = 10^-3 seconds = 1,000 us = 1,000,000 ns (it's called one millisecond.)

Credit
------
By Jeff Dean:               http://research.google.com/people/jeff/
Originally by Peter Norvig: http://norvig.com/21-days.html#answers

Contributions
-------------
'Humanized' comparison:  https://gist.github.com/hellerbarde/2843375
Visual comparison chart: http://i.imgur.com/k0t1e.png

Yi's Blog

不图新奇，但问优劣