Given two strings a and b, find whether any anagram of string a is a sub-string of string b

I know this is a fairly (old and) simple question. But what people do not realize is that it has a catch! I saw this question here and browsed through most of the solutions. Almost all of them (except the crazy ones where folks wanted to do brutal force search) would fail on a very simple case. And this question, is all about just that case

#include<iostream>
#include<string>
#include<vector>
using namespace std;
int main() {
  string src, tgt;
  std::getline(cin, src);
  std::getline(cin, tgt);
  vector<int> buf(256, 0);
  const int tot = src.size();
  for (int i = 0; i < src.size(); ++i) {
    buf[src[i]]++;
  }
  vector<int> tmp(256, 0);
  int start = -1, tmp_tot = 0, end = 0;
  for (int i = 0; i < tgt.size(); ++i) {
    const unsigned char ch = tgt.at(i);
    if (buf[ch] > 0) {
      if (tmp[ch] == buf[ch]) {
        if (tmp_tot > 256) {
          tmp.clear();
          tmp.resize(256, 0);
        } else {
          while (tmp_tot > 0) tmp[tgt[i - tmp_tot--]]--;
        }
        start = i;
        tmp[ch]++;
        tmp_tot = 1;
      } else {
        if (start == -1) start = i;
        tmp[ch]++;
        ++tmp_tot;
        if (tmp_tot == tot) end = i;
      }
    }
  }
  if (end > 0) {
    cout << start << "    " << end << std::endl;
  } else {
    cout << "Not found\n";
  }
  return 0;
}
Advertisements

Cocktail Sort!

Cocktail Sort is something I came to know of recently. Its much better alternative to Bubble Sort and performs much better.

Most folks should be aware that, sorting algorithms have a fancy property. For relatively small input (<20 maybe) elements the naive algorithms, like Bubble, Insertion and now Cocktail can outperform QuickSort and MergeSort etc. And if you don’t know why so, try to figure out 🙂 So how much can we save by using them in QuickSort ?

So I wrote a small program which takes a bunch of randomly generated numbers and records the run time of sorting them with different algorithms viz. std::sort of C++, qsort of C and my own written Cocktail Sort (I did test my code extensively and it seems to work 😀 ). It does seem that Cocktail outperforms Quick and std::sort for <25 elements. Repeatedly running the experiment does confirm this. But I guess the complexity of incorporating this in qsort or std::sort is more painful than the gain 🙂

Plus this is the first C++11 program of my blog! From now on, I will try to use as much as C++11 as I can but not excessive 😀

// You need C++11 for this to compile!
// You should be having it, so compile this with 'g++ -std=c++11 filename.cc -O3'
// ./a.out N1 N2
// where N1 < N2 and the test will run for array sizes of N1, N1 + 1, ... N2
// and produce result for all the iterations.

#include <algorithm>
#include <chrono>
#include <iostream>
#include <random>
#include <vector>

#include <cassert>
#include <cstdlib>
#include <climits>

void CocktailSort(std::vector<int>& num, int start, int end) {
  const int i_end = end - 1;
  bool swapped = false;
  do {
    for (int i = start; i <= i_end; ++i) {
      if (num[i] > num[i + 1]) {
        std::swap(num[i], num[i + 1]);
        swapped = true;
      }
    }
    if (!swapped) break;
    swapped = false;
    for (int i = end - 1; i >= start; --i) {
      if (num[i] > num[i + 1]) {
        std::swap(num[i], num[i + 1]);
        swapped = true;
      }
    }
  } while (swapped);
}

// This sort works by following a simple invariant. Every time the inner
// for loop completes, we place the min and max elements of the iterated
// subarray at the two ends. We need to do this exactly n / 2 times,
// since with each iteration we place 2 elements correctly.
// I saw the animation on wiki page of Cocktail sort and without reading
// the pseudo code thought of this algorithm and 'assumed' this should be
// Cocktail sort. Well, it is not. But for the ~10k runs of comparison, this
// out performs Cocktail sort, so will use this in QSort.
// Above Cocktail is for demo!
void MyBadSort(std::vector<int>& num, const int start, const int end) {
  const int nelem = end - start + 1;
  if (nelem == 1) return;
  if (nelem == 2) {
    if (num[start] > num[end]) std::swap(num[start], num[end]);
    return;
  }
  int found = start;
  const int n_iter = nelem / 2;
  for (int ix = 0; ix < n_iter; ++ix) {
    const int loc = end - ix;
    int min = num[found];
    int max = num[loc];
    if (min > max) {
      std::swap(min, max);
      std::swap(num[found], num[loc]);
    }
    const int upto = end - ix;
    for (int i = found + 1; i < upto; ++i) {
      if (num[i] < min) {
        min = num[i];
        std::swap(num[found], num[i]);
      }
      if (num[i] > max) {
        max = num[i];
        std::swap(num[loc], num[i]);
      }
    }
    ++found;
  }
}

void MyQSortNaive(std::vector<int>& num, const int start, const int end) {
  const int nelem = end - start + 1;
  if (nelem == 1) return;
  if (nelem == 2) {
    if (num[start] > num[end]) std::swap(num[start], num[end]);
    return;
  }
  const int pivot = num[start];
  int pos = start;
  for (int i = start; i <= end; ++i) {
    if (num[i] < pivot) {
      std::swap(num[pos], num[i]);
      ++pos;
    }
  }
  if (pos > start) MyQSortNaive(num, start, pos - 1);
  for (int i = pos; i <= end; ++i) {
    if (num[i] == pivot) {
      std::swap(num[pos], num[i]);
      ++pos;
    }
  }
  if (pos < end) MyQSortNaive(num, pos, end);
}

void MyQSort(std::vector<int>& num, const int start, const int end) {
  const int nelem = end - start + 1;
  if (nelem == 1) return;
  if (nelem == 2) {
    if (num[start] > num[end]) std::swap(num[start], num[end]);
    return;
  }
  if (nelem == 3) {
    if (num[start] > num[start + 1]) std::swap(num[start], num[start + 1]);
    if (num[start + 1] > num[end]) std::swap(num[start + 1], num[end]);
    if (num[start] > num[start + 1]) std::swap(num[start], num[start + 1]);
    return;
  }
  if (nelem <= 10) {
    MyBadSort(num, start, end);
    return;
  }
  const int pivot = num[start];
  int pos = start;
  for (int i = start; i <= end; ++i) {
    if (num[i] < pivot) {
      std::swap(num[pos], num[i]);
      ++pos;
    }
  }
  if (pos > start) MyQSort(num, start, pos - 1);
  for (int i = pos; i <= end; ++i) {
    if (num[i] == pivot) {
      std::swap(num[pos], num[i]);
      ++pos;
    }
  }
  if (pos < end) MyQSort(num, pos, end);
}

int main(int argc, char** argv) {
  assert(argc == 3);
  const int MOD_VALUE = 1000;  // INT_MAX
  unsigned long long int std_qsort = 0, my_qsort_naive = 0, my_qsort = 0;
  for (int run = std::stoi(argv[1]); run <= std::stoi(argv[2]); ++run) {
    unsigned seed1 = std::chrono::system_clock::now()
        .time_since_epoch().count();
    std::mt19937 rnd(seed1 + run * 1000);
    int N = run;
    std::cout << "\n\nRunning comparison with " << run
              << " elements to sort\n";
    std::vector<int> num;
    num.reserve(N);

    // C++ std::sort
    num.clear();
    for (int i = 0; i < N; ++i) num.push_back(rnd() % MOD_VALUE);
    std::shuffle(num.begin(), num.end(), rnd);
    auto start = std::chrono::high_resolution_clock::now();
    std::sort(num.begin(), num.end());
    auto end = std::chrono::high_resolution_clock::now();
    auto elapsed = std::chrono::duration_cast
      <std::chrono::microseconds>(end - start);
    assert(std::is_sorted(num.begin(), num.end()));
    std_qsort += elapsed.count();
    // std::cout << "Time spent in std::sort: " << elapsed.count() << "\n";

    // My QSort implementation without the Cocktail support
    num.clear();
    for (int i = 0; i < N; ++i) num.push_back(rnd() % MOD_VALUE);
    std::shuffle(num.begin(), num.end(), rnd);
    start = std::chrono::high_resolution_clock::now();
    // qsort(&num[0], num.size(), sizeof(num[0]), cmp);
    MyQSortNaive(num, 0, num.size() - 1);
    end = std::chrono::high_resolution_clock::now();
    elapsed = std::chrono::duration_cast
      <std::chrono::microseconds>(end - start);
    assert(std::is_sorted(num.begin(), num.end()));
    // std::cout << "Time spent in Naive Qsort: " << elapsed.count() << "\n";
    my_qsort_naive += elapsed.count();

    // My QSort with Cocktail support
    num.clear();
    for (int i = 0; i < N; ++i) num.push_back(rnd() % MOD_VALUE);
    std::shuffle(num.begin(), num.end(), rnd);
    start = std::chrono::high_resolution_clock::now();
    MyQSort(num, 0, num.size() - 1);
    end = std::chrono::high_resolution_clock::now();
    elapsed = std::chrono::duration_cast
      <std::chrono::microseconds>(end - start);
    assert(std::is_sorted(num.begin(), num.end()));
    // std::cout << "Time spent in MyQsort with sorter: " << elapsed.count() << "\n";
    my_qsort += elapsed.count();
  }
  std::cout << "\n\n";
  std::cout << "Total time taken for std::sort: " << std_qsort << "\n";
  std::cout << "Total time taken for naive myqsort: " << my_qsort_naive << "\n";
  std::cout << "Total time taken for myqsort: " << my_qsort << "\n";
  return 0;
}

You should try to run this for small N eg. ./a.out 10 12 or ./a.out 12 15 etc.

Finally I did it !!

A friend of mine referred me to a problem from my own blog (i.e. here :D) and I must say I enjoyed every aspect of the problem. From designing its awesum O(nlogn) solution to coding it with handling all the minor details. It’s about to be 6 A.M. now and I can’t sleep without posting the code 😀

Problem Statement: You are given 2 arrays A, B sorted in decreasing order of size m and n respectively and a number k, such that 1 <= k <= m * n. You need to fine the kth largest sum(a+b) possible where a in A, and b in B.

The only caveat is that I did it for kth smallest, I find ascending order to be more natural 🙂

#include<iostream>
#include<vector>

using namespace std;

#define MID(min, max) ((min) + (((max) - (min)) / 2))

int GetSmallerPairsCount(const vector<int>& A,
                         const vector<int>& B,
                         const int sum) {
  int num_smaller_pairs = 0;
  int temp = 0;
  // O(n) loop
  for (int A_index = A.size() - 1, B_index = 0; A_index >= 0; --A_index) {
    for (; B_index < B.size(); ++B_index) {
      if (A[A_index] + B[B_index] <= sum) {
        ++temp;
      } else {
        break;
      }
    }
    num_smaller_pairs += temp;
  }
  return num_smaller_pairs;
}

bool CheckIfSumExists(const vector<int>& A, const vector<int>& B, int sum) {
  // O(n) loop
  for (int a = A.size() - 1, b = 0; a >= 0 && b < B.size();) {
    if (A[a] + B[b] == sum) return true;
    if (A[a] + B[b] < sum) {
      ++b;
    } else {
      --a;
    }
  }
  return false;
}

int FindNextLargestSum(const vector<int>& A,
                       const vector<int>& B,
                       const int sum) {
  int new_sum, n_l = A.back() + B.back();
  // O(n) loop
  for (int A_index = A.size() - 1, B_index = 0;
      A_index >= 0 && B_index < B.size();) {
    new_sum = A[A_index] + B[B_index];
    if ((n_l > new_sum) && (new_sum > sum)) {
      n_l = new_sum;
    }
    if (new_sum <= sum) {
      ++B_index;
    } else {
      --A_index;
    }
  }
  return n_l;
}

int FindPreviousSmallestSum(const vector<int>& A,
                            const vector<int>& B,
                            const int sum) {
  int new_sum, p_s = A.front() + B.front();
  // O(n) loop
  for (int A_index = A.size() - 1, B_index = 0;
      A_index >= 0 && B_index < B.size();) {
    new_sum = A[A_index] + B[B_index];
    if ((p_s < new_sum) && (new_sum < sum)) {
      p_s = new_sum;
    }
    if (new_sum < sum) {
      ++B_index;
    } else {
      --A_index;
    }
  }
  return p_s;
}

// O(n*log(n)) order with O(1) extra space
int KthSmallestPairSum(const vector<int>& A,
                       const vector<int>& B,
                       const int k) {
  int max = A.back() + B.back();
  int min = A.front() + B.front();
  // O(log(n)) loop
  while (1) {
    int curr_mid = CheckIfSumExists(A, B, MID(min, max)) ? MID(min, max) : FindNextLargestSum(A, B, MID(min, max));
    int prev_mid = FindPreviousSmallestSum(A, B, curr_mid);
    int next_mid = FindNextLargestSum(A, B, curr_mid);

    int curr_mid_smaller = GetSmallerPairsCount(A, B, curr_mid);
    int prev_mid_smaller = GetSmallerPairsCount(A, B, prev_mid);
    int next_mid_smaller = GetSmallerPairsCount(A, B, next_mid);

    if (k == prev_mid_smaller) return prev_mid;
    if (prev_mid_smaller < k && next_mid_smaller >= k) {
      if (curr_mid_smaller < k) return next_mid;
      return curr_mid;
    }

    if (prev_mid_smaller > k) {
      max = FindPreviousSmallestSum(A, B, prev_mid);
    } else if (next_mid_smaller < k) {
      min = FindNextLargestSum(A, B, next_mid);
    } else {
      cout << "BUG\n";
      exit(0);
    }
  }
  return -2;
}

int main() {
  int size;
  cout << "Length of first array: " ;
  cin >> size;
  cout << "Enter the elements of first array\n";
  vector<int> A;
  for (int t = -1; size > 0; --size) {
    cin >> t;
    A.push_back(t);
  }
  cout << "Size of second array: ";
  cin >> size;
  vector<int> B;
  for (int t = -1; size > 0; --size) {
    cin >> t;
    B.push_back(t);
  }
  sort(A.begin(), A.end());
  sort(B.begin(), B.end());

  // Build the brute-force solution beforehand to compare
  // the results of our optimized method
  vector<int> brute;
  for (int i = 0; i < A.size(); ++i) {
    for (int j = 0; j < B.size(); ++j) {
      brute.push_back(A[i] + B[j]);
    }
  }
  sort(brute.begin(), brute.end());

  for (int kk = 1; kk <= A.size() * B.size(); ++kk) {
    int k = kk;
    if (k < 1 || k > A.size() * B.size()) {
      cout << "Invalid k. 1 <= k <= " << A.size() * B.size();
    } else {
      int result = KthSmallestPairSum(A, B, k);
      if (result != brute[k-1]) {
        cout << "Method broke. Wake up Avi !!";
        exit(0);
      }
      cout << k << "th smallest pair sum: " << result;
    }
    cout << endl;
  }
  cout << "All Iz Bhel !! " << endl;
  return 0;
}

Zero sum sub-array

Question: There is an array consisting of postive and negative integers. find the subarray whose sum is 0.

Solution:

#include<ext/hash_map>
#include<iostream>
#include<vector>

using namespace __gnu_cxx;
using namespace std;

int main() {
  int n;
  cin >> n;
  int prefix_sum = 0;
  hash_map<int, int> find_subarray;
  int start_index = -1, end_index = -1;
  for (int i = 0, t = -1; i < n; ++i) {
    cin >> t;
    if (!t) {
      start_index = i;
      end_index = i;
      break;
    }
    prefix_sum += t;
    if (!prefix_sum) {
      start_index = 0;
      end_index = i;
      break;
    }
    if (find_subarray.find(prefix_sum) == find_subarray.end()) {
      find_subarray[prefix_sum] = i;
    } else {
      start_index = find_subarray[prefix_sum] + 1;
      end_index = i;
      break;
    }
  }
  if (start_index != -1) {
    cout << "Zero sum subarray found. Start index : " << start_index
         << ". End Index: " << end_index << "\n";
  }
  return 0;
}

Interview Problems 25

1. Given a set of KEY->VALUE pairs such that each KEY is unique, describe a method of storing these pairs on disk, and a method for accessing the corresponding VALUE given a KEY. Assume that RAM is fixed at 1gb and the set of pairs requires 40gb.

2. Consider there is an array with duplicates and you are given two numbers as input and you have to return the minimum distance between the two in the array with minimum complexity.

3. You are given an array of integers, say array1 of size n. You have to create another array (say array2) and fill its contents by following this rule: array2[i] will hold the value of the left-most integer from the range array1[i+1] to array1[n-1] such that it is greater than array1[i]. If no such element exists, assign -1.

4. An external system has an array. You are given 3 inbuilt functions which perform their operation in O(1) time.
a) get(i): returns ith element of array
b) length(): returns length of array
c) reverse(i, j): reverses elements of subarray starting from i and ending with j (both inclusive)

Write function SortArray() using these three functions.

5. Given a binary tree, find 2 leaf nodes say X and Y such that F(X,Y) is maximum where F(X,Y) = sum of nodes in the path from root to X + sum of nodes in the path from root to Y – sum of nodes in the common path from root to first common ancestor of the Nodes X and Y.

6. Given an array, A, find if all elements in the sorted version of A are consecutive in less than O(nlogn).
eg: A: 5 4 1 2 3 on sorting 1 2 3 4 5 all elements are consecutive — true
A: 1 9 2 22 on sorting 1 2 9 22 all elements are NOT consecutive — false

7. Given a string (assume there no spaces or punctuations), write a code that returns the max. length of the string that has repeated more than once.

Interview Problems 24

1. Given a linked list structure where every node represents a linked list and contains two pointers of its type:
(i) pointer to next node in the main list.
(ii) pointer to a linked list where this node is head.
Write a C function to flatten the list into a single linked list. (from here)

2. Hundred of millions of irregular shape objects are moving in random direction. Provide data structure and algo to detect collision. (from here)

3. You are given an array of integer A[1..n] and a maximum sliding window of size w. you need to output an array B where B[i] is the maximum in A[i, i+w-1]. (from here)

4. [Design] Find duplicate profiles in Facebook.

5. You have an array of 0s and 1s and you want to output all the intervals (i, j) where the number of 0s and numbers of 1s are equal. (from here)

6. You have a stream of random numbers that are inserted into an expanding array. How can you maintain the current median and what is the complexity.

7. Given n arrays, find n number such that sum of their differences is minimum. For e.g. if there are three arrays

A = {4, 10, 15, 20}
B = {1, 13, 29}
C = {5, 14, 28}

find three numbers a, b, c such that |a-b| + |b-c| + |c-a| is minimum. Here the answer is a = 15, b = 13, and c = 14

8. [Coding] Write a C/C++ program to create a bitmap of any size as determined by user. Say user says 64k bitmap, then create 64k long bitmap. Have set and unset methods.

9. Reorder 1st string based on 2nd string.
eg: (tractor,car)
output: carrtto or carrott or carrtot.

10. There is a straight roads with ‘n’ number of milestones. You are given an array with the distance between all the pairs of milestones in some random order. Find the position of milestones.
Example:
Consider a road with 4 milestones(a,b,c,d) :
a bcd
Distance between a and b = 3
Distance between a and c = 8
Distance between a and d = 10
Distance between b and c = 5
Distance between b and d = 7
Distance between c and d = 2
All the above values are given in a random order say 7, 10, 5, 2, 8, 3.
The output must be 3,5,2 or 2,5,3

Zeros and Ones .. One Pass

Have heard this question a lot and wanted to code it. I initially always used the version 1 (described below) of this code to form solutions, but I figured an optimization and the version 2 is actually a one-pass algorithm.

Version 1: ( debatable one pass)

#include <iostream>
#include <vector>

using namespace std;

int main() {
  int n;
  cin >> n;
  // containing 0s and 1s
  vector<int> binaries;
  for (int i = 0, t = -1; i < n; ++i) {
    cin >> t;
    binaries.push_back(t);
  }

  vector<int>::iterator back_iter = binaries.begin();
  vector<int>::iterator front_iter = binaries.begin();

  for (; front_iter != binaries.end(); ++front_iter) {
    if (*front_iter == 0) {
      // find the localtion of the back pointer which is pointing to a 1
      while (back_iter != front_iter && *back_iter == 0) {
        ++back_iter;
      }
      if (back_iter != front_iter) {
        // found a place to replace
        iter_swap(back_iter, front_iter);
      }
    }
  }

  for (int i = 0; i < binaries.size(); ++i) {
    cout << binaries[i] << " ";
  }
  cout << "\n";
  return 0;
}

Version 2: Definitely one – pass 🙂

#include <iostream>
#include <vector>

using namespace std;

int main() {
  int n;
  cin >> n;
  // containing 0s and 1s
  vector<int> binaries;
  for (int i = 0, t = -1; i < n; ++i) {
    cin >> t;
    binaries.push_back(t);
  }

  vector<int>::iterator front_iter = binaries.begin();
  vector<int>::iterator back_iter = binaries.end() - 1;

  for (; front_iter != binaries.end() && front_iter != back_iter; ) {
    if (*front_iter) {
      iter_swap(front_iter, back_iter);
      --back_iter;
    } else {
      ++front_iter;
    }
  }

  for (int i = 0; i < binaries.size(); ++i) {
    cout << binaries[i] << " ";
  }
  cout << "\n";
  return 0;
}

Finally I coded this. 😀

%d bloggers like this: