The Just Shall Live By Faith :D

I had posted something about fast encoding and decoding of integer array previously here. So here is something (maybe) more frequently used. A fast way to save space when you need to store large number of unsigned integers. I came up with two methods, one the evil but faster (exploiting the C features of C++) and the other which is more elegant, with similar performance in space but (might) slow down marginally. I’ll prefer the second, but C++ allows you to shoot yourself in the foot, so eager geeks go ahead :D

Using (non-recommended) C style type-casting:

#include <iostream>
#include <string>

using namespace std;

void EncodeNumber(unsigned int num, string* num_str) {
  unsigned char buf[4];
  unsigned char* ptr = (unsigned char*)&num;
  buf[0] = ptr[0];
  buf[1] = ptr[1];
  buf[2] = ptr[2];
  buf[3] = ptr[3];
  num_str->assign((const char*)buf, 4);
}

void DecodeNumber(const string& num_str, unsigned int* num) {
  const unsigned int* a = (const unsigned int*)num_str.c_str();
  *num = *a;
}

int main() {
  unsigned int num;
  cin >> num;
  string num_str;
  EncodeNumber(num, &num_str);
  cout << "String format of number: " << num_str << endl;
  DecodeNumber(num_str, &num);
  cout << "Decoded number: " << num << endl;
  return 0;
}

The one using (recommended) C++ style type-casting:

#include <iostream>
#include <string>

using namespace std;

void EncodeNumber(unsigned int num, string* num_str) {
  unsigned char buf[4];
  buf[3] = num >> 24; num <<= 8; num >>= 8;
  buf[2] = num >> 16; num <<= 16; num >>= 16;
  buf[1] = num >> 8; num <<= 24; num >>= 24;
  buf[0] = num;
  num_str->assign(reinterpret_cast<const char*>(buf), 4);
}

void DecodeNumber(const string& num_str, unsigned int* num) {
  const unsigned int* a
      = reinterpret_cast<const unsigned int*>(num_str.c_str());
  *num = *a;
}

int main() {
  unsigned int num;
  cin >> num;
  string num_str;
  EncodeNumber(num, &num_str);
  cout << "String format of number: " << num_str << endl;
  DecodeNumber(num_str, &num);
  cout << "Decoded number: " << num << endl;
  return 0;
}

PS1: The encoding and decoding is assured to work if both are done on machines with same endianness and will break if different for encoding and decoding.
PS2: All this was done while waiting for a computationally intensive task to complete. Unfortunately this can’t speed it up :( . Hope (at least) someone finds some use of this :)

Sayings of Chanakya

Wanted to log these hence posting here. A brilliant collection.

1) “Learn from the mistakes of others… you can’t live long enough to make them all yourselves!!”

2)”A person should not be too honest. Straight trees are cut first and Honest people are screwed first.”

3)”Even if a snake is not poisonous, it should pretend to be venomous.”

4)”There is some self-interest behind every friendship. There is no friendship without self-interests. This is a bitter truth.”

5)” Before you start some work, always ask yourself three questions – Why am I doing it, What the results might be and Will I be successful. Only when you think deeply and find satisfactory answers to these questions, go ahead.”

6)”As soon as the fear approaches near, attack and destroy it.”

7)”The world’s biggest power is the youth and beauty of a woman.”

8)”Once you start a working on something, don’t be afraid of failure and don’t abandon it. People who work sincerely are the happiest.”

9)”The fragrance of flowers spreads only in the direction of the wind. But the goodness of a person spreads in all direction.”

10)”God is not present in idols. Your feelings are your god. The soul is your temple.”

11) “A man is great by deeds, not by birth.”

12) “Never make friends with people who are above or below you in status. Such friendships will never give you any happiness.”

13) “Treat your kid like a darling for the first five years. For the next five years, scold them. By the time they turn sixteen, treat them like a friend. Your grown up children are your best friends.”

14) “Books are as useful to a stupid person as a mirror is useful to a blind person.”

15) “Education is the best friend. An educated person is respected everywhere. Education beats the beauty and the youth.”

A night out after long :)

So found the following question here (Antonio is awesum :) ) and worked out a O(n) time solution BUT with O(n) space. I claim the space usage can be brought down to O(1). Anyone game enough to post the O(1) space version?

Problem:

Given an array A of numbers, find a pair of numbers A[i] and A[j], such that A[i] < A[j] and j-i is maximized.

O(n) time O(n) space code

#include <iostream>
#include <string>
#include <utility>
#include <vector>

using namespace std;

int main() {
  vector<int> numbers;
  int n;
  cin >> n;
  numbers.resize(n);
  for (int i = 0, t; i < n; ++i) {
    cin >> t;
    numbers[i] = t;
  }
  vector<pair<int, int> > decr;
  decr.push_back(make_pair(numbers[0], 0));
  for (int i = 1; i < numbers.size(); ++i) {
    if (numbers[i] < decr.back().first) {
      decr.push_back(make_pair(numbers[i], i));
    }
  }
  vector<pair<int, int> > incr;
  incr.push_back(make_pair(numbers[n-1], n-1));
  for (int i = n - 2; i >= 0; --i) {
    if (numbers[i] > incr.back().first) {
      incr.push_back(make_pair(numbers[i], i));
    }
  }
  reverse(incr.begin(), incr.end());
  int iter = min(incr.size(), decr.size());
  int max_diff = -1, s_i = -1, s_j = -1;
  int i_i = 0, d_i = 0;
  for (; i_i < incr.size() && d_i < decr.size();) {
    if (incr[i_i].first > decr[d_i].first) {
      int tmp_diff = incr[i_i].second - decr[d_i].second;
      if (tmp_diff > max_diff) {
        max_diff = tmp_diff;
        s_i = decr[d_i].second;
        s_j = incr[i_i].second;
      }
      i_i++;
    } else if (incr[i_i].first < decr[d_i].first) {
      d_i++;
    } else {
      i_i++;
      d_i++;
    }
  }
  cout << "i: " << s_i << "  j: " << s_j << endl;
  return 0;
}

जाग तुझको दूर जाना!

An awesome composition by a Hindi poet, महादेवी वर्मा

चिर सजग आँखें उनींदी आज कैसा व्यस्त बाना!
जाग तुझको दूर जाना!

अचल हिमगिरि के हॄदय में आज चाहे कम्प हो ले!
या प्रलय के आँसुओं में मौन अलसित व्योम रो ले;
आज पी आलोक को ड़ोले तिमिर की घोर छाया
जाग या विद्युत शिखाओं में निठुर तूफान बोले!
पर तुझे है नाश पथ पर चिन्ह अपने छोड़ आना!
जाग तुझको दूर जाना!

बाँध लेंगे क्या तुझे यह मोम के बंधन सजीले?
पंथ की बाधा बनेंगे तितलियों के पर रंगीले?
विश्व का क्रंदन भुला देगी मधुप की मधुर गुनगुन,
क्या डुबो देंगे तुझे यह फूल दे दल ओस गीले?
तू न अपनी छाँह को अपने लिये कारा बनाना!
जाग तुझको दूर जाना!

वज्र का उर एक छोटे अश्रु कण में धो गलाया,
दे किसे जीवन-सुधा दो घँट मदिरा माँग लाया!
सो गई आँधी मलय की बात का उपधान ले क्या?
विश्व का अभिशाप क्या अब नींद बनकर पास आया?
अमरता सुत चाहता क्यों मृत्यु को उर में बसाना?
जाग तुझको दूर जाना!

कह न ठंढी साँस में अब भूल वह जलती कहानी,
आग हो उर में तभी दृग में सजेगा आज पानी;
हार भी तेरी बनेगी माननी जय की पताका,
राख क्षणिक पतंग की है अमर दीपक की निशानी!
है तुझे अंगार-शय्या पर मृदुल कलियां बिछाना!
जाग तुझको दूर जाना!

More compositions of महादेवी वर्मा can be found here

Finally I did it !!

A friend of mine referred me to a problem from my own blog (i.e. here :D ) and I must say I enjoyed every aspect of the problem. From designing its awesum O(nlogn) solution to coding it with handling all the minor details. It’s about to be 6 A.M. now and I can’t sleep without posting the code :D

Problem Statement: You are given 2 arrays A, B sorted in decreasing order of size m and n respectively and a number k, such that 1 <= k <= m * n. You need to fine the kth largest sum(a+b) possible where a in A, and b in B.

The only caveat is that I did it for kth smallest, I find ascending order to be more natural :)

#include<iostream>
#include<vector>

using namespace std;

#define MID(min, max) ((min) + (((max) - (min)) / 2))

int GetSmallerPairsCount(const vector<int>& A,
                         const vector<int>& B,
                         const int sum) {
  int num_smaller_pairs = 0;
  int temp = 0;
  // O(n) loop
  for (int A_index = A.size() - 1, B_index = 0; A_index >= 0; --A_index) {
    for (; B_index < B.size(); ++B_index) {
      if (A[A_index] + B[B_index] <= sum) {
        ++temp;
      } else {
        break;
      }
    }
    num_smaller_pairs += temp;
  }
  return num_smaller_pairs;
}

bool CheckIfSumExists(const vector<int>& A, const vector<int>& B, int sum) {
  // O(n) loop
  for (int a = A.size() - 1, b = 0; a >= 0 && b < B.size();) {
    if (A[a] + B[b] == sum) return true;
    if (A[a] + B[b] < sum) {
      ++b;
    } else {
      --a;
    }
  }
  return false;
}

int FindNextLargestSum(const vector<int>& A,
                       const vector<int>& B,
                       const int sum) {
  int new_sum, n_l = A.back() + B.back();
  // O(n) loop
  for (int A_index = A.size() - 1, B_index = 0;
      A_index >= 0 && B_index < B.size();) {
    new_sum = A[A_index] + B[B_index];
    if ((n_l > new_sum) && (new_sum > sum)) {
      n_l = new_sum;
    }
    if (new_sum <= sum) {
      ++B_index;
    } else {
      --A_index;
    }
  }
  return n_l;
}

int FindPreviousSmallestSum(const vector<int>& A,
                            const vector<int>& B,
                            const int sum) {
  int new_sum, p_s = A.front() + B.front();
  // O(n) loop
  for (int A_index = A.size() - 1, B_index = 0;
      A_index >= 0 && B_index < B.size();) {
    new_sum = A[A_index] + B[B_index];
    if ((p_s < new_sum) && (new_sum < sum)) {
      p_s = new_sum;
    }
    if (new_sum < sum) {
      ++B_index;
    } else {
      --A_index;
    }
  }
  return p_s;
}

// O(n*log(n)) order with O(1) extra space
int KthSmallestPairSum(const vector<int>& A,
                       const vector<int>& B,
                       const int k) {
  int max = A.back() + B.back();
  int min = A.front() + B.front();
  // O(log(n)) loop
  while (1) {
    int curr_mid = CheckIfSumExists(A, B, MID(min, max)) ? MID(min, max) : FindNextLargestSum(A, B, MID(min, max));
    int prev_mid = FindPreviousSmallestSum(A, B, curr_mid);
    int next_mid = FindNextLargestSum(A, B, curr_mid);

    int curr_mid_smaller = GetSmallerPairsCount(A, B, curr_mid);
    int prev_mid_smaller = GetSmallerPairsCount(A, B, prev_mid);
    int next_mid_smaller = GetSmallerPairsCount(A, B, next_mid);

    if (k == prev_mid_smaller) return prev_mid;
    if (prev_mid_smaller < k && next_mid_smaller >= k) {
      if (curr_mid_smaller < k) return next_mid;
      return curr_mid;
    }

    if (prev_mid_smaller > k) {
      max = FindPreviousSmallestSum(A, B, prev_mid);
    } else if (next_mid_smaller < k) {
      min = FindNextLargestSum(A, B, next_mid);
    } else {
      cout << "BUG\n";
      exit(0);
    }
  }
  return -2;
}

int main() {
  int size;
  cout << "Length of first array: " ;
  cin >> size;
  cout << "Enter the elements of first array\n";
  vector<int> A;
  for (int t = -1; size > 0; --size) {
    cin >> t;
    A.push_back(t);
  }
  cout << "Size of second array: ";
  cin >> size;
  vector<int> B;
  for (int t = -1; size > 0; --size) {
    cin >> t;
    B.push_back(t);
  }
  sort(A.begin(), A.end());
  sort(B.begin(), B.end());

  // Build the brute-force solution beforehand to compare
  // the results of our optimized method
  vector<int> brute;
  for (int i = 0; i < A.size(); ++i) {
    for (int j = 0; j < B.size(); ++j) {
      brute.push_back(A[i] + B[j]);
    }
  }
  sort(brute.begin(), brute.end());

  for (int kk = 1; kk <= A.size() * B.size(); ++kk) {
    int k = kk;
    if (k < 1 || k > A.size() * B.size()) {
      cout << "Invalid k. 1 <= k <= " << A.size() * B.size();
    } else {
      int result = KthSmallestPairSum(A, B, k);
      if (result != brute[k-1]) {
        cout << "Method broke. Wake up Avi !!";
        exit(0);
      }
      cout << k << "th smallest pair sum: " << result;
    }
    cout << endl;
  }
  cout << "All Iz Bhel !! " << endl;
  return 0;
}

Zero sum sub-array

Question: There is an array consisting of postive and negative integers. find the subarray whose sum is 0.

Solution:

#include<ext/hash_map>
#include<iostream>
#include<vector>

using namespace __gnu_cxx;
using namespace std;

int main() {
  int n;
  cin >> n;
  int prefix_sum = 0;
  hash_map<int, int> find_subarray;
  int start_index = -1, end_index = -1;
  for (int i = 0, t = -1; i < n; ++i) {
    cin >> t;
    if (!t) {
      start_index = i;
      end_index = i;
      break;
    }
    prefix_sum += t;
    if (!prefix_sum) {
      start_index = 0;
      end_index = i;
      break;
    }
    if (find_subarray.find(prefix_sum) == find_subarray.end()) {
      find_subarray[prefix_sum] = i;
    } else {
      start_index = find_subarray[prefix_sum] + 1;
      end_index = i;
      break;
    }
  }
  if (start_index != -1) {
    cout << "Zero sum subarray found. Start index : " << start_index
         << ". End Index: " << end_index << "\n";
  }
  return 0;
}

Truth beind Engineering

When you’re working with TBs and PBs of data you get some serious realizations. Beautifully put forth in the penultimate slide of this presentation.

All engineering solutions are transient.

Nothing’s perfect but some solutions are good enough for a while.

Scalability solutions aren’t magic. They involve partitioning, indexing and replication.

All data for real-time queries MUST be in memory.

Disk is for writes only.

Some problems can be solved by pre-computation, but a lot can’t.

Exploit locality where possible.

Source

Interview Problems 25

1. Given a set of KEY->VALUE pairs such that each KEY is unique, describe a method of storing these pairs on disk, and a method for accessing the corresponding VALUE given a KEY. Assume that RAM is fixed at 1gb and the set of pairs requires 40gb.

2. Consider there is an array with duplicates and you are given two numbers as input and you have to return the minimum distance between the two in the array with minimum complexity.

3. You are given an array of integers, say array1 of size n. You have to create another array (say array2) and fill its contents by following this rule: array2[i] will hold the value of the left-most integer from the range array1[i+1] to array1[n-1] such that it is greater than array1[i]. If no such element exists, assign -1.

4. An external system has an array. You are given 3 inbuilt functions which perform their operation in O(1) time.
a) get(i): returns ith element of array
b) length(): returns length of array
c) reverse(i, j): reverses elements of subarray starting from i and ending with j (both inclusive)

Write function SortArray() using these three functions.

5. Given a binary tree, find 2 leaf nodes say X and Y such that F(X,Y) is maximum where F(X,Y) = sum of nodes in the path from root to X + sum of nodes in the path from root to Y – sum of nodes in the common path from root to first common ancestor of the Nodes X and Y.

6. Given an array, A, find if all elements in the sorted version of A are consecutive in less than O(nlogn).
eg: A: 5 4 1 2 3 on sorting 1 2 3 4 5 all elements are consecutive — true
A: 1 9 2 22 on sorting 1 2 9 22 all elements are NOT consecutive — false

7. Given a string (assume there no spaces or punctuations), write a code that returns the max. length of the string that has repeated more than once.

Interview Problems 24

1. Given a linked list structure where every node represents a linked list and contains two pointers of its type:
(i) pointer to next node in the main list.
(ii) pointer to a linked list where this node is head.
Write a C function to flatten the list into a single linked list. (from here)

2. Hundred of millions of irregular shape objects are moving in random direction. Provide data structure and algo to detect collision. (from here)

3. You are given an array of integer A[1..n] and a maximum sliding window of size w. you need to output an array B where B[i] is the maximum in A[i, i+w-1]. (from here)

4. [Design] Find duplicate profiles in Facebook.

5. You have an array of 0s and 1s and you want to output all the intervals (i, j) where the number of 0s and numbers of 1s are equal. (from here)

6. You have a stream of random numbers that are inserted into an expanding array. How can you maintain the current median and what is the complexity.

7. Given n arrays, find n number such that sum of their differences is minimum. For e.g. if there are three arrays

A = {4, 10, 15, 20}
B = {1, 13, 29}
C = {5, 14, 28}

find three numbers a, b, c such that |a-b| + |b-c| + |c-a| is minimum. Here the answer is a = 15, b = 13, and c = 14

8. [Coding] Write a C/C++ program to create a bitmap of any size as determined by user. Say user says 64k bitmap, then create 64k long bitmap. Have set and unset methods.

9. Reorder 1st string based on 2nd string.
eg: (tractor,car)
output: carrtto or carrott or carrtot.

10. There is a straight roads with ‘n’ number of milestones. You are given an array with the distance between all the pairs of milestones in some random order. Find the position of milestones.
Example:
Consider a road with 4 milestones(a,b,c,d) :
a bcd
Distance between a and b = 3
Distance between a and c = 8
Distance between a and d = 10
Distance between b and c = 5
Distance between b and d = 7
Distance between c and d = 2
All the above values are given in a random order say 7, 10, 5, 2, 8, 3.
The output must be 3,5,2 or 2,5,3

Zeros and Ones .. One Pass

Have heard this question a lot and wanted to code it. I initially always used the version 1 (described below) of this code to form solutions, but I figured an optimization and the version 2 is actually a one-pass algorithm.

Version 1: ( debatable one pass)

#include <iostream>
#include <vector>

using namespace std;

int main() {
  int n;
  cin >> n;
  // containing 0s and 1s
  vector<int> binaries;
  for (int i = 0, t = -1; i < n; ++i) {
    cin >> t;
    binaries.push_back(t);
  }

  vector<int>::iterator back_iter = binaries.begin();
  vector<int>::iterator front_iter = binaries.begin();

  for (; front_iter != binaries.end(); ++front_iter) {
    if (*front_iter == 0) {
      // find the localtion of the back pointer which is pointing to a 1
      while (back_iter != front_iter && *back_iter == 0) {
        ++back_iter;
      }
      if (back_iter != front_iter) {
        // found a place to replace
        iter_swap(back_iter, front_iter);
      }
    }
  }

  for (int i = 0; i < binaries.size(); ++i) {
    cout << binaries[i] << " ";
  }
  cout << "\n";
  return 0;
}

Version 2: Definitely one – pass :)

#include <iostream>
#include <vector>

using namespace std;

int main() {
  int n;
  cin >> n;
  // containing 0s and 1s
  vector<int> binaries;
  for (int i = 0, t = -1; i < n; ++i) {
    cin >> t;
    binaries.push_back(t);
  }

  vector<int>::iterator front_iter = binaries.begin();
  vector<int>::iterator back_iter = binaries.end() - 1;

  for (; front_iter != binaries.end() && front_iter != back_iter; ) {
    if (*front_iter) {
      iter_swap(front_iter, back_iter);
      --back_iter;
    } else {
      ++front_iter;
    }
  }

  for (int i = 0; i < binaries.size(); ++i) {
    cout << binaries[i] << " ";
  }
  cout << "\n";
  return 0;
}

Finally I coded this. :D

Follow

Get every new post delivered to your Inbox.

Join 95 other followers