## Programming Problem Solution 2

February 27, 2010 7 Comments

This post is all about solving a programming problem which I had put in **here**. The problem 3 of this post is being solved here. The problem is

You r given a large string of characters lets call it Sbig. Then there is a small set of characters lets call it Ssmall. You have to find smallest substring of Sbig which contains all characters in Ssmall.

For example you are given Sbig = “hello what are you doing”

Ssmall= “eo”

answer is “ello”

The idea for the solution to this problem is that we need to find a “window”, in the big string, which contains all the characters in small string. Notice that any such window will start and end at characters which are in small string. Now once we have found such a window, we just set the left most end point of the window at the character which is just right to to and also in the big string. Now we are short of 1 character in the window but still our left and right end points are pointing to characters in small string. Now we advance the right end point until we find that missing character. When we find it, if the current length of window is smaller than previous, we record it. Then again repeat the process.

eg. Sbig : “theryq fanw rtip etug aifge”

Ssmall : “aie”

So, in the above case, **“eryq fanw rti”** is the first window which we find in the Sbig which has all the characters in Ssmall. Now we set the left end point from ‘e’ to ‘a’ (in ‘fanw’). Now ‘e’ is absent from the window, so we advance the right end point until we find it. We find it at ‘etug’ and also the length of this window is smaller than previous one. Now the solution window is **‘anw rtip e’**. Again we shift the left end point from ‘a’ to ‘i’ ( in rtip) and advance the right end point. This way we will find the solution to the problem. Notice that this algorithm runs in amortized O(n).

Now, this was just the idea behind the solution, the game isn’t over yet :P. This solution will only work when the Ssmall does not have repeated characters and the window which we find also does not have repeated characters ( of small string). Just make a small case and figure it out :). So what should be done ??? Well, the idea remains the same, but some other requirements need to be satisfied to keep the algorithms still in O(n). A lot of book-keeping has to be done to maintain the count of all the characters in small string and the window which we find and adjust them accordingly.

**The C program which solves it for all the cases.**

#include<stdio.h> #include<string.h> #include<stdlib.h> int main() { char Sbig[100000], Ssmall[1000], lookup[256]; scanf("%[^\n] ",Sbig); scanf("%[^\n]",Ssmall); int l_b = strlen(Sbig), l_s = strlen(Ssmall),i,ans_left = 0, ans_right = 0, min = 1<<30, left_anchor = 0, right_anchor = 0, count = 0; int lookup_count[256], temp_lookup_count[256], temp_right_anchor; memset((void *)lookup, 0, 256); memset((void *)lookup_count, 0, 256 * sizeof(int)); memset((void *)temp_lookup_count, 0, 256 * sizeof(int)); for (i = 0 ; i < l_s ; i++) { lookup[Ssmall[i]] = '1'; lookup_count[Ssmall[i]]++; } // Search for the left anchor to put for the first time while (lookup[Sbig[left_anchor]] != '1') left_anchor++; count = 1; temp_lookup_count[Sbig[left_anchor]] = 1; // An O(n) loop which implements the algorithm for (right_anchor = left_anchor + 1 ; right_anchor < l_b ; right_anchor++) { if ( count < l_s && lookup[Sbig[right_anchor]] == '1' ) { if (temp_lookup_count[Sbig[right_anchor]] < lookup_count[Sbig[right_anchor]]) count++; temp_lookup_count[Sbig[right_anchor]]++; } if (count == l_s) { if (right_anchor - left_anchor + 1 < min ) { ans_left = left_anchor; ans_right = right_anchor; min = right_anchor - left_anchor + 1; } temp_lookup_count[Sbig[left_anchor]]--; temp_right_anchor = right_anchor; if (temp_lookup_count[Sbig[left_anchor]] < lookup_count[Sbig[left_anchor]]) count--; else right_anchor--; // This is done to keep the right anchor fixed at its place, for loop will increment it to same value left_anchor++; while ( left_anchor < temp_right_anchor && lookup[Sbig[left_anchor]] != '1') left_anchor++; } } printf("Min length : %d : Start : %d End : %d\n",min, ans_left, ans_right); return 0; }

It is very simple but just the book-keeping has made it look tedious. I’d suggest one should think of how to handle the cases of duplicates, figure out a solution and then compare with my code.

Hope this has been helpful. 🙂

## Recent Activity