# Programming Problem Solution 2

February 27, 2010 7 Comments

This post is all about solving a programming problem which I had put in **here**. The problem 3 of this post is being solved here. The problem is

You r given a large string of characters lets call it Sbig. Then there is a small set of characters lets call it Ssmall. You have to find smallest substring of Sbig which contains all characters in Ssmall.

For example you are given Sbig = “hello what are you doing”

Ssmall= “eo”

answer is “ello”

The idea for the solution to this problem is that we need to find a “window”, in the big string, which contains all the characters in small string. Notice that any such window will start and end at characters which are in small string. Now once we have found such a window, we just set the left most end point of the window at the character which is just right to to and also in the big string. Now we are short of 1 character in the window but still our left and right end points are pointing to characters in small string. Now we advance the right end point until we find that missing character. When we find it, if the current length of window is smaller than previous, we record it. Then again repeat the process.

eg. Sbig : “theryq fanw rtip etug aifge”

Ssmall : “aie”

So, in the above case, **“eryq fanw rti”** is the first window which we find in the Sbig which has all the characters in Ssmall. Now we set the left end point from ‘e’ to ‘a’ (in ‘fanw’). Now ‘e’ is absent from the window, so we advance the right end point until we find it. We find it at ‘etug’ and also the length of this window is smaller than previous one. Now the solution window is **‘anw rtip e’**. Again we shift the left end point from ‘a’ to ‘i’ ( in rtip) and advance the right end point. This way we will find the solution to the problem. Notice that this algorithm runs in amortized O(n).

Now, this was just the idea behind the solution, the game isn’t over yet :P. This solution will only work when the Ssmall does not have repeated characters and the window which we find also does not have repeated characters ( of small string). Just make a small case and figure it out :). So what should be done ??? Well, the idea remains the same, but some other requirements need to be satisfied to keep the algorithms still in O(n). A lot of book-keeping has to be done to maintain the count of all the characters in small string and the window which we find and adjust them accordingly.

**The C program which solves it for all the cases.**

#include<stdio.h> #include<string.h> #include<stdlib.h> int main() { char Sbig[100000], Ssmall[1000], lookup[256]; scanf("%[^\n] ",Sbig); scanf("%[^\n]",Ssmall); int l_b = strlen(Sbig), l_s = strlen(Ssmall),i,ans_left = 0, ans_right = 0, min = 1<<30, left_anchor = 0, right_anchor = 0, count = 0; int lookup_count[256], temp_lookup_count[256], temp_right_anchor; memset((void *)lookup, 0, 256); memset((void *)lookup_count, 0, 256 * sizeof(int)); memset((void *)temp_lookup_count, 0, 256 * sizeof(int)); for (i = 0 ; i < l_s ; i++) { lookup[Ssmall[i]] = '1'; lookup_count[Ssmall[i]]++; } // Search for the left anchor to put for the first time while (lookup[Sbig[left_anchor]] != '1') left_anchor++; count = 1; temp_lookup_count[Sbig[left_anchor]] = 1; // An O(n) loop which implements the algorithm for (right_anchor = left_anchor + 1 ; right_anchor < l_b ; right_anchor++) { if ( count < l_s && lookup[Sbig[right_anchor]] == '1' ) { if (temp_lookup_count[Sbig[right_anchor]] < lookup_count[Sbig[right_anchor]]) count++; temp_lookup_count[Sbig[right_anchor]]++; } if (count == l_s) { if (right_anchor - left_anchor + 1 < min ) { ans_left = left_anchor; ans_right = right_anchor; min = right_anchor - left_anchor + 1; } temp_lookup_count[Sbig[left_anchor]]--; temp_right_anchor = right_anchor; if (temp_lookup_count[Sbig[left_anchor]] < lookup_count[Sbig[left_anchor]]) count--; else right_anchor--; // This is done to keep the right anchor fixed at its place, for loop will increment it to same value left_anchor++; while ( left_anchor < temp_right_anchor && lookup[Sbig[left_anchor]] != '1') left_anchor++; } } printf("Min length : %d : Start : %d End : %d\n",min, ans_left, ans_right); return 0; }

It is very simple but just the book-keeping has made it look tedious. I’d suggest one should think of how to handle the cases of duplicates, figure out a solution and then compare with my code.

Hope this has been helpful. 🙂

neat!

One clarification, does the order of the small string matter in the final string?

so u are looking for “aie” or any of the 6 permutation ordering of “aie”

the order of characters doesn’t matter, we are only looking for existence of all the characters of the small string in the large one.

Hi Avi,

The solution seems nice, but you say that it will work if Ssmall does not contain repeated characters and the window also does not contain repeated characters. That’s a pretty narrow criteria. Can you explain how are you solving it for all possible cases?

As I had said, the idea behind the solution remains the same, just some book-keeping has to be done to make sure that repeated characters don’t cause any issues. The code which I have provided takes care of all such conditions.

I count all the characters and the number of times they occur in the Ssmall. Now, when I scan through Sbig, I again keep a count of every character (which was also in Ssmall) I see and also the number of times it has come in the window. Keeping both these counts and decrementing the counts when we move the left pointer on the Sbig, we can find out the smallest window.

eg. Ssmall : aaab Sbig: adaaaabrt

In this case, the first window is ‘adaaaab’, notice that there are 5 ‘a’s in it, whereas we need only 3. But it still is a valid window. Now when we move the left pointer from leftmost ‘a’

onto ‘d’, we notice that still the window has 4 ‘a’s so, no need to extend the right pointer. But only this time, this window is smaller. Again, we shift the left pointer to ‘a’ (beside ‘d’), the same situation remains, so the window has again decreased. This way we find the smallest window.

I suggest you have a glance at the code, I’ve tried to explain to my best. It’s a hell lot of trouble to explain your own codes 😛

I completely agree that its a hell lot of trouble explaining your own codes. 🙂 .. Thanx for the effort. BTW i did figure out this. I should have given it a harder try ….

i doubt it if its o(n)….??

gud and clear