Programming Problem Solution 2

This post is all about solving a programming problem which I had put in here. The problem 3 of this post is being solved here. The problem is

You r given a large string of characters lets call it Sbig. Then there is a small set of characters lets call it Ssmall. You have to find smallest substring of Sbig which contains all characters in Ssmall.
For example you are given Sbig = “hello what are you doing”
Ssmall= “eo”
answer is “ello”

The idea for the solution to this problem is that we need to find a “window”, in the big string, which contains all the characters in small string. Notice that any such window will start and end at characters which are in small string. Now once we have found such a window, we just set the left most end point of the window at the character which is just right to to and also in the big string. Now we are short of 1 character in the window but still our left and right end points are pointing to characters in small string. Now we advance the right end point until we find that missing character. When we find it, if the current length of window is smaller than previous, we record it. Then again repeat the process.

eg. Sbig : “theryq fanw rtip etug aifge”
Ssmall : “aie”

So, in the above case, “eryq fanw rti” is the first window which we find in the Sbig which has all the characters in Ssmall. Now we set the left end point from ‘e’ to ‘a’ (in ‘fanw’). Now ‘e’ is absent from the window, so we advance the right end point until we find it. We find it at ‘etug’ and also the length of this window is smaller than previous one. Now the solution window is ‘anw rtip e’. Again we shift the left end point from ‘a’ to ‘i’ ( in rtip) and advance the right end point. This way we will find the solution to the problem. Notice that this algorithm runs in amortized O(n).

Now, this was just the idea behind the solution, the game isn’t over yet :P. This solution will only work when the Ssmall does not have repeated characters and the window which we find also does not have repeated characters ( of small string). Just make a small case and figure it out :). So what should be done ??? Well, the idea remains the same, but some other requirements need to be satisfied to keep the algorithms still in O(n). A lot of book-keeping has to be done to maintain the count of all the characters in small string and the window which we find and adjust them accordingly.
The C program which solves it for all the cases.

#include<stdio.h>
#include<string.h>
#include<stdlib.h>

int main()
{
	char Sbig[100000], Ssmall[1000], lookup[256];
	
	scanf("%[^\n] ",Sbig);
	scanf("%[^\n]",Ssmall);
	
	int l_b = strlen(Sbig), l_s = strlen(Ssmall),i,ans_left = 0, ans_right = 0, min = 1<<30, left_anchor = 0, right_anchor = 0, count = 0;
	int lookup_count[256], temp_lookup_count[256], temp_right_anchor;
	
	memset((void *)lookup, 0, 256);
	memset((void *)lookup_count, 0, 256 * sizeof(int));
	memset((void *)temp_lookup_count, 0, 256 * sizeof(int));
	
	for (i = 0 ; i < l_s ; i++) {
		lookup[Ssmall[i]] = '1';
		lookup_count[Ssmall[i]]++;
	}
	// Search for the left anchor to put for the first time
	while (lookup[Sbig[left_anchor]] != '1') left_anchor++;
	count = 1;
	temp_lookup_count[Sbig[left_anchor]] = 1;
	
	// An O(n) loop which implements the algorithm
	for (right_anchor = left_anchor + 1 ; right_anchor < l_b ; right_anchor++) {
		if ( count < l_s && lookup[Sbig[right_anchor]] == '1' ) {
			if (temp_lookup_count[Sbig[right_anchor]] < lookup_count[Sbig[right_anchor]])
				count++;
			temp_lookup_count[Sbig[right_anchor]]++;
		}
		if (count == l_s) {
			if (right_anchor - left_anchor + 1 < min ) {
				ans_left = left_anchor;
				ans_right = right_anchor;
				min = right_anchor - left_anchor + 1;
			}
			temp_lookup_count[Sbig[left_anchor]]--;
			temp_right_anchor = right_anchor;
			if (temp_lookup_count[Sbig[left_anchor]] < lookup_count[Sbig[left_anchor]])
				count--;
			else
				right_anchor--; // This is done to keep the right anchor fixed at its place, for loop will increment it to same value
			left_anchor++;
			while ( left_anchor < temp_right_anchor && lookup[Sbig[left_anchor]] != '1') left_anchor++;
		}
	}
	printf("Min length : %d   : Start : %d  End : %d\n",min, ans_left, ans_right);
	return 0;
}

It is very simple but just the book-keeping has made it look tedious. I’d suggest one should think of how to handle the cases of duplicates, figure out a solution and then compare with my code.

Hope this has been helpful. 🙂

Advertisements

About Avi Dullu
Jaako rakhe saiyan, maar sake na koi!

7 Responses to Programming Problem Solution 2

  1. Neet says:

    neat!

    One clarification, does the order of the small string matter in the final string?

    so u are looking for “aie” or any of the 6 permutation ordering of “aie”

  2. Amar says:

    Hi Avi,
    The solution seems nice, but you say that it will work if Ssmall does not contain repeated characters and the window also does not contain repeated characters. That’s a pretty narrow criteria. Can you explain how are you solving it for all possible cases?

    • Avi Dullu says:

      As I had said, the idea behind the solution remains the same, just some book-keeping has to be done to make sure that repeated characters don’t cause any issues. The code which I have provided takes care of all such conditions.

      I count all the characters and the number of times they occur in the Ssmall. Now, when I scan through Sbig, I again keep a count of every character (which was also in Ssmall) I see and also the number of times it has come in the window. Keeping both these counts and decrementing the counts when we move the left pointer on the Sbig, we can find out the smallest window.
      eg. Ssmall : aaab Sbig: adaaaabrt

      In this case, the first window is ‘adaaaab’, notice that there are 5 ‘a’s in it, whereas we need only 3. But it still is a valid window. Now when we move the left pointer from leftmost ‘a’
      onto ‘d’, we notice that still the window has 4 ‘a’s so, no need to extend the right pointer. But only this time, this window is smaller. Again, we shift the left pointer to ‘a’ (beside ‘d’), the same situation remains, so the window has again decreased. This way we find the smallest window.

      I suggest you have a glance at the code, I’ve tried to explain to my best. It’s a hell lot of trouble to explain your own codes 😛

      • Amar says:

        I completely agree that its a hell lot of trouble explaining your own codes. 🙂 .. Thanx for the effort. BTW i did figure out this. I should have given it a harder try ….

  3. Akash says:

    i doubt it if its o(n)….??

  4. Ankit says:

    gud and clear

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: