Tuesday, October 4, 2011

Longest subsequence (contd.)

Unfortunately, the solution I've suggested fails if the sequence is say, 10, 13, 12, 1, 2, 3, 4, 5. The solution is 5, but the above algorithm returns 2.

This means that a single stack is unlikely to solve the problem. We need a list of stacks, each starting off with the first number that cannot be added to an existing stack. Here's what the code looks like:

List<Stack<int>> stacks = new List<Stack<int>>();

foreach (int incoming in sequence)
{
bool makeNew = true;
foreach (Stack stack in stacks)
{
if (stack.top() < incoming)
{
stack.push(incoming);
makeNew = false;
}
}

if (makeNew == true)
{
Stack<int> s = new Stack<int>();
s.Push(incoming);
stacks.Add(s);
}
}



And at the end, the solution is the stack with the maximum Count.

This *almost* works. It fails to get the correct solution in a very interesting way. Consider the sequence, 10, 20, 30, 12, 13, 14, 15. The solution is 5 (10, 12, 13, 14, 15), but the algorithm will return the solution as 4 (12, 13, 14, 15).

Here's how it works. The algorithm creates a new stack for element 10. It then pushes 20, and 30 onto this stack. The next element 12 cannot be put into an existing stack, so it creates a new one and pushes to the new stack. The next elements 12, 13, 14, 15 are added to the top of the new stack giving a solution of 4.

So, we always need a stack handy with just the element 10 in it for the algorithm to succeed.


List<Stack<int>> stacks = new List<Stack<int>>();

foreach (int incoming in sequence)
{
bool makeNew = true;
foreach (Stack stack in stacks)
{
if (stack.top() < incoming)
{
Stack<int> dupStack = new Stack<int>();
//Copy the elements of stack to dupStack.
stacks.Add(dupStack);
stack.Push(incoming);
makeNew = false;
}
}

if (makeNew == true)
{
Stack<int> s = new Stack<int>();
s.Push(incoming);
stacks.Add(s);
}
}


This ensures that even when we add an element to an existing stack, the unchanged stack is also present in the list of stacks. This algorithm gives the correct solution under all circumstances.

Correctness accomplished!

Now, for optimization...

Here's an interesting observation: We are running the algorithm above, and we have a list of stacks. Consider 2 stacks in the list. They both have the same elements (Count is the same). However, the top element in stack B is larger than the top element in stack A. Given a new incoming element which is greater than both the top elements of A and B, either stack can work. In other words, since we need only one stack with that height, the stack with the smaller top value should be preferred. We can safely delete the stack with the larger top value. The end result will be unaffected.

So, after adding the element to the stacks, we can run though all stacks and compare them to each other. If the Count of 2 stacks is the same, delete the one with the larger top value.

List<Stack<int>> toDelete = new List<Stack<int>>();

foreach (Stack<int> a in stacks)
{
foreach (Stack<int> b in stacks)
{
if (a == b) //No need to compare the same objects.
continue;

if (a.Count == b.Count && a.Top() < b.Top())
toDelete.Add(b);
}
}

foreach (Stack s in toDelete)
{
stacks.Remove(s);
}



And finally, one more thing. The solution does not ask for the list in the largest sub-sequence. It asks simply for the count. Given that, we don't even need a family of stacks. We can simply do with a KeyValuePair where the first int remembers the height of the "stack" and the second contains the Top value of the stack.

The solution becomes space efficient. Instead of keeping a bunch of stacks in memory, it remembers a pair of numbers per stack. So, if the largest sub - is 100, the maximum number of elements that are stored in memory is 100 * 2 integers.

Since a pair of integers given that the key will always be unique can be stored in a single Dictionary, it prob. makes sense to make a single dictionary and work the algorithm around that.

I will write up that code on request.

No comments:

Post a Comment