Monday, June 29, 2009

Random Pointer


We've had enough of logic puzzles! Here goes a data structures problem:

You have a linked list of nodes with the following fields:

struct Node {
int data;
Node * next;
Node * random;
}

The random pointer here can point to any node in the list including itself.

Now given a pointer to such a list, your task is to create a similar copy! The structure of the links should remain the same.

To clarify, all the nodes in the list are linked through the next pointer;
Mind you, You may or may not reach all the nodes if you try to access random pointers.

Link to the Solution

21 comments:

Vandana said...

This can be done in 2-pass with additional space and time complexity of O(n):

Traverse the linked list and create corresponding nodes of the new linked list and keep a map of addresses of nodes of original and new nodes. Now, traverse the new list again, look up the map and assign the "random" pointers of nodes.

Dilip said...

@Vandana: Nice solution! I had never thought of such a solution! Now try doing this without using that extra O(n) space.

Unknown said...

(1) we create an array with size=size of the original list. We will make the new list out of this array.
(2)Iterate through the original list. copy original list's data on the corresponding index of the array. Also, set the next pointers in the new array's list.
Eg. newHead->next = &newArray[index+1];
(3) And while we are on it replace the original list's data with the index of the new array it was copied onto. This will help us linking the random nodes later.
(4) Now we have an index associated with the original list. We will know where the random link is pointing to by reading the index at that location, head->random->data = index in the array where this node is.
(5) Transverse the original list again - this time to set the random pointers. Something like: newHead->random = &newHead[i->random->data]
(6) we can copy back the original data we changed in the original list by reading from the newly created one. we just had to track the random node and we used that already.
(7) Phew!!

Unknown said...

darn, that looks like an elaborated Vandana's reply :)

mythalez said...

while copying data and generating "new" list with proper next links. do this:
new->rand = old (link to the corresponding node in the older list)

then traverse thus:both lists thus:
old->next = new
old = old->next->next->rand
new = new->next

later traverse again thus:
new->rand = new->rand->rand->next
old->next = new->next->rand
new = new->next
old=old->next

done ? :P

mythalez said...

simplified version:

then traverse thus:both lists thus:
old->next = new
old = old->next->next->rand or old = new->next->rand (both r same now)
new = new->next

later traverse again thus:

new->rand = new->rand->rand->next
new->rand->rand->next = new->rand->rand (to set the old list next links to original values)
new = new->next

mythalez said...

both my solns above are wrong
both have the same problem though in different ways. while assinging the random of new, i am restoring one of the old node's next to the original value. however, many randoms may point to same node, so tht old->next value maybe needed again .. hence restoring the original values is the problem :)

Anonymous said...

Here's a 2-pass O(N) solution requiring constant additional space. Unfortunately, this requires exclusive access to the original list while building the new, because of the restorable mutations it performs.
Pass-1: Iterate over orig pointing orig.random at new and new.random at orig.random.
NOTE: This builds a 1-way associative map where: (1) each orig.random points at it's peer node in new; (2) each new.random points at it's peer' random node.
Pass-2: Iterate over orig - orig.random will now be the peer node in new and new.random will be the old orig.random. Restore orig.random and set up new.random appropriately.

Dilip said...

@Anonymous: Nice solution. Your method will create a proper copy but it's not easy(possible?) to restore the orig.random pointers. Try thinking of how to get the orig.random pointers point back to it's proper locations!

Dilip said...

@Anonymous: By the way your solution is the same as mythalez's. Just that in your case the old rand pointers store the corresponding new nodes and in his the next did.

Dilip said...

@mythalez: The only problem with your solution is that you are using the random pointer to store some useful information which could be lost when you assign the actual values to the random pointer and hence while you are reassigning the random pointers u are having to restore the information that u stored (in ur case the old->next values). Try using only those pointers that do not need to be restored while reassigning the random pointers and can be restored later by a simple traversal after all the assignment is over to store the information (correspondences) you wanted to store. :P

Anonymous said...

Dilip said: "Your method will create a proper copy but it's not easy(possible?) to restore the orig.random pointers" -- Care to elaborate? Here's a proof of correctness. Precondition (Pass 2): orig.random points at it's peer node. new.random points at the old orig.random. (NOTE: This also means that the old orig.random' random points at it's peer node.)
Postcondition (Pass 2): orig.random is restored. new.random is 'correct'.

Here's how you maintain the invariants: Restore orig.random from new.random. Make new.random point at new.random' peer (the 'correct' node). NOTE: At any iteration you only mutated orig and it's peer node; so you didn't break any invariants to follow.

Dilip said: "By the way your solution is the same as mythalez's. Just that in your case the old rand pointers store the corresponding new nodes and in his the next did." -- This statement is wrong in so many ways, I don't know where to begin to elaborate. Try proving correctness of what mythalez did like I elaborated above.

mythalez said...

@anonymous, the problem with all the approaches, mine, vandana's and urs is that many random nodes can point to the same node. so you restore the old.random while setting new.random, but old.random's modified value will be required again as other random's might be pointing to it later on.

hope u understand this? or do u want an example as well? :P

Unknown said...

who the duck is anonymous?

Vandana said...

@anonymous: This is exactly the solution i had thought to avoid the extra space, but as pointed out by dilip and mythalez, it doesnt work if many random pointers point to the same node.

@dilip: give anonymous guy that 2-nodes example.

@mythalez: the soln i've posted here is 100% correct, I never posted that soln :P

Anonymous said...

Yeah; 'random' is a little too weak - so I break my invariant when I change orig.random.
Here's a 3-pass O(N) constant space solution. I'm finding it hard to prove it optimal, though:
Pass-1: orig.random=unchanged; orig.next=new; new.next=orig.next; next.random=orig.random
Pass-2: orig.random=unchanged; orig.next=new; new.next=orig.next; new.random=correct_random
Pass-3: orig.random=unchanged; orig.next=correct_next; new.next=correct_next; new.random=correct_random(unchanged)

Dilip said...

@Anonymous: Great solution. This should work perfectly fine! By the way if uo don't mind could you introduce yourself?

Unknown said...

Well, my solution was 2 pass, O(N) space complexity. I don't believe we need a 3rd pass. Dilip, your question, please verify :)

Dilip said...

@Vivek: Your solution is perfectly correct and it's true that u can do it in 2 pass but be it two pass or three pass it remains O(N) time right. And anonymous' solution takes constant space while yours/vandana's takes O(N) space. That was the difference in your solutions!

Anonymous said...

Vivek said: "Well, my solution was 2 pass, O(N) space complexity. I don't believe we need a 3rd pass."

I haven't been able to prove it rigorously yet, but I believe you do need a third pass to do it with O(1) space. (The simple proof is to enumerate!)

Apart from the obvious space implications, the "keep a map" solutions have another big drawback -- the constant factor you add is several orders of magnitude. Consider that the best hash_map implementations out there generally cost 1K ns on average for a lookup [SGI STL' set requires 2.5K ns (red-black tree based)] - amortized, for 100K entries, with keys sizing 64 bits (even if you managed to be really smart by hinting how many buckets the datastructure must keep). Inserts cost roughly the same. Compare this to modern processors, where an L1 cache reference costs 1ns (an L2 cache reference costs 10ns) and a main memory reference costs roughly 100ns.

I'd encourage you to benchmark your solution against the 3-pass solution I proposed.

Anonymous said...

@Vivek: Incredible that I hadn't read your solution up until now. Here's a list of things to get you started:

Vivek said: "(1) we create an array with size=size ..."

(a) "create an array" -- This requires you to have contiguous memory available, proportional to the size of the SLL! That's quite an ask!
(b) "with size=size" -- For an SLL (for example SGI STL' list), this is an O(N) operation.

Vivek said: "(5) Transverse the original list again ..."
Vivek said: "(6) we can copy back the original data ..."

So 2 more passes. So, in fact, your "solution" requires 3-passes.

Vivek said: "darn, that looks like an elaborated Vandana's reply :)"

Uhm, no, it's not. Vandana's solution is much less rigid, albeit inefficient.