InterviewBit Solutions : Find Duplicate in Array

Wednesday, September 14, 2016

Find Duplicate in Array

Given a read only array of n + 1 integers between 1 and n, find one number that repeats in linear time using less than O(n) space and traversing the stream sequentially O(1) times.

Sample Input:

[3 4 1 4 1]

Sample Output:

If there are multiple possible answers ( like in the sample case above ), output any one.

If there is no duplicate, output -1

Solution:

Had it not been a read-only array, we could have solved it in-place using the algorithm here.
http://www.geeksforgeeks.org/find-duplicates-in-on-time-and-constant-extra-space/

Since it is a read-only array, we cannot make changes to the array itself.
We can treat this question as finding the starting point of a loop in a linked.

Explanation:

Let us take an array A = [3, 4, 1, 4, 2]

Treat the elements as individual nodes. If there is a repetition , we will get a loop.

The above array can be converted to a linked list as shown:

We get 2 linked lists in this case. Since we need to output any one duplicate number, we can do so by finding the start of the loop in linked list. We do not actually need to create linked list for it. The slow and fast pointer can be simulated in the array itself. The solution contains 2 parts as is the case when we need to find the start node of a loop in a linked list.

Phase1: Take 2 pointers slow and fast. Fast moves with double the speed of slow. If there is a cycle the two pointers will meet somewhere.

Phase2: Keep the fast pointer where it is. Move the slow pointer to the start of the linked list. Now advance both the pointers at the same speed. The point where they meet is the start node of a loop.

In our case, that is our duplicate node.

Note: This does not take care of the case when there is no duplicate element. However, interviewbit seems to have no test cases for this scenario and the code passes.

Please see the code below:

Please comment if you can explain it better.

17 comments:

KartikSeptember 20, 2017 at 8:35 AM
can u explain how does this return -1 if no loop is found?
ReplyDelete
Replies
UnknownSeptember 26, 2017 at 1:32 AM
For this you can initially check whether the sum is (n)*(n+1)/2 or not. If the sum is equal to this return -1.
ReplyDelete
Replies
UnknownJanuary 17, 2018 at 4:53 AM
For the given question, a test case such that there is no duplicate is not possible. Because we have an array size of n+1 and we only have n numbers. So, at least one number has to repeat.
ReplyDelete
Replies
UnknownJune 17, 2018 at 5:32 AM
can't understand the logic behind the implementation of the slow and fast pointers, can you please explain it a bit more in detail?
ReplyDelete
Replies
UnknownJuly 17, 2018 at 9:56 PM
Why do we need to find a start node of loop?
ReplyDelete
Replies
anandvir singhMarch 31, 2019 at 3:50 AM
No need to complicate everything.. a simple solution in Cpp:
int Solution::repeatedNumber(const vector &A) {
vector v(A.size());
fill(v.begin(), v.end(), true);

for(int i=0;i<A.size();i++){
if(v[A[i]]){
v[A[i]]=false;
}

else
return A[i];
}
}
ReplyDelete
Replies
UnknownMay 29, 2020 at 11:04 AM
Found a really good explaination after working very hard
https://www.youtube.com/watch?v=MrHbSiQGvLk&t=14s
its really unique way to solve this problem by buketing
ReplyDelete
Replies
Robin SoniJuly 17, 2020 at 10:35 PM
If we are making the linked list then still I can see one problem here that is the space complexity will still not be less than O(n). can you please explain ?
ReplyDelete
Replies

Add comment