rahul Well, it may seem that way, but this is made intenitonally. Please consider this: if program would actlually know who is who, it would not even ask for confirmation. But program does not. It can only measure similarity between faces, but it also makes mistakes, as recognition is never 100% accurate. And there also faces that are partially obscured, shoot in the dark, with low resolution, or not even looking into the camera. That complicates things even more and increases chances of mistakes greatly.
By going this iterative way we make sure each time program learns on your previous input in order to provide you with best quality suggestions. And it looks like it is doing great job, otherwise there would not be such a question in the first place. If we would not be using this approach, accuracy of suggestions would be much lower, and instead of just mostly confirming, you would spend significally more time by cherry picking correct suggetsions from others.
I know this all might seem as unnecessary burden, but this is all result of the fact that no recognition technology is perfect. And, as I said, this approach is significally better than alterniatives in terms of productivity.