1. For k means clustering, this link was very helpful