A web crawler is known as a robot or spider. Usually used by search engines to discover new or updated content on the web (Alex Xu) : web page, video, PDF file, etc.
Purposes :
Search engine indexing (SEI) : create local index for search engines.
Web archiving (WA) : collect info to preserve data for future uses.
Web mining (WMI) : data mining or useful knowledge from internet.
Web monitoring (WMO) : monitor copyright and trademark infringements over internet.
STEP 1 - Understand the problem and establish the scope