This topic is an introduction how to develop a Web data sample tool based on the Java.The main contents is to carry out Spider(find, collect web page's information need to have "Web spider" of high performance to search the information of itself in the Internet), analyze HTML(the informations in the Web all build up in the HTML, so the first problem for web robot is how to analyze HTML when crawling web page) and raise program function.(make use of the Java multi-threading technique to develop efficiently of Spider program in the Internet which have a number of Web pages) To adopt the core technique of Spider in the Eclipse to crawl the URL ,then download the whole Web site.I carry out the above-mentioned technical request with design and use various of Java class.The essence of this program is a Web Spider.The main advantage downloaded tool with other to compare it is it can automatically to fill form(such as:Automatically register)with usage cookies to handle session.It still has vivid download rule(such as:Pass the URL, size of web page, MIME type etc.)to limit a download.y the effect is good by a test.
Keyword: the data sample,Java class,Web Spider,the Java multi-threading