Abstract
At the initial stage of internet development, there were few websites, so information searching is comparatively easy. However, with the explosion of internet, searching for information became very hard to common website users which calls for the appearance of professional searching websites. A crucial part of web searching engine technology is web spider program.
This paper realized the following procedures from give the website address to operate searching, make use of data base lining technology to manage webpage linkage to download visited sources to the local hard drives. Lucene tool bag is used to give content to the download sources. This paper is focused on the following technology: the core of spider program (communication core, spider program working core), the establishment of sources and search.
Though the design analysis, I have finished my own spider creeping program. The program is finished based on initial design, implement the collection and arranging of net sources. These functions passed the test, and is able to run normally.