Visual FoxPro
Visual FoxPro
Avoid URLs Matching Any of a Set of Patterns
See more Spider Examples
Demonstrates how to use "avoid patterns" to prevent spidering any URL that matches a wildcarded pattern. This example avoids URLs containing the substrings "java", "python", or "perl".Chilkat Visual FoxPro Downloads
LOCAL lnSuccess
LOCAL loSpider
LOCAL i
lnSuccess = 0
loSpider = CreateObject('Chilkat.Spider')
* The spider object crawls a single web site at a time. As you'll see
* in later examples, you can collect outbound links and use them to
* crawl the web. For now, we'll simply spider 10 pages of chilkatsoft.com
loSpider.Initialize("www.chilkatsoft.com")
* Add the 1st URL:
loSpider.AddUnspidered("http://www.chilkatsoft.com/")
* Avoid URLs matching these patterns:
loSpider.AddAvoidPattern("*java*")
loSpider.AddAvoidPattern("*python*")
loSpider.AddAvoidPattern("*perl*")
* Begin crawling the site by calling CrawlNext repeatedly.
FOR i = 0 TO 9
lnSuccess = loSpider.CrawlNext()
IF (lnSuccess = 1) THEN
* Show the URL of the page just spidered.
? loSpider.LastUrl
* The HTML is available in the LastHtml property
ELSE
* Did we get an error or are there no more URLs to crawl?
IF (loSpider.NumUnspidered = 0) THEN
? "No more URLs to spider"
ELSE
? loSpider.LastErrorText
ENDIF
ENDIF
* Sleep 1 second before spidering the next URL.
loSpider.SleepMs(1000)
NEXT
RELEASE loSpider