Sample code for 30+ languages & platforms
Tcl

GetBaseDomain

See more Spider Examples

The GetBaseDomain method is a utility function that converts a domain into a "domain base", which is useful for grouping URLs. For example: abc.chilkatsoft.com, xyz.chilkatsoft.com, and blog.chilkatsoft.com all have the same base domain: chilkatsoft.com. Things get more complicated when considering country domains (.au, .uk, .se, .cn, etc.) and government, state, and .us domains. Also, domains such as blogspot, wordpress, etc, are treated specially so that "xyz.blogspot.com" has a base domain of "xyz.blogspot.com". Note: If you find other domains that should be treated similarly to blogspot.com, send a request to support@chilkatsoft.com.

Chilkat Tcl Downloads

Tcl

load ./chilkat.dll

set spider [new_CkSpider]

puts [CkSpider_getBaseDomain $spider www.chilkatsoft.com]
puts [CkSpider_getBaseDomain $spider blog.chilkatsoft.com]
puts [CkSpider_getBaseDomain $spider www.news.com.au]
puts [CkSpider_getBaseDomain $spider blogs.bbc.co.uk]
puts [CkSpider_getBaseDomain $spider xyz.blogspot.com]
puts [CkSpider_getBaseDomain $spider www.heaids.org.za]
puts [CkSpider_getBaseDomain $spider www.hec.gov.pk]
puts [CkSpider_getBaseDomain $spider www.e-mrs.org]
puts [CkSpider_getBaseDomain $spider cra.curtin.edu.au]

# Prints: 
# chilkatsoft.com
# chilkatsoft.com
# news.com.au
# bbc.co.uk
# xyz.blogspot.com
# heaids.org.za
# hec.gov.pk
# e-mrs.org
# curtin.edu.a

delete_CkSpider $spider