Optimizing Web Crawlers for Shared Hosts
- Details
- Published: Thursday, 09 May 2013 09:25
- Written by Alan Langford
Recently I've seen some of our shared servers getting bogged down when web crawlers start processing some large sites. What these sites have in common is that they have sections that need longer database queries to compose a page. For example, one of these sites has a large database for a directory.
From the hosting perspective, a slow shared server is a distressing prospect. It's fine if one site with inefficient queries takes longer to load, but it's not fine when that load affects other sites on the same server. It's pretty common for hosting companies (at least those few who monitor this sort of thing) to react by booting the problematic site, usually recommending a dedicated or virtual server in the process. But this can be impractical or grossly unfair. Why should a web site that has accumulated a large database of information but that has low overall traffic be forced into a much more expensive hosting plan just because of the way web crawlers work?