usually on the website optimization are very focused on the website data, for example, included snapshot and ranking. Especially for large sites, there are many is not ideal, this time can not just look at the surface of the data to determine the cause of the problems, but should be deeply, through web logs, see the specific situation of search engine spiders visit the website to find some answers. I usually analysis website, also attaches great importance to view the problem from the log, root can find the problem. Today to share with you some methods and ideas of their own, hope that more exchanges.
to view all the overall situation of the
view the search engine spiders crawling the total number of
in addition to the total number of visits and the overall situation of the spider crawling directory, leaving a very important is to see the total page crawling. I usually analysis website, often see is the spider crawling on top of the page are of no great importance of the page, such as B2C, such as shopping cart page links contact us page CSS files, some themes, these pages and files on the rankings and included didn’t have practical help, but the reality is the most the spider crawling, so it will cause a waste of time, after all, the spider visit a web site and the total number of total crawling is certain, the total crawl depth is certain, if time is wasted on the page, then the page key collection will be affected. Therefore, through the log after the discovery, we need shielding drop in robots.txt, or in the meta page block.
The average log View >
included a site first to search engine spiders to climb to the site. The log can see how much the total number of daily visits the site of the spider, how to determine the site of the weight of approximately. A high weight site, the number of visitors is relatively more spiders, on the contrary, a low weight site, even if there is a very large scale, the number of spider visit is still limited. Crawling times is limited, allocation of time and depth is limited, so the collection is limited. So our focus is to provide more love from Shanghai entrance to our site, only the entrance, access times can be improved.
to view all the overall situation was crawling through the log directory, can let us know the search engine crawling up directory is what, whether we want to provide to the user directory. General can clearly see the top ten list, if the directory is not the focus of our content available to the user directory, you will need to make adjustments, some are directory of the current multi degree crawl, but the real value of the search engine directory is not excessive attention to timely discover and find the reason.
view crawling top ten page
a look at the average crawl crawl depth
directory is crawling