1.JAVA 的例子
在JAVA中,您可以從」HttpServletRequest「獲得「user agent」
例如:服務託管在abcdefg.com
@Autowired
private HttpServletRequest request;
//...
String userAgent = request.getHeader("user-agent");
System.out.println("User Agent : " + userAgent);
if(!StringUtils.isEmpty(userAgent)){
if(userAgent.toLowerCase().contains("googlebot")){
System.out.println("This is Google bot");
}else{
System.out.println("Not from Google");
}
}
請注意,以上解決方案行之有效,但未能檢測出虛假使用者代理。
2.虛假的使用者代理
很容易創建一個假/欺騙使用者代理的請求。例如:
例子:
偽造的使用者代理髮送請求到abcdefg.com
package com.mkyong.web;
import org.apache.HTTP.HttpResponse;
import org.apache.HTTP.client.HttpClient;
import org.apache.HTTP.client.methods.HttpGet;
import org.apache.HTTP.impl.client.HttpClientBuilder;
public class test {
public static void main(String[] args) throws Exception {
HttpClient client = HttpClientBuilder.create().build();
HttpGet request = new HttpGet("abcdefg.com");
request.setHeader("user-agent", "fake googlebot");
HttpResponse response = client.execute(request);
}
}
輸出:abcdefg.com。
User Agent : fake googlebot
This is Google bot
3.驗證 googlebot
為了驗證真正的googlebot,你可以手動使用「反向DNS查找」。
> host 66.249.66.1
1.66.249.66.in-addr.arpa domain name pointer
crawl-66-249-66-1.googlebot.com.
> host crawl-66-249-66-1.googlebot.com
crawl-66-249-66-1.googlebot.com has address 66.249.66.1
來源:Verifying Googlebot
4.驗證googlebot——JAVA示例
基於上述理論,我們可以類比首屆「反向DNS查找」的一部分。使用host命令來確定請求的IP指向哪裡。
如果請求是來自谷歌蜘蛛,它將顯示該模式:xx *.googlebot.com.
P.S host命令僅在* nix系統可用
例如:檢測假使用者代理
@Autowired
private HttpServletRequest request;
//...
String requestIp = getRequestIp();
String userAgent = request.getHeader("user-agent");
System.out.println("User Agent : " + userAgent);
if(!StringUtils.isEmpty(userAgent)){
if(userAgent.toLowerCase().contains("googlebot")){
check fake user agent
String output = executeCommand("host " + requestIp);
System.out.println("Output : " + output);
if(output.toLowerCase().contains("googlebot.com")){
System.out.println("This is Google bot");
}else{
System.out.println("This is fake user agent");
}
}else{
System.out.println("Not from Google");
}
}
get requested IP
private String getRequestIp() {
String ipAddress = request.getHeader("X-FORWARDED-FOR");
if (ipAddress == null) {
ipAddress = request.getRemoteAddr();
}
return ipAddress;
}
execute external command
private String executeCommand(String command) {
StringBuffer output = new StringBuffer();
Process p;
try {
p = Runtime.getRuntime().exec(command);
p.waitFor();
BufferedReader reader =
new BufferedReader(new InputStreamReader(p.getInputStream()));
String line = "";
while ((line = reader.readLine())!= null) {
output.append(line + "\n");
}
} catch (Exception e) {
e.printStackTrace();
}
return output.toString();
}
「步驟2」找出假使用者代理的例子了。現在,你把這個輸出:
Output : Host 142.1.168.192.in-addr.arpa. not found: 3(NXDOMAIN) //this output may vary.
User Agent : fake googlebot
This is fake user agent
來源:
http://blog.mkfree.com/posts/204
文章標籤
全站熱搜
留言列表