java爬虫知识盲区整理
2023-04-18 15:37:02 时间
java爬虫知识盲区整理
HttpClient重定向处理
【HttpClient4.5中文教程】八.终止请求和重定向处理
首先说说HttpClient和浏览器的区别
我们从浏览器发起一笔请求,浏览器则会帮你处理重定向、缓存等事情。这也就是为什么用浏览器表单post提交后,不管服务端如何重定向,都能正常接收到服务端返回的数据。
但是用HttpClient呢,你会发现,请求后,会返回302,因为POST方式提交HttpClient是不会帮你处理重定向的。这时候怎么办呢?
方法一:(自己手动处理)
HttpClient httpClient = HttpClients.createDefault();
HttpPost httpPost= new HttpPost(http://ip:port/xxx);
CloseableHttpResponse response = httpclient.execute(httpPost);
int statusCode = response.getStatusLine().getStatusCode();
System.out.println("statusCode=="+statusCode); //返回码
Header header=response.getFirstHeader("Location");
//重定向地址
String location = header.getValue();
System.out.println(location);
//然后再对新的location发起请求即可
HttpGet httpGet = new HttpGet(location);
CloseableHttpResponse response2 = httpclient.execute(httpGet);
System.out.println("返回报文"+EntityUtils.toString(response2.getEntity(), "UT-F-8"));
方法二:(已有工具类)
HttpClientBuilder builder = HttpClients.custom()
.disableAutomaticRetries() //关闭自动处理重定向
.setRedirectStrategy(new LaxRedirectStrategy());//利用LaxRedirectStrategy处理POST重定向问题
CloseableHttpClient client = builder.build();
HttpPost httpPost= new HttpPost(http://ip:port/xxx);
CloseableHttpResponse response = client.execute(httpPost);
int statusCode = response.getStatusLine().getStatusCode();
System.out.println("statusCode=="+statusCode); //返回码
System.out.println("返回报文"+EntityUtils.toString(response.getEntity(), "UT-F-8"));
HttpClient获取Cookie的两种方式
一、旧版本的HttpClient获取Cookies
p.s. 该方式官方已不推荐使用
使用DefaultHttpClient类实例化httpClient对象:
public static String dooPost_deprecated(String url, Map<String, String> map, String charset) {
DefaultHttpClient httpClient = null;
HttpPost httpPost = null;
String result = null;
try {
httpClient = new DefaultHttpClient();
httpPost = new HttpPost(url);
// 设置参数
List<NameValuePair> list = new ArrayList<NameValuePair>();
Iterator<Entry<String, String>> iterator = map.entrySet().iterator();
while (iterator.hasNext()) {
Entry<String, String> elem = (Entry<String, String>) iterator.next();
list.add(new BasicNameValuePair(elem.getKey(), elem.getValue()));
}
if (list.size() > 0) {
UrlEncodedFormEntity entity = new UrlEncodedFormEntity(list, charset);
httpPost.setEntity(entity);
}
HttpResponse response = httpClient.execute(httpPost);
System.out.println(response.getStatusLine().getStatusCode());
String JSESSIONID = null;
String cookie_user = null;
//获得Cookies
CookieStore cookieStore = httpClient.getCookieStore();
List<Cookie> cookies = cookieStore.getCookies();
for (int i = 0; i < cookies.size(); i++) {
//遍历Cookies
System.out.println(cookies.get(i));
System.out.println("cookiename=="+cookies.get(i).getName());
System.out.println("cookieValue=="+cookies.get(i).getValue());
System.out.println("Domain=="+cookies.get(i).getDomain());
System.out.println("Path=="+cookies.get(i).getPath());
System.out.println("Version=="+cookies.get(i).getVersion());
if (cookies.get(i).getName().equals("JSESSIONID")) {
JSESSIONID = cookies.get(i).getValue();
}
if (cookies.get(i).getName().equals("cookie_user")) {
cookie_user = cookies.get(i).getValue();
}
}
if (cookie_user != null) {
result = JSESSIONID;
}
} catch (Exception ex) {
ex.printStackTrace();
}
return result;
}
二、新版本的HttpClient获取Cookies
使用CloseableHttpClient类实例化httpClient对象:
public static String doPost(Map<String, String> map, String charset) {
CloseableHttpClient httpClient = null;
HttpPost httpPost = null;
String result = null;
try {
CookieStore cookieStore = new BasicCookieStore();
httpClient = HttpClients.custom().setDefaultCookieStore(cookieStore).build();
httpPost = new HttpPost("http://localhost:8080/testtoolmanagement/LoginServlet");
List<NameValuePair> list = new ArrayList<NameValuePair>();
Iterator<Map.Entry<String, String>> iterator = map.entrySet().iterator();
while (iterator.hasNext()) {
Entry<String, String> elem = (Entry<String, String>) iterator.next();
list.add(new BasicNameValuePair(elem.getKey(), elem.getValue()));
}
if (list.size() > 0) {
UrlEncodedFormEntity entity = new UrlEncodedFormEntity(list, charset);
httpPost.setEntity(entity);
}
httpClient.execute(httpPost);
String JSESSIONID = null;
String cookie_user = null;
List<Cookie> cookies = cookieStore.getCookies();
for (int i = 0; i < cookies.size(); i++) {
if (cookies.get(i).getName().equals("JSESSIONID")) {
JSESSIONID = cookies.get(i).getValue();
}
if (cookies.get(i).getName().equals("cookie_user")) {
cookie_user = cookies.get(i).getValue();
}
}
if (cookie_user != null) {
result = JSESSIONID;
}
} catch (Exception ex) {
ex.printStackTrace();
}
return result;
}
相关文章
- Jease 2.6发布 Java开源内容框架
- JVM调优总结:反思
- JVM调优总结:调优方法
- JVM调优总结:新一代的垃圾回收算法
- JVM调优总结:典型配置举例
- JVM调优总结:分代垃圾回收详述
- JVM调优总结:垃圾回收面临的问题
- JVM调优总结:基本垃圾回收算法
- JVM调优总结:一些概念
- 用Java GUI编写的画板程序
- Java的动态绑定机制
- jOOQ 2.0.2发布 Java的ORM框架
- Java中带复选框的树的实现和应用
- Java网络编程菜鸟进阶:TCP和套接字入门
- 甲骨文与谷歌专利权之争定于今年三月开审
- Java调用C/C++编写的第三方dll动态链接库
- 集成开发环境 NetBeans IDE 7.1正式版发布
- kangle 2.7.5紧急发布 防hash碰撞攻击
- 东方通技术引领模式为国产软件“争权”
- UML中关联,组合与聚合等关系的辨析