提交 0203d062 编写于 作者: 梦想橡皮擦's avatar 梦想橡皮擦 💬

案例 IP 限制反爬

上级 f4be1693
......@@ -23,12 +23,17 @@
15. [我是怎么用一个特殊 Cookie ,限制住别人的爬虫的](https://blog.csdn.net/hihell/article/details/128474849)
16. [你很勇哦,这么点数据就敢用异步加载?](https://blog.csdn.net/hihell/article/details/128474866?spm=1001.2014.3001.5501)
17. [老板让我手动控制网页渲染速度,说这能反爬虫?我信了。](https://blog.csdn.net/hihell/article/details/128474887?spm=1001.2014.3001.5501)
18. [离职原因:让 BOSS 学习“滚动加载”这一名词](https://dream.blog.csdn.net/article/details/128474916)
19. [网站响应数据加一个简单的密,就能挡住80%的爬虫,你信吗?](https://dream.blog.csdn.net/article/details/128474924)
20. [一秒一个Token甩到前台,吓死在座的各位爬虫工程师](https://dream.blog.csdn.net/article/details/128474930)
21. [反爬工程师都会用的手段,IP限制反爬 - 爬虫训练场](https://dream.blog.csdn.net/article/details/128550653)
## 小知识点补充博客
1. [【小知识点】爬虫训练场项目,Python Flask 模板更新,每次都要重新服务](https://blog.csdn.net/hihell/article/details/128399376)
2. [【小知识点】Python Flask 部署,生成环境的爬虫训练场项目](https://blog.csdn.net/hihell/article/details/128422613)
3. [【小知识点】给PythonWeb项目添加百度统计,爬虫训练场](https://blog.csdn.net/hihell/article/details/128448271)
4. [【小知识点】为爬虫训练场项目添加 Bootstrap5 时间轴](https://dream.blog.csdn.net/article/details/128543088)
## 站点数据储备博客
......
......@@ -3,10 +3,27 @@ from flask_sqlalchemy import SQLAlchemy
from .config import BaseConfig # 导入配置文件
# Flask 限流器
from flask_limiter import Limiter
from flask_limiter.util import get_remote_address,get_ipaddr
app = Flask(__name__)
app.config.from_object(BaseConfig) # 启用配置
def get_real_ip():
if request.headers.getlist("X-Forwarded-For"):
return request.headers.getlist("X-Forwarded-For")[0]
return request.remote_addr
limiter = Limiter(app, key_func=get_real_ip)
# limiter = Limiter(app, key_func=get_ipaddr)
db = SQLAlchemy()
db.init_app(app) # 初始化数据库
......
......@@ -7,6 +7,10 @@ from flask import Blueprint, jsonify, request
from flask import render_template
from ..model import School # 导入上级模块
# 从 app 中导入 limiter 对象
from app import limiter
s = Blueprint('school', __name__, url_prefix='/ss')
......@@ -132,8 +136,6 @@ def encry_api():
"""
间隔10秒生成一Cookie
"""
......@@ -165,3 +167,24 @@ def token_list_school():
pagination = pagination_object(page)
return jsonify(pagination)
"""
限制 IP 访问
"""
@s.route('ajax_list3')
def ajax_list3():
page = 1 # 初始化第一页数据
pagination = pagination_object(page)
return render_template('school/ajax_list3.html', pagination=pagination)
@s.route('api3')
@limiter.limit("3/second")
def school_api3():
page = int(request.args.get("page", 1))
pagination = pagination_object(page)
return jsonify(pagination)
{% extends "base.html" %}
{% block content %}
<style>
@media (max-width: 540px) {
table {
font-size:10px !important;
}
}
</style>
<div class="container">
<table class="table table-hover table-bordered">
<table class="table table-hover table-bordered table-responsive">
<caption class="caption-top text-center">
<div class="alert alert-warning">
<p class="m-0">
<strong>CSDN 2022 博客之星总排名</strong> 👉 绿色背景是总分前 200(晋级区)👈</p>
<p class="text-success p-0"><small>数据同步时间:2023-01-03 12:00</small></p>
<p class="m-0"><small>来都来了,不去给橡皮擦打个5分么?</small> | <a target="_blank"
<p class="text-success p-0"><small>数据同步时间:2023-01-04 21:00</small></p>
<p class="m-0"><small>来都来了,不去给橡皮擦打个5分么?</small> <br> <a target="_blank"
href="https://bbs.csdn.net/topics/611387187"><small>https://bbs.csdn.net/topics/611387187</small></a>
</p>
......@@ -18,16 +27,16 @@
<a class="btn btn-primary" href="/csdn/newstar">仅看新星</a>
</div>
<br>
<div class="btn-group btn-group-sm mt-3 d-flex">
<div class="btn-group btn-group-sm mt-3 d-flex" style="font-size:12px;">
<a class="btn btn-success" href="/csdn/blogstar?page=1">其它</a>
<a class="btn btn-success" href="/csdn/blogstar?page=2">前端</a>
<a class="btn btn-success" href="/csdn/blogstar?page=3">后端</a>
<a class="btn btn-success" href="/csdn/blogstar?page=4">大数据</a>
<a class="btn btn-success" href="/csdn/blogstar?page=5">云原生</a>
<a class="btn btn-success" href="/csdn/blogstar?page=6">前沿技术</a>
<a class="btn btn-success" href="/csdn/blogstar?page=6">前沿</a>
<a class="btn btn-success" href="/csdn/blogstar?page=7">人工智能</a>
<a class="btn btn-success" href="/csdn/blogstar?page=8">运维与安全</a>
<a class="btn btn-success" href="/csdn/blogstar?page=9">移动开发</a>
<a class="btn btn-success" href="/csdn/blogstar?page=8">运维</a>
<a class="btn btn-success" href="/csdn/blogstar?page=9">移动</a>
<a class="btn btn-success" href="/csdn/blogstar?page=10">物联网</a>
</div>
</caption>
......
{% extends "base.html" %}
{% block content %}
<style>
@media (max-width: 540px) {
table {
font-size:10px !important;
}
}
</style>
<div class="container">
<div class="table-responsive-sm">
<div class=" table-responsive">
<table class="table table-hover table-bordered">
<caption class="caption-top text-center">
<div class="alert alert-warning">
<p class="m-0">
<strong>CSDN 2022 博客新星总排名</strong> 👉 绿色背景是总分前 100(晋级区)👈</p>
<p class="text-success p-0"><small>数据同步时间:2023-12-30 9:00</small></p>
<p class="m-0"><small>来都来了,不去给橡皮擦打个5分么?</small> | <a target="_blank"
<p class="text-success p-0"><small>数据同步时间:2023-01-04 21:00</small></p>
<p class="m-0"><small>来都来了,不去给橡皮擦打个5分么?</small> <br><a target="_blank"
href="https://bbs.csdn.net/topics/611387187"><small>https://bbs.csdn.net/topics/611387187</small></a>
</p>
......@@ -25,7 +35,7 @@
<th>昵称</th>
<th>赛道</th>
<th>注册时间</th>
<th>目前得</th>
<th></th>
</tr>
</thead>
<tbody>
......@@ -52,9 +62,7 @@
{% endif %}
</td>
<td>
{{u.regtime}}
</td>
<td>{{u.regtime}}</td>
<td>{{u.totalScore}}</td>
</tr>
{%endfor%}
......
......@@ -234,7 +234,29 @@
</p>
</div>
<div class="card-footer text-end">
<a href="https://dream.blog.csdn.net/article/details/128474924" target="_blank" class="card-link text-muted small">案例制作教程</a>
<a href="https://dream.blog.csdn.net/article/details/128474930" target="_blank" class="card-link text-muted small">案例制作教程</a>
<a href="#" class="btn btn-success btn-sm card-link disabled" alt="暂未开放">学习博客</a>
</div>
</div>
</div>
<div class="col mt-2">
<div class="card border-info rounded-5 shadow-sm" style="min-height:306px;min-width:300px;">
<div class="card-header text-center">
<h4 class="card-title">IP 限制爬虫</h4>
<div class="bg-danger text-white rounded p-1"
style="transform: rotate(20deg); position:absolute;right:0;top:0.5rem;">最新更新
</div>
</div>
<div class="card-body">
<p class="card-text">本案例限制单IP每秒仅能访问3次API,学习时,需要用到代理IP池,或者间隔时间采集。</p>
<p class="card-text text-left">难度:⭐⭐</p>
<p class="card-text">
案例:
<a href="/ss/ajax_list3" class="card-link text-success">学校清单</a>
</p>
</div>
<div class="card-footer text-end">
<a href="https://dream.blog.csdn.net/article/details/128474930" target="_blank" class="card-link text-muted small">案例制作教程</a>
<a href="#" class="btn btn-success btn-sm card-link disabled" alt="暂未开放">学习博客</a>
</div>
</div>
......
{% extends "base.html" %}
{% block script %}
<script type="text/javascript" src="https://ajax.aspnetcdn.com/ajax/jQuery/jquery-3.6.0.min.js"></script>
<script type="text/javascript">
function get_data(page){
$.ajax({
type: "get",
url: "/ss/api3",
data: {
page: page
},
success: function(response) {
// ajax 请求成功
render_data(response);
// 修改分页数据
$('.prev').attr('page',response["prev_page"]);
$('.next').attr('page',response["next_page"]) ;
console.log("AJAX request succeeded!");
},
error: function(error) {
console.log("AJAX request failed: " + error);
}
});
}
function render_data(response){
data_list = response["data_list"];
if(data_list.length>0){
$('#school_list').empty();
$.each(data_list,function(index,item){
var row = $('<div>', {
'class': 'row mt-3',
'data-custom-attribute': 'value'
});
var col =$('<div>', {
'class': 'col'
});
var d_flex = $('<div>', {
'class': 'd-flex'
});
d_flex.append('<div class="flex-shrink-0"><a href="#"><img class="rounded-pill img-thumbnail" width="64" height="64" src="'+item.pic+'" alt=""></a></div>');
// 生成一下标签代码
var badge = "";
$.each(item.feature.split(','),function(i,f){
badge += ' <span class="badge rounded-pill bg-primary">'+f+'</span> ';
});
d_flex.append('<div class="flex-grow-1 ms-3"><h5 class="float-start pe-3">'+item.name+'</h5><p class="ms-3">'+badge+'</p><p><em>所在省市:<span class="text-black-50">'+item.province+'--'+item.city+'</span></em></p></div>')
col.append(d_flex);
row.append(col);
$('#school_list').append(row);
})
}
}
$(function(){
$('.page-item').on('click',function(){
page = $(this).attr('page');
// 获取数据
get_data(page);
})
})
</script>
{% endblock script %}
{% block content %}
<div class="container" id="school_list">
{% for school in pagination.data_list %}
<div class="row mt-3">
<div class="col">
<div class="d-flex">
<div class="flex-shrink-0">
<a href="#">
<img class="rounded-pill img-thumbnail" width="64" height="64" src="{{school.pic}}" alt="">
</a>
</div>
<div class="flex-grow-1 ms-3">
<h5 class="float-start pe-3">{{school.name}}</h5>
<p class="ms-3">
{% for fea in school.feature.split(',') %}
<span class="badge rounded-pill bg-primary">{{fea}}</span>
{% endfor %}
</p>
<p><em>所在省市:<span class="text-black-50">{{school.province}} -- {{school.city}}</span></em></p>
</div>
</div>
</div>
</div>
{% endfor %}
</div>
<div class="container">
<div class="row">
<div class="col">
<span class="text-dark float-end align-middle"
style="line-height: 40px;">合计 {{pagination.total}} 条数据</span>
<ul class="pagination float-end">
<li class="page-item prev" page="{{pagination.prev_page}}">
<a class="page-link" href="#">上一页</a>
</li>
<li class="page-item next" page="{{ pagination.next_page }}"><a class="page-link"
href="#">下一页</a>
</li>
</ul>
</div>
</div>
</div>
{% endblock %}
......@@ -16,6 +16,22 @@
<span class="timeline-label">
<span class="label bg-success text-white p-1">正在更新中</span>
</span>
<div class="timeline-item">
<div class="timeline-point timeline-point-success">
<i class="fa fa-times"></i>
</div>
<div class="timeline-event">
<div class="timeline-heading">
<h4>爬虫训练场 V0.0.16 发布</h4>
</div>
<div class="timeline-body">
<p>更新 反爬案例 --- IP 限制次数!</p>
</div>
<div class="timeline-footer">
<p class="text-right">2023年01月05日 20:35</p>
</div>
</div>
</div>
<div class="timeline-item">
<div class="timeline-point timeline-point-success">
<i class="fa fa-times"></i>
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册