医药，CSDN点赞器，搜狗图片

368a344d · 梦想橡皮擦 · f960ac86 · 368a344d · 368a344d · 368a344d
47 changed file
--- a/NO54/a.txt
+++ b/NO54/a.txt
+name:银黄胶囊name:阿胶益寿口服液name:香菊片name:舒阴洁洗剂name:灵丹草合剂name:田七痛经胶囊name:枣仁安神液name:复方庆大霉素膜name:复方穿心莲片name:橘半止咳颗粒name:银黄胶囊name:一清颗粒name:虫草洋参胶囊name:归圆口服液name:五子衍宗丸name:清气化痰丸name:藿香清胃片name:穿心莲片name:维C银翘片name:银黄颗粒name:抗脑衰胶囊name:苦胆草片name:通便灵胶囊name:复方鲜竹沥液name:强力枇杷露name:生脉饮name:复方海蛇胶囊name:宁心宝胶囊name:银黄颗粒name:六味地黄胶囊
+name:银黄胶囊
+name:阿胶益寿口服液
+name:香菊片
+name:舒阴洁洗剂
+name:灵丹草合剂
+name:田七痛经胶囊
+name:枣仁安神液
+name:复方庆大霉素膜
+name:复方穿心莲片
+name:橘半止咳颗粒
+name:银黄胶囊
+name:一清颗粒
+name:虫草洋参胶囊
+name:归圆口服液
+name:五子衍宗丸
+name:清气化痰丸
+name:藿香清胃片
+name:穿心莲片
+name:维C银翘片
+name:银黄颗粒
+name:抗脑衰胶囊
+name:苦胆草片
+name:通便灵胶囊
+name:复方鲜竹沥液
+name:强力枇杷露
+name:生脉饮
+name:复方海蛇胶囊
+name:宁心宝胶囊
+name:银黄颗粒
+name:六味地黄胶囊
\ No newline at end of file
--- a/NO54/data/yy/2021-10-27T06-49-06.csv
+++ b/NO54/data/yy/2021-10-27T06-49-06.csv
+name
+银黄胶囊
+阿胶益寿口服液
+香菊片
+舒阴洁洗剂
+灵丹草合剂
+田七痛经胶囊
+枣仁安神液
+复方庆大霉素膜
+复方穿心莲片
+橘半止咳颗粒
+银黄胶囊
+一清颗粒
+虫草洋参胶囊
+归圆口服液
+五子衍宗丸
+清气化痰丸
+藿香清胃片
+穿心莲片
+维C银翘片
+银黄颗粒
+抗脑衰胶囊
+苦胆草片
+通便灵胶囊
+复方鲜竹沥液
+强力枇杷露
+生脉饮
+复方海蛇胶囊
+宁心宝胶囊
+银黄颗粒
+六味地黄胶囊
--- a/NO54/scrapy.cfg
+++ b/NO54/scrapy.cfg
+# Automatically created by: scrapy startproject
+#
+# For more information about the [deploy] section see:
+# https://scrapyd.readthedocs.io/en/latest/deploy.html
+
+[settings]
+default = yiyao.settings
+
+[deploy]
+#url = http://localhost:6800/
+project = yiyao
--- a/NO54/yiyao/__init__.py
+++ b/NO54/yiyao/__init__.py
--- a/NO54/yiyao/__pycache__/__init__.cpython-37.pyc
+++ b/NO54/yiyao/__pycache__/__init__.cpython-37.pyc
--- a/NO54/yiyao/__pycache__/items.cpython-37.pyc
+++ b/NO54/yiyao/__pycache__/items.cpython-37.pyc
--- a/NO54/yiyao/__pycache__/my_ext.cpython-37.pyc
+++ b/NO54/yiyao/__pycache__/my_ext.cpython-37.pyc
--- a/NO54/yiyao/__pycache__/pipelines.cpython-37.pyc
+++ b/NO54/yiyao/__pycache__/pipelines.cpython-37.pyc
--- a/NO54/yiyao/__pycache__/settings.cpython-37.pyc
+++ b/NO54/yiyao/__pycache__/settings.cpython-37.pyc
--- a/NO54/yiyao/items.py
+++ b/NO54/yiyao/items.py
+# Define here the models for your scraped items
+#
+# See documentation in:
+# https://docs.scrapy.org/en/latest/topics/items.html
+
+import scrapy
+
+
+class YiyaoItem(scrapy.Item):
+    # define the fields for your item here like:
+    name = scrapy.Field()
+
--- a/NO54/yiyao/middlewares.py
+++ b/NO54/yiyao/middlewares.py
+# Define here the models for your spider middleware
+#
+# See documentation in:
+# https://docs.scrapy.org/en/latest/topics/spider-middleware.html
+
+from scrapy import signals
+
+# useful for handling different item types with a single interface
+from itemadapter import is_item, ItemAdapter
+
+
+class YiyaoSpiderMiddleware:
+    # Not all methods need to be defined. If a method is not defined,
+    # scrapy acts as if the spider middleware does not modify the
+    # passed objects.
+
+    @classmethod
+    def from_crawler(cls, crawler):
+        # This method is used by Scrapy to create your spiders.
+        s = cls()
+        crawler.signals.connect(s.spider_opened, signal=signals.spider_opened)
+        return s
+
+    def process_spider_input(self, response, spider):
+        # Called for each response that goes through the spider
+        # middleware and into the spider.
+
+        # Should return None or raise an exception.
+        return None
+
+    def process_spider_output(self, response, result, spider):
+        # Called with the results returned from the Spider, after
+        # it has processed the response.
+
+        # Must return an iterable of Request, or item objects.
+        for i in result:
+            yield i
+
+    def process_spider_exception(self, response, exception, spider):
+        # Called when a spider or process_spider_input() method
+        # (from other spider middleware) raises an exception.
+
+        # Should return either None or an iterable of Request or item objects.
+        pass
+
+    def process_start_requests(self, start_requests, spider):
+        # Called with the start requests of the spider, and works
+        # similarly to the process_spider_output() method, except
+        # that it doesn’t have a response associated.
+
+        # Must return only requests (not items).
+        for r in start_requests:
+            yield r
+
+    def spider_opened(self, spider):
+        spider.logger.info('Spider opened: %s' % spider.name)
+
+
+class YiyaoDownloaderMiddleware:
+    # Not all methods need to be defined. If a method is not defined,
+    # scrapy acts as if the downloader middleware does not modify the
+    # passed objects.
+
+    @classmethod
+    def from_crawler(cls, crawler):
+        # This method is used by Scrapy to create your spiders.
+        s = cls()
+        crawler.signals.connect(s.spider_opened, signal=signals.spider_opened)
+        return s
+
+    def process_request(self, request, spider):
+        # Called for each request that goes through the downloader
+        # middleware.
+
+        # Must either:
+        # - return None: continue processing this request
+        # - or return a Response object
+        # - or return a Request object
+        # - or raise IgnoreRequest: process_exception() methods of
+        #   installed downloader middleware will be called
+        return None
+
+    def process_response(self, request, response, spider):
+        # Called with the response returned from the downloader.
+
+        # Must either;
+        # - return a Response object
+        # - return a Request object
+        # - or raise IgnoreRequest
+        return response
+
+    def process_exception(self, request, exception, spider):
+        # Called when a download handler or a process_request()
+        # (from other downloader middleware) raises an exception.
+
+        # Must either:
+        # - return None: continue processing this exception
+        # - return a Response object: stops process_exception() chain
+        # - return a Request object: stops process_exception() chain
+        pass
+
+    def spider_opened(self, spider):
+        spider.logger.info('Spider opened: %s' % spider.name)
--- a/NO54/yiyao/my_ext.py
+++ b/NO54/yiyao/my_ext.py
+from scrapy.exporters import BaseItemExporter
+
+
+class TXTItemExporter(BaseItemExporter):
+
+    def __init__(self, file, **kwargs):
+        super().__init__(dont_fail=True, **kwargs)
+        self.file = file
+
+    def export_item(self, item):
+        # _get_serialized_fields 方法可以获得 item 所有字段，并返回迭代器
+        print(self._get_serialized_fields(item, default_value=''))
+        print(self.file)
+        for name, value in self._get_serialized_fields(item, default_value=''):
+            self.file.write(bytes("\nname:" + value, encoding="utf-8"))
--- a/NO54/yiyao/pipelines.py
+++ b/NO54/yiyao/pipelines.py
+# Define your item pipelines here
+#
+# Don't forget to add your pipeline to the ITEM_PIPELINES setting
+# See: https://docs.scrapy.org/en/latest/topics/item-pipeline.html
+
+
+# useful for handling different item types with a single interface
+from itemadapter import ItemAdapter
+
+
+class YiyaoPipeline:
+    def process_item(self, item, spider):
+        return item
--- a/NO54/yiyao/settings.py
+++ b/NO54/yiyao/settings.py
+# Scrapy settings for yiyao project
+#
+# For simplicity, this file contains only settings considered important or
+# commonly used. You can find more settings consulting the documentation:
+#
+#     https://docs.scrapy.org/en/latest/topics/settings.html
+#     https://docs.scrapy.org/en/latest/topics/downloader-middleware.html
+#     https://docs.scrapy.org/en/latest/topics/spider-middleware.html
+
+BOT_NAME = 'yiyao'
+
+SPIDER_MODULES = ['yiyao.spiders']
+NEWSPIDER_MODULE = 'yiyao.spiders'
+
+# Crawl responsibly by identifying yourself (and your website) on the user-agent
+USER_AGENT = 'yiyao (+http://www.yourdomain.com)'
+
+# Obey robots.txt rules
+ROBOTSTXT_OBEY = False
+
+# Configure maximum concurrent requests performed by Scrapy (default: 16)
+# CONCURRENT_REQUESTS = 32
+
+# Configure a delay for requests for the same website (default: 0)
+# See https://docs.scrapy.org/en/latest/topics/settings.html#download-delay
+# See also autothrottle settings and docs
+# DOWNLOAD_DELAY = 3
+# The download delay setting will honor only one of:
+# CONCURRENT_REQUESTS_PER_DOMAIN = 16
+# CONCURRENT_REQUESTS_PER_IP = 16
+
+# Disable cookies (enabled by default)
+# COOKIES_ENABLED = False
+
+# Disable Telnet Console (enabled by default)
+# TELNETCONSOLE_ENABLED = False
+
+# Override the default request headers:
+# DEFAULT_REQUEST_HEADERS = {
+#   'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8',
+#   'Accept-Language': 'en',
+# }
+
+# Enable or disable spider middlewares
+# See https://docs.scrapy.org/en/latest/topics/spider-middleware.html
+# SPIDER_MIDDLEWARES = {
+#    'yiyao.middlewares.YiyaoSpiderMiddleware': 543,
+# }
+
+# Enable or disable downloader middlewares
+# See https://docs.scrapy.org/en/latest/topics/downloader-middleware.html
+# DOWNLOADER_MIDDLEWARES = {
+#    'yiyao.middlewares.YiyaoDownloaderMiddleware': 543,
+# }
+
+# Enable or disable extensions
+# See https://docs.scrapy.org/en/latest/topics/extensions.html
+# EXTENSIONS = {
+#    'scrapy.extensions.telnet.TelnetConsole': None,
+# }
+
+# Configure item pipelines
+# See https://docs.scrapy.org/en/latest/topics/item-pipeline.html
+ITEM_PIPELINES = {
+    'yiyao.pipelines.YiyaoPipeline': 300,
+}
+
+# Enable and configure the AutoThrottle extension (disabled by default)
+# See https://docs.scrapy.org/en/latest/topics/autothrottle.html
+# AUTOTHROTTLE_ENABLED = True
+# The initial download delay
+# AUTOTHROTTLE_START_DELAY = 5
+# The maximum download delay to be set in case of high latencies
+# AUTOTHROTTLE_MAX_DELAY = 60
+# The average number of requests Scrapy should be sending in parallel to
+# each remote server
+# AUTOTHROTTLE_TARGET_CONCURRENCY = 1.0
+# Enable showing throttling stats for every response received:
+# AUTOTHROTTLE_DEBUG = False
+
+# Enable and configure HTTP caching (disabled by default)
+# See https://docs.scrapy.org/en/latest/topics/downloader-middleware.html#httpcache-middleware-settings
+# HTTPCACHE_ENABLED = True
+# HTTPCACHE_EXPIRATION_SECS = 0
+# HTTPCACHE_DIR = 'httpcache'
+# HTTPCACHE_IGNORE_HTTP_CODES = []
+# HTTPCACHE_STORAGE = 'scrapy.extensions.httpcache.FilesystemCacheStorage'
+# FEEDS = {
+#     '%(batch_id)d.csv': {
+#         'format': 'csv',
+#         'encoding': 'utf8',
+#         'batch_item_count': 2,
+#     },
+# }
+
+FEED_EXPORTERS={'txt':'yiyao.my_ext.TXTItemExporter'}
--- a/NO54/yiyao/spiders/__init__.py
+++ b/NO54/yiyao/spiders/__init__.py
+# This package will contain the spiders of your Scrapy project
+#
+# Please refer to the documentation for information on how to create and manage
+# your spiders.
--- a/NO54/yiyao/spiders/__pycache__/__init__.cpython-37.pyc
+++ b/NO54/yiyao/spiders/__pycache__/__init__.cpython-37.pyc
--- a/NO54/yiyao/spiders/__pycache__/yy.cpython-37.pyc
+++ b/NO54/yiyao/spiders/__pycache__/yy.cpython-37.pyc
--- a/NO54/yiyao/spiders/yy.py
+++ b/NO54/yiyao/spiders/yy.py
+import scrapy
+from yiyao.items import YiyaoItem
+
+class YySpider(scrapy.Spider):
+    name = 'yy'
+    allowed_domains = ['pharmnet.com.cn']
+    start_urls = ['http://www.pharmnet.com.cn/product/1111/1/1.html']
+
+    def parse(self, response):
+        all_items = response.css('a.green.fb.f13::text').getall()
+        for item in all_items:
+            ret = YiyaoItem()
+            ret["name"] = item
+            yield ret
--- a/NO55/images/可爱水手服美女清新高清桌面壁纸_0.jpg
+++ b/NO55/images/可爱水手服美女清新高清桌面壁纸_0.jpg
--- a/NO55/scrapy.cfg
+++ b/NO55/scrapy.cfg
+# Automatically created by: scrapy startproject
+#
+# For more information about the [deploy] section see:
+# https://scrapyd.readthedocs.io/en/latest/deploy.html
+
+[settings]
+default = sougou.settings
+
+[deploy]
+#url = http://localhost:6800/
+project = sougou
--- a/NO55/sougou/__init__.py
+++ b/NO55/sougou/__init__.py
--- a/NO55/sougou/__pycache__/__init__.cpython-37.pyc
+++ b/NO55/sougou/__pycache__/__init__.cpython-37.pyc
--- a/NO55/sougou/__pycache__/pipelines.cpython-37.pyc
+++ b/NO55/sougou/__pycache__/pipelines.cpython-37.pyc
--- a/NO55/sougou/__pycache__/settings.cpython-37.pyc
+++ b/NO55/sougou/__pycache__/settings.cpython-37.pyc
--- a/NO55/sougou/items.py
+++ b/NO55/sougou/items.py
+# Define here the models for your scraped items
+#
+# See documentation in:
+# https://docs.scrapy.org/en/latest/topics/items.html
+
+import scrapy
+
+
+class SougouItem(scrapy.Item):
+    # define the fields for your item here like:
+    # name = scrapy.Field()
+    pass
--- a/NO55/sougou/middlewares.py
+++ b/NO55/sougou/middlewares.py
+# Define here the models for your spider middleware
+#
+# See documentation in:
+# https://docs.scrapy.org/en/latest/topics/spider-middleware.html
+
+from scrapy import signals
+
+# useful for handling different item types with a single interface
+from itemadapter import is_item, ItemAdapter
+
+
+class SougouSpiderMiddleware:
+    # Not all methods need to be defined. If a method is not defined,
+    # scrapy acts as if the spider middleware does not modify the
+    # passed objects.
+
+    @classmethod
+    def from_crawler(cls, crawler):
+        # This method is used by Scrapy to create your spiders.
+        s = cls()
+        crawler.signals.connect(s.spider_opened, signal=signals.spider_opened)
+        return s
+
+    def process_spider_input(self, response, spider):
+        # Called for each response that goes through the spider
+        # middleware and into the spider.
+
+        # Should return None or raise an exception.
+        return None
+
+    def process_spider_output(self, response, result, spider):
+        # Called with the results returned from the Spider, after
+        # it has processed the response.
+
+        # Must return an iterable of Request, or item objects.
+        for i in result:
+            yield i
+
+    def process_spider_exception(self, response, exception, spider):
+        # Called when a spider or process_spider_input() method
+        # (from other spider middleware) raises an exception.
+
+        # Should return either None or an iterable of Request or item objects.
+        pass
+
+    def process_start_requests(self, start_requests, spider):
+        # Called with the start requests of the spider, and works
+        # similarly to the process_spider_output() method, except
+        # that it doesn’t have a response associated.
+
+        # Must return only requests (not items).
+        for r in start_requests:
+            yield r
+
+    def spider_opened(self, spider):
+        spider.logger.info('Spider opened: %s' % spider.name)
+
+
+class SougouDownloaderMiddleware:
+    # Not all methods need to be defined. If a method is not defined,
+    # scrapy acts as if the downloader middleware does not modify the
+    # passed objects.
+
+    @classmethod
+    def from_crawler(cls, crawler):
+        # This method is used by Scrapy to create your spiders.
+        s = cls()
+        crawler.signals.connect(s.spider_opened, signal=signals.spider_opened)
+        return s
+
+    def process_request(self, request, spider):
+        # Called for each request that goes through the downloader
+        # middleware.
+
+        # Must either:
+        # - return None: continue processing this request
+        # - or return a Response object
+        # - or return a Request object
+        # - or raise IgnoreRequest: process_exception() methods of
+        #   installed downloader middleware will be called
+        return None
+
+    def process_response(self, request, response, spider):
+        # Called with the response returned from the downloader.
+
+        # Must either;
+        # - return a Response object
+        # - return a Request object
+        # - or raise IgnoreRequest
+        return response
+
+    def process_exception(self, request, exception, spider):
+        # Called when a download handler or a process_request()
+        # (from other downloader middleware) raises an exception.
+
+        # Must either:
+        # - return None: continue processing this exception
+        # - return a Response object: stops process_exception() chain
+        # - return a Request object: stops process_exception() chain
+        pass
+
+    def spider_opened(self, spider):
+        spider.logger.info('Spider opened: %s' % spider.name)
--- a/NO55/sougou/pipelines.py
+++ b/NO55/sougou/pipelines.py
+# Define your item pipelines here
+#
+# Don't forget to add your pipeline to the ITEM_PIPELINES setting
+# See: https://docs.scrapy.org/en/latest/topics/item-pipeline.html
+
+
+# useful for handling different item types with a single interface
+from itemadapter import ItemAdapter
+from scrapy.http import Request
+from scrapy.pipelines.images import ImagesPipeline
+
+class SougouPipeline:
+    def process_item(self, item, spider):
+        return item
+
+
+class SogouImgPipeline(ImagesPipeline):
+
+    def get_media_requests(self, item, info):
+        name = item["name"]
+        for index, url in enumerate(item["image_urls"]):
+            yield Request(url, meta={'name': name, 'index': index})
+
+    def file_path(self, request, response=None, info=None):
+        # 名称
+        name = request.meta['name']
+        # 索引
+        index = request.meta['index']
+
+        filename = u'{0}_{1}.jpg'.format(name, index)
+        print(filename)
+        return filename
--- a/NO55/sougou/settings.py
+++ b/NO55/sougou/settings.py
+# Scrapy settings for sougou project
+#
+# For simplicity, this file contains only settings considered important or
+# commonly used. You can find more settings consulting the documentation:
+#
+#     https://docs.scrapy.org/en/latest/topics/settings.html
+#     https://docs.scrapy.org/en/latest/topics/downloader-middleware.html
+#     https://docs.scrapy.org/en/latest/topics/spider-middleware.html
+
+BOT_NAME = 'sougou'
+
+SPIDER_MODULES = ['sougou.spiders']
+NEWSPIDER_MODULE = 'sougou.spiders'
+
+# Crawl responsibly by identifying yourself (and your website) on the user-agent
+USER_AGENT = 'Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/95.0.4638.54 Safari/537.36'
+
+# Obey robots.txt rules
+ROBOTSTXT_OBEY = False
+
+# Configure maximum concurrent requests performed by Scrapy (default: 16)
+# CONCURRENT_REQUESTS = 32
+
+# Configure a delay for requests for the same website (default: 0)
+# See https://docs.scrapy.org/en/latest/topics/settings.html#download-delay
+# See also autothrottle settings and docs
+DOWNLOAD_DELAY = 1
+RANDOMIZE_DOWNLOAD_DELAY = True
+# The download delay setting will honor only one of:
+# CONCURRENT_REQUESTS_PER_DOMAIN = 16
+# CONCURRENT_REQUESTS_PER_IP = 16
+
+# Disable cookies (enabled by default)
+# COOKIES_ENABLED = False
+
+# Disable Telnet Console (enabled by default)
+# TELNETCONSOLE_ENABLED = False
+
+# Override the default request headers:
+DEFAULT_REQUEST_HEADERS = {
+    'Accept': 'application/json, text/plain, */*',
+    'Accept-Encoding': 'gzip, deflate, br',
+    'Accept-Language': 'zh-CN,zh;q=0.9',
+    'HOST': 'pic.sogou.com',
+}
+
+# Enable or disable spider middlewares
+# See https://docs.scrapy.org/en/latest/topics/spider-middleware.html
+# SPIDER_MIDDLEWARES = {
+#    'sougou.middlewares.SougouSpiderMiddleware': 543,
+# }
+
+# Enable or disable downloader middlewares
+# See https://docs.scrapy.org/en/latest/topics/downloader-middleware.html
+# DOWNLOADER_MIDDLEWARES = {
+#    'sougou.middlewares.SougouDownloaderMiddleware': 543,
+# }
+
+# Enable or disable extensions
+# See https://docs.scrapy.org/en/latest/topics/extensions.html
+# EXTENSIONS = {
+#    'scrapy.extensions.telnet.TelnetConsole': None,
+# }
+
+# Configure item pipelines
+# See https://docs.scrapy.org/en/latest/topics/item-pipeline.html
+ITEM_PIPELINES = {
+    'sougou.pipelines.SogouImgPipeline': 1,
+}
+IMAGES_STORE = "images"
+
+# Enable and configure the AutoThrottle extension (disabled by default)
+# See https://docs.scrapy.org/en/latest/topics/autothrottle.html
+# AUTOTHROTTLE_ENABLED = True
+# The initial download delay
+# AUTOTHROTTLE_START_DELAY = 5
+# The maximum download delay to be set in case of high latencies
+# AUTOTHROTTLE_MAX_DELAY = 60
+# The average number of requests Scrapy should be sending in parallel to
+# each remote server
+# AUTOTHROTTLE_TARGET_CONCURRENCY = 1.0
+# Enable showing throttling stats for every response received:
+# AUTOTHROTTLE_DEBUG = False
+
+# Enable and configure HTTP caching (disabled by default)
+# See https://docs.scrapy.org/en/latest/topics/downloader-middleware.html#httpcache-middleware-settings
+# HTTPCACHE_ENABLED = True
+# HTTPCACHE_EXPIRATION_SECS = 0
+# HTTPCACHE_DIR = 'httpcache'
+# HTTPCACHE_IGNORE_HTTP_CODES = []
+# HTTPCACHE_STORAGE = 'scrapy.extensions.httpcache.FilesystemCacheStorage'
--- a/NO55/sougou/spiders/__init__.py
+++ b/NO55/sougou/spiders/__init__.py
+# This package will contain the spiders of your Scrapy project
+#
+# Please refer to the documentation for information on how to create and manage
+# your spiders.
--- a/NO55/sougou/spiders/__pycache__/__init__.cpython-37.pyc
+++ b/NO55/sougou/spiders/__pycache__/__init__.cpython-37.pyc
--- a/NO55/sougou/spiders/__pycache__/sg.cpython-37.pyc
+++ b/NO55/sougou/spiders/__pycache__/sg.cpython-37.pyc
--- a/NO55/sougou/spiders/sg.py
+++ b/NO55/sougou/spiders/sg.py
+import scrapy
+
+
+class SgSpider(scrapy.Spider):
+    name = 'sg'
+    allowed_domains = ['pic.sogou.com']
+    base_url = "https://pic.sogou.com/napi/pc/recommend?key=homeFeedData&category=feed&start={}&len=10"
+    start_urls = [base_url.format(0)]
+
+    def parse(self, response):
+        json_data = response.json()
+        if json_data is not None:
+            img_list = json_data["data"]["list"]
+            for img in img_list:
+                yield {
+                    'name': img[0]['title'],
+                    'image_urls': [_["originImage"] for _ in img[0]["picList"]],
+                }
+        else:
+            return None
--- a/NO56/ca_tt.py
+++ b/NO56/ca_tt.py
+import browsercookie
+import requests
+import re
+firefox_cookiejar = browsercookie.firefox()
+
+
+# res = requests.get("https://img-home.csdnimg.cn/data_json/jsconfig/menu_path.json", cookies=firefox_cookiejar)
+# print(res.text)
--- a/NO56/csdn/__init__.py
+++ b/NO56/csdn/__init__.py
--- a/NO56/csdn/__pycache__/__init__.cpython-37.pyc
+++ b/NO56/csdn/__pycache__/__init__.cpython-37.pyc
--- a/NO56/csdn/__pycache__/middlewares.cpython-37.pyc
+++ b/NO56/csdn/__pycache__/middlewares.cpython-37.pyc
--- a/NO56/csdn/__pycache__/settings.cpython-37.pyc
+++ b/NO56/csdn/__pycache__/settings.cpython-37.pyc
--- a/NO56/csdn/items.py
+++ b/NO56/csdn/items.py
+# Define here the models for your scraped items
+#
+# See documentation in:
+# https://docs.scrapy.org/en/latest/topics/items.html
+
+import scrapy
+
+
+class CsdnItem(scrapy.Item):
+    # define the fields for your item here like:
+    # name = scrapy.Field()
+    pass
--- a/NO56/csdn/middlewares.py
+++ b/NO56/csdn/middlewares.py
+# Define here the models for your spider middleware
+#
+# See documentation in:
+# https://docs.scrapy.org/en/latest/topics/spider-middleware.html
+
+from scrapy import signals
+from scrapy.downloadermiddlewares.cookies import CookiesMiddleware
+import browsercookie
+
+# useful for handling different item types with a single interface
+from itemadapter import is_item, ItemAdapter
+
+
+class CsdnSpiderMiddleware:
+    # Not all methods need to be defined. If a method is not defined,
+    # scrapy acts as if the spider middleware does not modify the
+    # passed objects.
+
+    @classmethod
+    def from_crawler(cls, crawler):
+        # This method is used by Scrapy to create your spiders.
+        s = cls()
+        crawler.signals.connect(s.spider_opened, signal=signals.spider_opened)
+        return s
+
+    def process_spider_input(self, response, spider):
+        # Called for each response that goes through the spider
+        # middleware and into the spider.
+
+        # Should return None or raise an exception.
+        return None
+
+    def process_spider_output(self, response, result, spider):
+        # Called with the results returned from the Spider, after
+        # it has processed the response.
+
+        # Must return an iterable of Request, or item objects.
+        for i in result:
+            yield i
+
+    def process_spider_exception(self, response, exception, spider):
+        # Called when a spider or process_spider_input() method
+        # (from other spider middleware) raises an exception.
+
+        # Should return either None or an iterable of Request or item objects.
+        pass
+
+    def process_start_requests(self, start_requests, spider):
+        # Called with the start requests of the spider, and works
+        # similarly to the process_spider_output() method, except
+        # that it doesn’t have a response associated.
+
+        # Must return only requests (not items).
+        for r in start_requests:
+            yield r
+
+    def spider_opened(self, spider):
+        spider.logger.info('Spider opened: %s' % spider.name)
+
+
+class CsdnDownloaderMiddleware:
+    # Not all methods need to be defined. If a method is not defined,
+    # scrapy acts as if the downloader middleware does not modify the
+    # passed objects.
+
+    @classmethod
+    def from_crawler(cls, crawler):
+        # This method is used by Scrapy to create your spiders.
+        s = cls()
+        crawler.signals.connect(s.spider_opened, signal=signals.spider_opened)
+        return s
+
+    def process_request(self, request, spider):
+        # Called for each request that goes through the downloader
+        # middleware.
+
+        # Must either:
+        # - return None: continue processing this request
+        # - or return a Response object
+        # - or return a Request object
+        # - or raise IgnoreRequest: process_exception() methods of
+        #   installed downloader middleware will be called
+        return None
+
+    def process_response(self, request, response, spider):
+        # Called with the response returned from the downloader.
+
+        # Must either;
+        # - return a Response object
+        # - return a Request object
+        # - or raise IgnoreRequest
+        return response
+
+    def process_exception(self, request, exception, spider):
+        # Called when a download handler or a process_request()
+        # (from other downloader middleware) raises an exception.
+
+        # Must either:
+        # - return None: continue processing this exception
+        # - return a Response object: stops process_exception() chain
+        # - return a Request object: stops process_exception() chain
+        pass
+
+    def spider_opened(self, spider):
+        spider.logger.info('Spider opened: %s' % spider.name)
+
+
+class BrowserCookiesDownloaderMiddleware(CookiesMiddleware):
+    def __init__(self, debug=False):
+        super().__init__(debug)
+        self.load_browser_cookies()
+
+    def load_browser_cookies(self):
+        # 注意这个地方的名字叫做 firefox
+        jar = self.jars['firefox']
+        firefox_cookiejar = browsercookie.firefox()
+        for cookie in firefox_cookiejar:
+            jar.set_cookie(cookie)
--- a/NO56/csdn/pipelines.py
+++ b/NO56/csdn/pipelines.py
+# Define your item pipelines here
+#
+# Don't forget to add your pipeline to the ITEM_PIPELINES setting
+# See: https://docs.scrapy.org/en/latest/topics/item-pipeline.html
+
+
+# useful for handling different item types with a single interface
+from itemadapter import ItemAdapter
+
+
+class CsdnPipeline:
+    def process_item(self, item, spider):
+        return item
--- a/NO56/csdn/settings.py
+++ b/NO56/csdn/settings.py
+# Scrapy settings for csdn project
+#
+# For simplicity, this file contains only settings considered important or
+# commonly used. You can find more settings consulting the documentation:
+#
+#     https://docs.scrapy.org/en/latest/topics/settings.html
+#     https://docs.scrapy.org/en/latest/topics/downloader-middleware.html
+#     https://docs.scrapy.org/en/latest/topics/spider-middleware.html
+
+BOT_NAME = 'csdn'
+
+SPIDER_MODULES = ['csdn.spiders']
+NEWSPIDER_MODULE = 'csdn.spiders'
+
+# Crawl responsibly by identifying yourself (and your website) on the user-agent
+USER_AGENT = 'Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:92.0) Gecko/20100101 Firefox/92.0'
+
+# Obey robots.txt rules
+ROBOTSTXT_OBEY = False
+
+# Configure maximum concurrent requests performed by Scrapy (default: 16)
+# CONCURRENT_REQUESTS = 32
+
+# Configure a delay for requests for the same website (default: 0)
+# See https://docs.scrapy.org/en/latest/topics/settings.html#download-delay
+# See also autothrottle settings and docs
+# DOWNLOAD_DELAY = 3
+# The download delay setting will honor only one of:
+# CONCURRENT_REQUESTS_PER_DOMAIN = 16
+# CONCURRENT_REQUESTS_PER_IP = 16
+
+# Disable cookies (enabled by default)
+# COOKIES_ENABLED = False
+
+# Disable Telnet Console (enabled by default)
+# TELNETCONSOLE_ENABLED = False
+
+# Override the default request headers:
+# DEFAULT_REQUEST_HEADERS = {
+#   'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8',
+#   'Accept-Language': 'en',
+# }
+
+# Enable or disable spider middlewares
+# See https://docs.scrapy.org/en/latest/topics/spider-middleware.html
+# SPIDER_MIDDLEWARES = {
+#    'csdn.middlewares.CsdnSpiderMiddleware': 543,
+# }
+
+# Enable or disable downloader middlewares
+# See https://docs.scrapy.org/en/latest/topics/downloader-middleware.html
+DOWNLOADER_MIDDLEWARES = {
+    'scrapy.downloadermiddlewares.cookies.CookiesMiddleware': None,
+    'csdn.middlewares.BrowserCookiesDownloaderMiddleware': 543,
+}
+
+# Enable or disable extensions
+# See https://docs.scrapy.org/en/latest/topics/extensions.html
+# EXTENSIONS = {
+#    'scrapy.extensions.telnet.TelnetConsole': None,
+# }
+
+# Configure item pipelines
+# See https://docs.scrapy.org/en/latest/topics/item-pipeline.html
+# ITEM_PIPELINES = {
+#    'csdn.pipelines.CsdnPipeline': 300,
+# }
+
+# Enable and configure the AutoThrottle extension (disabled by default)
+# See https://docs.scrapy.org/en/latest/topics/autothrottle.html
+# AUTOTHROTTLE_ENABLED = True
+# The initial download delay
+# AUTOTHROTTLE_START_DELAY = 5
+# The maximum download delay to be set in case of high latencies
+# AUTOTHROTTLE_MAX_DELAY = 60
+# The average number of requests Scrapy should be sending in parallel to
+# each remote server
+# AUTOTHROTTLE_TARGET_CONCURRENCY = 1.0
+# Enable showing throttling stats for every response received:
+# AUTOTHROTTLE_DEBUG = False
+
+# Enable and configure HTTP caching (disabled by default)
+# See https://docs.scrapy.org/en/latest/topics/downloader-middleware.html#httpcache-middleware-settings
+# HTTPCACHE_ENABLED = True
+# HTTPCACHE_EXPIRATION_SECS = 0
+# HTTPCACHE_DIR = 'httpcache'
+# HTTPCACHE_IGNORE_HTTP_CODES = []
+# HTTPCACHE_STORAGE = 'scrapy.extensions.httpcache.FilesystemCacheStorage'
--- a/NO56/csdn/spiders/__init__.py
+++ b/NO56/csdn/spiders/__init__.py
+# This package will contain the spiders of your Scrapy project
+#
+# Please refer to the documentation for information on how to create and manage
+# your spiders.
--- a/NO56/csdn/spiders/__pycache__/__init__.cpython-37.pyc
+++ b/NO56/csdn/spiders/__pycache__/__init__.cpython-37.pyc
--- a/NO56/csdn/spiders/__pycache__/clike.cpython-37.pyc
+++ b/NO56/csdn/spiders/__pycache__/clike.cpython-37.pyc
--- a/NO56/csdn/spiders/clike.py
+++ b/NO56/csdn/spiders/clike.py
+import scrapy
+
+
+class ClikeSpider(scrapy.Spider):
+    name = 'clike'
+    allowed_domains = ['csdn.net']
+    like_url = 'https://blog.csdn.net/phoenix/web/v1/article/like'
+
+    def start_requests(self):
+        data = {
+            "articleId": "120845464",
+        }
+        yield scrapy.FormRequest(url=self.like_url, formdata=data, meta={'cookiejar': 'firefox'})
+
+    def parse(self, response):
+        print(response.json())
--- a/NO56/scrapy.cfg
+++ b/NO56/scrapy.cfg
+# Automatically created by: scrapy startproject
+#
+# For more information about the [deploy] section see:
+# https://scrapyd.readthedocs.io/en/latest/deploy.html
+
+[settings]
+default = csdn.settings
+
+[deploy]
+#url = http://localhost:6800/
+project = csdn
--- a/README.md
+++ b/README.md
@@ -94,5 +94,10 @@
 48. [程序员跨行帮朋友，python爬虫之饲料添加剂数据，采集+备份](https://dream.blog.csdn.net/article/details/121028282)
 49. [CSDN热榜、华为云博客都可用来练习Python scrapy 爬虫](https://dream.blog.csdn.net/article/details/121066927)
 50. [纯纯的爬虫知识，python scrapy 下载中间件知多少](https://dream.blog.csdn.net/article/details/121083780)
-51. [20行Python代码、爬虫、蓝桥训练营，一篇博客整合这几个关键词](https://dream.blog.csdn.net/article/details/121151700)
+51. [[20行Python scrapy 代码，去采集【蓝桥】训练营](https://editor.csdn.net/md/?articleId=121151700)](https://dream.blog.csdn.net/article/details/121151700)
+52. [Scrapy Spider中间件，你学会了吗？本篇博客有一案例](https://dream.blog.csdn.net/article/details/120969435)
+53. [通过淘宝数据学习爬虫，python scrapy requests与response对象](https://dream.blog.csdn.net/article/details/120979533)
+54. [你知道在 scrapy 中，可以定制化导出数据格式吗？scrapy 导出器学习](https://dream.blog.csdn.net/article/details/120992365)
+55. [python scrapy ，几行代码实现一个【搜狗图片】下载器](https://dream.blog.csdn.net/article/details/120996308)
+56. [Python爬虫落地应用之【自动化点赞器】，一篇游走在封禁边缘的博客](https://dream.blog.csdn.net/article/details/121000212)