Name
Last commit
Last update
crawler_log Loading commit data...
crawler_sys Loading commit data...
dev Loading commit data...
documentation Loading commit data...
gm_types Loading commit data...
gm_upload Loading commit data...
maintenance Loading commit data...
re_set_releaser_page_crawler Loading commit data...
tasks Loading commit data...
test Loading commit data...
write_data_into_es Loading commit data...
.gitignore Loading commit data...
README.md Loading commit data...
requirements.txt Loading commit data...
run.sh Loading commit data...
start_crawler.sh Loading commit data...

crawler

发布者页爬虫

  1. 部署在BJ-GM-Prod-Cos-faiss001/srv/apps/
  2. 切换权限 sudo su - gmuser
  3. source /root/anaconda3/bin/activate
  4. 创建虚拟环境 conda activate crawler_env/conda deactivate
  5. 抓取程序 nohup python /srv/apps/crawler/crawler_sys/framework/update_data_in_target_releasers_multi_process_by_date_from_redis.py > /data/log/fect_task.log &
  6. 写入抓取url程序 python /srv/apps/crawler/crawler_sys/framework/write_releasers_to_redis.py -p weibo -d 1 -proxies 2

##搜索页爬虫