Skip to content
Projects
Groups
Snippets
Help
Loading...
Sign in
Toggle navigation
C
crawler
Project
Project
Details
Activity
Cycle Analytics
Repository
Repository
Files
Commits
Branches
Tags
Contributors
Graph
Compare
Charts
Issues
0
Issues
0
List
Board
Labels
Milestones
Merge Requests
0
Merge Requests
0
CI / CD
CI / CD
Pipelines
Jobs
Schedules
Charts
Wiki
Wiki
Snippets
Snippets
Members
Members
Collapse sidebar
Close sidebar
Activity
Graph
Charts
Create a new issue
Jobs
Commits
Issue Boards
Open sidebar
backend
crawler
Commits
e0fc155c
Commit
e0fc155c
authored
4 years ago
by
litaolemo
Browse files
Options
Browse Files
Download
Email Patches
Plain Diff
update
parent
7f7d3d81
Show whitespace changes
Inline
Side-by-side
Showing
2 changed files
with
13 additions
and
4 deletions
+13
-4
cal_ni_and_put_to_backend.py
crawler_sys/scheduler/cal_ni_and_put_to_backend.py
+3
-1
push_crawler_data_to_mysql.py
crawler_sys/scheduler/push_crawler_data_to_mysql.py
+10
-3
No files found.
crawler_sys/scheduler/cal_ni_and_put_to_backend.py
View file @
e0fc155c
...
@@ -477,7 +477,9 @@ def task_main():
...
@@ -477,7 +477,9 @@ def task_main():
{content}
{content}
"""
.
format
(
tractate_id
=
tractate_id
,
content
=
res_data
[
"content"
],
level
=
res_data
[
"level"
])
"""
.
format
(
tractate_id
=
tractate_id
,
content
=
res_data
[
"content"
],
level
=
res_data
[
"level"
])
send_file_email
(
""
,
""
,
email_group
=
[
"<liulingling@igengmei.com>"
,
"<liujinhuan@igengmei.com>"
,
"<hongxu@igengmei.com>"
,
"<yangjiayue@igengmei.com>"
,
"<zhangweiwei@igengmei.com>"
,
"<liuyiting@igengmei.com>"
],
cc_group
=
[
"<duanyingrong@igengmei.com>"
,
"<litao@igengmei.com>"
],
send_file_email
(
""
,
""
,
email_group
=
[
"<liulingling@igengmei.com>"
,
"<liujinhuan@igengmei.com>"
,
"<hongxu@igengmei.com>"
,
"<yangjiayue@igengmei.com>"
,
"<zhangweiwei@igengmei.com>"
,
"<liuyiting@igengmei.com>"
],
cc_group
=
[
"<duanyingrong@igengmei.com>"
,
"<litao@igengmei.com>"
],
email_msg_body_str
=
body_str
,
title_str
=
title_str
)
email_msg_body_str
=
body_str
,
title_str
=
title_str
)
print
(
"send to mysql"
)
print
(
"send to mysql"
)
except
Exception
as
e
:
except
Exception
as
e
:
...
...
This diff is collapsed.
Click to expand it.
crawler_sys/scheduler/push_crawler_data_to_mysql.py
View file @
e0fc155c
...
@@ -44,6 +44,7 @@ def send_email(query_id_dict: Dict):
...
@@ -44,6 +44,7 @@ def send_email(query_id_dict: Dict):
def
scan_es_to_mysql
():
def
scan_es_to_mysql
():
query_id_dict
=
{}
query_id_dict
=
{}
doc_id_list
=
[]
search_query
=
{
search_query
=
{
"query"
:
{
"query"
:
{
"bool"
:
{
"bool"
:
{
...
@@ -60,8 +61,14 @@ def scan_es_to_mysql():
...
@@ -60,8 +61,14 @@ def scan_es_to_mysql():
if_exists
=
rds
.
sismember
(
"article_id_list"
,
res
[
"_id"
])
if_exists
=
rds
.
sismember
(
"article_id_list"
,
res
[
"_id"
])
tractate_id
=
None
tractate_id
=
None
if
not
if_exists
:
if
not
if_exists
:
data
=
res
[
"_source"
]
doc_id_list
.
append
(
res
[
"_id"
])
data
[
"doc_id"
]
=
res
[
"_id"
]
for
doc_id
in
doc_id_list
:
try
:
data
=
es_framework
.
get_source
(
index
=
"crawler-data-raw"
,
id
=
doc_id
)
except
:
continue
if
data
:
try
:
try
:
tractate_id
=
write_data_into_mysql
(
data
,
user_id_list
)
tractate_id
=
write_data_into_mysql
(
data
,
user_id_list
)
print
(
"write data
%
s
%
s into sql"
%
(
tractate_id
,
res
[
"_id"
]))
print
(
"write data
%
s
%
s into sql"
%
(
tractate_id
,
res
[
"_id"
]))
...
@@ -74,7 +81,7 @@ def scan_es_to_mysql():
...
@@ -74,7 +81,7 @@ def scan_es_to_mysql():
query_id_dict
[
search_word
]
=
{}
query_id_dict
[
search_word
]
=
{}
query_id_dict
[
search_word
][
tractate_id
]
=
1
query_id_dict
[
search_word
][
tractate_id
]
=
1
count
+=
1
count
+=
1
if
count
%
10
00
==
0
:
if
count
%
2
00
==
0
:
send_email
(
query_id_dict
)
send_email
(
query_id_dict
)
query_id_dict
=
{}
query_id_dict
=
{}
send_email
(
query_id_dict
)
send_email
(
query_id_dict
)
...
...
This diff is collapsed.
Click to expand it.
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment