Skip to content
Projects
Groups
Snippets
Help
Loading...
Sign in
Toggle navigation
F
ffm-baseline
Project
Project
Details
Activity
Cycle Analytics
Repository
Repository
Files
Commits
Branches
Tags
Contributors
Graph
Compare
Charts
Issues
0
Issues
0
List
Board
Labels
Milestones
Merge Requests
0
Merge Requests
0
CI / CD
CI / CD
Pipelines
Jobs
Schedules
Charts
Wiki
Wiki
Members
Members
Collapse sidebar
Close sidebar
Activity
Graph
Charts
Create a new issue
Jobs
Commits
Issue Boards
Open sidebar
ML
ffm-baseline
Commits
4490b0c6
Commit
4490b0c6
authored
Nov 06, 2019
by
高雅喆
Browse files
Options
Browse Files
Download
Email Patches
Plain Diff
update
parent
55fbd6b4
Show whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
3 additions
and
5 deletions
+3
-5
gyz_test.py
eda/smart_rank/gyz_test.py
+3
-5
No files found.
eda/smart_rank/gyz_test.py
View file @
4490b0c6
...
...
@@ -10,7 +10,6 @@ from email.mime.application import MIMEApplication
import
redis
import
datetime
from
pyspark
import
SparkConf
from
pyspark
import
SparkContext
import
time
from
pyspark.sql
import
SparkSession
import
json
...
...
@@ -79,14 +78,13 @@ sparkConf = SparkConf().set("spark.hive.mapred.supports.subdirectories", "true")
.
set
(
"spark.sql.extensions"
,
"org.apache.spark.sql.TiExtensions"
)
\
.
set
(
"spark.tispark.pd.addresses"
,
"172.16.40.170:2379"
)
.
set
(
"spark.io.compression.codec"
,
"lzf"
)
\
.
set
(
"spark.driver.maxResultSize"
,
"8g"
)
.
set
(
"spark.sql.avro.compression.codec"
,
"snappy"
)
sc
=
SparkContext
()
sqlContext
=
SparkSession
(
sc
)
spark
=
SparkSession
.
builder
.
config
(
conf
=
sparkConf
)
.
enableHiveSupport
()
.
getOrCreate
()
spark
.
sparkContext
.
setLogLevel
(
"WARN"
)
spark
.
sparkContext
.
addPyFile
(
"/srv/apps/ffm-baseline_git/eda/smart_rank/tool.py"
)
device_ids_lst_rdd
=
spark
.
sparkContext
.
parallelize
(
device_info
)
result
=
device_ids_lst_rdd
.
repartition
(
100
)
.
map
(
lambda
x
:
get_user_service_portrait
(
x
,
all_word_tags
,
all_tag_tag_type
,
all_3tag_2tag
,
all_tags_name
,
size
=
None
,
pay_time
=
pay_time
))
result
.
collect
(
)
result
.
write
.
format
(
'csv'
)
.
save
(
"~/test_df.csv"
)
print
(
result
.
take
(
10
)
)
#
result.write.format('csv').save("~/test_df.csv")
spark
.
stop
()
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment