Skip to content
Projects
Groups
Snippets
Help
Loading...
Sign in
Toggle navigation
F
ffm-baseline
Project
Project
Details
Activity
Cycle Analytics
Repository
Repository
Files
Commits
Branches
Tags
Contributors
Graph
Compare
Charts
Issues
0
Issues
0
List
Board
Labels
Milestones
Merge Requests
0
Merge Requests
0
CI / CD
CI / CD
Pipelines
Jobs
Schedules
Charts
Wiki
Wiki
Members
Members
Collapse sidebar
Close sidebar
Activity
Graph
Charts
Create a new issue
Jobs
Commits
Issue Boards
Open sidebar
ML
ffm-baseline
Commits
6f906fb2
Commit
6f906fb2
authored
Dec 28, 2018
by
高雅喆
Browse files
Options
Browse Files
Download
Email Patches
Plain Diff
drop duplicate
parent
b058016d
Show whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
20 additions
and
5 deletions
+20
-5
EsmmData.scala
eda/feededa/src/main/scala/com/gmei/EsmmData.scala
+20
-5
No files found.
eda/feededa/src/main/scala/com/gmei/EsmmData.scala
View file @
6f906fb2
...
...
@@ -577,11 +577,26 @@ object GetDevicePortrait {
|(select device_id,max(level1_count) as max_count from tag_count group by device_id) b
|on a.level1_count = b.max_count and a.device_id = b.device_id
"""
.
stripMargin
).
rdd
.
map
(
x
=>
(
x
(
0
).
toString
,
x
(
1
).
toString
,
x
(
2
).
toString
,
x
(
3
).
toString
))
max_count_tag
.
foreachPartition
(
GmeiConfig
.
updateDeviceFeat
)
max_count_tag
.
take
(
10
).
foreach
(
println
)
println
(
max_count_tag
.
count
())
)
// .rdd.map(x => (x(0).toString,x(1).toString,x(2).toString,x(3).toString))
// max_count_tag.foreachPartition(GmeiConfig.updateDeviceFeat)
//
// max_count_tag.take(10).foreach(println)
// println(max_count_tag.count())
//drop duplicates
val
max_count_tag_rdd
=
max_count_tag
.
rdd
.
groupBy
(
_
.
getAs
[
String
(
"
device_id
"
)).
map
{
case
(
device_id
,
data
)
=>
val
stat_date
=
data
.
map
(
_
.
getAs
[
String
](
"stat_date"
)).
head
val
max_level1_id
=
data
.
map
(
_
.
getAs
[
String
](
"max_level1_id"
)).
head
val
max_level1_count
=
data
.
map
(
_
.
getAs
[
String
](
"max_level1_count"
)).
head
(
device_id
,
stat_date
,
max_level1_id
,
max_level1_count
)
}.
filter
(
_
.
_1
!=
null
)
max_count_tag_rdd
.
foreachPartition
(
GmeiConfig
.
updateDeviceFeat
)
max_count_tag_rdd
.
take
(
10
).
foreach
(
println
)
println
(
max_count_tag_rdd
.
count
())
sc
.
stop
()
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment