Skip to content
Projects
Groups
Snippets
Help
Loading...
Sign in
Toggle navigation
F
ffm-baseline
Project
Project
Details
Activity
Cycle Analytics
Repository
Repository
Files
Commits
Branches
Tags
Contributors
Graph
Compare
Charts
Issues
0
Issues
0
List
Board
Labels
Milestones
Merge Requests
0
Merge Requests
0
CI / CD
CI / CD
Pipelines
Jobs
Schedules
Charts
Wiki
Wiki
Members
Members
Collapse sidebar
Close sidebar
Activity
Graph
Charts
Create a new issue
Jobs
Commits
Issue Boards
Open sidebar
ML
ffm-baseline
Commits
a0d42a6e
Commit
a0d42a6e
authored
Jan 03, 2019
by
高雅喆
Browse files
Options
Browse Files
Download
Email Patches
Plain Diff
esmm train data first level1_id
parent
99e8c1dd
Hide whitespace changes
Inline
Side-by-side
Showing
2 changed files
with
15 additions
and
4 deletions
+15
-4
submit.sh
eda/esmm/Model_pipline/submit.sh
+3
-3
EsmmData.scala
eda/feededa/src/main/scala/com/gmei/EsmmData.scala
+12
-1
No files found.
eda/esmm/Model_pipline/submit.sh
View file @
a0d42a6e
...
...
@@ -20,9 +20,9 @@ echo "data2ffm"
${
PYTHON_PATH
}
${
MODEL_PATH
}
/Feature_pipline/data2ffm.py
>
${
DATA_PATH
}
/infer.log
all_sample
=
$((
`
cat
${
DATA_PATH
}
/tr.csv |
awk
-F
'\t'
'{print$5}'
|
awk
-F
','
'{print$2$3$4}'
|
sort
|
uniq
|
wc
-l
`
))
uniq_feat
=
$((
`
cat
${
DATA_PATH
}
/tr.csv |
awk
-F
'\t'
'{print$5}'
|
awk
-F
','
'{print$4}'
|
sort
|
uniq
-u
|
wc
-l
`
))
repe_feat
=
$((
all_sample-uniq_feat
))
echo
"Bayes Error Rate"
:
$((
repe_feat
*
100
/
all_sample
))
%
uniq_feat
=
$((
`
cat
${
DATA_PATH
}
/tr.csv |
awk
-F
'\t'
'{print$5}'
|
awk
-F
','
'{print$4}'
|
sort
|
uniq
-u
|
wc
-l
`
))
repe_feat
=
$((
all_sample-uniq_feat
))
echo
"Bayes Error Rate"
:
$((
repe_feat
*
100
/
all_sample
))
%
echo
"split data"
split
-l
$((
`
wc
-l
<
${
DATA_PATH
}
/tr.csv
`
/
15
))
${
DATA_PATH
}
/tr.csv
-d
-a
4
${
DATA_PATH
}
/tr/tr_
--additional-suffix
=
.csv
...
...
eda/feededa/src/main/scala/com/gmei/EsmmData.scala
View file @
a0d42a6e
...
...
@@ -211,7 +211,18 @@ object EsmmData {
)
// union_data_scity_id.createOrReplaceTempView("union_data_scity_id")
union_data_scity_id
.
show
()
GmeiConfig
.
writeToJDBCTable
(
"jdbc:mysql://10.66.157.22:4000/jerry_test?user=root&password=3SYz54LS9#^9sBvC&rewriteBatchedStatements=true"
,
union_data_scity_id
,
table
=
"esmm_train_data"
,
SaveMode
.
Append
)
val
union_data_scity_id2
=
sc
.
sql
(
s
"""
|select device_id,cid_id,first(stat_date) stat_date,first(ucity_id) ucity_id,first(diary_service_id) diary_service_id,first(y) y,
|first(z) z,first(clevel1_id) clevel1_id,first(slevel1_id) slevel1_id,first(ccity_name) ccity_name,first(scity_id) scity_id
|from union_data_scity_id
|group by device_id,cid_id
"""
.
stripMargin
)
GmeiConfig
.
writeToJDBCTable
(
"jdbc:mysql://10.66.157.22:4000/jerry_test?user=root&password=3SYz54LS9#^9sBvC&rewriteBatchedStatements=true"
,
union_data_scity_id2
,
table
=
"esmm_train_data"
,
SaveMode
.
Append
)
}
else
{
println
(
"esmm_train_data already have param.date data"
)
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment