Skip to content
Projects
Groups
Snippets
Help
Loading...
Sign in
Toggle navigation
F
ffm-baseline
Project
Project
Details
Activity
Cycle Analytics
Repository
Repository
Files
Commits
Branches
Tags
Contributors
Graph
Compare
Charts
Issues
0
Issues
0
List
Board
Labels
Milestones
Merge Requests
0
Merge Requests
0
CI / CD
CI / CD
Pipelines
Jobs
Schedules
Charts
Wiki
Wiki
Members
Members
Collapse sidebar
Close sidebar
Activity
Graph
Charts
Create a new issue
Jobs
Commits
Issue Boards
Open sidebar
ML
ffm-baseline
Commits
c5834874
Commit
c5834874
authored
Apr 16, 2019
by
王志伟
Browse files
Options
Browse Files
Download
Plain Diff
Merge branch 'master' of
http://git.wanmeizhensuo.com/ML/ffm-baseline
parents
7182b1ce
500b2501
Hide whitespace changes
Inline
Side-by-side
Showing
2 changed files
with
3 additions
and
5 deletions
+3
-5
feature.py
eda/esmm/Model_pipline/feature.py
+2
-1
feature.py
tensnsorflow/es/feature.py
+1
-4
No files found.
eda/esmm/Model_pipline/feature.py
View file @
c5834874
...
...
@@ -57,11 +57,12 @@ def get_data():
# print(df.head(2)
print
(
"before"
)
print
(
df
.
shape
)
print
(
"after"
)
df
=
df
.
drop_duplicates
()
df
=
df
.
drop_duplicates
([
"ucity_id"
,
"clevel2_id"
,
"ccity_name"
,
"device_type"
,
"manufacturer"
,
"channel"
,
"top"
,
"time"
,
"stat_date"
,
"app_list"
])
print
(
"after"
)
print
(
df
.
shape
)
app_list_number
,
app_list_map
=
multi_hot
(
df
,
"app_list"
,
1
)
level2_number
,
level2_map
=
multi_hot
(
df
,
"clevel2_id"
,
1
+
app_list_number
)
# df["app_list"] = df["app_list"].fillna("lost_na")
...
...
tensnsorflow/es/feature.py
View file @
c5834874
...
...
@@ -80,10 +80,7 @@ def get_data():
print
(
df
.
shape
)
df
=
df
.
drop_duplicates
([
"ucity_id"
,
"clevel2_id"
,
"ccity_name"
,
"device_type"
,
"manufacturer"
,
"channel"
,
"top"
,
"time"
,
"stat_date"
,
"app_list"
])
# df = df.drop_duplicates(["ucity_id", "clevel2_id", "ccity_name", "device_type", "manufacturer",
# "channel", "top", "time", "stat_date", "app_list", "hospital_id", "level3_ids"])
"channel"
,
"top"
,
"time"
,
"stat_date"
,
"app_list"
,
"hospital_id"
,
"level3_ids"
])
print
(
"去重后样本数量:"
,
df
.
shape
)
app_list_number
,
app_list_map
=
multi_hot
(
df
,
"app_list"
,
2
)
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment