Skip to content
Projects
Groups
Snippets
Help
Loading...
Sign in
Toggle navigation
F
ffm-baseline
Project
Project
Details
Activity
Cycle Analytics
Repository
Repository
Files
Commits
Branches
Tags
Contributors
Graph
Compare
Charts
Issues
0
Issues
0
List
Board
Labels
Milestones
Merge Requests
0
Merge Requests
0
CI / CD
CI / CD
Pipelines
Jobs
Schedules
Charts
Wiki
Wiki
Members
Members
Collapse sidebar
Close sidebar
Activity
Graph
Charts
Create a new issue
Jobs
Commits
Issue Boards
Open sidebar
ML
ffm-baseline
Commits
72cbd9da
Commit
72cbd9da
authored
Jun 11, 2019
by
Your Name
Browse files
Options
Browse Files
Download
Email Patches
Plain Diff
并行数据转换
parent
856be02b
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
8 additions
and
5 deletions
+8
-5
train.py
eda/esmm/Model_pipline/train.py
+8
-5
No files found.
eda/esmm/Model_pipline/train.py
View file @
72cbd9da
...
@@ -71,15 +71,18 @@ def input_fn(filenames, batch_size=32, num_epochs=1, perform_shuffle=False):
...
@@ -71,15 +71,18 @@ def input_fn(filenames, batch_size=32, num_epochs=1, perform_shuffle=False):
return
parsed
,
{
"y"
:
y
,
"z"
:
z
}
return
parsed
,
{
"y"
:
y
,
"z"
:
z
}
# Extract lines from input files using the Dataset API, can pass one filename or filename list
# Extract lines from input files using the Dataset API, can pass one filename or filename list
dataset
=
tf
.
data
.
TFRecordDataset
(
filenames
)
.
map
(
_parse_fn
,
num_parallel_calls
=
8
)
.
prefetch
(
500000
)
# multi-thread pre-process then prefetch
#
dataset = tf.data.TFRecordDataset(filenames).map(_parse_fn, num_parallel_calls=8).prefetch(500000) # multi-thread pre-process then prefetch
# Randomizes input using a window of 256 elements (read into memory)
# Randomizes input using a window of 256 elements (read into memory)
if
perform_shuffle
:
#
if perform_shuffle:
dataset
=
dataset
.
shuffle
(
buffer_size
=
256
)
#
dataset = dataset.shuffle(buffer_size=256)
# epochs from blending together.
# epochs from blending together.
dataset
=
dataset
.
repeat
(
num_epochs
)
# dataset = dataset.repeat(num_epochs)
dataset
=
dataset
.
batch
(
batch_size
)
# Batch size to use
# dataset = dataset.batch(batch_size) # Batch size to use
dataset
=
tf
.
data
.
TFRecordDataset
(
filenames
)
.
apply
(
tf
.
contrib
.
map_and_batch
(
map_func
=
_parse_fn
,
batch_size
=
batch_size
))
# dataset = dataset.padded_batch(batch_size, padded_shapes=({"feeds_ids": [None], "feeds_vals": [None], "title_ids": [None]}, [None])) #不定长补齐
# dataset = dataset.padded_batch(batch_size, padded_shapes=({"feeds_ids": [None], "feeds_vals": [None], "title_ids": [None]}, [None])) #不定长补齐
#return dataset.make_one_shot_iterator()
#return dataset.make_one_shot_iterator()
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment