Welcome to easyrec’s documentation!¶
Prerequisites¶
Windows, Linux or MacOS
Python 3+
Tensorflow 2+
Installation¶
Prepare environment¶
The very first thing to install easyrec is to ensure that Python is installed. If not, it is recommended to use Anaconda instead of pure Python.
Create a conda virtual environment.
conda create -n easyrec python=3.7
Activate the virtual environment.
conda activate easyrec
Install easyrec¶
PyPI¶
The easiest way to install easyrec is PyPI. You can directly run the following command to install:
pip install easyrec-python
Source¶
In addition, you can also install easyrec from source. First, locate the directory that you want to keep codes:
cd /path/for/codes/
Next, clone the source codes from Github:
git clone git@github.com:xu-zhiwei/easyrec.git
Finally, install easyrec in your Python environment:
cd easyrec
pip install requirements.txt
python setup.py install # or "python setup.py develop" for developers who wants to modify the codes
Verification¶
After installation, you can verify it by the following:
Switch to Python environment.
python
Verify installation.
import easyrec
The above code is supposed to run successfully upon you finish installation.
Background of example¶
easyrec provides a number of existing models proposed in recommender system fields. It is extremely easy to use and what you only need is to prepare the input of models.
To quickly acquire the usage of easyrec, we have finished some examples and make them open-sourced in Github.
Here we take MovieLens 1M Dataset (short as ml-1m) for dataset and Factorization Machine (FM) for model as an example.
Prepare dataset¶
After you have downloaded the dataset, you may clean the data as follows:
from pathlib import Path
import pandas as pd
dataset_path = Path('/path/for/dataset/')
# load ratings.dat
rating_df = pd.read_csv(dataset_path / 'ratings.dat', sep='::', engine='python', header=None,
names=['user_id', 'item_id', 'ctr', 'timestamp'])
rating_df.loc[rating_df['ctr'] <= 3, 'ctr'] = 0
rating_df.loc[rating_df['ctr'] > 3, 'ctr'] = 1
rating_df.pop('timestamp')
# load users.dat
user_df = pd.read_csv(dataset_path / 'users.dat', sep='::', engine='python', header=None,
names=['user_id', 'sex_id', 'age_id', 'occupation_id', 'zip_code_id'])
user_df['age_id'] = user_df['age_id'].astype(str)
user_df['occupation_id'] = user_df['occupation_id'].astype(str)
user_df['zip_code_id'] = user_df['zip_code_id'].astype(str)
# load movies.dat
item_df = pd.read_csv(dataset_path / 'movies.dat', sep='::', engine='python', header=None,
names=['item_id', 'title', 'genre_ids'])
item_df.pop('title') # title is not used in the example
item_df['genre_ids'] = item_df['genre_ids'].apply(lambda x: x.split('|'))
# join 3 tables
df = pd.merge(rating_df, user_df, how='left', on='user_id')
df = pd.merge(df, item_df, how='left', on='item_id')
Then, based on feature columns in Tensorflow 2, you can formally define the format of input for models and obtain the dataset generator.
Note: detailed introduction of feature columns is illustrated in Tutorial.
import tensorflow as tf
# define the feature columns
categorical_column_with_identity = tf.feature_column.categorical_column_with_identity
categorical_column_with_vocabulary_list = tf.feature_column.categorical_column_with_vocabulary_list
one_hot_feature_columns = [
categorical_column_with_identity(key='user_id', num_buckets=df['user_id'].max() + 1, default_value=0),
categorical_column_with_vocabulary_list(
key='sex_id', vocabulary_list=set(df['sex_id'].values), num_oov_buckets=1),
categorical_column_with_vocabulary_list(
key='age_id', vocabulary_list=set(df['age_id'].values), num_oov_buckets=1),
categorical_column_with_vocabulary_list(
key='occupation_id', vocabulary_list=set(df['occupation_id'].values), num_oov_buckets=1),
categorical_column_with_vocabulary_list(
key='zip_code_id', vocabulary_list=set(df['zip_code_id'].values), num_oov_buckets=1),
categorical_column_with_identity(key='item_id', num_buckets=df['item_id'].max() + 1, default_value=0),
]
# construct dataset generator
def train_validation_test_split(dataset: tf.data.Dataset,
dataset_size: int,
train_ratio: float,
validation_ratio: float
) -> Tuple[tf.data.Dataset, tf.data.Dataset, tf.data.Dataset]:
if train_ratio + validation_ratio >= 1:
raise ValueError('train_size + validation_size should be less than 1')
train_size, validation_size = round(train_ratio * dataset_size), round(validation_ratio * dataset_size)
train_dataset = dataset.take(train_size)
test_dataset = dataset.skip(train_size)
validation_dataset = test_dataset.take(validation_size)
test_dataset = test_dataset.skip(validation_size)
return train_dataset, validation_dataset, test_dataset
def transform_ragged_lists_to_sparse_tensor(ragged_lists: list):
indices, values = [], []
max_length = 0
for i, ragged_list in enumerate(ragged_lists):
for j, value in enumerate(ragged_list):
indices.append((i, j))
values.append(value)
max_length = max(max_length, len(ragged_list))
return tf.SparseTensor(
indices=indices,
values=values,
dense_shape=(len(ragged_lists), max_length)
)
train_ratio, validation_ratio, test_ratio = [0.6, 0.2, 0.2]
batch_size = 128
labels = df.pop('ctr')
features = dict(df)
features['genre_ids'] = transform_ragged_lists_to_sparse_tensor(features['genre_ids'])
dataset = tf.data.Dataset.from_tensor_slices((features, labels))
dataset = dataset.shuffle(buffer_size=200, seed=42)
train_dataset, validation_dataset, test_dataset = train_validation_test_split(dataset,
len(df),
train_ratio,
validation_ratio
)
train_dataset = train_dataset.batch(batch_size)
validation_dataset = validation_dataset.batch(batch_size)
test_dataset = test_dataset.batch(batch_size)
Low-level APIs¶
Next, train the model according to Low-level APIs (or the High-level APIs mentioned below).
from tensorflow.keras.losses import BinaryCrossentropy
from tensorflow.keras.metrics import Mean, AUC
from tensorflow.keras.optimizers import SGD
learning_rate = 1e-1
epochs = 50
output_ckpt_path = Path(output_ckpt_path)
model = FM(
one_hot_feature_columns,
k=32
)
start_epoch = 0
loss_obj = BinaryCrossentropy()
optimizer = SGD(learning_rate=learning_rate)
train_loss = Mean(name='train_loss')
train_auc = AUC(name='train_auc')
validation_loss = Mean(name='validation_loss')
validation_auc = AUC(name='validation_auc')
best_auc = 0
@tf.function
def train_step(x, y):
with tf.GradientTape() as tape:
predictions = model(x)
loss = loss_obj(y, predictions)
grads = tape.gradient(loss, model.trainable_variables)
optimizer.apply_gradients(zip(grads, model.trainable_variables))
train_loss(loss)
train_auc(y, predictions)
@tf.function
def validation_step(x, y):
predictions = model(x)
loss = loss_obj(y, predictions)
validation_loss(loss)
validation_auc(y, predictions)
# train
for epoch in range(start_epoch, epochs):
train_loss.reset_states()
train_auc.reset_states()
validation_loss.reset_states()
validation_auc.reset_states()
for features, labels in train_dataset:
train_step(features, labels)
for features, labels in validation_dataset:
validation_step(features, labels)
print('epoch: {}, train_loss: {}, train_auc: {}'.format(epoch + 1, train_loss.result().numpy(),
train_auc.result().numpy()))
print('epoch: {}, validation_loss: {}, validation_auc: {}'.format(epoch + 1, validation_loss.result().numpy(),
validation_auc.result().numpy()))
model.save(output_ckpt_path / str(epoch + 1))
if best_auc < validation_auc.result().numpy():
best_auc = validation_auc.result().numpy()
model.save(output_ckpt_path / 'best')
Finally, the model parameter with the best evaluation result can be loaded and carry out inference.
@tf.function
def test_step(x, y):
predictions = model(x)
loss = loss_obj(y, predictions)
test_loss(loss)
test_auc(y, predictions)
model = tf.keras.models.load_model(output_ckpt_path / 'best')
test_loss = Mean(name='test_loss')
test_auc = AUC(name='test_auc')
for features, labels in test_dataset:
test_step(features, labels)
print('test_loss: {}, test_auc: {}'.format(test_loss.result().numpy(),
test_auc.result().numpy()))
High-level APIs¶
Coming sooooooooon!
Feature columns¶
The most important part of easyrec is feature columns, which basically determines whether you can train models within your customized dataset.
In easyrec, the type of feature columns can be concluded into 2 groups, i.e., by data type or by usage.
Data type-aware feature columns¶
One hot¶
One hot feature columns indicate a list of categorical feature columns, and a data sample can belong to one and only one of the categories.
"""
Example Args in Functions:
one_hot_feature_columns: List[CategoricalColumn] encodes one hot feature fields, such as sex_id.
"""
import tensorflow as tf
categorical_column_with_identity = tf.feature_column.categorical_column_with_identity
categorical_column_with_vocabulary_list = tf.feature_column.categorical_column_with_vocabulary_list
one_hot_feature_columns = [
categorical_column_with_identity(key='user_id', num_buckets=df['user_id'].max() + 1, default_value=0),
categorical_column_with_vocabulary_list(
key='sex_id', vocabulary_list=set(df['sex_id'].values), num_oov_buckets=1),
categorical_column_with_vocabulary_list(
key='age_id', vocabulary_list=set(df['age_id'].values), num_oov_buckets=1),
categorical_column_with_vocabulary_list(
key='occupation_id', vocabulary_list=set(df['occupation_id'].values), num_oov_buckets=1),
categorical_column_with_vocabulary_list(
key='zip_code_id', vocabulary_list=set(df['zip_code_id'].values), num_oov_buckets=1),
categorical_column_with_identity(key='item_id', num_buckets=df['item_id'].max() + 1, default_value=0),
]
Multi hot¶
Multi hot feature columns indicate a list of categorical feature columns, and a data sample can belong to one or more than one of the categories.
"""
Example Args in Function:
multi_hot_feature_columns: List[CategoricalColumn] encodes multi hot feature fields, such as
historical_item_ids.
"""
import tensorflow as tf
categorical_column_with_vocabulary_list = tf.feature_column.categorical_column_with_vocabulary_list
multi_hot_feature_columns = [
categorical_column_with_vocabulary_list(
key='genre_ids', vocabulary_list=get_vocabulary_list_from_ragged_list_series(item_df['genre_ids']),
num_oov_buckets=1
)
]
Dense¶
Dense feature columns indicate a list of numerical feature columns.
"""
Example Args in Function:
dense_feature_columns: List[NumericalColumn] encodes numerical feature fields, such as age.
"""
import tensorflow as tf
dense_feature_columns = [
tf.feature_column.numeric_column(key='age')
]
Usage-aware feature columns¶
These feature columns indicate a list of feature columns that can be directly feed into model.
"""
Example Args:
user_feature_columns: List[FeatureColumn] to directly feed into tf.keras.layers.DenseFeatures, which
basically contains user feature fields.
item_feature_columns: List[FeatureColumn] to directly feed into tf.keras.layers.DenseFeatures, which
basically contains item feature fields.
feature columns: List[FeatureColumn] to directly feed into tf.keras.layers.DenseFeatures, which basically
contains all feature fields.
"""
import tensorflow as tf
categorical_column_with_identity = tf.feature_column.categorical_column_with_identity
categorical_column_with_vocabulary_list = tf.feature_column.categorical_column_with_vocabulary_list
indicator_column = tf.feature_column.indicator_column
user_feature_columns = [
categorical_column_with_identity(key='user_id', num_buckets=df['user_id'].max() + 1, default_value=0),
categorical_column_with_vocabulary_list(
key='sex_id', vocabulary_list=set(df['sex_id'].values), num_oov_buckets=1),
categorical_column_with_vocabulary_list(
key='age_id', vocabulary_list=set(df['age_id'].values), num_oov_buckets=1),
categorical_column_with_vocabulary_list(
key='occupation_id', vocabulary_list=set(df['occupation_id'].values), num_oov_buckets=1),
categorical_column_with_vocabulary_list(
key='zip_code_id', vocabulary_list=set(df['zip_code_id'].values), num_oov_buckets=1),
]
easyrec.models¶
easyrec.models.afm¶
- class easyrec.models.afm.AFM(*args, **kwargs)¶
Bases:
keras.engine.training.Model
Attentional Factorization Machines (AFM). Reference: Jun Xiao et al. Attentional Factorization Machines:Learning the Weight of Feature Interactions via Attention Networks. arXiv. 2017.
- Parameters
one_hot_feature_columns – List[CategoricalColumn] encodes one hot feature fields, such as sex_id.
k – Dimension of the second-order weights.
- call(inputs, training=None, mask=None)¶
Calls the model on new inputs.
In this case call just reapplies all ops in the graph to the new inputs (e.g. build a new computational graph from the provided inputs).
Note: This method should not be called directly. It is only meant to be overridden when subclassing tf.keras.Model. To call a model on an input, always use the __call__ method, i.e. model(inputs), which relies on the underlying call method.
- Parameters
inputs – Input tensor, or dict/list/tuple of input tensors.
training – Boolean or boolean scalar tensor, indicating whether to run the Network in training mode or inference mode.
mask – A mask or list of masks. A mask can be either a tensor or None (no mask).
- Returns
A tensor if there is a single output, or a list of tensors if there are more than one outputs.
easyrec.models.autoint¶
- class easyrec.models.autoint.AutoInt(*args, **kwargs)¶
Bases:
keras.engine.training.Model
Automatic Feature Interaction (AutoInt). Reference: AutoInt: Automatic Feature Interaction Learning via Self-Attentive Neural Networks. CIKM. 2019.
- Parameters
one_hot_feature_columns – List[CategoricalColumn] encodes one hot feature fields, such as sex_id.
multi_hot_feature_columns – List[CategoricalColumn] encodes multi hot feature fields, such as historical_item_ids.
dense_feature_columns – List[NumericalColumn] encodes numerical feature fields, such as age.
embedding_dimension – Dimension of embedded Column.
num_heads – Number of heads.
attention_qkv_dimension – Dimension of Query, Key and Value in self attention.
attention_output_dimension – Dimension of output in self attention.
- call(inputs, training=None, mask=None)¶
Calls the model on new inputs.
In this case call just reapplies all ops in the graph to the new inputs (e.g. build a new computational graph from the provided inputs).
Note: This method should not be called directly. It is only meant to be overridden when subclassing tf.keras.Model. To call a model on an input, always use the __call__ method, i.e. model(inputs), which relies on the underlying call method.
- Parameters
inputs – Input tensor, or dict/list/tuple of input tensors.
training – Boolean or boolean scalar tensor, indicating whether to run the Network in training mode or inference mode.
mask – A mask or list of masks. A mask can be either a tensor or None (no mask).
- Returns
A tensor if there is a single output, or a list of tensors if there are more than one outputs.
easyrec.models.dcn¶
- class easyrec.models.dcn.DCN(*args, **kwargs)¶
Bases:
keras.engine.training.Model
Deep & Cross Network (DCN). Reference: Ruoxi Wang et al. Deep & Cross Network for ad Click Predictions. ADKDD. 2017.
- Parameters
one_hot_feature_columns – List[CategoricalColumn] encodes one hot feature fields, such as sex_id.
multi_hot_feature_columns – List[CategoricalColumn] encodes multi hot feature fields, such as historical_item_ids.
dense_feature_columns – List[NumericalColumn] encodes numerical feature fields, such as age.
embedding_dimension – Dimension of embedded CategoricalColumn.
num_crosses – Number of crosses.
deep_units_list – Dimension of fully connected stack outputs in deep dense block.
deep_activation – Activation to use in deep dense block.
- call(inputs, training=None, mask=None)¶
Calls the model on new inputs.
In this case call just reapplies all ops in the graph to the new inputs (e.g. build a new computational graph from the provided inputs).
Note: This method should not be called directly. It is only meant to be overridden when subclassing tf.keras.Model. To call a model on an input, always use the __call__ method, i.e. model(inputs), which relies on the underlying call method.
- Parameters
inputs – Input tensor, or dict/list/tuple of input tensors.
training – Boolean or boolean scalar tensor, indicating whether to run the Network in training mode or inference mode.
mask – A mask or list of masks. A mask can be either a tensor or None (no mask).
- Returns
A tensor if there is a single output, or a list of tensors if there are more than one outputs.
easyrec.models.deep_crossing¶
- class easyrec.models.deep_crossing.DeepCrossing(*args, **kwargs)¶
Bases:
keras.engine.training.Model
Deep Crossing. Reference: Ying Shan et al. Deep Crossing: Web-Scale Modeling without Manually Crafted Combinatorial Features. KDD. 2016.
- Parameters
feature_columns – List[FeatureColumn] to directly feed into tf.keras.layers.DenseFeatures, which basically contains all feature fields.
num_residual_blocks – Number of residual blocks.
residual_units_list – Dimension of fully connected stack outputs in residual block.
residual_activation – Activation to use in residual block.
- call(inputs, training=None, mask=None)¶
Calls the model on new inputs.
In this case call just reapplies all ops in the graph to the new inputs (e.g. build a new computational graph from the provided inputs).
Note: This method should not be called directly. It is only meant to be overridden when subclassing tf.keras.Model. To call a model on an input, always use the __call__ method, i.e. model(inputs), which relies on the underlying call method.
- Parameters
inputs – Input tensor, or dict/list/tuple of input tensors.
training – Boolean or boolean scalar tensor, indicating whether to run the Network in training mode or inference mode.
mask – A mask or list of masks. A mask can be either a tensor or None (no mask).
- Returns
A tensor if there is a single output, or a list of tensors if there are more than one outputs.
easyrec.models.deepfm¶
- class easyrec.models.deepfm.DeepFM(*args, **kwargs)¶
Bases:
keras.engine.training.Model
Deep Factorization Machine (DeepFM). Reference: Huifeng Guo et al. DeepFM: A Factorization-Machine based Neural Network for CTR Prediction. arXiv. 2017.
- Parameters
one_hot_feature_columns – List[CategoricalColumn] encodes one hot feature fields, such as sex_id.
k – Dimension of the second-order weights.
deep_units_list – Dimension of fully connected stack outputs in deep block.
deep_activation – Activation to use in deep block.
- call(inputs, training=None, mask=None)¶
Calls the model on new inputs.
In this case call just reapplies all ops in the graph to the new inputs (e.g. build a new computational graph from the provided inputs).
Note: This method should not be called directly. It is only meant to be overridden when subclassing tf.keras.Model. To call a model on an input, always use the __call__ method, i.e. model(inputs), which relies on the underlying call method.
- Parameters
inputs – Input tensor, or dict/list/tuple of input tensors.
training – Boolean or boolean scalar tensor, indicating whether to run the Network in training mode or inference mode.
mask – A mask or list of masks. A mask can be either a tensor or None (no mask).
- Returns
A tensor if there is a single output, or a list of tensors if there are more than one outputs.
easyrec.models.dssm¶
- class easyrec.models.dssm.DSSM(*args, **kwargs)¶
Bases:
keras.engine.training.Model
Deep Structured Semantic Model (DSSM). Po-Sen Huang et al. Learning Deep Structured Semantic Models for Web Search using Clickthrough Data. CIKM. 2013.
- Parameters
user_feature_columns – List[FeatureColumn] to directly feed into tf.keras.layers.DenseFeatures, which basically contains user feature fields.
item_feature_columns – List[FeatureColumn] to directly feed into tf.keras.layers.DenseFeatures, which basically contains item feature fields.
user_units_list – Dimension of fully connected stack outputs in user dense block.
user_activation – Activation to use in user dense block.
item_units_list – Dimension of fully connected stack outputs in item dense block.
item_activation – Activation to use in item dense block.
score_function – Final output function to combine the user embedding and item embedding.
- call(inputs, training=None, mask=None)¶
Calls the model on new inputs.
In this case call just reapplies all ops in the graph to the new inputs (e.g. build a new computational graph from the provided inputs).
Note: This method should not be called directly. It is only meant to be overridden when subclassing tf.keras.Model. To call a model on an input, always use the __call__ method, i.e. model(inputs), which relies on the underlying call method.
- Parameters
inputs – Input tensor, or dict/list/tuple of input tensors.
training – Boolean or boolean scalar tensor, indicating whether to run the Network in training mode or inference mode.
mask – A mask or list of masks. A mask can be either a tensor or None (no mask).
- Returns
A tensor if there is a single output, or a list of tensors if there are more than one outputs.
easyrec.models.ffm¶
- class easyrec.models.ffm.FFM(*args, **kwargs)¶
Bases:
keras.engine.training.Model
Field-aware Factorization Machine (FFM). Reference: Yuchin Juan et al. Field-aware Factorization Machines for CTR Prediction. RecSys. 2016.
- Parameters
one_hot_feature_columns – List[CategoricalColumn] encodes one hot feature fields, such as sex_id.
k – Dimension of the second-order weights.
- call(inputs, training=None, mask=None)¶
Calls the model on new inputs.
In this case call just reapplies all ops in the graph to the new inputs (e.g. build a new computational graph from the provided inputs).
Note: This method should not be called directly. It is only meant to be overridden when subclassing tf.keras.Model. To call a model on an input, always use the __call__ method, i.e. model(inputs), which relies on the underlying call method.
- Parameters
inputs – Input tensor, or dict/list/tuple of input tensors.
training – Boolean or boolean scalar tensor, indicating whether to run the Network in training mode or inference mode.
mask – A mask or list of masks. A mask can be either a tensor or None (no mask).
- Returns
A tensor if there is a single output, or a list of tensors if there are more than one outputs.
easyrec.models.fm¶
- class easyrec.models.fm.FM(*args, **kwargs)¶
Bases:
keras.engine.training.Model
Factorization Machine (FM). Reference: Steffen Rendle. Factorization Machines. ICDM. 2010.
- Parameters
one_hot_feature_columns – List[CategoricalColumn] encodes one hot feature fields, such as sex_id.
k – Dimension of the second-order weights.
- call(inputs, training=None, mask=None)¶
Calls the model on new inputs.
In this case call just reapplies all ops in the graph to the new inputs (e.g. build a new computational graph from the provided inputs).
Note: This method should not be called directly. It is only meant to be overridden when subclassing tf.keras.Model. To call a model on an input, always use the __call__ method, i.e. model(inputs), which relies on the underlying call method.
- Parameters
inputs – Input tensor, or dict/list/tuple of input tensors.
training – Boolean or boolean scalar tensor, indicating whether to run the Network in training mode or inference mode.
mask – A mask or list of masks. A mask can be either a tensor or None (no mask).
- Returns
A tensor if there is a single output, or a list of tensors if there are more than one outputs.
easyrec.models.fnn¶
- class easyrec.models.fnn.FNN(*args, **kwargs)¶
Bases:
keras.engine.training.Model
Factorization-machine supported Neural Network (FNN). Reference: Weinan Zhang. Deep Learning over Multi-field Categorical Data – A Case Study on User Response Prediction. ECIR. 2016.
fm: Pretrained Factorization Machines. one_hot_feature_columns: List[CategoricalColumn] encodes one hot feature fields, such as sex_id. units_list: Dimension of fully connected stack outputs. activation: Activation to use.
- call(inputs, pretraining=True, training=None, mask=None)¶
Calls the model on new inputs.
In this case call just reapplies all ops in the graph to the new inputs (e.g. build a new computational graph from the provided inputs).
Note: This method should not be called directly. It is only meant to be overridden when subclassing tf.keras.Model. To call a model on an input, always use the __call__ method, i.e. model(inputs), which relies on the underlying call method.
- Parameters
inputs – Input tensor, or dict/list/tuple of input tensors.
training – Boolean or boolean scalar tensor, indicating whether to run the Network in training mode or inference mode.
mask – A mask or list of masks. A mask can be either a tensor or None (no mask).
- Returns
A tensor if there is a single output, or a list of tensors if there are more than one outputs.
easyrec.models.lr¶
- class easyrec.models.lr.LR(*args, **kwargs)¶
Bases:
keras.engine.training.Model
Logisitic Regression (LR).
- Parameters
feature_columns – List[FeatureColumn] to directly feed into tf.keras.layers.DenseFeatures, which basically contains all feature fields.
- call(inputs, training=None, mask=None)¶
Calls the model on new inputs.
In this case call just reapplies all ops in the graph to the new inputs (e.g. build a new computational graph from the provided inputs).
Note: This method should not be called directly. It is only meant to be overridden when subclassing tf.keras.Model. To call a model on an input, always use the __call__ method, i.e. model(inputs), which relies on the underlying call method.
- Parameters
inputs – Input tensor, or dict/list/tuple of input tensors.
training – Boolean or boolean scalar tensor, indicating whether to run the Network in training mode or inference mode.
mask – A mask or list of masks. A mask can be either a tensor or None (no mask).
- Returns
A tensor if there is a single output, or a list of tensors if there are more than one outputs.
easyrec.models.mlp¶
- class easyrec.models.mlp.MLP(*args, **kwargs)¶
Bases:
keras.engine.training.Model
Multi-layer Perceptron (MLP).
- Parameters
feature_columns – List[FeatureColumn] to directly feed into tf.keras.layers.DenseFeatures, which basically contains all feature fields.
units_list – Dimension of fully connected stack outputs.
activation – Activation to use.
- call(inputs, training=None, mask=None)¶
Calls the model on new inputs.
In this case call just reapplies all ops in the graph to the new inputs (e.g. build a new computational graph from the provided inputs).
Note: This method should not be called directly. It is only meant to be overridden when subclassing tf.keras.Model. To call a model on an input, always use the __call__ method, i.e. model(inputs), which relies on the underlying call method.
- Parameters
inputs – Input tensor, or dict/list/tuple of input tensors.
training – Boolean or boolean scalar tensor, indicating whether to run the Network in training mode or inference mode.
mask – A mask or list of masks. A mask can be either a tensor or None (no mask).
- Returns
A tensor if there is a single output, or a list of tensors if there are more than one outputs.
easyrec.models.mmoe¶
- class easyrec.models.mmoe.MMOE(*args, **kwargs)¶
Bases:
keras.engine.training.Model
Multi-gate Mixture-of-Experts. Reference: Jiaqi Ma et al. Modeling Task Relationships in Multi-task Learning with Multi-gate Mixture-of-Experts. KDD. 2018.
- Parameters
feature_columns – List[FeatureColumn] to directly feed into tf.keras.layers.DenseFeatures, which basically contains all feature fields.
num_experts – Number of experts.
expert_units_list – Dimension of fully connected stack outputs in expert dense block.
expert_activation – Activation to use in expert dense block.
num_towers – Number of towers (tasks).
tower_units_list – Dimension of fully connected stack outputs in tower dense block.
tower_activation – Activation to use in tower dense block.
- call(inputs, use_tower=0, training=None, mask=None)¶
Calls the model on new inputs.
In this case call just reapplies all ops in the graph to the new inputs (e.g. build a new computational graph from the provided inputs).
Note: This method should not be called directly. It is only meant to be overridden when subclassing tf.keras.Model. To call a model on an input, always use the __call__ method, i.e. model(inputs), which relies on the underlying call method.
- Parameters
inputs – Input tensor, or dict/list/tuple of input tensors.
training – Boolean or boolean scalar tensor, indicating whether to run the Network in training mode or inference mode.
mask – A mask or list of masks. A mask can be either a tensor or None (no mask).
- Returns
A tensor if there is a single output, or a list of tensors if there are more than one outputs.
easyrec.models.neumf¶
- class easyrec.models.neumf.NeuMF(*args, **kwargs)¶
Bases:
keras.engine.training.Model
Neural Matrix Factorization (NeuMF). Xiangnan He et al. Neural Factorization Machines for Sparse Predictive Analytics. SIGIR. 2017.
- Parameters
user_feature_column – CategoricalColumn to represent user_id.
item_feature_column – CategoricalColumn to represent item_id.
user_embedding_dimension – Dimension of user embedding.
item_embedding_dimension – Dimension of item embedding.
units_list – Dimension of fully connected stack outputs.
activation – Activation to use.
alpha – Tendency parameter for GMF, thus, 1 - alpha is used for MLP.
- call(inputs, training=None, mask=None)¶
Calls the model on new inputs.
In this case call just reapplies all ops in the graph to the new inputs (e.g. build a new computational graph from the provided inputs).
Note: This method should not be called directly. It is only meant to be overridden when subclassing tf.keras.Model. To call a model on an input, always use the __call__ method, i.e. model(inputs), which relies on the underlying call method.
- Parameters
inputs – Input tensor, or dict/list/tuple of input tensors.
training – Boolean or boolean scalar tensor, indicating whether to run the Network in training mode or inference mode.
mask – A mask or list of masks. A mask can be either a tensor or None (no mask).
- Returns
A tensor if there is a single output, or a list of tensors if there are more than one outputs.
easyrec.models.nfm¶
- class easyrec.models.nfm.NFM(*args, **kwargs)¶
Bases:
keras.engine.training.Model
Neural Factorization Machine (NFM). Xiangnan He et al. Neural Factorization Machines for Sparse Predictive Analytics. SIGIR. 2017.
- Parameters
one_hot_feature_columns – List[CategoricalColumn] encodes one hot feature fields, such as sex_id.
k – Dimension of the second-order weights.
- call(inputs, training=None, mask=None)¶
Calls the model on new inputs.
In this case call just reapplies all ops in the graph to the new inputs (e.g. build a new computational graph from the provided inputs).
Note: This method should not be called directly. It is only meant to be overridden when subclassing tf.keras.Model. To call a model on an input, always use the __call__ method, i.e. model(inputs), which relies on the underlying call method.
- Parameters
inputs – Input tensor, or dict/list/tuple of input tensors.
training – Boolean or boolean scalar tensor, indicating whether to run the Network in training mode or inference mode.
mask – A mask or list of masks. A mask can be either a tensor or None (no mask).
- Returns
A tensor if there is a single output, or a list of tensors if there are more than one outputs.
easyrec.models.pnn¶
- class easyrec.models.pnn.PNN(*args, **kwargs)¶
Bases:
keras.engine.training.Model
Product-based Neural Network (PNN). Reference: Yanru Qu et al. Product-based Neural Networks for User Response Prediction. ICDM. 2016.
- Parameters
one_hot_feature_columns – List[CategoricalColumn] encodes one hot feature fields, such as sex_id.
multi_hot_feature_columns – List[CategoricalColumn] encodes multi hot feature fields, such as historical_item_ids.
embedding_dimension – embedding dimension of each field.
use_inner_product – whether to use IPNN.
use_outer_product – whether to use OPNN.
units_list – Dimension of fully connected stack outputs.
activation – Activation to use.
- call(inputs, training=None, mask=None)¶
Calls the model on new inputs.
In this case call just reapplies all ops in the graph to the new inputs (e.g. build a new computational graph from the provided inputs).
Note: This method should not be called directly. It is only meant to be overridden when subclassing tf.keras.Model. To call a model on an input, always use the __call__ method, i.e. model(inputs), which relies on the underlying call method.
- Parameters
inputs – Input tensor, or dict/list/tuple of input tensors.
training – Boolean or boolean scalar tensor, indicating whether to run the Network in training mode or inference mode.
mask – A mask or list of masks. A mask can be either a tensor or None (no mask).
- Returns
A tensor if there is a single output, or a list of tensors if there are more than one outputs.
easyrec.models.wide_and_deep¶
- class easyrec.models.wide_and_deep.WideAndDeep(*args, **kwargs)¶
Bases:
keras.engine.training.Model
Wide & Deep. Reference: Heng-Tze Cheng et al. Wide & Deep Learning for Recommender Systems. RecSys. 2016.
- Parameters
one_hot_feature_columns – List[CategoricalColumn] encodes one hot feature fields, such as sex_id.
multi_hot_feature_columns – List[CategoricalColumn] encodes multi hot feature fields, such as historical_item_ids.
dense_feature_columns – List[NumericalColumn] encodes numerical feature fields, such as age.
embedding_dimension – Dimension of embedded CategoricalColumn.
deep_units_list – Dimension of fully connected stack outputs in deep dense block.
deep_activation – Activation to use in deep dense block.
- call(inputs, training=None, mask=None)¶
Calls the model on new inputs.
In this case call just reapplies all ops in the graph to the new inputs (e.g. build a new computational graph from the provided inputs).
Note: This method should not be called directly. It is only meant to be overridden when subclassing tf.keras.Model. To call a model on an input, always use the __call__ method, i.e. model(inputs), which relies on the underlying call method.
- Parameters
inputs – Input tensor, or dict/list/tuple of input tensors.
training – Boolean or boolean scalar tensor, indicating whether to run the Network in training mode or inference mode.
mask – A mask or list of masks. A mask can be either a tensor or None (no mask).
- Returns
A tensor if there is a single output, or a list of tensors if there are more than one outputs.
easyrec.models.xdeepfm¶
- class easyrec.models.xdeepfm.xDeepFM(*args, **kwargs)¶
Bases:
keras.engine.training.Model
Extreme Deep Factorization Machine (xDeepFM). Reference: Jianxun Lian et al. xDeepFM: Combining Explicit and Implicit Feature Interactions for Recommender Systems. KDD. 2018.
- Parameters
one_hot_feature_columns – List[CategoricalColumn] encodes one hot feature fields, such as sex_id.
multi_hot_feature_columns – List[CategoricalColumn] encodes multi hot feature fields, such as historical_item_ids.
k – Dimension of the second-order weights.
deep_units_list – Dimension of fully connected stack outputs in deep block.
deep_activation – Activation to use in deep block.
cross_units_list – Number of fields in the cross layer.
- call(inputs, training=None, mask=None)¶
Calls the model on new inputs.
In this case call just reapplies all ops in the graph to the new inputs (e.g. build a new computational graph from the provided inputs).
Note: This method should not be called directly. It is only meant to be overridden when subclassing tf.keras.Model. To call a model on an input, always use the __call__ method, i.e. model(inputs), which relies on the underlying call method.
- Parameters
inputs – Input tensor, or dict/list/tuple of input tensors.
training – Boolean or boolean scalar tensor, indicating whether to run the Network in training mode or inference mode.
mask – A mask or list of masks. A mask can be either a tensor or None (no mask).
- Returns
A tensor if there is a single output, or a list of tensors if there are more than one outputs.
easyrec.blocks¶
easyrec.blocks.interaction¶
- class easyrec.blocks.interaction.AFM(*args, **kwargs)¶
Bases:
keras.engine.training.Model
Attentional factorization machine layer.
- Parameters
one_hot_feature_columns – List[CategoricalColumn] encodes one hot feature fields, such as sex_id.
k – Dimension of the second-order weights.
- call(inputs, training=None, mask=None)¶
Calls the model on new inputs.
In this case call just reapplies all ops in the graph to the new inputs (e.g. build a new computational graph from the provided inputs).
Note: This method should not be called directly. It is only meant to be overridden when subclassing tf.keras.Model. To call a model on an input, always use the __call__ method, i.e. model(inputs), which relies on the underlying call method.
- Parameters
inputs – Input tensor, or dict/list/tuple of input tensors.
training – Boolean or boolean scalar tensor, indicating whether to run the Network in training mode or inference mode.
mask – A mask or list of masks. A mask can be either a tensor or None (no mask).
- Returns
A tensor if there is a single output, or a list of tensors if there are more than one outputs.
- class easyrec.blocks.interaction.FFM(*args, **kwargs)¶
Bases:
keras.engine.training.Model
Field-aware factorization machine layer.
- Parameters
one_hot_feature_columns – List[CategoricalColumn] encodes one hot feature fields, such as sex_id.
k – Dimension of the second-order weights.
- call(inputs, *args, **kwargs)¶
Calls the model on new inputs.
In this case call just reapplies all ops in the graph to the new inputs (e.g. build a new computational graph from the provided inputs).
Note: This method should not be called directly. It is only meant to be overridden when subclassing tf.keras.Model. To call a model on an input, always use the __call__ method, i.e. model(inputs), which relies on the underlying call method.
- Parameters
inputs – Input tensor, or dict/list/tuple of input tensors.
training – Boolean or boolean scalar tensor, indicating whether to run the Network in training mode or inference mode.
mask – A mask or list of masks. A mask can be either a tensor or None (no mask).
- Returns
A tensor if there is a single output, or a list of tensors if there are more than one outputs.
- class easyrec.blocks.interaction.FM(*args, **kwargs)¶
Bases:
keras.engine.training.Model
Factorization machine layer using vector w and matrix v.
- Parameters
one_hot_feature_columns – List[CategoricalColumn] encodes one hot feature fields, such as sex_id.
k – Dimension of the second-order weights.
- call(inputs, *args, **kwargs)¶
Calls the model on new inputs.
In this case call just reapplies all ops in the graph to the new inputs (e.g. build a new computational graph from the provided inputs).
Note: This method should not be called directly. It is only meant to be overridden when subclassing tf.keras.Model. To call a model on an input, always use the __call__ method, i.e. model(inputs), which relies on the underlying call method.
- Parameters
inputs – Input tensor, or dict/list/tuple of input tensors.
training – Boolean or boolean scalar tensor, indicating whether to run the Network in training mode or inference mode.
mask – A mask or list of masks. A mask can be either a tensor or None (no mask).
- Returns
A tensor if there is a single output, or a list of tensors if there are more than one outputs.
- class easyrec.blocks.interaction.NFM(*args, **kwargs)¶
Bases:
keras.engine.training.Model
Neural factorization machine layer.
- Parameters
one_hot_feature_columns – List[CategoricalColumn] encodes one hot feature fields, such as sex_id.
k – Dimension of the second-order weights.
units_list – Dimension of fully connected stack outputs.
activation – Activation to use.
- call(inputs, training=None, mask=None)¶
Calls the model on new inputs.
In this case call just reapplies all ops in the graph to the new inputs (e.g. build a new computational graph from the provided inputs).
Note: This method should not be called directly. It is only meant to be overridden when subclassing tf.keras.Model. To call a model on an input, always use the __call__ method, i.e. model(inputs), which relies on the underlying call method.
- Parameters
inputs – Input tensor, or dict/list/tuple of input tensors.
training – Boolean or boolean scalar tensor, indicating whether to run the Network in training mode or inference mode.
mask – A mask or list of masks. A mask can be either a tensor or None (no mask).
- Returns
A tensor if there is a single output, or a list of tensors if there are more than one outputs.
easyrec.blocks.nn¶
- class easyrec.blocks.nn.DenseBlock(*args, **kwargs)¶
Bases:
keras.engine.training.Model
Multi-perception layer.
- Parameters
units_list – Dimension of fully connected stack outputs.
activation – Activation to use.
- call(inputs, training=None, mask=None)¶
Calls the model on new inputs.
In this case call just reapplies all ops in the graph to the new inputs (e.g. build a new computational graph from the provided inputs).
Note: This method should not be called directly. It is only meant to be overridden when subclassing tf.keras.Model. To call a model on an input, always use the __call__ method, i.e. model(inputs), which relies on the underlying call method.
- Parameters
inputs – Input tensor, or dict/list/tuple of input tensors.
training – Boolean or boolean scalar tensor, indicating whether to run the Network in training mode or inference mode.
mask – A mask or list of masks. A mask can be either a tensor or None (no mask).
- Returns
A tensor if there is a single output, or a list of tensors if there are more than one outputs.
- class easyrec.blocks.nn.MultiHeadSelfAttention(*args, **kwargs)¶
Bases:
keras.engine.training.Model
Multi-head self attention layer.
- Parameters
input_dimension – Dimension of input.
qkv_dimension – Dimension of Query, Key and Value.
num_heads – Number of heads.
output_dimension – Dimension of final output.
use_normalization – Whether to use normalization in Query * Key process.
- call(inputs, training=None, mask=None)¶
Calls the model on new inputs.
In this case call just reapplies all ops in the graph to the new inputs (e.g. build a new computational graph from the provided inputs).
Note: This method should not be called directly. It is only meant to be overridden when subclassing tf.keras.Model. To call a model on an input, always use the __call__ method, i.e. model(inputs), which relies on the underlying call method.
- Parameters
inputs – Input tensor, or dict/list/tuple of input tensors.
training – Boolean or boolean scalar tensor, indicating whether to run the Network in training mode or inference mode.
mask – A mask or list of masks. A mask can be either a tensor or None (no mask).
- Returns
A tensor if there is a single output, or a list of tensors if there are more than one outputs.
- class easyrec.blocks.nn.ResidualBlock(*args, **kwargs)¶
Bases:
keras.engine.training.Model
Residual layer.
- call(inputs, training=None, mask=None)¶
Calls the model on new inputs.
In this case call just reapplies all ops in the graph to the new inputs (e.g. build a new computational graph from the provided inputs).
Note: This method should not be called directly. It is only meant to be overridden when subclassing tf.keras.Model. To call a model on an input, always use the __call__ method, i.e. model(inputs), which relies on the underlying call method.
- Parameters
inputs – Input tensor, or dict/list/tuple of input tensors.
training – Boolean or boolean scalar tensor, indicating whether to run the Network in training mode or inference mode.
mask – A mask or list of masks. A mask can be either a tensor or None (no mask).
- Returns
A tensor if there is a single output, or a list of tensors if there are more than one outputs.
- class easyrec.blocks.nn.SelfAttention(*args, **kwargs)¶
Bases:
keras.engine.training.Model
Self attention layer.
- Parameters
input_dimension – Dimension of input.
qkv_dimension – Dimension of Query, Key and Value.
use_normalization – Whether to use normalization in Query * Key process.
- call(inputs, training=None, mask=None)¶
Calls the model on new inputs.
In this case call just reapplies all ops in the graph to the new inputs (e.g. build a new computational graph from the provided inputs).
Note: This method should not be called directly. It is only meant to be overridden when subclassing tf.keras.Model. To call a model on an input, always use the __call__ method, i.e. model(inputs), which relies on the underlying call method.
- Parameters
inputs – Input tensor, or dict/list/tuple of input tensors.
training – Boolean or boolean scalar tensor, indicating whether to run the Network in training mode or inference mode.
mask – A mask or list of masks. A mask can be either a tensor or None (no mask).
- Returns
A tensor if there is a single output, or a list of tensors if there are more than one outputs.