An Unbiased Learning To Rank Algorithms (ULTRA) toolbox

Related tags

Deep LearningULTRA
Overview
logo

Unbiased Learning to Rank Algorithms (ULTRA)

Python 3.6 Documentation Status Build Status codecov License follow on Twitter

This is an Unbiased Learning To Rank Algorithms (ULTRA) toolbox, which provides a codebase for experiments and research on learning to rank with human annotated or noisy labels. With the unified data processing pipeline, ULTRA supports multiple unbiased learning-to-rank algorithms, online learning-to-rank algorithms, neural learning-to-rank models, as well as different methods to use and simulate noisy labels (e.g., clicks) to train and test different algorithms/ranking models. A user-friendly documentation can be found here.

Get Started

Create virtual environment (optional):

pip install --user virtualenv
~/.local/bin/virtualenv -p python3 ./venv
source venv/bin/activate

Install ULTRA from the source:

git clone https://github.com/ULTR-Community/ULTRA.git
cd ULTRA
make init # Replace 'tensorflow' with 'tensorflow-gpu' in requirements.txt for GPU support

Run toy example:

bash example/toy/offline_exp_pipeline.sh

Structure

structure

Input Layers

  1. ClickSimulationFeed: this is the input layer that generate synthetic clicks on fixed ranked lists to feed the learning algorithm.

  2. DeterministicOnlineSimulationFeed: this is the input layer that first create ranked lists by sorting documents according to the current ranking model, and then generate synthetic clicks on the lists to feed the learning algorithm. It can do result interleaving if required by the learning algorithm.

  3. StochasticOnlineSimulationFeed: this is the input layer that first create ranked lists by sampling documents based on their scores in the current ranking model and the Plackett-Luce distribution, and then generate synthetic clicks on the lists to feed the learning algorithm. It can do result interleaving if required by the learning algorithm.

  4. DirectLabelFeed: this is the input layer that directly feed the true relevance labels of each documents to the learning algorithm.

  5. [MTLSimulationFeed] (https://github.com/phyllist/ULTRA/blob/master/ultra/input_layer/mtl_simulation_feed.py): this is the input layer that generate synthetic click and dwell-time on fixed ranked lists to feed the learning algorithm.

Learning Algorithms

  1. NA: this model is an implementation of the naive algorithm that directly train models with input labels (e.g., clicks).

  2. DLA: this is an implementation of the Dual Learning Algorithm in Unbiased Learning to Rank with Unbiased Propensity Estimation.

  3. IPW: this model is an implementation of the Inverse Propensity Weighting algorithms in Learning to Rank with Selection Bias in Personal Search and Unbiased Learning-to-Rank with Biased Feedback

  4. REM: this model is an implementation of the regression-based EM algorithm in Position bias estimation for unbiased learning to rank in personal search

  5. PD: this model is an implementation of the pairwise debiasing algorithm in Unbiased LambdaMART: An Unbiased Pairwise Learning-to-Rank Algorithm.

  6. DBGD: this model is an implementation of the Dual Bandit Gradient Descent algorithm in Interactively optimizing information retrieval systems as a dueling bandits problem

  7. MGD: this model is an implementation of the Multileave Gradient Descent in Multileave Gradient Descent for Fast Online Learning to Rank

  8. NSGD: this model is an implementation of the Null Space Gradient Descent algorithm in Efficient Exploration of Gradient Space for Online Learning to Rank

  9. PDGD: this model is an implementation of the Pairwise Differentiable Gradient Descent algorithm in Differentiable unbiased online learning to rank

  10. PAIRREGM: this model is an implementation of the pairwise regression-based EM algorithm of our paper "Unbiased Pairwise Learning to Rank in Recommender Systems".

Ranking Models

  1. Linear: this is a linear ranking algorithm that compute ranking scores with a linear function.

  2. DNN: this is neural ranking algorithm that compute ranking scores with a multi-layer perceptron network (with non-linear activation functions).

  3. DLCM: this is an implementation of the Deep Listwise Context Model in Learning a Deep Listwise Context Model for Ranking Refinement.

  4. GSF: this is an implementation of the Groupwise Scoring Function in Learning Groupwise Multivariate Scoring Functions Using Deep Neural Networks.

  5. SetRank: this is an implementation of the SetRank model in SetRank: Learning a Permutation-Invariant Ranking Model for Information Retrieval.

  6. [BiasTowerDNN] (https://github.com/phyllist/ULTRA/blob/master/ultra/ranking_model/BiasTowerDNN.py): this is an implementation of the shallow tower based DNN model

Supported Evaluation Metrics

  1. MRR: the Mean Reciprocal Rank (inherited from TF-Ranking).

  2. ERR: the Expected Reciprocal Rank from Expected reciprocal rank for graded relevance.

  3. ARP: the Average Relevance Position (inherited from TF-Ranking).

  4. NDCG: the Normalized Discounted Cumulative Gain (inherited from TF-Ranking).

  5. DCG: the Discounted Cumulative Gain (inherited from TF-Ranking).

  6. Precision: the Precision (inherited from TF-Ranking).

  7. MAP: the Mean Average Precision (inherited from TF-Ranking).

  8. Ordered_Pair_Accuracy: the percentage of correctedly ordered pair (inherited from TF-Ranking).

Click Simulation Example

Create click models for click simulations

python ultra/utils/click_models.py pbm 0.1 1 4 1.0 example/ClickModel

* The output is a json file containing the click mode that could be used for click simulation. More details could be found in the code.

(Optional) Estimate examination propensity with result randomization

python ultra/utils/propensity_estimator.py example/ClickModel/pbm_0.1_1.0_4_1.0.json 
   
     example/PropensityEstimator/

   

* The output is a json file containing the estimated examination propensity (used for IPW). DATA_DIR is the directory for the prepared data created by ./libsvm_tools/prepare_exp_data_with_svmrank.py. More details could be found in the code.

Citation

If you use ULTRA in your research, please use the following BibTex entry.

@article{10.1145/3439861,
author = {Ai, Qingyao and Yang, Tao and Wang, Huazheng and Mao, Jiaxin},
title = {Unbiased Learning to Rank: Online or Offline?},
year = {2021},
issue_date = {February 2021},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
volume = {39},
number = {2},
issn = {1046-8188},
url = {https://doi.org/10.1145/3439861},
doi = {10.1145/3439861},
journal = {ACM Trans. Inf. Syst.},
month = feb,
articleno = {21},
numpages = {29},
keywords = {unbiased learning, online learning, Learning to rank}
}

@inproceedings{Ai:2018:ULR:3269206.3274274,
 author = {Ai, Qingyao and Mao, Jiaxin and Liu, Yiqun and Croft, W. Bruce},
 title = {Unbiased Learning to Rank: Theory and Practice},
 booktitle = {Proceedings of the 27th ACM International Conference on Information and Knowledge Management},
 series = {CIKM '18},
 year = {2018},
 isbn = {978-1-4503-6014-2},
 location = {Torino, Italy},
 pages = {2305--2306},
 numpages = {2},
 url = {http://doi.acm.org/10.1145/3269206.3274274},
 doi = {10.1145/3269206.3274274},
 acmid = {3274274},
 publisher = {ACM},
 address = {New York, NY, USA},
 keywords = {click model, counterfactual learning, unbiased learning to rank, user bias},
}

Development Team

​ ​ ​ ​

QingyaoAi
Qingyao Ai

Core Dev
ASST PROF, Univ. of Utah

Taosheng-ty
Tao Yang

Core Dev
Ph.D., Univ. of Utah

huazhengwang
Huazheng Wang

Core Dev
Ph.D., Univ. of Virginia

defaultstr
Jiaxin Mao

Core Dev
Postdoc, Tsinghua Univ.

Contribution

Please read the Contributing Guide before creating a pull request.

Project Organizers

  • Qingyao Ai
    • School of Computing, University of Utah
    • Homepage

License

Apache-2.0

Copyright (c) 2020-present, Qingyao Ai (QingyaoAi)

Random Walk Graph Neural Networks

Random Walk Graph Neural Networks This repository is the official implementation of Random Walk Graph Neural Networks. Requirements Code is written in

Giannis Nikolentzos 38 Jan 02, 2023
Rethinking Transformer-based Set Prediction for Object Detection

Rethinking Transformer-based Set Prediction for Object Detection Here are the code for the ICCV paper. The code is adapted from Detectron2 and AdelaiD

Zhiqing Sun 62 Dec 03, 2022
Use graph-based analysis to re-classify stocks and to improve Markowitz portfolio optimization

Dynamic Stock Industrial Classification Use graph-based analysis to re-classify stocks and experiment different re-classification methodologies to imp

Sheng Yang 10 Dec 05, 2022
SAFL: A Self-Attention Scene Text Recognizer with Focal Loss

SAFL: A Self-Attention Scene Text Recognizer with Focal Loss This repository implements the SAFL in pytorch. Installation conda env create -f environm

6 Aug 24, 2022
Official implementation of NeuralFusion: Online Depth Map Fusion in Latent Space

NeuralFusion This is the official implementation of NeuralFusion: Online Depth Map Fusion in Latent Space. We provide code to train the proposed pipel

53 Jan 01, 2023
HackBMU-5.0-Team-Ctrl-Alt-Elite - HackBMU 5.0 Team Ctrl Alt Elite

HackBMU-5.0-Team-Ctrl-Alt-Elite The search is over. We present to you ‘Health-A-

3 Feb 19, 2022
official code for dynamic convolution decomposition

Revisiting Dynamic Convolution via Matrix Decomposition (ICLR 2021) A pytorch implementation of DCD. If you use this code in your research please cons

Yunsheng Li 110 Nov 23, 2022
Pytorch implementation of PTNet for high-resolution and longitudinal infant MRI synthesis

Pyramid Transformer Net (PTNet) Project | Paper Pytorch implementation of PTNet for high-resolution and longitudinal infant MRI synthesis. PTNet: A Hi

Xuzhe Johnny Zhang 6 Jun 08, 2022
[ICCV 2021] Excavating the Potential Capacity of Self-Supervised Monocular Depth Estimation

EPCDepth EPCDepth is a self-supervised monocular depth estimation model, whose supervision is coming from the other image in a stereo pair. Details ar

Rui Peng 110 Dec 23, 2022
ICCV2021 Paper: AutoShape: Real-Time Shape-Aware Monocular 3D Object Detection

ICCV2021 Paper: AutoShape: Real-Time Shape-Aware Monocular 3D Object Detection

Zongdai 107 Dec 20, 2022
Image Super-Resolution by Neural Texture Transfer

SRNTT: Image Super-Resolution by Neural Texture Transfer Tensorflow implementation of the paper Image Super-Resolution by Neural Texture Transfer acce

Zhifei Zhang 413 Nov 30, 2022
source code for https://arxiv.org/abs/2005.11248 "Accelerating Antimicrobial Discovery with Controllable Deep Generative Models and Molecular Dynamics"

Accelerating Antimicrobial Discovery with Controllable Deep Generative Models and Molecular Dynamics This work will be published in Nature Biomedical

International Business Machines 71 Nov 15, 2022
The first machine learning framework that encourages learning ML concepts instead of memorizing class functions.

SeaLion is designed to teach today's aspiring ml-engineers the popular machine learning concepts of today in a way that gives both intuition and ways of application. We do this through concise algori

Anish 324 Dec 27, 2022
A pytorch implementation of the ACL2019 paper "Simple and Effective Text Matching with Richer Alignment Features".

RE2 This is a pytorch implementation of the ACL 2019 paper "Simple and Effective Text Matching with Richer Alignment Features". The original Tensorflo

287 Dec 21, 2022
Some code of the implements of Geological Modeling Using 3D Pixel-Adaptive and Deformable Convolutional Neural Network

3D-GMPDCNN Geological Modeling Using 3D Pixel-Adaptive and Deformable Convolutional Neural Network PyTorch implementation of "Geological Modeling Usin

5 Nov 21, 2022
WarpRNNT loss ported in Numba CPU/CUDA for Pytorch

RNNT loss in Pytorch - Numba JIT compiled (warprnnt_numba) Warp RNN Transducer Loss for ASR in Pytorch, ported from HawkAaron/warp-transducer and a re

Somshubra Majumdar 15 Oct 22, 2022
Implementation of Neonatal Seizure Detection using EEG signals for deploying on edge devices including Raspberry Pi.

NeonatalSeizureDetection Description Link: https://arxiv.org/abs/2111.15569 Citation: @misc{nagarajan2021scalable, title={Scalable Machine Learn

Vishal Nagarajan 11 Nov 08, 2022
ICLR 2021, Fair Mixup: Fairness via Interpolation

Fair Mixup: Fairness via Interpolation Training classifiers under fairness constraints such as group fairness, regularizes the disparities of predicti

Ching-Yao Chuang 49 Nov 22, 2022
The implement of papar "Enhanced Graph Learning for Collaborative Filtering via Mutual Information Maximization"

SIGIR2021-EGLN The implement of paper "Enhanced Graph Learning for Collaborative Filtering via Mutual Information Maximization" Neural graph based Col

15 Dec 27, 2022
Implement slightly different caffe-segnet in tensorflow

Tensorflow-SegNet Implement slightly different (see below for detail) SegNet in tensorflow, successfully trained segnet-basic in CamVid dataset. Due t

Tseng Kuan Lun 364 Oct 27, 2022