The implementation of the paper "HIST: A Graph-based Framework for Stock Trend Forecasting via Mining Concept-Oriented Shared Information".

Overview

The HIST framework for stock trend forecasting

The implementation of the paper "HIST: A Graph-based Framework for Stock Trend Forecasting via Mining Concept-Oriented Shared Information". image

Environment

  1. Install python3.7, 3.8 or 3.9.
  2. Install the requirements in requirements.txt.
  3. Install the Qlib and download the data:
    # install Qlib from source
    pip install --upgrade  cython
    git clone https://github.com/microsoft/qlib.git && cd qlib
    python setup.py install
    
    # Download the stock features of Alpha360
    python scripts/get_data.py qlib_data --target_dir ~/.qlib/qlib_data/cn_data --region cn --version v2
    

Reproduce the stock trend forecasting results

image

git clone https://github.com/Wentao-Xu/HIST.git
cd HIST
mkdir output

Reproduce our HIST framework

# CSI 100
python learn.py --model_name HIST --data_set csi100 --hidden_size 128 --num_layers 2 --outdir ./output/csi100_HIST

# CSI 300
python learn.py --model_name HIST --data_set csi300 --hidden_size 128 --num_layers 2 --outdir ./output/csi300_HIST

Reproduce the baselines

  • MLP
# MLP on CSI 100
python learn.py --model_name MLP --data_set csi100 --hidden_size 512 --num_layers 3 --outdir ./output/csi100_MLP

# MLP on CSI 300
python learn.py --model_name MLP --data_set csi300 --hidden_size 512 --num_layers 3 --outdir ./output/csi300_MLP
  • LSTM
# LSTM on CSI 100
python learn.py --model_name LSTM --data_set csi100 --hidden_size 128 --num_layers 2 --outdir ./output/csi100_LSTM

# LSTM on CSI 300
python learn.py --model_name LSTM --data_set csi300 --hidden_size 128 --num_layers 2 --outdir ./output/csi300_LSTM
  • GRU
# GRU on CSI 100
python learn.py --model_name GRU --data_set csi100 --hidden_size 128 --num_layers 2 --outdir ./output/csi100_GRU

# GRU on CSI 300
python learn.py --model_name GRU --data_set csi300 --hidden_size 64 --num_layers 2 --outdir ./output/csi300_GRU
  • SFM
# SFM on CSI 100
python learn.py --model_name SFM --data_set csi100 --hidden_size 64 --num_layers 2 --outdir ./output/csi100_SFM

# SFM on CSI 300
python learn.py --model_name SFM --data_set csi300 --hidden_size 128 --num_layers 2 --outdir ./output/csi300_SFM
  • GATs
# GATs on CSI 100
python learn.py --model_name GATs --data_set csi100 --hidden_size 128 --num_layers 2 --outdir ./output/csi100_GATs

# GATs on CSI 300
python learn.py --model_name GATs --data_set csi300 --hidden_size 64 --num_layers 2 --outdir ./output/csi300_GATs
  • ALSTM
# ALSTM on CSI 100
python learn.py --model_name ALSTM --data_set csi100 --hidden_size 64 --num_layers 2 --outdir ./output/csi100_ALSTM

# ALSTM on CSI 300
python learn.py --model_name ALSTM --data_set csi300 --hidden_size 128 --num_layers 2 --outdir ./output/csi300_ALSTM
  • Transformer
# Transformer on CSI 100
python learn.py --model_name Transformer --data_set csi100 --hidden_size 32 --num_layers 3 --outdir ./output/csi100_Transformer

# Transformer on CSI 300
python learn.py --model_name Transformer --data_set csi300 --hidden_size 32 --num_layers 3 --outdir ./output/csi300_Transformer
  • ALSTM+TRA

    We reproduce the ALSTM+TRA with its source code.

Owner
Wentao Xu
PhD Student in a Joint Program between MSRA and SYSU.
Wentao Xu
Domain Connectivity Analysis Tools to analyze aggregate connectivity patterns across a set of domains during security investigations

DomainCAT (Domain Connectivity Analysis Tool) Domain Connectivity Analysis Tool is used to analyze aggregate connectivity patterns across a set of dom

DomainTools 34 Dec 09, 2022
The Python ensemble sampling toolkit for affine-invariant MCMC

emcee The Python ensemble sampling toolkit for affine-invariant MCMC emcee is a stable, well tested Python implementation of the affine-invariant ense

Dan Foreman-Mackey 1.3k Jan 04, 2023
Implement the Perspective open source code in preparation for data visualization

Task Overview | Installation Instructions | Link to Module 2 Introduction Experience Technology at JP Morgan Chase Try out what real work is like in t

Abdulazeez Jimoh 1 Jan 23, 2022
Simple addon for snapping active object to mesh ground

Snap to Ground Simple addon for snapping active object to mesh ground How to install: install the Python file as an addon use shortcut "D" in 3D view

Iyad Ahmed 12 Nov 07, 2022
A simple python tool for explore your object detection dataset

A simple tool for explore your object detection dataset. The goal of this library is to provide simple and intuitive visualizations from your dataset and automatically find the best parameters for ge

GRADIANT - Centro Tecnolóxico de Telecomunicacións de Galicia 142 Dec 25, 2022
CONTRIBUTIONS ONLY: Voluptuous, despite the name, is a Python data validation library.

CONTRIBUTIONS ONLY What does this mean? I do not have time to fix issues myself. The only way fixes or new features will be added is by people submitt

Alec Thomas 1.8k Dec 31, 2022
Homework 2: Matplotlib and Data Visualization

Homework 2: Matplotlib and Data Visualization Overview These data visualizations were created for my introductory computer science course using Python

Sophia Huang 12 Oct 20, 2022
Bioinformatics tool for exploring RNA-Protein interactions

Explore RNA-Protein interactions. RNPFind is a bioinformatics tool. It takes an RNA transcript as input and gives a list of RNA binding protein (RBP)

Nahin Khan 3 Jan 27, 2022
Create matplotlib visualizations from the command-line

MatplotCLI Create matplotlib visualizations from the command-line MatplotCLI is a simple utility to quickly create plots from the command-line, levera

Daniel Moura 46 Dec 16, 2022
Time series visualizer is a flexible extension that provides filling world map by country from real data.

Time-series-visualizer Time series visualizer is a flexible extension that provides filling world map by country from csv or json file. You can know d

Long Ng 3 Jul 09, 2021
Visualize and compare datasets, target values and associations, with one line of code.

In-depth EDA (target analysis, comparison, feature analysis, correlation) in two lines of code! Sweetviz is an open-source Python library that generat

Francois Bertrand 2.3k Jan 05, 2023
Joyplots in Python with matplotlib & pandas :chart_with_upwards_trend:

JoyPy JoyPy is a one-function Python package based on matplotlib + pandas with a single purpose: drawing joyplots (a.k.a. ridgeline plots). The code f

Leonardo Taccari 462 Jan 02, 2023
Visualizations for machine learning datasets

Introduction The facets project contains two visualizations for understanding and analyzing machine learning datasets: Facets Overview and Facets Dive

PAIR code 7.1k Jan 07, 2023
Area-weighted venn-diagrams for Python/matplotlib

Venn diagram plotting routines for Python/Matplotlib Routines for plotting area-weighted two- and three-circle venn diagrams. Installation The simples

Konstantin Tretyakov 400 Dec 31, 2022
Compute and visualise incidence (reworking of the original incidence package)

incidence2 incidence2 is an R package that implements functions and classes to compute, handle and visualise incidence from linelist data. It refocuss

15 Nov 22, 2022
Flexitext is a Python library that makes it easier to draw text with multiple styles in Matplotlib

Flexitext is a Python library that makes it easier to draw text with multiple styles in Matplotlib

Tomás Capretto 93 Dec 28, 2022
A simple interpreted language for creating basic mathematical graphs.

graphr Introduction graphr is a small language written to create basic mathematical graphs. It is an interpreted language written in python and essent

2 Dec 26, 2021
Custom ROI in Computer Vision Applications

EasyROI Helper library for drawing ROI in Computer Vision Applications Table of Contents EasyROI Table of Contents About The Project Tech Stack File S

43 Dec 09, 2022
Process dataframe in a easily way.

Popanda Written by Shengxuan Wang at OSU. Used for processing dataframe, especially for machine learning. The name is from "Po" in the movie Kung Fu P

ShawnWang 1 Dec 24, 2021
A central task in drug discovery is searching, screening, and organizing large chemical databases

A central task in drug discovery is searching, screening, and organizing large chemical databases. Here, we implement clustering on molecular similarity. We support multiple methods to provide a inte

NVIDIA Corporation 124 Jan 07, 2023