A simple library that implements CLIP guided loss in PyTorch.

Last update: Dec 26, 2022

Overview

pytorch_clip_guided_loss: Pytorch implementation of the CLIP guided loss for Text-To-Image, Image-To-Image, or Image-To-Text generation.

A simple library that implements CLIP guided loss in PyTorch.

Install package

pip install pytorch_clip_guided_loss

Install the latest version

pip install --upgrade git+https://github.com/bes-dev/pytorch_clip_guided_loss.git

Features

The library supports multiple prompts (images or texts) as targets for optimization.
The library automatically detects the language of the input text, and multilingual translate it via google translate.
The library supports the original CLIP model by OpenAI and ruCLIP model by SberAI.

Usage

Simple code

import torch
from pytorch_clip_guided_loss import get_clip_guided_loss

loss_fn = get_clip_guided_loss(clip_type="ruclip", input_range = (-1, 1)).eval().requires_grad_(False)
# text prompt
loss_fn.add_prompt(text="text description of the what we would like to generate")
# image prompt
loss_fn.add_prompt(image=torch.randn(1, 3, 224, 224))

# variable
var = torch.randn(1, 3, 224, 224).requires_grad_(True)
loss = loss_fn(image=var)["loss"]
loss.backward()
print(var.grad)

VQGAN-CLIP

We provide our tiny implementation of the VQGAN-CLIP pipeline for image generation as an example of the usage of our library. To start using our implementation of the VQGAN-CLIP please follow by documentation.

A simple library that implements CLIP guided loss in PyTorch.

Related tags

Overview

pytorch_clip_guided_loss: Pytorch implementation of the CLIP guided loss for Text-To-Image, Image-To-Image, or Image-To-Text generation.

Install package

Install the latest version

Features

Usage

Simple code

VQGAN-CLIP

Owner

Sergei Belousov

Code for ACL'2021 paper WARP 🌀 Word-level Adversarial ReProgramming

Source code for "OmniPhotos: Casual 360° VR Photography"

Gated-Shape CNN for Semantic Segmentation (ICCV 2019)

RTS3D: Real-time Stereo 3D Detection from 4D Feature-Consistency Embedding Space for Autonomous Driving

This project aims to explore the deployment of Swin-Transformer based on TensorRT, including the test results of FP16 and INT8.

Official code repository for Continual Learning In Environments With Polynomial Mixing Times

Python periodic table module

Udacity Suse Cloud Native Foundations Scholarship Course Walkthrough

CLIP (Contrastive Language–Image Pre-training) trained on Indonesian data

Deep Multi-Magnification Network for multi-class tissue segmentation of whole slide images

Code to use Augmented Shapiro Wilks Stopping, as well as code for the paper "Statistically Signifigant Stopping of Neural Network Training"

Simulation-based inference for the Galactic Center Excess

🌎 The Modern Declarative Data Flow Framework for the AI Empowered Generation.

A Python implementation of global optimization with gaussian processes.

Source code and Dataset creation for the paper "Neural Symbolic Regression That Scales"

A Lightweight Hyperparameter Optimization Tool 🚀

[CVPR 2021] Counterfactual VQA: A Cause-Effect Look at Language Bias

9th place solution

Generate Cartoon Images using Generative Adversarial Network

Utilizes Pose Estimation to offer sprinters cues based on an image of their running form.