Styled text-to-drawing synthesis method. Featured at the 2021 NeurIPS Workshop on Machine Learning for Creativity and Design

Last update: Dec 23, 2022

Overview

StyleCLIPDraw

Peter Schaldenbrand, Zhixuan Liu, Jean Oh September 2021

To be featured in the 2021 NeurIPS Workshop on Machine Learning and Design

StyleCLIPDraw adds a style loss to the CLIPDraw (Frans et al. 2021) (code) text-to-drawing synthesis model to allow artistic control of the synthesized drawings in addition to control of the content via text. Whereas performing decoupled style transfer on a generated image only affects the texture, our proposed coupled approach is able to capture a style in both texture and shape, suggesting that the style of the drawing is coupled with the drawing process itself.

Checkout our code on Colab

Method

Unlike most other image generation models, CLIPDraw produces drawings consisting of a series of Bezier curves defined by a list of coordinates, a color, and an opacity. The drawing begins as randomized Bezier curves on a canvas and is optimized to fit the given style and text. The StyleCLIPDraw model architecture is shown above. The brush strokes are rendered into a raster image via differentiable model. There are two losses for StyleCLIPDraw that correspond to each input. The text input and the augmented raster drawing are fed the the CLIP model and the difference in embeddings are compared using cosine distance to compute a loss that encourages the drawing to fit the text input. The image is augmented to avoid finding shallow solutions to optimizing through the CLIP model. The raster image and the style image are fed through early layers of the VGG-16 model and the difference in extracted features form the loss that encourages the drawings to fit the style of the style image.

Styled text-to-drawing synthesis method. Featured at the 2021 NeurIPS Workshop on Machine Learning for Creativity and Design

Related tags

Overview

StyleCLIPDraw

Peter Schaldenbrand, Zhixuan Liu, Jean Oh September 2021

Method

Results

StyleCLIPDraw vs. CLIPDraw then Style Transfer

Owner

Peter Schaldenbrand

3D Multi-Person Pose Estimation by Integrating Top-Down and Bottom-Up Networks

Implementation of "The Power of Scale for Parameter-Efficient Prompt Tuning"

Library for machine learning stacking generalization.

Recursive Bayesian Networks

Benchmark tools for Compressive LiDAR-to-map registration

Machine Learning Privacy Meter: A tool to quantify the privacy risks of machine learning models with respect to inference attacks, notably membership inference attacks

Analysis code and Latex source of the manuscript describing the conditional permutation test of confounding bias in predictive modelling.

Solution of Kaggle competition: Sartorius - Cell Instance Segmentation

Measuring and Improving Consistency in Pretrained Language Models

Codes for AAAI 2022 paper: Context-aware Health Event Prediction via Transition Functions on Dynamic Disease Graphs

Implementation of Deep Deterministic Policy Gradiet Algorithm in Tensorflow

A Real-ESRGAN equipped Colab notebook for CLIP Guided Diffusion

用强化学习DQN算法，训练AI模型来玩合成大西瓜游戏，提供Keras版本和PARL（paddle）版本

App for identification of various objects. Based on YOLO v4 tiny architecture

[NeurIPS 2021] COCO-LM: Correcting and Contrasting Text Sequences for Language Model Pretraining

Creating predictive checklists from data using integer programming.

Proto-RL: Reinforcement Learning with Prototypical Representations

Code for the paper "TadGAN: Time Series Anomaly Detection Using Generative Adversarial Networks"

Pacman-AI - AI project designed by UC Berkeley. Designed reflex and minimax agents for the game Pacman.

Leveraging Social Influence based on Users Activity Centers for Point-of-Interest Recommendation