PitchHut
Log in / Sign up
images-that-sound
6 views
See sound and hear images with Spectrogram Visions.
Pitch

Introducing Images That Sound, a groundbreaking project merging visual art and auditory experience. By transforming spectrograms into captivating images and sounds, we invite you to explore a new realm of perception. Our repository provides the tools to generate and manipulate these unique auditory-visual creations, powered by advanced diffusion techniques.

Description

Explore the innovative world of Images that Sound, where visual art intersects with auditory experiences. Developed by Ziyang Chen, Daniel Geng, and Andrew Owens from the University of Michigan, this project showcases how specialized spectrograms can be visualized as images while also producing sound that can be generated through diffusion techniques.

You can find our research paper here and visit the project page for further insights.

Key Features

  • Generate images that sound through advanced multimodal denoising and diffusion methods.
  • Utilize pretrained models like Stable Diffusion v1.5 and Auffusion for high-quality output.
  • Explore different methodologies including Imprint and SDS for innovative multimedia creation.

Usage Examples

To create images that sound using our multimodal denoising method:

python src/main_denoise.py experiment=examples/bell

For the Imprint baseline method, you can run:

python src/main_imprint.py experiment=examples/bell

To use the SDS baseline, execute:

python src/main_sds.py experiment=examples/bell

Additionally, we offer colorization capabilities to produce visually vibrant generated videos paired with audio. Run the following code for generating colorized videos:

python src/colorization/create_color_video.py \
  --sample_dir /path/to/generated/sample/dir \
  --prompt "a colorful photo of [object]" \
  --num_samples 16 --guidance_scale 10 \
  --num_inference_steps 30 --start_diffusion_step 7

Acknowledgements

We would like to express our gratitude to the open-source community, relying on various resources and tools such as Lightning-Hydra-Template, diffusers, and many others to realize this project.

Explore the fascinating blend of imagery and sound with Images that Sound and immerse yourself in an entirely new multimedia experience!