Nodey
Breadcrumbs

Nodes Glossary

Here’s a list of all the nodes available within the platform, along with their respective functionalities.

Nodey Nodes Glossary

Category

Node Name

Description

Audio

Audio Concat

The Audio Concat Node takes two or more separate audio streams and joins them chronologically. The output is a single audio file or stream where the second input begins immediately after the first ends.

Expects Inputs:
-Audio Audio 1: Primary clip.
-Audio Audio 2: Clip to append.

Provides Outputs:
-Audio Concatenated Audio: The full sequential stream.

Audio

Audio Gain

The Audio Gain Node scales the magnitude of an audio signal. The node multiplies every sample in the digital audio stream by a specific value.

Expects Inputs:
-Audio Audio: Initial clip.
-Float Gain: A gain of 0.5 reduces the signal by half (-6 dB), and a gain of 2.0 doubles it (+6dB).

Provides Outputs:
-Audio Audio with Gain: Scaled audio output.

Audio

Audio Mix

The Audio Mix Node combines multiple incoming audio signals into a single output stream by mathematically summing their waveforms.

Expects Inputs:
-Audio Audio 1: Initial clip.
-Audio Audio 2: A gain of 0.5 reduces the signal by half, and a gain of 2.0 doubles it.
-Float Mix: Crossfades the Amplitude between inputs using a center-weighted blend. At -1.0, Input 2 is fully attenuated. As the value moves toward 1.0, Input 1 fades out while Input 2 fades in. At 0.0, both signals are summed at equal volume.
-Float Audio 2 Delay (secs): Offsets the start time of the second input in seconds.

Provides Outputs:
-Audio Mixed Audio: Mixed audio file.

Audio

Audio Trim

Audio Trim extracts a specific segment of audio and defines its timing and volume fades within the timeline.

Expects Inputs:
-Audio Audio: The full audio stream.
-Float Start (secs): Indicates the timestamp of where in the source to begin.
-Float Duration (secs): The length of the segment to keep.
-Float Delay (secs): The offset/start time on the main timeline.
-Float Fade In (secs): Volume ramp duration at the start.
-Float Fade Out (secs): Volume ramp duration at the end.

Provides Outputs:
-Audio Trimmed Audio: The processed audio segment.

GenAI/3D

Meshy Image to 3D

The Meshy 3D Model Node converts 2D image inputs and text prompts into a 3D model with AI-generated texturing.

Expects Inputs:
-Image Model Image: The primary reference image for geometry generation.
-Image Alt Angle (optional): Up to three additional images to provide the AI with more perspective on the object.
-Integer Max Triangles: A cap on the mesh density (polygon count) to optimize performance.
-Text Texture Prompt: A text description guiding the look, material, and style of the surface.
-Image Texture Image: A reference image to guide specific patterns or material colors.
-Text Texture: Select No Texture to disable texture creation.
-Text Topology: Select either Triangle or Quad for topology.
-Text AI Model: The model to use.

Provides Outputs:
-Model-3d Generated Model: The resulting 3D model file, typically including vertex data and mapped textures.

GenAI/3D

Meshy Retopologizer

The Meshy Retopologizer Node rebuilds the polygon topology of a 3D model using AI. It replaces dense, irregular meshes (common in AI-generated or sculpted models) with cleaner, more efficient geometry suitable for animation, real-time rendering, or further production work.

Expects Inputs:
-Model-3D Model: The source 3D mesh to be retopologized.
-Integer Max Polygons: A cap on the total polygon count for the retopologized mesh.
-Float Height (meters): The target real-world height of the model in meters, used for correct scaling.
-Text Set Origin: Select Origin at the bottom to reposition the model's origin point to the bottom center of the bounding box, which is standard for placing characters on ground planes.
-Text Topology: Select either Triangle or Quad for topology.

Provides Outputs:
-Model-3d Retopologized Model: The rebuilt 3D model with optimized topology.

GenAI/3D

Meshy Rigger

The Meshy Rigger Node automatically generates a skeletal rig for a 3D model using AI, making it ready for animation. It analyzes the mesh geometry to place bones and joint hierarchies appropriate for the model's shape. Note that this process will strip any existing textures from the model.

Expects Inputs:
-Model-3D Model: The source 3D mesh to be rigged.
-Float Height (meters): The target real-world height of the model in meters, used to correctly scale the skeleton.
-Text Animate: Select Animate to apply a default animation cycle to the rigged model for immediate preview.

Provides Outputs:
-Model-3D Rigged Model: The 3D model with an embedded skeletal rig ready for animation.

GenAI/3D

Meshy Text to 3D

The Meshy Text to 3D Node generates a 3D model from a text description using AI. Unlike the image-based Meshy Node, this variant relies entirely on a written prompt to define the object's shape and appearance.

Expects Inputs:
-Text Prompt: A text description of the 3D object to generate (e.g., ‘a medieval wooden shield’).
-Integer Max Triangles: A cap on the mesh density (polygon count) to optimize performance.
-Text Texture Prompt: A text description guiding the look, material, and style of the surface.
-Image Texture Image: A reference image to guide specific patterns or material colors.
-Text Texture: Select No Texture to disable texture creation.
-Text Create Humanoid: Select Humanoid for bipedal character models when enabled.
-Text Topology: Select either Triangle or Quad for topology.
-Text AI Model: The model to use.

Provides Outputs:
-Model-3D Generated Model: The resulting 3D model file, typically including vertex data and mapped textures.

GenAI/3D

Meshy Textureizer

The Meshy Textureizer Node applies AI-generated textures to an existing 3D model. It takes a bare or previously textured mesh and re-skins it based on a text prompt and optional reference image, allowing rapid iteration on surface appearance without regenerating the geometry.

Expects Inputs:
-Model-3D Model: The source 3D mesh to be textured.
-Text Texture Prompt: A text description guiding the desired material, color, and surface style.
-Image Texture Image: A visual reference to guide the patterns or colors of the texture map.
-Text Original UVs: Select Generate UVs to allow the model to generate new UVs.
-Text AI Model: The model to use.

Provides Outputs:
-Model-3D Textured Model: The 3D model with newly generated texture maps applied.

GenAI/3D

Sharp Splat from Image

Create a Gaussian splat using the Apple Sharp model.

Expects Inputs:
-Image Image: The image to use for the Gaussian splat generation.
-Text Host: The host for the Sharp service.

Provides Outputs:
-Model-splat Generated Splat: The generated Gaussian splat.

GenAI/3D

Tripo Image to 3D

The Tripo 3D Model Node generates a 3D mesh and high-fidelity textures from a primary reference image and text-based style prompts.

Expects Inputs:
-Image Model Image: The primary reference for geometry generation.
-Image Alt Angle x3: Additional perspectives to improve 3D reconstruction accuracy.
-Integer Max Triangles: Sets the polygon limit for the generated mesh.
-Integer Seed: Sets a seed value for generation
-Texture: Set to true to inhibit texture generation

Provides Outputs:
-Model-3D Generated Model: The finalized 3D mesh data.

GenAI/3D

World Labs Image to Splat

Generates a Gaussian splat from an image using World Labs Marble.

Expects Inputs:
-Text Prompt (optional): Text prompt to guide the generation.
-Image Front Image: The front-facing source image.
-Image Right Image (optional): The right-facing reference image.
-Image Back Image (optional): The back-facing reference image.
-Image Left Image (optional): The left-facing reference image.
-Image Front Right (optional): The front-right facing reference image.
-Image Back Right (optional): The back-right facing reference image.
-Image Back Left (optional): The back-left facing reference image.
-Image Front Left (optional): The front-left facing reference image.
-Boolean Image Is Panorama: Whether the input image is a panoramic image.
-Integer Seed (optional): If set, will influence the determinism of the generation.
-String Model: Select the desired model.

Provides Outputs:
-Model-splat Generated World: The generated Gaussian splat world.
-Model-3D Collider Model: A 3D collider model for the generated world.
-Image Panorama Image: A panoramic image of the generated world.

GenAI/3D

World Labs Text to Splat

Generates a Gaussian splat from a text prompt using World Labs Marble.

Expects Inputs:
-Text Prompt: The text prompt describing the world to generate.
-Integer Seed (optional): If set, will influence the determinism of the generation.
-String Model: Select the desired model.

Provides Outputs:
-Model-splat Generated World: The generated Gaussian splat world.
-Model-3D Collider Model: A 3D collider model for the generated world.
-Image Panorama Image: A panoramic image of the generated world.

GenAI/3D

World Labs Video to Splat

Generates a Gaussian splat from a video using World Labs Marble.

Expects Inputs:
-Text Prompt (optional): Text prompt to guide the generation.
-Video Video: The source video to generate the world from.
-Integer Seed (optional): If set, will influence the determinism of the generation.
-String Model: Select the desired model.

Provides Outputs:
-Model-splat Generated World: The generated Gaussian splat world.
-Model-3D Collider Model: A 3D collider model for the generated world.
-Image Panorama Image: A panoramic image of the generated world.

GenAI/Beeble

Beeble SwitchX

Performs video-to-video generation using the Beeble AI SwitchX model.

Expects Inputs:
-Video Video: Source video to be modified.
-Text Prompt: Prompt guiding the video modification. Either Prompt or Reference Image is required.
-Image Reference Image: Image guiding the video modification. Either Prompt or Reference Image is required. Also, required if the Alpha Mode is set to Select Alpha mode.
-Video Alpha Video: A video of the alpha (matte) channel, required if the Alpha Mode is set to Custom Alpha mode.
-Text Max Resolution: The desired resolution of the generated video.

Provides Outputs:
-Video Generated Video: The generated content video.
-Video Generated Alpha: The generated alpha video.

GenAI/ElevenLabs

Voice Aggregator

Utility node to assist in processing ElevenLabs nodes that output both an audio sample and an ID. It is a pass-through node.

Expects Inputs:
-Audio Audio: The audio file.
-Text ID: The ID text.

Provides Outputs:
-Audio audio: The audio file.
-Text ID: The ID text.

GenAI/ElevenLabs

Voice Changer

Transforms audio from one voice to another. Maintain full control over emotion, timing, and delivery using the ElevenLabs Speech-to-Speech API.

Expects Inputs:
-Audio Input Audio: The source audio with the voice to be changed.
-Text Voice ID: The Voice ID text from ElevenLabs voices, or a Voice ID created with the ElevenLabs Voice Creator node.
-JSON Settings: Optional additional voice settings as JSON, see ElevenLabs for examples.
-Integer Seed: Optional GenAI seed to guide repeatable generations.
-Boolean Remove Noise: Set to Remove Noise to automatically remove background noise.
-Text Model: Model name to use.
-Text Output Format: The desired output format, in the form of Format, Sample Rate, Bitrate.

Provides Outputs:
-Audio Generated Audio: The audio with the voice changed to the voice defined by the Voice ID.

GenAI/ElevenLabs

Voice Creator

Create a voice from a previously generated voice preview using the ElevenLabs Create a voice API.

Expects Inputs:
-Text Voice Name: A unique name for the voice that will be created.
-Text Voice Descrip: A description of the new voice.
-Text Gen Voice ID: A Generated Voice ID from the ElevenLabs Voice Designer node.
-JSON Labels: Optional text-to-text map of desired labels for the voice.

Provides Outputs:
-Text Voice ID: A newly minted Voice ID that can be used with the ElevenLabs Voice Changer Node.
-Audio Preview Audio: Example audio with the new voice.
-Text Name: The name of the voice that was created.
-Text Category: The category that the created voice is in.

GenAI/ElevenLabs

Voice Designer

Design a voice via a prompt using the ElevenLabs Voice Design API.

Expects Inputs:
-Text Voice Descrip: A detailed description of the voice to be created, including elements like intonation, pacing, quality, etc.
-Text Text: Optional text for the created voice to read. If not provided, ElevenLabs will generate sample text automatically.
-Audio Ref Audio: Optional reference audio, may only be used with the Eleven TTV v3 model.
-Float Loud: Optional loudness setting, controls the volume level of the generated voice. -1 is quietest, 1 is loudest, 0 corresponds to roughly -24 LUFS.
-Integer Seed: Optional GenAI seed to guide repeatable generations.
-Float Guidance: Optional Guidance Scale influencing how closely the voice adheres to the prompt, 0 being very loose and 100 being very tight.
-Text Remix ID: Optional Remixing Session ID as generated by the ElevenLabs Remixing node.
-Text Remix Iter ID: Optional Remixing iteration value, when iterating on remixes.
-Float Quality: Optional value between -1 and 1, trading off variability and quality, with -1 being highly variable and 1 being high quality.
-Float Strength: Optional value between 0 and 1, influencing how strongly the voice adheres to the description.
-Boolean Enhance: Set to Enhance to pre-process the reference audio.
-Text Model: Model name to use.
-Text Output Format: The desired output format, in the form of Format, Sample Rate, Bitrate.

Provides Outputs:
-Text Gen Voice ID: The Generate Voice ID that can be used with the ElevenLabs Voice Creator node.
-Audio Generated Audio: A sample audio created with the generated voice.

GenAI/ElevenLabs

Voice Remixer

Remix an existing voice via a prompt using the ElevenLabs Voice Remix API.

Expects Inputs:
-Text Voice ID: An ElevenLabs Voice ID on which to base the remixed voice.
-Text Voice Descript: A detailed description of the voice to be created, including elements like intonation, pacing, quality, etc.
-Text Text: Optional text for the created voice to read. If not provided, ElevenLabs will generate sample text automatically.
-Float Loud: Optional loudness setting, controls the volume level of the generated voice. -1 is quietest, 1 is loudest, 0 corresponds to roughly -24 LUFS.
-Integer Seed: Optional GenAI seed to guide repeatable generations.
-Float Guidance: Optional Guidance Scale influencing how closely the voice adheres to the prompt, 0 being very loose and 100 being very tight.
-Text Remix ID: Optional Remixing Session ID as generated by the ElevenLabs Remixing node.
-Text Remix Iter ID: Optional Remixing iteration value, when iterating on remixes.
-Float Strength: Optional value between 0 and 1, influencing how strongly the voice adheres to the description.
-Text Output Format: The desired output format, in the form of Format, Sample Rate Bitrate.
-Text Gen Voice ID: The Generate Voice ID that can be used with the ElevenLabs Voice Creator node.
-Audio Generated Audio: A sample audio created with the generated voice.

GenAI/Google

Chirp 3 Custom Voice TTS

Synthesizes speech using a custom cloned voice via Google's Chirp 3 API.

Expects Inputs:
-Text What to Say: The text you want spoken.
-Text Voice Cloning Key: The secret key for your cloned custom voice.
-Text Language Code: The language you want the speech generated in.

Provides Outputs:
-Audio Generated Audio: The synthesized audio stream.

GenAI/Google

Chirp 3 Custom Voice

Generates a voice cloning key using Google's Chirp 3 Instant Custom Voice.

Expects Inputs:
-Audio Reference Audio: An audio file containing the target voice to clone, about 10 seconds required. An example script is, 'To build a great synthetic voice, we need to capture a wide variety of sounds. From the sharpest consonants to the smoothest vowels, every single syllable matters.'
-Audio Consent Audio: An audio file containing the consent statement read by the voice actor, about 10 seconds required. The script is, 'I am the owner of this voice, and I consent to Google using this voice to create a synthetic voice model.'
-Text Language Code: The language of the audio (e.g., en-us).

Provides Outputs:
-Text Voice Cloning Key: The generated cloning key text, which can be passed as a voice_name to TTS nodes.

GenAI/Google

Chirp 3 HD TTS

Synthesizes ultra-realistic, emotionally resonant speech using Google's generative Chirp 3 HD Voices API.

Expects Inputs:
-Text What to Say: The text to be converted to speech.
-Text Voice Name: The exact name of the HD voice model.
-Text Language Code: The language code mapping to the selected voice.

Provides Outputs:
-Audio Generated Audio: The synthesized audio stream.

GenAI/Google

Gemini Flash2.5 Image

Generates high-quality images from text descriptions and optional reference images using the Gemini Flash 2.5 model.

Expects Inputs:
-Text Prompt: The text description detailing the subject, style, lighting, and composition of the desired image.
-Image Ref Image (optional) x2: Visual references that the AI uses to influence the style, structure, or content of the new generation.
-Text / Enum Aspect Ratio: The framing dimensions for the output (e.g., 1:1, 16:9, 9:16).
-Integer Seed: A specific number used to initialize the generation; using the same seed with the same prompt will produce the same result.

Provides Outputs:
-Image Generated Image: The final AI-generated image file.

GenAI/Google

Gemini Flash2.5 Isolate

Leverages vision-language models to identify and segment specific objects from an image based on a text query.

Expects Inputs:
-Text Item to Isolate: A natural language description of the object you want to extract (e.g., 'the blue coffee mug').
-Image Image: The source image containing the item.

Provides Outputs:
-Image Isolated Image: A smaller image file cropped in a square around the prompted item.
-Image Isolated Mask: A mask representation of that image.

GenAI/Google

Gemini Flash2.5 Segment

Leverages vision-language models to identify and segment specific objects from an image based on a text query.

Expects Inputs:
-Text Item to Segment: A natural language description of the object you want to extract (e.g., ‘the blue coffee mug’).
-Image Image: The source image containing the item.
-Integer Mask Threshold: Controls the tightness of the resulting mask.

Provides Outputs:
-Image Segmented Image: The segmented image.
-Image Segmented Mask: The segmentation mask.

GenAI/Google

Gemini Flash2.5 Text

Processes natural language prompts to generate creative copy, analyze data, or describe visual inputs. It can utilize reference images to provide high-context answers or follow specific Agent Instructions for tailored personas.

Expects Inputs:
-Text Prompt: The primary text instruction or question for the AI.
-Image Ref Image (optional) x2: Visual context for the AI to ‘look at’ when generating its response.
-Text Agent Instructions: High-level system instructions to define the AI’s behavior, tone, or specific formatting requirements.
-Integer Seed: A numerical value used to ensure reproducible text outputs.

Provides Outputs:
-Text Generated Text: The finalized text generated by the model.

GenAI/Google

Gemini Flash2.5 Transcribe

Utilizes Gemini’s multimodal capabilities to listen to audio streams and generate highly accurate transcriptions. It can distinguish between different voices and provide precise timing for when each word or sentence was spoken.

Expects Inputs:
-Audio Audio: The source sound file or stream to be transcribed.
-Boolean Include Timestamps: A toggle to determine if the output should include start and end times for the transcribed text.
-Boolean Include Speakers: A toggle to enable diarization (identifying and labeling different speakers in the audio).

Provides Outputs:
-Text Generated Text: The finalized transcription text, formatted based on the input settings.

GenAI/Google

Gemini Flash2.5 Text-to-Speech

Transforms text into synthetic speech.

Expects Inputs:
-Text What to Say: The actual text content to be converted into speech.
-Text voice or voice1, voice...: Specifies the desired voice model (en-US-Studio-O, en-US-Neural2-D) or a prioritized list of voices to use for the generation.
-Text Language Code: The BCP-47 language tag (e.g., en-US, fr-FR) to ensure correct pronunciation and accent.

Provides Outputs:
-Audio Generated Audio: The generated synthetic speech audio stream.

GenAI/Google

Gemini Flash3.1 Image

Generates high-quality images from text descriptions and optional reference images using the Gemini Flash 3.1 model.

Expects Inputs:
-Text Prompt: Detailed text description of the image to be generated.
-Image Ref Image (optional) x14: Multiple visual references used to guide the AI on style, layout, or specific details.
-Text Aspect Ratio: Defines the frame dimensions (e.g., 16:9, 1:1, 9:16).
-Text Size (512, 1K, 2K, or 4K): Determines the output resolution and detail density of the final image.
-Integer Seed: A numerical value to ensure reproducible results or for iterative tweaking of a specific generation.

Provides Outputs:
-Image Generated Image: The finalized AI-generated image asset.

GenAI/Google

Gemini Pro3.1 Media Inspector

Utilizes Gemini 3.1 Pro capabilities to inspect multiple forms of media (up to 8 images, 4 videos, and 1 audio file) and provide a text response based on a prompt.

Expects Inputs:
-Text Prompt: Text Prompt (required).
-Image Images: Up to 8 images.
-Video Videos: Up to 4 videos.
-Audio Audio: 1 audio file.

Provides Outputs:
-Text Generated Text: The response from the model.

GenAI/Google

Gemini Pro3.1 Text

Processes natural language prompts to generate creative copy, analyze data, or describe visual inputs.
It can utilize reference images to provide high-context answers or follow specific Agent Instructions for tailored personas.

Expects Inputs:
-Text Prompt: The primary text instruction or question for the AI.
-Image Ref Image (optional) x2: Visual context for the AI to ‘look at’ when generating its response.
-Text Agent Instructions: High-level system instructions to define the AI’s behavior, tone, or specific formatting requirements.
-Integer Seed: A numerical value used to ensure reproducible text outputs.

Provides Outputs:
-Text Generated Text: The AI-generated text response.

GenAI/Google

Gemini Pro3.0 Audio Inspector

Utilizes Gemini’s multimodal capabilities to listen to audio streams and process as requested in the prompt.

Expects Inputs:
-Text Prompt: The prompt to be sent to the model.
-Audio Audio: The source sound file or stream to be processed.

Provides Outputs:
-Text Generated Text: The finalized response from the model.

GenAI/Google

Gemini Pro3.0 Image

Generates high-quality images from text descriptions and optional reference images using the Gemini Pro 3.0 model.

Expects Inputs:
-Text Prompt: Detailed text description of the image to be generated.
-Image Ref Image (optional) x6: Multiple visual references used to guide the AI on style, layout, or specific details.
-Text Aspect Ratio: Defines the frame dimensions (e.g., 16:9, 1:1, 9:16).
-Text Size (1K, 2K, or 4K): Determines the output resolution and detail density of the final image.
-Integer Seed: A numerical value to ensure reproducible results or for iterative tweaking of a specific generation.

Provides Outputs:
-Image Generated Image: The finalized AI-generated image asset.

GenAI/Google

Gemini Pro3.0 Text

Processes natural language prompts to generate creative copy, analyze data, or describe visual inputs. It can utilize reference images to provide high-context answers or follow specific agent instructions for tailored personas.

Expects Inputs:
-Text Prompt: The primary text instruction or question for the AI.
-Image Ref Image (optional) x2: Visual context for the AI to 'look at' when generating its response.
-Text Agent Instructions: High-level system instructions to define the AI’s behavior, tone, or specific formatting requirements.
-Integer Seed: A numerical value used to ensure reproducible text outputs.

Provides Outputs:
-Text Generated Text: The finalized text generated by the model.

GenAI/Google

Imagen 3.0 Mask Inpaint

Allows for detailed image editing by painting new content into a defined area of an existing image. It utilizes a background image as a base and a mask to designate which pixels the AI should regenerate based on the provided description.

Expects Inputs:
-Text Describe Inpainting: A text prompt detailing exactly what should be generated within the masked area.
-Image Background Image: The original source image that serves as the base for the edit.
-Image Mask Image (optional): A grayscale or alpha mask where white/opaque areas indicate the region to be changed and black/transparent areas remain untouched.

Provides Outputs:
-Image Inpainted Image: The finalized image with the specified region inpainted.

GenAI/Google

Imagen 3.0 Mask Outpaint

Used for generative expansion of an image (uncropping), allowing artists to create larger scenes while maintaining the style and content of the original background.

Expects Inputs:
-Text Describe Outpainting: A text prompt detailing what should be generated in the expanded areas.
-Image Background Image: The source image to be expanded.
-Image Mask Image (optional): Defines the specific area outside the original image boundaries to be generated.
-Boolean Blend Original: A toggle to enable or disable feathered blending between the source image and the newly generated content.

Provides Outputs:
-Image Outpainted Image: The finalized, expanded image asset.

GenAI/Google

Imagen 4.0 Super-Res

Increases the pixel dimensions of an input image while intelligently reconstructing textures and edges. This is essential for bringing low-resolution generations or cropped assets up to production standards (2K/4K) without the blurring typical of standard bicubic upscaling.

Expects Inputs:
-Image Input Image: The source image to be upscaled.
-Integer Scale (2, 3, or 4): The multiplication factor for the resolution (e.g., a 2 scale turns a 1080p image into 4K).

Provides Outputs:
-Image Super-Res Image: The high-resolution, enhanced version of the source image.

GenAI/Google

Lyria2.0 Music

Creates high-quality 48kHz stereo audio compositions by translating natural language descriptions into fully realized musical pieces.

Expects Inputs:
-Text Music Prompt: A descriptive text in English detailing the desired genre, mood, instrumentation, and style (e.g., ‘A calm acoustic folk song with a gentle guitar melody’).
-Text Negative Prompt: A description of specific elements, such as vocals, drums, or fast tempo, that the model should exclude from the generated audio.
-Integer Seed: A specific value used for deterministic generation, ensuring that the same prompt and parameters produce the same audio output.

Provides Outputs:
-Audio Generated Music: A generated instrumental audio track, typically up to 30 seconds in length, provided in a high-quality format like WAV.

GenAI/Google

Lyria 3 Music

Generates high-quality audio using Google Lyria 3 REST API. Provide a text prompt, an optional image URI, and choose a lyria-3 model.

Expects Inputs:
-Text Music Prompt: The prompt guiding the generation of music.
-Image Image: Optional reference image to guide the generation of music.
-String Model: The Lyria model to use.

Provides Outputs:
-Audio Generated Music: The generated music.
-Text Lyrics: Any lyrics generated in the music.
-Text Description: A description of the generated music.

GenAI/Google

Veo3

Produces high-quality video sequences from natural language descriptions using Veo3.

Expects Inputs:
-Text Prompt: A detailed description of the scene, including character actions, lighting, and camera movement.
-Image Image (optional): A reference image used to guide the initial frame, visual style, or character design for the video generation.
-Text Negative Prompt: Specific elements, visual styles, or camera behaviors to be excluded from the generated sequence.
-Boolean Portrait: A toggle to switch between standard widescreen (landscape) and vertical (portrait) aspect ratios for mobile-optimized delivery.
-Integer Seed: Set a value to control repeatability in video generation.
-Boolean Lossless: Set to true to generate a lossless video (very large file size).

Provides Outputs:
-Video Generated Video: The finalized cinematic video file with natively generated audio.

GenAI/Google

Veo3.1

Uses advanced generative AI to create smooth, natural video sequences that bridge two different images representing the first and last frame.

Expects Inputs:
-Text Prompt: A text description detailing the action, lighting, and style of the transition to guide the AI's interpolation.
-Image Start Image (optional): The starting reference frame that defines the beginning of the video sequence.
-Image End Image (optional): The ending reference frame that defines the final state of the video sequence.
-Text Negative Prompt: Elements or styles to be explicitly excluded from the generated transition.
-Boolean Portrait: A toggle to switch between widescreen (16:9) and vertical (9:16) aspect ratios.
-Float Duration (secs): Requested video duration (may not be respected in certain combinations of options).
-Integer Seed: Set a value to control repeatability in video generation.
-Boolean Lossless: Set to true to generate a lossless video (very large file size).
-Text Resolution: Desired resolution for the video, 720p, 1080p, or 4k.
-Text Model: Veo model to use, Standard, Fast, or Lite (note: 4k is not supported with the Lite model).
-Text Audio: Select No Audio to suppress audio generation.

Provides Outputs:
-Video Generated Video: The finalized video clip (typically 8 seconds) featuring smooth motion and natively generated audio.

GenAI/Google

Veo3.1 Ref Images

Generates cinematic video sequences while maintaining strict subject, character, or style consistency through multiple visual references.

Expects Inputs:
-Text Prompt: Detailed instructions on action, lighting, and camera movement.
-Image Ref Image (optional) x3: Upload up to three images of a specific subject to help the AI preserve their identity.
-Text Negative Prompt: Descriptive text for elements to exclude from the generation.
-Boolean Portrait: Toggle for vertical 9:16 or widescreen 16:9 aspect ratio.
-Integer Seed: Set a value to control repeatability in video generation.
-Boolean Lossless: Set to true to generate a lossless video (very large file size).
-Text Resolution: Desired resolution for the video, 720p, 1080p, or 4k.
-Text Model: Veo model to use, Standard, or Fast.
-Text Audio: Select No Audio to suppress audio generation.

Provides Outputs:
-Video Generated Video: The finalized 8-second cinematic video featuring smooth motion and natively generated audio.

GenAI/Google

Veo3.1 Video Extension

Uses generative AI to lengthen existing video clips by synthesizing new, contextually relevant frames based on a text prompt.

Expects Inputs:
-Text Prompt: A text description of the action or visual elements you want to see in the newly generated extension.
-Video Video: The source video clip that provides the starting context for the extension.
-Text Negative Prompt: A text description of elements you want the AI to specifically avoid in the generated footage (e.g., ‘blur’, ‘distorted faces’).
-Boolean Portrait: A toggle to define the aspect ratio; when enabled, the model optimizes for vertical mobile-first content rather than landscape cinematic content.
-Integer Seed: Set a value to control repeatability in video generation.
-Boolean Lossless: Set to true to generate a lossless video (very large file size).
-Text Model: Veo model to use, Standard, Fast, or Lite.

Provides Outputs:
-Video Generated Video: An extended version of the input video containing the original footage followed by the newly synthesized AI sequence.

GenAI/Sync

Sync Lipsync

Generates a lip-synced video using the Sync API. Uploads a source video and audio track, then uses AI to synchronize the lip movements in the video to match the provided audio.

Expects Inputs:
-Video Source Video: The video containing the face(s) to lip-sync. Must be under 20 MB.
-Audio Audio: The audio track to sync to the video. Must be under 20 MB.
-Text Prompt: Optional text prompt to guide the animation.
-Float Temperature: The temperature to use for the generation.
-Text Model: The model to use for the generation.
-Text Sync Mode: How to handle length differences between video and audio (Loop, Freeze, or Trim).
-Text Model Mode: The model mode to use for the generation (react-1 only).

Provides Outputs:
-Video Generated Video: The lip-synced output video.

GenAI/Topaz

Topaz Video Enhancement Resolution

This node is designed to take a source video and increase its quality and pixel count using Topaz.

Dropdowns (Top to Bottom):
-Codec [AV1, H264, H265, VP9].
-Output File Format (Auto, .mkv, .mp4, .mov).
-Audio Codec (AAC, AC3, PCM).
-Processing Mode (Copy, Convert, None).

Expects Inputs:
-Video Video Data: The source video file to be enhanced.
-Integer Width: The target horizontal resolution in pixels.
-Integer Height: The target vertical resolution in pixels.
For 4K Astra, use 3840x2160 (Landscape) or 2160x3840 (Portrait). This establishes a bounding box, and the aspect ratio is preserved.
-Text Enhancement Filter: The specific AI model used to reconstruct textures and remove artifacts. Use the Enhancement Filter Node to structure JSON.
-Text Frame Interpolation: The model used to generate intermediate frames for smoother motion. Use the Interpolation Filter Node to structure JSON.
-Text Astra Overrides: Sets paid diffusion flag needed for Astra models slf-1 and slc-1.

Provides Outputs:
-Video Generated Video: The finalized, enhanced, and upscaled video file.

GenAI/Topaz

Topaz Video Enhancement Filter

Formats settings into JSON for connection to Topaz Video Enhancement.

Expects Inputs:
-Text Model: Selects the core AI architecture (e.g., ahq-12, proteus, artemis). High-quality models like ahq are best for high-bitrate sources.
-Text Video Type: Defines the source video structure. Progressive is standard for digital/AI video; Interlaced is for older broadcast/tape footage.
-Text Auto: Toggles sub-algorithms between Auto (AI-determined) and Manual (user-defined) for the sliders below.
-Text Field Order: For interlaced video, where each frame is split into two fields of alternating lines.
-Text Focus Fix Level: Repairs slightly out-of-focus footage by downscaling to find a sharper base, then upscaling back.
-Text Creativity: Low (default) or high. Only applies to the slc-1 Starlight Creative Astra model.
-Float Compression (-1.0 to 1.0): Removes blocky artifacts from low-bitrate or crunchy source videos.
-Float Details (0.0 to 0.1): Reconstructs micro-textures like skin pores, fabric weaves, or fine foliage.
-Float Pre-noise (0.0 to 0.1): How much original sensor noise to ignore before enhancement begins.
-Float Noise (-1.0 to 1.0): Targets removal or preservation of luminance and chroma noise.
-Float Halo (-1.0 to 1.0): Reduces ringing or white outlines on edges from over-sharpened source material.
-Float Pre-blur (-1.0 to 1.0): Softens harsh edges or pre-processes pixelated footage so the AI can read shapes better.
-Float Blur (-1.0 to 1.0): General blur control.
-Float Grain (0.0 to 1.0): AI grain intensity to prevent an unnaturally smooth look.
-Integer Grain (0 to 5): Grain particle size for filmic vs. digital aesthetic.
-Float Recover Original Detail (0.0 to 1.0): Blends original unprocessed frames back to maintain a natural appearance.

Provides Outputs:
-Text Enhancement Filter: The formatted filter configuration.
-Text Astra Overrides: Additional Astra model overrides.

GenAI/ComfyUI

ComfyUI Workflow Bridge

Sends image, video, and text data to a user-provided ComfyUI workflow JSON.

Unlike the standard ComfyUI Bridge, which fetches the workflow from the connected ComfyUI instance, this node accepts a workflow JSON via connection (e.g., from a Load Text Node).

Requires the Nodey Bridge extension installed in ComfyUI.


Expects Inputs:
-Image Image (optional) Image data to send to ComfyUI.
-Video Video (optional): Video data to send to ComfyUI.
-Text Text (optional): Text/prompt data to send to ComfyUI.
-Text Workflow JSON: Connect a Load Text Node with the workflow in API format – exported via Save (API Format) in ComfyUI.
-Text Nodey ID: Unique identifier matching a NodeyInput Node in the workflow.

Provides Outputs:
-Image Result Image: First output image from ComfyUI workflow.
-Video Result Video: First output video/GIF from ComfyUI workflow.
-Text Result Text: Text output from ComfyUI workflow (if any).

GenAI/ComfyUI

ComfyUI Bridge

Sends image, video, and text data to ComfyUI workflows.

Requires the Nodey Bridge extension installed in ComfyUI.
Automatically fetches the current workflow from the connected ComfyUI instance.
Data is mapped to NodeyInput Nodes by matching the nodey_id.

Expects Inputs:
-Image Image (optional): Image data to send to ComfyUI.
-Video Video (optional): Video data to send to ComfyUI.
-Text Text (optional): Text/prompt data to send to ComfyUI.
-Text Nodey ID: Unique identifier matching a NodeyInput Node in ComfyUI.

Provides Outputs:
-Image Result Image: First output image from ComfyUI workflow
-Video Result Video: First output video/GIF from ComfyUI workflow
-Text Result Text: Text output from ComfyUI workflow (if any).

IO

Screenshare Output

Captures screen content as image frames or video recordings.

Uses the browser's Screen Capture API to record from a selected display, window, or browser tab.
Requires the user to establish a configuration for screen capture, including source selection (display, window, or tab) and capture settings.

Provides Outputs:
-Image Captured Frame: Single image frame from the screen capture.
-Video Captured Video: Video recording of the screen capture session.

GenAI/Topaz

Topaz Video Frame Interpolation Filter

This node creates a structured JSON configuration for temporal enhancements.

Expects Inputs:
-Text Model: Selects the AI architecture for motion estimation (e.g., apo-8, chronos, apollo). apo models are optimized for fast-moving action and high-accuracy slow motion.
-Integer Slowmo (1 to 16): Sets the deceleration factor. A value of 2 doubles the number of frames (half speed), while 16 creates extreme slow motion from standard footage.
-Integer FPS (15 to 240): Sets the target frames per second. The AI will generate exactly enough frames to reach this specific playback speed regardless of the source.
-Text Duplicate Frames: Toggles True/False detection and replacement of repeated frames, common in traditional animation or low-bandwidth web video.
-Integer Duplicate Threshold (0.001 to 0.1): Sensitivity for duplicate detection. Higher values make the AI more aggressive in identifying slightly varying frames as duplicates.

Provides Outputs:
-Text Frame Interpolation Filter: The formatted interpolation configuration.

Image

Cube Map to Images

Converts a standard flat cube map image to the individual image planes.

Expects Inputs:
-Image Cubemap Image: The standard flat cube map image.

Provides Outputs:
-Image Front Image: The front-facing image.
-Image Left Image: The left-facing image.
-Image Back Image: The back-facing image.
-Image Right Image: The right-facing image.
-Image Top Image: The top-facing image.
-Image Bottom Image: The bottom-facing image.

Image

Cube to Panorama

Converts a set of 4 cube face images to a cylindrical projected panoramic image.

Expects Inputs:
-Image Front Image: The front-facing image.
-Image Left Image: The left-facing image.
-Image Back Image: The back-facing image.
-Image Right Image: The right-facing image.

Provides Outputs:
-Image Panoramic Image: The panoramic image as a cylindrical projection of the cube faces provided.

Image

Cube to Sphere

Converts a set of 6 cube face images to an equirectangular projected spherical image.

Expects Inputs:
-Image Front Image: The front-facing image.
-Image Left Image: The left-facing image.
-Image Back Image: The back-facing image.
-Image Right Image: The right-facing image.
-Image Top Image: The top-facing image.
-Image Bottom Image: The bottom-facing image.

Provides Outputs:
-Image Equirectangular Image: The equirectangular projected spherical image.

Image

Image Atlas

Aggregates multiple individual images into a single tiled composite or grid-based reference sheet.

Expects Inputs:
-Image Image 1-9: The source images to be tiled in the grid.
-Text Color: Color: Defines the background or border color between the tiled images.

Provides Outputs:
-Image Image Atlas: The final combined grid image.

Image

Image Blur

Applies professional-grade softening effects to an image with support for asymmetric and box-style blurring.

Expects Inputs:
-Image Image: The source image to be blurred.
-Float Blur Radius: The primary intensity of the effect; larger values result in a more out-of-focus image.
-Float Asym Vert Blur (optional): Allows for independent control over vertical blurring to create anamorphic-style streaks or motion-blur effects.
-Boolean Box (not Gaussian): A toggle to switch from a smooth Gaussian curve to a faster, more linear Box blur algorithm, which can produce more stylized, angular edges.

Provides Outputs:
-Image Blurred Image: The blurred image.

Image

Image Color Correction

This node provides a streamlined interface to modify the tonal and color characteristics of an image. It is essential for matching AI-generated elements with real-world footage or correcting lighting inconsistencies within a workflow.

Expects Inputs:
-Image Image: The source image to be color corrected.

Provides Outputs:
-Image Edited Image: The color-corrected image.

Image

Image Composite

This node is the primary tool for layered image construction. It allows you to place a foreground element (Overlay) onto a base layer (Background) using a mask to precisely control which parts of the overlay are visible.

Expects Inputs:
-Image Background Image: The base layer that serves as the foundation of the composite.
-Image Overlay Image: The foreground element to be placed on top of the background.
-Image Overlay Mask Image: A grayscale image that determines the transparency of the Overlay; white areas are fully visible, while black areas are hidden.

Provides Outputs:
-Image Composited Image: The finalized composite image with the layers successfully merged.

Image

Image Crop

Extracts a specific rectangular sub-section of an image by defining precise pixel offsets and dimensions.

Expects Inputs:
-Image Image: The source image to be cropped.
-Integer Left: The horizontal starting point (X-coordinate) for the crop, measured in pixels from the left edge.
-Integer Top: The vertical starting point (Y-coordinate) for the crop, measured in pixels from the top edge.
-Integer Width: The horizontal length of the final cropped area.
-Integer Height: The vertical length of the final cropped area.

Provides Outputs:
-Image Cropped Image: Cropped Image.

Image

Image Crop w/ UI

Extracts a specific rectangular sub-section of an image through a dedicated graphical interface.

Expects Inputs:
-Image Image: The source image to be cropped.

Provides Outputs:
-Image Cropped Image: The cropped image.

Image

Image Details

This node acts as an inspector that reads an image file and converts its internal properties into individual data streams.

Expects Inputs:
-Image Image: The source image asset you wish to analyze.

Provides Outputs:
-Integer Width: The horizontal resolution of the image in pixels (e.g., 1400).
-Integer Height: The vertical resolution of the image in pixels (e.g., 1024).
-Integer Channels: The number of color channels (e.g., 3 for standard RGB, 4 if there is an Alpha/Transparency channel).
-Text Format: The file extension or encoding type of the image (e.g., jpeg).

Image

Image FlipFlop

Mirrors an image vertically (Flip), horizontally (Flop), or both (FlipFlop).

Expects Inputs:
-Image Image: The source image to transform.
-Text Mode: The mirroring direction — Flip mirrors vertically (top ↔ bottom), Flop mirrors horizontally (left ↔ right), FlipFlop applies both.

Provides Outputs:
-Image Flipped Image: The mirrored image.

Image

Image Grey Convert

This node is used to strip color information from an image, converting it into a single-channel grayscale representation. It is essential for creating luminance masks, preparing images for specific AI depth-analysis models, or achieving a classic black-and-white aesthetic.

Expects Inputs:
-Image Image: The source image to be converted.

Provides Outputs:
-Image Converted Image: The grayscale image.

Image

Image Invert

This node mathematically flips the color information of an input image (e.g., changing white to black, red to cyan, etc.). It’s a fundamental utility for mask manipulation, allowing you to quickly swap the active and inactive areas of a transparency map.

Expects Inputs:
-Image Image: The source image or mask to be inverted.
-Boolean Also Invert Mask: A toggle that determines if the alpha channel (transparency) should also be flipped along with the RGB color data.

Provides Outputs:
-Image Inverted Image: The finalized inverted image asset.

Image

Image Mask

This node applies masking data to a source image. It allows you to define which parts of an image should be visible or hidden based on a secondary mask.

Expects Inputs:
-Image Image: The primary source image intended to be masked.
-Image Mask (optional): A secondary image asset (typically grayscale) used to define transparency; white pixels generally represent opacity, while black pixels represent transparency.

Provides Outputs:
-Image Masked Image: The resulting image asset with the mask applied to its alpha channel.

Image

Image Merge

Reconstructs a full-color image by combining four independent grayscale channel maps.

Expects Inputs:
-Image Red Channel: A grayscale image used to define the intensity of the red color values.
-Image Green Channel: A grayscale image used to define the intensity of the green color values.
-Image Blue Channel: A grayscale image used to define the intensity of the blue color values.
-Image Alpha Channel: A grayscale image used to define transparency (white is opaque, black is transparent).

Provides Outputs:
-Image Merged Image: The finalized multi-channel color image resulting from the merge.

Image

Image Pad

This node is primarily used to prepare images for outpainting – the process of extending an image beyond its original borders. By adding empty space (padding) around the source, it provides the AI with a workspace to generate new content that blends with the original image.

Expects Inputs:
-Image Image: The source image asset to be padded.
-Integer Left: The number of pixels to add to the left side of the image.
-Integer Top: The number of pixels to add above the image.
-Integer Right: The number of pixels to add to the right side of the image.
-Integer Bottom: The number of pixels to add below the image.
-Text Color (optional): Defines the fill color for the padded areas; if not specified, it typically defaults to black (zero-padding).

Provides Outputs:
-Image Padded Image: The finalized image with the specified padding added to its dimensions.

Image

Image Paint

Provides a dedicated manual interface for painting, annotating, and creating multi-layered masks on an image.

Expects Inputs:
-Image Image: The source image to be used as the base canvas for painting.

Provides Outputs:
-Image Edited Image: The primary output that merges the original base image with all painted layers
-Image Drawing: An isolated output containing only the brush strokes on a transparent or neutral background, excluding the original source image.
-Image Mask: A secondary output that may provide only the isolated paint/brush strokes as a separate asset.

Image

Image Resize

This node is a fundamental utility for controlling the physical size of image assets. It’s essential for ensuring images meet the specific input requirements of generation models or for preparing assets for final export. The node allows for both proportional scaling and forced aspect ratio changes.

Expects Inputs:
-Image Image: The source image asset to be resized.
-Integer Width: The target horizontal resolution in pixels.
-Integer Height: The target vertical resolution in pixels.
-Boolean Free Scale: A toggle that determines scaling behavior. When False, the node typically maintains the original aspect ratio (using the dimensions as a fit-within box); when True, it stretches the image to match the exact width and height provided.

Provides Outputs:
-Image Resized Image: The processed image asset at the new specified dimensions.

Image

Image Rotate

This node is used to adjust the orientation of an image asset. It’s essential for correcting crooked horizon lines in AI-generated landscapes, orienting character references

Expects Inputs:
-Image Image: The source image asset to be rotated.
-Float Angle (degrees): The numerical value for the rotation, where positive values typically rotate clockwise and negative values rotate counter-clockwise.
-Text Color (optional): Defines the fill color for the wedges or empty areas created in the corners when an image is rotated at non-orthogonal angles

Provides Outputs:
-Image Rotated Image: The finalized rotated image asset.

Image

Solid Image

This node is used to create base canvases, background layers, or solid-color masks. It is an essential utility for providing a clean background image for the Image Composite Node or for generating a specific color constant to be used in Image Merge operations.

Expects Inputs:
-Integer Width: The horizontal resolution in pixels.
-Integer Height: The vertical resolution in pixels.
-Integer Channels: The number of color channels.
-Text Color: The fill color for the solid image.

Provides Outputs:
-Image Solid Image: The generated solid color image.

Image

Image Split

Deconstructs a color image into its fundamental Red, Green, Blue, and Alpha channel components.

Expects Inputs:
-Image Image: The source color image asset you wish to deconstruct.

Provides Outputs:
-Image Red Channel (Top): A grayscale map representing the intensity of red values across the image.
-Image Green Channel (Middle-Top): A grayscale map representing the intensity of green values.
-Image Blue Channel (Middle-Bottom): A grayscale map representing the intensity of blue values.
-Image Alpha Channel (Bottom): A grayscale map representing transparency – fully opaque areas appear white, while transparent areas appear black.

Image

Image 2 Video

This node acts as a bridge between image and video processing. It takes a single still frame and stretches it across time to create a video clip.

Expects Inputs:
-Image Image: The source still image intended to be converted.
-Float Duration (secs): The total length of the resulting video file in seconds (e.g., 5 seconds).
-Float Framerate (fps): The playback speed or temporal resolution of the video (e.g., 30 fps).

Provides Outputs:
-Video Video from Image: A video data stream containing the repeated frame sequence.

IO

Load Audio

Loads an audio file.

Expects Inputs:
-Audio Audio File: The audio file to load.

Provides Outputs:
-Audio Audio Data: The loaded audio data.

IO

Save Audio

Saves an audio file.

Expects Inputs:
-Audio Audio to Save: The audio data to save.

Provides Outputs:
-Audio Audio Saved: The saved audio file.

IO

Load Image

Loads an image file.

Expects Inputs:
-Image Image File: The image file to load.

Provides Outputs:
-Image Image: The loaded image.

IO

Save Image

Saves an image file.

Expects Inputs:
-Image Image to Save: The image to save.

Provides Outputs:
-Image Image Saved: The saved image.

IO

Load Image Sequence

Imports a series of numbered image files and compiles them into a playable video data stream.

Expects Inputs:
-Video Sequence Data: The image sequence to load.

Provides Outputs:
-Video Video: The compiled video stream.

IO

Save Image Sequence

This node acts as an export engine that deconstructs video data back into its constituent frames.

Expects Inputs:
-Video Video Input: The video to export as frames.

Provides Outputs:
-Video Video Output: The processed video data.

IO

Load 3D Model

Loads a 3D model file.

Expects Inputs:
-Model-3D Model File: The 3D model file to load.

Provides Outputs:
-Model-3D Model: The loaded 3D model.

IO

Save 3D Model

Saves a 3D model file.

Expects Inputs:
-Model-3D 3D Model to Save: The 3D model to save.

Provides Outputs:
-Model-3D 3D Model Saved: The saved 3D model.

IO

Load JSON

Loads a JSON file and outputs the parsed JSON object.

Expects Inputs:
-Text JSON File: The JSON file to load.

Provides Outputs:
-json JSON Data: The parsed JSON object.

IO

Save JSON

Saves a JSON object to a file.

Expects Inputs:
-json JSON to Save: The JSON object to save.

Provides Outputs:
-json JSON Saved: The saved JSON data.

IO

Load Splat Model

Loads a Gaussian splat model file.

Expects Inputs:
-Model-splat Splat File: The splat model file to load.

Provides Outputs:
-Model-splat Splat Model: The loaded splat model.

IO

Save Splat Model

Saves a Gaussian splat model file.

Expects Inputs:
-Model-splat Splat Model to Save: The splat model to save.

Provides Outputs:
-Model-splat Splat Model Saved: The saved splat model.

IO

Load Text

Loads a text file.

Expects Inputs:
-Text Text File: The text file to load.

Provides Outputs:
-Text File Content: The loaded text content.

IO

Save Text

Saves a text file.

Expects Inputs:
-Text Text to Save: The text to save.

Provides Outputs:
-Text Text Saved: The saved text.

IO

Load Video

Loads a video file.

Expects Inputs:
-Video Video File: The video file to load.

Provides Outputs:
-Video Video: The loaded video.

IO

Load Audio URL

Loads audio from a remote URL.

Expects Inputs:
-Text URL: The remote URL of the audio to load.

Provides Outputs:
-Audio Audio Data: The loaded audio data.

IO

Load Image URL

Loads an image from a remote URL.

Expects Inputs:
-Text URL: The remote URL of the image to load.

Provides Outputs:
-Image Image Data: The loaded image.

IO

Load 3D Model URL

Loads a 3D model from a remote URL.

Expects Inputs:
-Text URL: The remote URL of the 3D model to load.

Provides Outputs:
-Model-3D 3D Model: The loaded 3D model.

IO

Load Splat URL

Loads a splat model from a remote URL.

Expects Inputs:
-Text URL: The remote URL of the splat model to load.

Provides Outputs:
-Model-splat Splat Model: The loaded splat model.

IO

Load Video URL

Loads a video from a remote URL.

Expects Inputs:
-Text URL: The remote URL of the video to load.

Provides Outputs:
-Video Video Data: The loaded video.

IO

Save Video

Saves a video file.

Expects Inputs:
-Video Video to Save: The video to save.

Provides Outputs:
-Video Video Saved: The saved video.

IO

Video LiLo

Video LiLo (Lots In, Lots Out) runs multiple parallel copies of the upstream video graph and collects results into separate output slots. Each copy executes independently, enabling batch video generation from a single pipeline.

Expects Inputs:
-Video Video to Save: The video pipeline to multiply.
-String Count: Number of parallel executions (1-16).

Provides Outputs:
-Video Video 1-16: Individual video outputs from each parallel execution.

IO

Image LiLo

Image LiLo (Lots In, Lots Out) runs multiple parallel copies of the upstream image graph and collects results into separate output slots. Each copy executes independently, enabling batch image generation from a single pipeline.

Expects Inputs:
-Image Image to Save: The image pipeline to multiply.
-String Count: Number of parallel executions (1-16).

Provides Outputs:
-Image Image 1-16: Individual image outputs from each parallel execution.

JSON

JSON -> Text

Converts a JSON object to a formatted text.

Expects Inputs:
-json JSON Data: The JSON object to convert.

Provides Outputs:
-Text JSON Text: The formatted text.

JSON

JSON Get Key

Extracts a value from a JSON object using a dot-notation key path (e.g. data.users.0.name). Numeric path segments are treated as array indices. The result is always returned as a text.

Expects Inputs:
-json JSON Data: The JSON object to extract from.
-Text Key Path: Dot-notation path to the value (e.g. data.users.0.name).

Provides Outputs:
-Text Value: The extracted value as a text.

JSON

Text -> JSON

Parses a JSON-formatted text into a JSON object.

Expects Inputs:
-Text JSON Text: The JSON-formatted text to parse.

Provides Outputs:
-json Parsed JSON: The parsed JSON object.

Math

Add

Adds two numbers.

Expects Inputs:
-Float a: First number.
-Float b: Second number.

Provides Outputs:
-Float sum: The sum of a and b.

Math

Divide

Divides two numbers.

Expects Inputs:
-Float a: Dividend.
-Float b: Divisor.

Provides Outputs:
-Float quotient: The result of a divided by b.

Math

Expression

Evaluates a mathematical expression.

Expects Inputs:
-Text Expression: The mathematical expression to evaluate.

Provides Outputs:
-Float Result (float): The result as a floating point number.
-Integer Result (integer): The result as an integer.
-Boolean Result (boolean): The result as a boolean.

Math

Float -> Int

Converts a float to an integer.

Expects Inputs:
-Float f: The float value to convert.

Provides Outputs:
-Integer int: The integer result.

Math

Int -> Float

Converts an integer to a float.

Expects Inputs:
-Integer i: The integer value to convert.

Provides Outputs:
-Float float: The float result.

Math

Multiply

Multiplies two numbers.

Expects Inputs:
-Float a: First number.
-Float b: Second number.

Provides Outputs:
-Float product: The product of a and b.

Math

Power

Raises a number to a power.

Expects Inputs:
-Float a: The base number.
-Float b: The exponent.

Provides Outputs:
-Float power: The result of a raised to the power of b.

Math

Subtract

Subtracts two numbers.

Expects Inputs:
-Float a: First number.
-Float b: Second number.

Provides Outputs:
-Float difference: The result of a minus b.

Model

GLB Apply Textures

Appends external PBR textures to a GLB's binary payload and updates the first material definition to reference them.

Expects Inputs:
-Image Base Color: The base color material.
-Image Metallic Roughness: The metallic roughness component.
-Image Normal Map: The normal map.
-Image Occlusion Map: The occlusion map.
-Image Emissive Map: The emissive map.

Provides Outputs:
-Model Textured Model: The model with the materials and textures applied.

Model

GLB Extract Textures

Extracts standard PBR textures from the given material index of a GLB file.

Expects Inputs:
-Model Model: The input GLB model with materials and textures.
-Integer Material Index: Optional material index to extract.

Provides Outputs:
-Image Base Color: The base color material.
-Image Metallic Roughness: The metallic roughness component.
-Image Normal Map: The normal map.
-Image Occlusion Map: The occlusion map.
-Image Emissive Map: The emissive map.

Model

GLB Strip Textures

Removes all material, texture, and image references from a GLB 3D model.

Expects Inputs:
-Model Model: The input GLB model with materials and textures to be removed.

Provides Outputs:
-Model Model: The output GLB model with the materials and textures removed.

ShotGrid

Load Shotgrid Published Image

Loads a published file image from the currently active Shotgrid project.

Expects Inputs:
-Text Published Metadata: Publish metadata retrieved from Shotgrid.
Provides Outputs:
-Image Image: Image from the published metadata.

ShotGrid

Load Shotgrid Published Video

Loads a published file video from the currently active Shotgrid project.

Expects Inputs:
-Text Published Metadata: Publish metadata retrieved from Shotgrid.

Provides Outputs:
-Video Video: Video from the published metadata.

ShotGrid

Shotgrid Published Files

Provides the ability to specify filters for Shot, Task, Name Search, Pipeline Step, and Published File Type to generate a selection list of ShotGrid published files.

Provides Outputs:

  • Text JSON Metadata: A JSON text containing metadata for the selected ShotGrid published file.

Text

Int -> Text

Converts an integer to text.

Expects Inputs:
-Integer I: The integer to convert.

Provides Outputs:
-Text Text: The text representation.

Text

Text -> Int

Converts text to an integer.

Expects Inputs:
-Text Text: The text to convert.

Provides Outputs:
-Integer Int: The integer result.

Text

Float -> Text

Converts a float to a text.

Expects Inputs:
-Float F: The float to convert.

Provides Outputs:
-Text Text: The text representation.

Text

Text -> Float

Converts a text to a float.

Expects Inputs:
-Text Text: The text to convert.

Provides Outputs:
-Float Float: The float result.

Text

Text Concatenate

Concatenates two texts.

Expects Inputs:
-Text Text 1: The first text.
-Text Text 2: The second text.

Provides Outputs:
-Text Concatenated Text: The combined text.

Text

Text Portion

Gets a subtext of a text.

Expects Inputs:
-Text Text: The source text.
-Integer 1st Character (from 1): The starting character position.
-Integer Number of Characters: The number of characters to extract.

Provides Outputs:
-Text Portion: The extracted subtext.

Video

Video Audio Mix

Synchronizes independent audio and video streams into a unified media asset with precise timing and volume control.

Expects Inputs:
-Video Video: The primary visual data stream.
-Audio Audio: The sound file or audio stream to be combined with the video.
-Float Mix (-1.0 to 1.0): Adjusts the output volume balance. 0 represents the original volume, while negative values attenuate and positive values boost the signal.
-Float Audio Delay (secs): Offsets the audio start time relative to the video. Positive values delay the audio, while negative values make the audio start earlier to fix sync drift.

Provides Outputs:
-Video Video With Audio: A finalized media container containing both the visual and auditory tracks synced together.

Video

Video Audio Split

Split the audio track out of a video with an audio track.

Expects Inputs:

  • Video Video: The video with the audio track to be extracted

  • Boolean Strip Audio from Video: If true, remove the audio track from the returned video.

Provides Outputs:

  • Video Video: The original video, optionally with the audio removed.

  • Audio Audio: The extracted audio track (if any).

Video

Video Color Correction

This node provides a streamlined interface to modify the tonal and color characteristics of a video. It is essential for matching AI-generated elements with real-world footage or correcting lighting inconsistencies within a workflow.

Expects Inputs:
-Video Video: The source video to be color corrected.

Provides Outputs:
-Video Edited Video: The color corrected video.

Video

Video Composite

It allows you to layer a foreground video (Overlay) onto a base video (Background) while using a third video stream (Mask) to define visibility.

Expects Inputs:
-Video Background Video: The base video layer that provides the foundation for the composition.
-Video Overlay Video: The foreground video layer to be placed on top of the background.
-Video Overlay Mask Video: A grayscale video stream that determines the transparency of the Overlay; white areas are visible, and black areas are hidden.
-Float Overlay Delay (secs): Offsets the start time of the Overlay and Mask videos relative to the Background. A positive value waits before starting the overlay, while a negative value starts it earlier.

Provides Outputs:
-Video Composited Video: The finalized composite video stream with layers merged.

Video

Video Concat

This node acts as a basic non-linear editor within the node graph. It appends Video 2 directly to the end of Video 1, allowing for the creation of multi-shot sequences or the stitching together of AI-generated clips without needing an external video editor.

Expects Inputs:
-Video Video 1: The primary video clip that will appear first in the sequence.
-Video Video 2: The second video clip that will be appended to the first.
-Boolean Trim Joining Frame: A toggle that, when enabled, removes the overlapping or redundant frame at the exact point where the two videos meet

Provides Outputs:
-Video Concatenated Video: A single video data stream containing the combined sequence of both input clips.

Video

Video Crop

Extracts a specific rectangular sub-section of a video by defining precise pixel offsets and dimensions.

Expects Inputs:
-Video Video: The source image to be cropped.
-Integer Left: The horizontal starting point (X-coordinate) for the crop, measured in pixels from the left edge.
-Integer Top: The vertical starting point (Y-coordinate) for the crop, measured in pixels from the top edge.
-Integer Width: The horizontal length of the final cropped area.
-Integer Height: The vertical length of the final cropped area.

Provides Outputs:
-Video Cropped Video: Cropped Image

Video

Video Crop w/ UI

Extracts a specific rectangular sub-section of a video through a dedicated graphical interface.

Expects Inputs:
-Video Video: The source video to be cropped.

Provides Outputs:
-Video Cropped Video: The cropped video.

Video

Video Details

This node acts as an inspector for video files, deconstructing a video stream into its fundamental technical specifications

Expects Inputs:
-Video Video: The source image asset you wish to analyze.

Provides Outputs:
-Integer Width: The horizontal resolution of the video in pixels (e.g., 1920).
-Integer Height: The vertical resolution of the video in pixels (e.g., 1080).
-Float FPS: The temporal resolution or playback speed in frames per second (e.g., 29.97).
-Float Duration: The total length of the video file in seconds (e.g., 106.44).
-Integer Audio Channels: The number of independent audio tracks detected (e.g., 2 for Stereo).

Video

Video Edit

This node serves as a "Human-in-the-Loop" editor within a node-based workflow. It allows users to manually define the start and end points of a clip, rearrange sequences, or select specific segments for AI processing. This is essential for isolating a particular action within a long video before sending it to more resource-heavy nodes like Topaz Video Enhance Resolution.

Expects Inputs:
-Video Video: The source video to be edited.

Provides Outputs:
-Video Edited Video: The edited video.

Video

Video FlipFlop

Mirrors a video vertically (Flip), horizontally (Flop), or both (FlipFlop).

Expects Inputs:
-Video Video: The source video to transform.
-Text Mode: The mirroring direction — Flip mirrors vertically (top ↔ bottom), Flop mirrors horizontally (left ↔ right), FlipFlop applies both.

Provides Outputs:
-Video Flipped Video: The mirrored video.

Video

Video Frame

This node acts as a "frame-grabber," allowing you to isolate one moment in time from a video clip

Expects Inputs:
-Video Video: The source video stream from which the frame will be pulled.
-Integer Frame Number: The exact index of the frame to be extracted (e.g., frame 0 for the very first frame).

Provides Outputs:
-Image Frame Image: The extracted still frame as a standard image asset.

Video

Video Grey Convert

This node is used to strip color information from a video, converting it into a single-channel grayscale representation. It’s essential for creating luminance masks, preparing videos for specific AI depth-analysis models, or achieving a classic black-and-white aesthetic.

Expects Inputs:
-Video Video: The source video to convert.

Provides Outputs:
-Video Converted Video: The grayscale video.

Video

Video Mask

This node is a hybrid compositing tool that layers a foreground video onto a background video based on a single, non-moving image mask. It is ideal for picture-in-picture effects, static logo overlays, or framing a video within a specific static shape (like a circle or border) throughout its entire duration.

Expects Inputs:
-Video Background Video: The primary video stream that serves as the bottom layer.
-Video Overlay Video: The secondary video stream to be placed on top.
-Image Overlay Mask Image: A static grayscale image where white pixels reveal the overlay and black pixels hide it.
-Float Overlay Delay (secs): Offsets the start time of the overlay video relative to the background.

Provides Outputs:
-Video Masked Video: The finalized video stream with the masked overlay applied.

Video

Video Pad

This node is primarily used to prepare images for outpainting – the process of extending an image beyond its original borders. By adding empty space (padding) around the source, it provides the AI with a workspace to generate new content that blends with the original image.

Expects Inputs:
-Video Video: The source image asset to be padded.
-Integer Left: The number of pixels to add to the left side of the image.
-Integer Top: The number of pixels to add above the image.
-Integer Right: The number of pixels to add to the right side of the image.
-Integer Bottom: The number of pixels to add below the image.
-Text Color (optional): Defines the fill color for the padded areas; if not specified, it typically defaults to black (zero-padding).

Provides Outputs:
-Video Padded Video: The finalized image with the specified padding added to its dimensions.

Video

Video Resize

This node is a fundamental utility for controlling the physical size of video assets. It’s essential for ensuring videos meet the specific input requirements of generation models or for preparing videos for final export. The node allows for both proportional scaling and forced aspect ratio changes.

Expects Inputs:
-Video Video: The source video asset to be resized.
-Integer Width: The target horizontal resolution in pixels.
-Integer Height: The target vertical resolution in pixels.
-Boolean Free Scale: A toggle that determines scaling behavior. When False, the node typically maintains the original aspect ratio (using the dimensions as a "fit-within" box); when True, it stretches the image to match the exact width and height provided.

Provides Outputs:
-Video Resized Video: The processed video asset at the new specified dimensions.

Video

Video Reverse

This node mathematically reorders the frames of an input video so that the last frame becomes the first and the first frame becomes the last. It’s a creative utility used for boomerang-style loops, corrective temporal adjustments, or achieving specific visual storytelling effects in a video pipeline.

Expects Inputs:
-Video Video: The source video to reverse.

Provides Outputs:
-Video Reversed Video: The reversed video.

Video

Video Rotate

This node is used to adjust the orientation of a video asset. It’s essential for correcting crooked horizon lines in AI-generated landscapes, orienting character references

Expects Inputs:
-Video video: The source video asset to be rotated.
-Float Angle (degrees): The numerical value for the rotation, where positive values typically rotate clockwise and negative values rotate counter-clockwise.
-Text Color (optional): Defines the fill color for the wedges or empty areas created in the corners when a video is rotated at non-orthogonal angles

Provides Outputs:
-Video Rotated Video: The finalized rotated video asset.

Video

Video Split

Deconstructs a color video into its fundamental Red, Green, Blue, and Audio channel components.

Expects Inputs:
-Video Input Video: The source color video asset you wish to deconstruct.

Provides Outputs:
-Video Red Channel (Top): A grayscale map representing the intensity of red values across the image.
-Video Green Channel (Middle-Top): A grayscale map representing the intensity of green values.
-Video Blue Channel (Middle-Bottom): A grayscale map representing the intensity of blue values.
-Audio Audio: The independent audio data stream extracted from the video container.

Video

Video Trim

This node is used to isolate a specific portion of a long video file without needing to launch the full Video Editor interface.

Expects Inputs:
-Video Video: The source video stream to be trimmed.
-Integer Start (frame): The exact frame index where the new clip should begin.
-Float Duration (secs): The total length of the resulting clip in seconds.

Provides Outputs:
-Video Trimmed Video: A new video data stream containing only the specified temporal segment.

Root

Float

A floating-point number constant.

Expects Inputs:
-Float Value: The float value.

Provides Outputs:
-Float Value: The float value.

Root

Integer

An integer constant.

Expects Inputs:
-Integer Value: The integer value.

Provides Outputs:
-Integer Value: The integer value.

Root

Text

A text constant.

Expects Inputs:
-Text text_data: The text value.

Provides Outputs:
-Text text_data: The text value.

Root

Boolean

A boolean constant.

Expects Inputs:
-Text Value: True or False.

Provides Outputs:
-Boolean Value: The boolean value.

GenAI/3D

Meshy Text to 3D

The Meshy Text to 3D Node generates a 3D model from a text description using AI. Unlike the image-based Meshy node, this variant relies entirely on a written prompt to define the object's shape and appearance.

Expects Inputs:
-Text Prompt: A text description of the 3D object to generate (e.g., ‘a medieval wooden shield’).
-Boolean Create Humanoid: A toggle that optimizes the generation pipeline for bipedal character models when enabled.
-Integer Max Triangles: A cap on the mesh density (polygon count) to optimize performance.
-Text Texture Prompt: A text description guiding the look, material, and style of the surface.
-Image Texture Image: A reference image to guide specific patterns or material colors.

Provides Outputs:
-Model-3D Generated Model: The resulting 3D model file, typically including vertex data and mapped textures.

GenAI/3D

Meshy Textureizer

The Meshy Textureizer Node applies AI-generated textures to an existing 3D model. It takes a bare or previously textured mesh and re-skins it based on a text prompt and optional reference image, allowing rapid iteration on surface appearance without regenerating the geometry.

Expects Inputs:
-Model-3D Model: The source 3D mesh to be textured.
-Text Texture Prompt: A text description guiding the desired material, color, and surface style.
-Image Texture Image: A visual reference to guide the patterns or colors of the texture map.
-Boolean Ignore Original UVs: When enabled, the AI discards the model's existing UV layout and generates a new one optimized for the new texture.

Provides Outputs:
-Model-3D Textured Model: The 3D model with newly generated texture maps applied.

GenAI/3D

Meshy Rigger

The Meshy Rigger Node automatically generates a skeletal rig for a 3D model using AI, making it ready for animation. It analyzes the mesh geometry to place bones and joint hierarchies appropriate for the model's shape.

Expects Inputs:
-Model-3D Model: The source 3D mesh to be rigged.
-Float Height (meters): The target real-world height of the model in meters, used to correctly scale the skeleton.
-Boolean Animate: When enabled, applies a default animation cycle to the rigged model for immediate preview.

Provides Outputs:
-Model-3D Rigged Model: The 3D model with an embedded skeletal rig ready for animation.

GenAI/3D

Meshy Retopologizer

The Meshy Retopologizer Node rebuilds the polygon topology of a 3D model using AI. It replaces dense, irregular meshes (common in AI-generated or sculpted models) with cleaner, more efficient geometry suitable for animation, real-time rendering, or further production work.

Expects Inputs:
-Model-3D Model: The source 3D mesh to be retopologized.
-Boolean Generate Quads: When enabled, the output mesh uses quad-dominant topology instead of triangles, which is preferred for subdivision and animation workflows.
-Integer Max Polygons: A cap on the total polygon count for the retopologized mesh.
-Float Height (meters): The target real-world height of the model in meters, used for correct scaling.
-Boolean Set Origin to Bottom: When enabled, repositions the model's origin point to the bottom center of the bounding box, which is standard for placing characters on ground planes.

Provides Outputs:
-Model-3D Retopologized Model: The rebuilt 3D model with optimized topology.

ShotGrid

ShotGrid Publish

The ShotGrid Publish Node uploads and registers an asset file to the connected ShotGrid project. It creates two published file entities in ShotGrid, one for the asset and one for the graph from which the publish was made. Both are provided as outputs for downstream pipeline consumers and review workflows.

Requires:

  • The user must be logged into ShotGrid through the connections panel.

  • The user must have selected a ShotGrid project in the connections panel.

Expects Inputs:
-Text ShotGrid Context: A JSON text containing the names and IDs of the selected Project, Shot, and Task (e.g., from the ShotGrid Context node).

Provides Outputs:
-Text Last Published Asset: A JSON text representation of the asset PublishedFile entity created by this node, including its ID, name, and path.
-Text Last Published Graph: A JSON text representation of the graph PublishedFile entity created by this node, including its ID, name, and path.

Image

Image Composite w/ UI

The Image Composite w/ UI Node provides an interactive graphical interface for layering a foreground element (Overlay) onto a base layer (Background). Unlike the standard Image Composite Node, which requires a separate mask input, this variant includes a built-in visual tool for positioning, scaling, and adjusting the overlay directly. It supports traditional blend modes, including Normal, Multiply, Screen, Overlay, Darken, Lighten, Difference, Exclusion, Hard Light, Soft Light, Color Dodge, Color Burn, Hue, Saturation, Color, and Luminosity.

Expects Inputs:
-Image Background Image: The base layer that serves as the foundation of the composite.
-Image Overlay Image: The foreground element to be placed on top of the background.
-Composite Tool: An interactive UI control for visually adjusting the overlay's position, scale, and blending mode within the composite.

Provides Outputs:
-Image Composited Image: The finalized composite image with the layers merged according to the UI settings.

OKO

OKO Space Selection

Allows the user to select an OKO space from the available options.

Provides Outputs:
-Text Space ID: The OKO Space ID.

OKO

OKO Publish Splat

Allows the user to publish a splat asset to an OKO space library.

Expects Inputs:
-Text Space ID: The OKO space to save to.
-Text Asset Name: The name of the asset.
-Image Asset Thumbnail: The thumbnail image for the asset.
-Model-splat Asset Splat Model: The splat model file for the asset.

OKO

OKO Publish Model

Allows the user to publish a model asset to an OKO space library.

Expects Inputs:
-Text Space ID: The OKO space to save to.
-Text Asset Name: The name of the asset.
-model-3d Asset Model: The 3D model file for the asset.

OKO

OKO Publish Audio

Allows the user to publish an audio asset to an OKO space library.

Expects Inputs:
-Text Space ID: The OKO space to save to.
-Text Asset Name: The name of the asset.
-Audio Asset Audio: The audio file for the asset.

OKO

OKO Publish Video

Allows the user to publish a video asset to an OKO space library.

Expects Inputs:
-Text Space ID: The OKO space to save to.
-Text Asset Name: The name of the asset.
-Image Asset Thumbnail: The thumbnail image for the asset.
-Video Asset Video: The video file for the asset.

OKO

OKO Publish Image

Allows the user to publish an image asset to an OKO space library.

Expects Inputs:
-Text Space ID: The OKO space to save to.
-Text Asset Name: The name of the asset.
-Image Asset Image: The image file for the asset.

OKO

OKO Asset Selection

Allows the user to select an OKO asset from the available options.

Expects Inputs:
-Text Space ID: The OKO Space ID.

Provides Outputs:
-Text Asset URL: The OKO Asset URL.

GenAI/Fal

Happy Horse Image-To-Video

Generates video from a still image using Alibaba's Happy Horse 1.0 model. The input image is used as the first frame, with optional text guidance. Supports up to 1080p resolution and 3-15 seconds of video with synchronized native audio.

Expects Inputs:
-Image Image: The source image to animate (min 300px, aspect ratio 1:2.5 to 2.5:1, max 10 MB).
-Text Prompt: Optional text guidance for the animation (max 2500 characters).
-Integer Seed: Seed for reproducibility.
-Text Resolution: Output video resolution – 720p or 1080p.
-Text Duration: Output video duration in seconds (3-15).

Provides Outputs:
-Video Generated Video: The AI-generated video.

GenAI/Fal

Seedance V1.5 Pro Image-To-Video

Generates high-quality videos from text descriptions and images using the Seedance 1.5 Pro model.

Expects Inputs:
-Text Prompt: The text description detailing the subject, style, lighting, and composition of the desired video.
-Image Image: The image to use to drive the video creation.
-Integer Seed: A seed value to guide the determinism of the generation.
-Text Resolution: The resolution of video to generate, any of 480p, 720p, or 1080p.
-Text Duration: The duration of the video to generate, any of 5 or 10 seconds.

Provides Outputs:
-Video Output: The final AI-generated video file.

GenAI/Fal

 

Generates video from multi-modal references using ByteDance's Seedance 2.0 model. Supports up to 9 images, 3 videos, and 3 audio clips as reference inputs, with no more than 12 total reference items. Reference them in the prompt as @Image1, @Video1, @Audio1, etc.

Expects Inputs:
-Text Prompt: Text description for the video. Reference media using @Image1, @Video1, @Audio1, etc.
-Image Image 1–9: Up to 9 reference images (max 30 MB each, JPEG/PNG/WebP).
-Video Video 1–3: Up to 3 reference videos (combined max 50 MB, 2–15s total duration).
-Audio Audio 1–3: Up to 3 reference audio clips (max 15 MB each, combined max 15s). Requires at least one image or video.
-Integer Seed: Seed for reproducibility.
-Text Generate Audio: Whether to generate synchronized audio.
-Text Aspect Ratio: Output aspect ratio (auto, 21:9, 16:9, 4:3, 1:1, 3:4, 9:16).
-Text Resolution: Output resolution (480p, 720p, 1080p).
-Text Duration: Video length in seconds (auto, or 4–15).

Provides Outputs:
-Video Generated Video: The AI-generated video.

GenAI/Fal

Seedance V1.5 Pro Text-To-Video

Generates high-quality videos from text descriptions using the Seedance 1.5 Pro model.

Expects Inputs:
-Text Prompt: The text description detailing the subject, style, lighting, and composition of the desired video.
-Integer Seed: A seed value to guide the determinism of the generation.
-Text Aspect Ratio: The aspect ratio of the video to generate, any of 480p, 720p, or 1080p.
-Text Resolution: The resolution of video to generate, any of 5 or 10 seconds.
-Text Duration: The duration of the video to generate, any of 5 or 10 seconds.

Provides Outputs:
-Video Output: The final AI-generated video file.

GenAI/Fal

Seedance V2.0 Image-To-Video

Generates high-quality videos from text descriptions and images using the Seedance 2.0 model.

Expects Inputs:
-Text Prompt: The text description detailing the subject, style, lighting, and composition of the desired video.
-Image Image: The image to use to drive the video creation.
-Integer Seed: A seed value to guide the determinism of the generation.
-Text Resolution: The resolution of video to generate, any of 480p, 720p, or 1080p.
-Text Duration: The duration of the video to generate, any of 5 or 10 seconds.

Provides Outputs:
-Video Output: The final AI-generated video file.

GenAI/Fal

Seedance V2.0 Text-To-Video

Generates high-quality videos from text descriptions using the Seedance 2.0 model.

Expects Inputs:
-Text Prompt: The text description detailing the subject, style, lighting, and composition of the desired video.
-Integer Seed: A seed value to guide the determinism of the generation.
-Text Aspect Ratio: The aspect ratio of the video to generate, any of 480p, 720p, or 1080p.
-Text Resolution: The resolution of video to generate, any of 5 or 10 seconds.
-Text Duration: The duration of the video to generate, any of 5 or 10 seconds.

Provides Outputs:
-Video Output: The final AI-generated video file.

GenAI/Fal

Kling O3 Pro Video Edit

Edit videos using Kling O3.

Expects Inputs:
-Text Prompt: Prompt text. Reference video as @Video1, Reference images as @Image1, etc, and Elements as @Element1.
-Video Video: Reference video.
-Image Front Image: Image to use as the front image.
-Image Alternate Image: An alternate view of the image.
-Image Reference 1: Reference image for style/appearance.
-Image Reference 2: Reference image for style/appearance.
-Text Keep Audio: Keep or discard original sound.

Provides Outputs:
-Video Generated Video: The edited video.

GenAI/Fal

Kling O3 Pro Video Reference

Kling O3 generates new shots guided by the input reference video.

Expects Inputs:
-Text Prompt: Prompt text. Reference video as @Video1, Reference images as @Image1, etc, and Elements as @Element1.
-Video Video: Reference video.
-Image Front Image: Image to use as the front image.
-Image Alternate Image: An alternate view of the image.
-Image Reference 1: Reference image for style/appearance.
-Image Reference 2: Reference image for style/appearance.
-Text Keep Audio: Keep or discard original sound.
-Text Aspect Ratio: Any of auto, 16:9, 9:16, 1:1, where auto infers the aspect ratio from the input video.
-Text Duration: Any of 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, number of seconds to generate.

Provides Outputs:
-Video Generated Video: The generated video.

GenAI/Fal

Kling V3 Video Lip-sync

Lip-sync a video to an audio track.

Expects Inputs:
-Video Video: Reference video.
-Audio Audio: Audio to sync with the video.
-Text Sync Mode: Cut Off, Loop, Bounce, Silence, or Remap.

Provides Outputs:
-Video Generated Video: The resultant generated lip-synced video.

GenAI/Fal

Kling V3 Pro Video Motion Control

Transfer movements from a reference video to any character image.

Expects Inputs:
-Text Prompt: Prompt text.
-Video Video: Reference video. The character actions will be consistent with this reference video.
-Image Image: Reference image. Characters and backgrounds are based on this image.
-Text Keep Original Sound: Whether to keep original sound (default Keep Sound).
-Text Character Source: Choose Character from Image when the image should define the character, and the result should better follow camera movement (max 10s). Choose Character from Video when the video should define the character, and it’s better suited for complex motions (max 30s).

Provides Outputs:
-Video Generated Video: The resultant generated motion controlled video.

GenAI/Fal

LTX 2.3 Audio to Video

Generates a video synchronized to an input audio clip using the LTX 2.3 model. Audio duration must be 2-20 seconds.

Expects Inputs:
-Audio Audio: The audio clip to generate a video from (2-20 seconds).
-Text Prompt: Text description of how the video should look. Required if no start image is provided.
-Image Start Image (Optional): An image to use as the first frame of the video.
-Float Guidance Scale: Controls how closely the output follows the prompt. Defaults to 5 for text, 9 with an image.
-Text Aspect Ratio: The aspect ratio of the generated video (auto, 16:9, or 9:16).

Provides Outputs:
-Video Generated Video: The final AI-generated video file synchronized to the audio.

GenAI/Fal

LTX 2.3 HDR

Converts SDR video to HDR using a self-hosted LTX 2.3 IC-LoRA container. Produces a lossless 16-bit H.265 MP4 in ACEScg colour space and/or a tonemapped MP4 preview.

Expects Inputs:
-Video Video: The SDR video to convert to HDR.
-Text Host: Hostname or IP of the LTX-HDR container (default: 127.0.0.1).
-Integer Seed: Seed for reproducibility (default: 10).
-Integer Max Frames: Maximum number of frames to process (default: 161, max: 161).
-Integer Inference Steps (optional): Override for stage 1 inference steps.
-Integer Stage 2 Inference Steps (optional): Override for stage 2 (upscaler) inference steps.
-Text Output Mode: SDR and HDR or SDR Only.

Provides Outputs:
-Video SDR Preview: Tonemapped SDR MP4 preview.
-Video HDR Video: Lossless 16-bit H.265 MP4 in ACEScg colour space.

GenAI/Fal

LTX 2.3 Image to Video

Generates high-quality videos from text descriptions and images using the LTX 2.3 model.

Expects Inputs:
-Text Prompt: The text description detailing the subject, style, lighting, and composition of the desired video.
-Image Image: The image to use to drive the video creation.
-Image Last Image (optional): The last image to use to drive the video creation.
-Text Aspect Ratio: The aspect ratio of the video to generate, any of auto, 16:9, 9:16.
-Text Resolution: The resolution of video to generate, any of 1080p, 1440p, and 2160p.
-Text Duration: The duration of the video to generate, any of 6, 8, or 10 seconds.
-Text FPS: The frames per second of the video to generate, any of 24, 25, 48, or 50.
-Text Audio: Audio or No Audio.

Provides Outputs:
-Video Output: The final AI-generated video file.

GenAI/Fal

LTX 2.3 Reference Video to Video

Generates a video from a reference video and text prompt using the LTX 2.3 22B model. Can optionally use audio and start/end images.

Expects Inputs:
-Text Prompt: The text description detailing the desired video.
-Video Reference Video: The source video to reference.
-Audio Audio (Optional): Optional audio to use for the video.
-Image Start Image (Optional): Image to use as the first frame.
-Image End Image (Optional): Image to use as the last frame.
-Text Negative Prompt: Text describing behaviors to suppress.
-Integer Seed: Seed for reproducibility.
-Integer Inference Steps: Number of inference steps.
-JSON Tuning Dictionary (optional): Tunable parameters in a JSON dictionary, accepts keys with float values: video_cfg_scale, video_stg_scale, video_rescaling_scale, video_modality_scale, audio_cfg_scale, audio_stg_scale, audio_rescaling_scale, audio_modality_scale, gradient_estimation_gamma, camera_lora_scale, distill_lora_first_pass_scale, distill_lora_second_pass_scale, video_strength, audio_strength.
-Integer Frame Count: If not matching video length, the number of frames to generate.
-Text Match Video Length: Whether to match output to the input video length or use frame count.
-Text Aspect Ratio: Resulting aspect ratio of the generated video.
-Text Generate Audio: Whether to generate audio.
-Text Use Multiscale: Generate coherently starting from a smaller version.
-Text Camera LoRA: Camera movement to apply to the generated video.
-Text Preprocessor: Preprocessing to apply to the reference video.
-Text Video Quality: Quality of the generated video.

Provides Outputs:
-Video Generated Video: The final AI-generated video file.

GenAI/Fal

LTX 2.3 Text to Video

Generates high-quality videos from text descriptions using the LTX 2.3 model.

Expects Inputs:
-Text Prompt: The text description detailing the subject, style, lighting, and composition of the desired video.
-Text Aspect Ratio: The aspect ratio of the video to generate, any of 16:9 or 9:16.
-Text Resolution: The resolution of video to generate, any of 1080p, 1440p, or 2160p.
-Text Duration: The duration of the video to generate, any of 6, 8, or 10 seconds.
-Text FPS: The frames per second of the video to generate, any of 24, 25, 48, or 50.
-Text Audio: Audio or No Audio.

Provides Outputs:
-Video Output: The final AI-generated video file.

GenAI/Fal

LTX 2.3 Video Extension

Extends an existing video, preserving the action in the original video.

Expects Inputs:
-Video Video: The input video to be extended.
-Text Prompt (optional): The prompt to guide the extension of the video.
-Float Duration (secs): The duration of the video extension.
-Float Seconds to reference: The number of seconds of the input video to use as guidance for the extension.
-Text Mode: Any of Start or End, whether to extend the start or the end of the input video.

Provides Outputs:
-Video Output: The final AI-generated video file.

GenAI/Fal

LTX 2.3 Retake Video

Generates a new video from an existing video with the content replaced as defined in the prompt.

Expects Inputs:
-Video Video: The input video to be modified (currently limited to about 15 MB).
-Text Prompt: The prompt guiding the video modification, describing what should be changed in the video and how it should be updated.
-Float Start at (secs): The time at which to start applying the updates.
-Float Duration (secs): The amount of time to apply the updates.
-Text Retake Mode: The mode to use for the retake, any of Replace Audio, Replace Video, or Replace Audio and Video.

Provides Outputs:
-Video Output: The final AI-generated video file.

GenAI/Fal

WAN 2.2 Video Style Transfer

Transfers a character or environment style from an image onto an existing video.

Expects Inputs:
-Video Video: The source video upon which the style will be conveyed.
-Image Image: The source image from which to draw the style to be conveyed.
-Float CFG: Classifier-free guidance influencing the adherence to the style of the image onto the video.
-Integer Inference Steps: The number of inference steps in the process – higher is more accurate but it takes longer.
-Float Shift: Influences the effect the image will have on the video, between 1.0 and 10.0.
-Integer Seed: Influences the determinism of the generation.
-Text Mode: Either Character Transfer or Style Transfer, which to transfer from the image.
-Text Resolution: Any of 480p, 580p, or 720p.
-Text Video Quality: Any of Low, Medium, High, or Maximum.
-Text Turbo: Any of Standard or Turbo, influences the speed of inference with a tradeoff in quality.

Provides Outputs:
-Video Generated Video: The resultant generated video with the style transferred from the image.

GenAI/Fal

WAN 2.5 Image to Video

Generates high-quality videos from text descriptions and images using the Wan 2.5 model.

Expects Inputs:
-Text Prompt: The text description detailing the subject, style, lighting, and composition of the desired video.
-Image Image: The image to use to drive the video creation.
-Text Negative Prompt: The text description defining behaviors to be suppressed in video generation.
-Integer Seed: A seed value to guide the determinism of the generation.
-Text Resolution: The resolution of video to generate, any of 480p, 720p, or 1080p.
-Text Duration: The duration of the video to generate, any of 5 or 10 seconds.

Provides Outputs:
-Video Output: The final AI-generated video file.

GenAI/Fal

WAN 2.5 Text to Video

Generates high-quality videos from text descriptions using the Wan 2.5 model.

Expects Inputs:
-Text Prompt: The text description detailing the subject, style, lighting, and composition of the desired video.
-Text Negative Prompt: The text description defining behaviors you wish to suppress.
-Integer Seed: A seed value to guide the determinism of the generation.
-Text Aspect Ratio: The aspect ratio of the video to generate, any of 1:1, 16:9, or 9:16.
-Text Resolution: The resolution of video to generate, any of 480p, 720p, or 1080p.
-Text Duration: The duration of the video to generate, any of 5 or 10 seconds.

Provides Outputs:
-Video Output: The final AI-generated video file.

GenAI/Fal

WAN 2.6 Reference to Video

Generate a video using reference videos for character/subject consistency (R2V). Models characters referenced as @Video1, @Video2, @Video3 in the prompt.

Expects Inputs:
-Text Prompt: The text prompt describing the desired video.
-Video Video 1: The first reference video.
-Video Video 2: The second reference video.
-Video Video 3: The third reference video.
-Text Negative Prompt: The text prompt describing the desired video.
-Integer Seed: The seed for the random number generator.
-Text Aspect Ratio: The aspect ratio of the generated video.
-Text Resolution: The resolution of the generated video.
-Text Duration: The duration of the generated video.
-Text Multi Shot: Whether the generated video should be a multi-shot video.

Provides Outputs:
-Video Generated Video: The final AI-generated video file.

GenAI/Fal

WAN 2.7 Edit Video

Edits a video using the Wan 2.7 Video Edit model. Supports instruction-based editing, reference image-based editing, and video style transfer.

Expects Inputs:
-Video Input Video: The input video to be edited.
-Text Prompt: Editing instruction or style transfer description.
-Image Reference Image: An optional reference image URL for reference-based editing.
-Integer Seed: The seed for the random number generator.
-Text Resolution: Output video resolution tier (720p or 1080p).
-Text Aspect Ratio: Aspect ratio of the generated video.
-Text Audio Setting: Audio handling (Auto Audio or Original Audio).

Provides Outputs:
-Video Generated Video: The final AI-generated video file.

GenAI/Fal

WAN Motion

Transfers motion from a driving video onto a reference character image using the Wan Motion model.

Expects Inputs:
-Video Driving Video: The driving video that provides the motion.
-Image Reference Image: The reference image that provides the character’s appearance.
-Text Prompt: An optional text prompt describing the desired video.
-Integer Seed: The seed for the random number generator.
-Text Acceleration: The acceleration level to use.
-Text Adapt Motion: Whether to adapt the driving video's motion to match the reference image's body proportions.

Provides Outputs:
-Video Generated Video: The final AI-generated video file.

Context

Shot Context

Captures shot-level context for graph execution and governance.

Expects Inputs:
-Text Shot ID: The shot identifier to use for this graph run.

Provides Outputs:
-Text Shot ID: The normalised shot identifier (trimmed).
-Text Shot Payload: A JSON string payload containing the shot ID (for example: {shot_id: sh010}).